Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature: ros2 port for all filters #189

Merged
merged 18 commits into from
Aug 23, 2024

Conversation

berend-kupers
Copy link
Contributor

Hello,

I have added ROS2 support for the following filters, which did have support in noetic-devel:

  • polygon filter
  • scan blob filter
  • sector filter

Key changes

  • Port polygon filter to ROS2
  • Port scan blob filter to ROS2
  • Port sector filter to ROS2
  • Port launch test for polygon filter and speckle filter to ROS2
  • Port unit tests for speckle filter and scan shadows filter to ROS2
  • Added "remove_shadow_start_point_" parameter to scan shadows filter, which wasn´t included in the port to ROS2

Benefits:

  • More filters available in ROS2

Functionality Status:

  • No changes have been made to the (default) functionality of other filters

Note:
I want to mention two issues that I've noticed in the already ported laser filters, which would be best to fix in the filters package.

  1. In box_filter and speckle_filter a (lifecycle-) node is created to create a tf listener (box_filter) and to create a on_set_parameters_callback (speckle_filter). When you start a filter chain and rename the node via launch files, it overrides the name for all node instances. So you get the error:

Publisher already registered for provided node name.

  1. Dynamically setting a parameter doesn´t seem to work, even though the speckles_filter does have an on_set_parameters_callback implemented. If I use ros2 param set ..., I get the error:

Setting parameter failed: parameter 'filter2.params.filter_type' cannot be set because it is read-only

I think these parameters are set to read-only in the filters package.

Please let me know if you have any feedback. I would be happy to make any necessary changes.

Kind regards,

Berend Kupers
Lowpad

@bjsowa
Copy link
Contributor

bjsowa commented Jul 10, 2024

What's the status of this PR? @jonbinney Could you please merge this?
I'm interested in using the polygon filter but I'd rather avoid copying the whole implementation to my project.

@jonbinney
Copy link
Contributor

@bjsowa sorry i haven't had time to review and merge this! Are able to try out this branch and give some feedback?

include/laser_filters/polygon_filter.h Outdated Show resolved Hide resolved
include/laser_filters/polygon_filter.h Outdated Show resolved Hide resolved
include/laser_filters/polygon_filter.h Outdated Show resolved Hide resolved
@bjsowa
Copy link
Contributor

bjsowa commented Jul 15, 2024

I tested the polygon_filter and it seems to work after implementing the fixes from my review.

However, I found a general flaw in how some of the filters are implemented. Only the logging and parameter interfaces are passed from the main node to the filters. Thus, any filter that needs to add some Subscriptions or Services, is implemented as LifecycleNode. The problem is, the filter nodes are not spun by any executor, thus the subscription or services callbacks are not called.

The result of it is that, for example, the base_footprint_exclude subscription, as well as, parameter and lifecycle related services in laser_scan_polygon_filter do not work.

IMO, all filters should be implemented as a Node (perhaps rclcpp::Node instead of LifecycleNode), either spawning their own spinning thread like the tf2 transform listener or created as the main node's sub-node

@berend-kupers
Copy link
Contributor Author

@bjsowa I implemented your comments.

As for your comments about the lifecycle node. I tried to stay close to the implementation of the other filters, e.g. box filter and indeed noticed that for example the parameter reconfigure doesn´t work. I thought it would be out of scope for this PR, but agree it needs to be fixed.

In our code, I solved it by forking the filters package and passing a pointer to the node instead of just the param/logging interfaces. That works for us, but would introduce a breaking change. I can try to get one of your suggestions working, but as I said, I think that should be a separate PR

@bjsowa
Copy link
Contributor

bjsowa commented Jul 16, 2024

As for your comments about the lifecycle node. I tried to stay close to the implementation of the other filters, e.g. box filter and indeed noticed that for example the parameter reconfigure doesn´t work. I thought it would be out of scope for this PR, but agree it needs to be fixed.

Agree, it can be addressed in future PRs. @jonbinney I have no more feedback.

@jonbinney
Copy link
Contributor

Thanks for taking a look @bjsowa !

I've got some time to review this PR this week. It looks like tests are failing on some EOL ROS2 distros, so i've updated the CI here: #196

@jonbinney
Copy link
Contributor

@berend-kupers could you rebase onto current head? Then the tests will run on the updated list of distros. Thanks!

@jonbinney
Copy link
Contributor

(or merge ros2 branch into this one, whichever you prefer)

@berend-kupers
Copy link
Contributor Author

@jonbinney Done

@jonbinney
Copy link
Contributor

Looks like the speckle filter test is failing on CI, and I have the same issue when I run locally. It looks like an error during rclcpp shutdown - do you know why this is? Maybe because of the multiple nodes with the same name you mentioned in your Note 1 in the PR description?

@jonbinney
Copy link
Contributor

I've been looking into what changes would be needed to the filters library to avoid needing to create extra nodes. I think a few things:

Add protected member function to FilterBase to get the NodeLoggingInterface. This would enable logging in the filter. (this should be an easy fix)

Either of the following changes for parameters:
1. Add protected member function in FilterBase for filters get the NodeParametersInterface
2. Add ability in FilterBase for filters to mark parameters as writeable and add protected member function in FilterBase to add parameter setting callbacks.
The first option is easier to implement, but is a bit weird because I don't think there is a way to create a NodeParametersInterface that_always_ applies a given prefix, and so the filters would each see the parameters for all the other filters. Their callback functions would also be called anytime the parameters are changed for any filter.

Add a transform listener somewhere... and give filters access to it.

OR we could say forget all of this careful passing of only the interfaces needed by the filters, and provide them a getter() for the entire Node object. This seems quite ugly, although i guess it isn't as ugly as using a singleton node in ROS1.

OR we could go crazy and make each filter its own Component which can all then be loaded dynamically or something.....

In any case, I'd like to get most of this PR merged before any of that stuff happens. These filters are quite useful for people, even without some of their features. @berend-kupers what do you think about removing the dynamic reconfiguring of these filters for now, as well as removing the ability to have two different frames in BoxFilter? Then we can review and merge this PR and keep on with the longer discussion of how to add the extra functionality to the filters package?

@jonbinney
Copy link
Contributor

Still some tests failing - i should have time to debug them later this week to see what is going on.

@jonbinney
Copy link
Contributor

ok i think i know why the tests are failing intermittently. In test_speckle_filter.test.py the subscribers are created, then the publishers, then the raw laser scan is published, then we wait for the filtered scan. But in the background, asynchronous network stuff is happening to actually hook up the subscribers. So even though the subscribers are created "first" sometimes they don't actually connect until after the filtered laser scan has already come back. In that case we wait forever.

I added

        rate = node.create_rate(1)
        rate.sleep()

after node.start_subscribers() and before node.publish_laser_scan(), and after that i was able to run the test a dozen times with no failures. This is too hacky though..... ideas on a better fix?

@berend-kupers
Copy link
Contributor Author

Shouldn't making the publisher transient local do the trick?

Otherwise, I could either

  • publish the scan message at a fixed rate, or
  • create another wait_for_() function and use the node.count_subscribers()

@jonbinney
Copy link
Contributor

By "transient local" do you mean make it a local variable? I'm not sure how that would fix this. I like the idea of publishing the scan at a fixed rate though, that sounds simple and robust to failures.

@berend-kupers
Copy link
Contributor Author

I meant setting the quality of service setting durability to "transient local" for the publisher. Then the publisher should persist samples for "late-joining" subscribers.

But I'm fine with publishing the scan at a fixed rate as well

@jonbinney
Copy link
Contributor

I meant setting the quality of service setting durability to "transient local" for the publisher. Then the publisher should persist samples for "late-joining" subscribers.

Ah, perfect! I'm not familiar with the ROS2 quality of service settings yet, so I just learned something :-)

@jonbinney
Copy link
Contributor

Interesting - a couple tests still failing but I'm having a harder time reproducing it locally. Debugging now.....

We need to publish scans repeatedly in case the filter chain runs and
processes the output scan before we finish subscribing to that topic.
@jonbinney
Copy link
Contributor

Looks like you changed the QoS on the publisher created in the python test node - that's not the one that's causing the problem. We're missing the return message published by the filter chain and received by the python test node. I tried changing the QoS in the subscriber in the python node, but it failed (something about the underlying middleware or the options used in the filter chain).

The other fix you suggested (publishing at a fixed rate until we get a reply) does work. I've implemented that and sent a PR to your PR branch - could you review and merge that? :-)

If you're curious, here is the command I used to run the tests repeatedly to reproduce the intermittent failure:

colcon test --retest-until-fail=100 --packages-select laser_filters --event-handlers=console_cohesion+ --ctest-args --output-on-failure; colcon test-result --all

@jonbinney
Copy link
Contributor

Awesome, tests pass! I'll do a code review tomorrow to make sure nothing major is wrong. Since these filters weren't in ros2 at all before, I don't plan on nit-picking about small changes - those can be fixed in smaller PRs later.

Copy link
Contributor

@jonbinney jonbinney left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. I have some things I'd like to add, but I think the best approach is to merge this and I'll submit PRs for those other things.

@jonbinney jonbinney merged commit 4945cd7 into ros-perception:ros2 Aug 23, 2024
5 checks passed
@jonbinney
Copy link
Contributor

Thanks @berend-kupers, and also @bjsowa for reviewing. Sorry it took me so long to get to!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants