Where communities thrive


  • Join over 1.5M+ people
  • Join over 100K+ communities
  • Free without limits
  • Create your own community
People
Activity
    halfbit
    @halfbit:matrix.org
    [m]
    I'd rather see latency histograms than throughput though to be honest
    kydos
    @kydos
    @halfbit:matrix.org, what are you using for benchmarking NATS, dds-bench? If so this is not really apple to apple comparison, please take a look at this line https://github.com/nats-io/nats.go/blob/3b1f6fcc1e1014c838036367494c3012523166b0/examples/nats-bench/main.go#L168
    halfbit
    @halfbit:matrix.org
    [m]
    nats-bench
    they have their own bench tool which I mean, yeah, hard to say its doing even remotely the same stuff
    kydos
    @kydos
    Their test is written in such a way to do user-level batching/flushing — real-world application rarely do that manually.
    halfbit
    @halfbit:matrix.org
    [m]
    yeah we don't do that in using nats
    kydos
    @kydos
    In zenoh if you look at the code for pub/sub used for throughput that’s what a beginner would write. We think that is important as your application will not need to get into these nasty details to get decent performances
    Our scheduling is done at the transport level, thus it is transparent for the user.
    halfbit
    @halfbit:matrix.org
    [m]
    its interesting that batching is even needed, I get the ideal scenario is avoiding too many sys calls... would io_uring do better without having to batch up messages?
    push messages into a queue with the kernel, let it do things as fast as it can kind of deal
    rather than this goofy, lets buffer up a million messages then call write once
    that nats-bench is doing at least
    kydos
    @kydos
    At a transport level zenoh sends frames, a frame can include multiple messages. One frame is sent with a single sys-call (if we do not need to fragment), how we pack it up is part of our scheduler, but the goal is taking care of reducing syscalls while at the same times ensuring message prioritisation.
    To be more precise within a zenoh session we can use multiples links, e.g. TCP/IP connections, to ensure that we can better recover congestions over WAN — that is a relatively well known trick.
    halfbit
    @halfbit:matrix.org
    [m]
    makes sense, and seemingly what nats does as well
    zenoh seems to be capped by the line rate between two computers, which is fantastic
    at least on this puny gigabit network I have
    no 10g to test against :-)
    Geoff
    @geoff_or:matrix.org
    [m]
    I can't get the zenoh-bridge-dds to work across NAT ☹️

    On a native PC, I launch:
    ./target/release/zenoh-bridge-dds -m peer -d 0 -l tcp/10.89.76.138:7447 -e tcp/10.89.76.139:7447

    On a container running on another PC, I launch:
    ./target/release/zenoh-bridge-dds -m peer -d 0 -l tcp/10.92.70.153:7447 -e tcp/10.89.76.138:7447

    The container's host is forwarding port tcp/7447 into the container. I confirmed that port forwarding is working with iperf.

    However when I run a ROS 2 publisher on the native PC, and then list topics in the container, I don't see the published topic.

    Geoff
    @geoff_or:matrix.org
    [m]
    I can see the two instances of zenoh-bridge-dds talking to each other in wireshark
    Geoff
    @geoff_or:matrix.org
    [m]
    That's a wireshark capture from the host
    Interestingly, in captured packet 19 I can see the IP address of the container (10.92.70.153) being sent to the native PC (10.89.76.138). I don't know how zenoh discovery works but is it possible that the instance of zenoh running inside the container is telling the instance running on the native PC to connect to the container's IP rather than the IP of the host PC (which would then NAT and forward that connection into the container)?
    Geoff
    @geoff_or:matrix.org
    [m]
    Is there any documentation on what configuration parameters can be set in a configuration file?
    Geoff
    @geoff_or:matrix.org
    [m]
    It's working using a router on the host PC. I can communicate over ROS between the native PC and the container :)
    I'd like to get it going over NAT without the router though, if that's possible.
    Interesting... if I stop publishing the topic, it's still listed as available. I guess that zenoh-bridge-dds keeps making the topic available even though there are no publishers for it anymore.
    That's not ideal.
    I thikn that's also why communication worked across the router. Topic discovery doesn't seem to, but if I publish the same topic locally then ROS thinks the topic exists now and will start publishing it
    Yep, that's what's happening. I have to publish a topic in the container to make ROS think it exists before I can echo it. So topic discovery across the DDS bridge isn't working.
    Julien Enoch
    @JEnoch

    Hi @geoff_or:matrix.org,

    On a native PC, I launch:
    ./target/release/zenoh-bridge-dds -m peer -d 0 -l tcp/10.89.76.138:7447 -e tcp/10.89.76.139:7447
    On a container running on another PC, I launch:
    ./target/release/zenoh-bridge-dds -m peer -d 0 -l tcp/10.92.70.153:7447 -e tcp/10.89.76.138:7447

    A single TCP connection is good enough for 2 zenoh peer to communicate with each other (as it's a bi-directional link). Usually, a container only accepts incoming connections and cannot establish outgoing connection. Thus, you can remove the -e tcp/10.89.76.138:7447 argument from the bridge on container (but it dosn't harm, there will be no duplicate traffic).

    However when I run a ROS 2 publisher on the native PC, and then list topics in the container, I don't see the published topic.

    For scalability purpose, the zenoh-dds-bridge doesn't route all the DDS discovery traffic (if a DDS system declares thousands of Readers/Writers, we don't want the DDS discovery traffic to overwhelm all the other bridged systems that might need only few topics). Only user's publications are routed, if there is a matching subscription somewhere.
    In detail, here is how 1 bridge works:

    • it creates a DDS Participant that discovers all DDS Readers and Writers via regular DDSI protocol (UDP multicast by default).
    • For each discovered DDS Writer, the bridge creates a matching DDS Reader and declares a publication on a zenoh resource corresponding to the DDS topic name (and partition)
    • For each discovered DDS Reader, the bridge creates a matching DDS Writer and declares a subscription on a zenoh resource corresponding to the DDS topic name (and partition)
    • And the way zenoh works, the published DDS data routed by the bridge will be transmitted to another bridge only if this one declares a subscription.

    Notice that I'm currently working on an admin space for the zenoh-bridge-dds (similar to the zenoh router's admin space). With this, you'll be able to browse the discovered DDS Reader/Writer and established route for each zenoh-bridge-dds. A zenoh admin space being accessible via GET operations using any zenoh API (including the REST API).

    Interestingly, in captured packet 19 I can see the IP address of the container (10.92.70.153) being sent to the native PC (10.89.76.138). I don't know how zenoh discovery works but is it possible that the instance of zenoh running inside the container is telling the instance running on the native PC to connect to the container's IP rather than the IP of the host PC (which would then NAT and forward that connection into the container)?

    That's to be confirmed ( @OlivierHecart ?), but I think this packet 19 is part of the "gossip discovery": each zenoh peer send to other peers the list of locators it discovered. Thus another peer will try to connect those locators to establish a more robust graph of connectivity. Again, if ever this connection is established in parallel of an already existing connection, it doesn't harm.

    I thikn that's also why communication worked across the router. Topic discovery doesn't seem to, but if I publish the same topic locally then ROS thinks the topic exists now and will start publishing it
    Yep, that's what's happening. I have to publish a topic in the container to make ROS think it exists before I can echo it. So topic discovery across the DDS bridge isn't working.

    I don't understand why you needs to "publish a topic in the container to make ROS think it exist". Does ROS creates the DDS subscription only if it discovers the DDS publication ? That would explain the behaviour you see.
    But that's not what I saw when testing with the ROS2 turtlesim + teleop: both are declaring their DDS subsciptions at startup anyway.

    It's working using a router on the host PC. I can communicate over ROS between the native PC and the container :)
    I'd like to get it going over NAT without the router though, if that's possible.

    It should have been working with your 1st configuration, without the router.
    Maybe you have been confused by the ros2 topic list not working because of non-routed discovery traffic ?
    Can you please give it a try again, starting the ROS2 nodes on each side and cheking if the data flows ? You can also set the RUST_LOG=debug environment variable to see the DDS discovery information and the zenoh established routes.

    Geoff
    @geoff_or:matrix.org
    [m]
    Thanks for the advice. I'll give it another go in the morning.
    Geoff
    @geoff_or:matrix.org
    [m]

    I don't understand why you needs to "publish a topic in the container to make ROS think it exist". Does ROS creates the DDS subscription only if it discovers the DDS publication ? That would explain the behaviour you see.
    But that's not what I saw when testing with the ROS2 turtlesim + teleop: both are declaring their DDS subsciptions at startup anyway.

    The case you were testing will work the way you describe, but that's effectively for a finished system. It works because the nodes you run on both sides of the bridge explicitly open up the topics they want. But there are a lot of things in ROS that don't work that way. For example, no ROS developer is going to be happy if the developer tools don't work, and the developer tools won't work if introspection information isn't available. (There are important use cases in ROS where having that information available without having to have a shell on the actual robot is important.) That's why ros2 topic list won't show topics that are actually available - it doesn't subscribe to them to list them. It's also why ros2 topic echo won't echo a topic that actually exists on the far side of the bridge: it thinks the topic doesn't exist because it can't see it in the discovered topics and so it refuses to subscribe to it. And it's why rviz will not be able to visualise data by topic - it doesn't know the topics exist, so the developer must know in advance.

    Julien Enoch
    @JEnoch
    That’s exactly the piece of information we were missing about ROS2 design, thanks!
    Do you also know if those dev tools only rely on DDS Topics discovery, but also on DDS Readers/Writers discovery ?
    Better, can you point me to the implementation code of ros2 topic * commands that relies on DDS discovery ?
    So we can figure out if there is something smart we could do, without bringing back a overwhelming DDS discovery traffic.
    kydos
    @kydos
    @JEnoch and @geoff_or:matrix.org I think that we should provide an alternative implementation for ros2 topic based on zenoh queries.
    2 replies
    Julien Enoch
    @JEnoch
    @kydos according to Geoff, I understood that would not solve the issues for rviz.
    Gergely Kis
    @kisg
    I have not yet tried it myself, but Iceoryx does have a DDS bridge using CycloneDDS: https://github.com/eclipse-iceoryx/iceoryx/tree/master/iceoryx_dds
    I don't know if their implementation was done with the above requirements in mind, but it might be worth a look.
    Geoff
    @geoff_or:matrix.org
    [m]
    @JEnoch: The ros2 topic command is implemented here:
    https://github.com/ros2/ros2cli/tree/master/ros2topic/ros2topic
    Geoff
    @geoff_or:matrix.org
    [m]
    @JEnoch: Knowing now that ros2 topic list won't work, I've confirmed that running the zenoh-dds-bridge in peer mode across NAT without a router on the host PC works just fine. Thanks!
    halfbit
    @halfbit:matrix.org
    [m]
    in testing iceoryx today, that I did a thorough job, it didn't really seem to be much faster than zenoh for my small message scenario, the bonus of zenoh being I don't need to suddenly jump to another protocol for multi-host pub/sub
    *not that I did a thorough job
    yingliangzhe
    @yingliangzhe:matrix.org
    [m]
    Hello Everyone,
    I am trying to subscribe a message from ros2. zenoh dds bridge is already opened on the remote machine. The ros2 is running on remote machine, and my python script is running on local machine. I have sucessfully published a message from my local machine and the remote one has already received it. But If I tried to receive messages from the remote machine, it seems that, the callback function is not called.
    Does someone has the same experience?
    Julien Enoch
    @JEnoch
    Hi @yingliangzhe:matrix.org ,
    if you activate the debug logs for the bridge (exporting RUST_LOG=debug environment variable prior to start it), do you see in logs a route being created for the topic you’r subscribing to ?
    The log should be something like:
    New route: DDS ‘…’ => zenoh ‘…’
    yingliangzhe
    @yingliangzhe:matrix.org
    [m]
    for the zenoh publisher and ros2 subscriber I did see the log New route: DDS '...'