Where communities thrive


  • Join over 1.5M+ people
  • Join over 100K+ communities
  • Free without limits
  • Create your own community
People
Repo info
Activity
  • 13:52
    elytraflyer starred eclipse-iceoryx/iceoryx
  • 12:56
    codecov[bot] commented #1434
  • 12:56
    codecov[bot] commented #1434
  • 12:56
    codecov[bot] commented #1434
  • 12:56
    codecov[bot] commented #1434
  • 12:56
    codecov[bot] commented #1434
  • 12:56
    mossmaurice edited #1434
  • 12:56
    mossmaurice edited #1434
  • 12:56
    mossmaurice edited #1434
  • 12:56
    mossmaurice edited #1434
  • 12:56
    mossmaurice edited #1434
  • 12:56
    mossmaurice edited #1434
  • 11:22
    codecov[bot] commented #1434
  • 11:01
    codecov[bot] commented #1441
  • 11:01
    mossmaurice labeled #1441
  • 11:00
    codecov[bot] commented #1443
  • 11:00
    mossmaurice labeled #1443
  • 10:59
    mossmaurice edited #1434
  • 10:58
    mossmaurice review_requested #1434
  • 10:58
    codecov[bot] commented #1434
peddinavk
@peddinavk
Thank you @budrus, @sebastian:reparts.org for your replies. Please allow me to provide more details about this use case, similar to "chain" as mentioned by @sebastian:reparts.org . V2X stack(in automotive) is similar to network stack in user space with layer structured(AccessLayer(AL)-->NetworkLayer(NL)-->MessagingLayer(ML)-->SafetyApps(SA)). W.r.t data access they look like chain (AL)pub–>(NL)modsubpub—>(ML)modsubpub-->(SA)multiple subs. Usually in Network layer each layer will strip the respective header and route to upper layers, in V2X stack also it's same but here there is an extra catch. Some of these layers do the unpack and create new header before passing to next layer. for example, NL header is ASN serialized to save over the Air bandwidth so NL has to de-serialize the header and create and attach the new header to payload and send it to the ML where ML verify/decrypt the payload and provide final safety data to safety apps. Safety apps, like FCW(Forward Collison warning ) & ICW(Intersection Collison warning) apps use same "safety payload" to determine the threat, but these safety apps consume the data & won't modify it as they are the last users in the chain. Looking for zero copy mechanism in such kind of layered stack to operate in same shared memory segment created by AL and release by final Layer Apps(Safety Apps).
peddinavk
@peddinavk

@peddinavk one question for which your Input would be interesting. If you want to do this, do you want to have the contract that there is only one subscriber for this topic allowed? So it always would be possible that the single subscriber changes the data and multiple subscribers is an error. Or would you like the functionality that a sibscriber can request a writeable reference and if there are other readers a copy is made behind the scenes. The first option is maybe easier

Yes, 1st option would suite as modifier is only one but end subscribers can be more than one as they are only readers.

peddinavk
@peddinavk

Thank you @budrus, @sebastian:reparts.org for your replies. Please allow me to provide more details about this use case, similar to "chain" as mentioned by @sebastian:reparts.org . V2X stack(in automotive) is similar to network stack in user space with layer structured(AccessLayer(AL)-->NetworkLayer(NL)-->MessagingLayer(ML)-->SafetyApps(SA)). W.r.t data access they look like chain (AL)pub–>(NL)modsubpub—>(ML)modsubpub-->(SA)multiple subs. Usually in Network layer each layer will strip the respective header and route to upper layers, in V2X stack also it's same but here there is an extra catch. Some of these layers do the unpack and create new header before passing to next layer. for example, NL header is ASN serialized to save over the Air bandwidth so NL has to de-serialize the header and create and attach the new header to payload and send it to the ML where ML verify/decrypt the payload and provide final safety data to safety apps. Safety apps, like FCW(Forward Collison warning ) & ICW(Intersection Collison warning) apps use same "safety payload" to determine the threat, but these safety apps consume the data & won't modify it as they are the last users in the chain. Looking for zero copy mechanism in such kind of layered stack to operate in same shared memory segment created by AL and release by final Layer Apps(Safety Apps).

Out of curiosity, does this group came across this kind of use case(chain model of pub/sub) in the past?

sebastian
@sebastian:reparts.org
[m]
@peddinavk: Hah, I didn't think about networking stacks before , but back in the "dark ages" I was developing IP and 802.15.4 stacks on low-power microcontrollers, and we did exactly that: Often only a single buffer (sometimes for rx AND tx) used across all layers, changed data in place, and pointers pointing... everywhere 🤪 Good times. Seems those thinking patterns are still engraved in my brain 😁
bound-check
@bound-check
Hi. I would like to know how hard it is to make iceoryx agnostic to compiler. I am considering transforming all structures on the shared memory to those based on FlatBuffers
Simon Hoinkis
@mossmaurice
Hi @bound-check, what exactly do you mean by "agnostic to compiler"? You can build iceoryx with gcc, clang and msvc. I'm not familiar with FlatBuffers in detail, but you should be able to loan a chunk in shared memory via iceoryx and write your serialized data from FlatBuffer into the chunk. In the subscriber app you could then read and interpret the data via FlatBuffers. If you have more details or other questions, feel free to join our developer meetup which is happening in 25mins!
Simon Hoinkis
@mossmaurice
rmw_iceoryx was updated to iceoryx v1.0.1 for ROS 2 Foxy. It's far from finished, but feel free to give it a try :blush:
bound-check
@bound-check
@mossmaurice It says in the doc that apps and RouDi should be compiled with the same compiler and flags. As far as I can see from the source code, this is due to the fact that iceoryx stores C++ objects directly on the shared memory, leaving the underlying memory layout at the disposal of the compiler. I want to lift this restriction of using the same compiler so that apps and RouDi can use the compilers of their choice.
11 replies
Joe Speed is hiring
@joespeed
maybe help them rerun this with loaned message API cyclonedds+iceoryx? ros2/rclcpp#1642
Joe Speed is hiring
@joespeed
loaned message results for cyclonedds w built-in iceoryx https://github.com/ros2/rclcpp/issues/1642#issuecomment-900387789
Michael Pöhnl
@budrus
Thanks @joespeed . Further graphs we need to understand ;-)
Nikhil Marathe
@nikhilm:matrix.org
[m]
Hello! I've been evaluating Iceoryx for a project and been really impressed so far. I have a question about plugging WaitSet or similar into an existing event loop. We have a loop based on select/epoll that uses file descriptors. Is there a way to get a file descriptor from Iceoryx that becomes readable when events have arrived? I was thinking of something like eventfd (Linux specific) instead of the condition variable. Is it possible to implement something like a WaitSet using the Iceoryx API where I could use an eventfd to wake up subscribers? An eventfd cannot be shared among processes however, nor would it make sense to put it in shared memory. Any tips appreciated. Alternatively we would run a separate thread that had a WaitSet, and then use our own callback to notify the main loop via eventfd, but that leads to 2 thread wakeups instead of 1, so just trying to avoid that.
Michael Pöhnl
@budrus
Hey @nikhilm:matrix.org. Interesting questions. I assume you want to wait for several events with epoll and a new message to an iceoryx subscriber could be one of them? As you already wrote, the fallback would be to have a separate "bridging" thread. In this case the iceoryx Listener is maybe even the better choice. You could write the eventfd in the Listener callback. Avoiding two waiting threads would only be possible if we can trigger the event you are waiting for with epoll from another process. As I think we cannot solve this with the semaphore we are currently using, the question would be if there is an alternative for an inter-process event that can be used with epoll.
Michael Pöhnl
@budrus
We currently support Linux, QNX, MacOS, Windows and hopefully with low effort every OS that is POSIX comliant. There is a risk that we might find a solution that works for Linux with epoll and eventfd but not for other platforms. The only idea @elfenpiff and I have is to provide the file handle of the shared memory semaphore we are using to the user (when switching to a named one). How exaclty this could look like is not clear to me. The first thing to try would maybe be creating a semaphore, get the file handle of this, use this in a select/epoll and trigger the semaphore from another thread. If this works, we could try it with a shared memory sempahore and multiple processes. If this all works it should be feasible
Ulrich Eck
@ulricheck
Hello! I've also recently started using iceoryx for one of our research projects at TUM - recently we deployed the system on multi-processor systems (NUMA) .. are there any plans to provide support for memory locality (e.g. multiple segments, pinned to numa nodes) at some point ? at the moment, i can only utilize one physical node with the shm-connected process pipeline, otherwise i'ld saturate the inter-processor comm channel ..besides that - icoryx is awesome so far!
Nikhil Marathe
@nikhilm:matrix.org
[m]
Thank you! I guess we will just have to use the 2 thread mechanism. It will add another 1-2us to our latency, but it will have to do until we can redo some other parts of the program.
Michael Pöhnl
@budrus
@ulricheck We already thought about having different memory segments in different physical memories (e.g. CPU memory and GPU memory) and to do the memory transfers behind the scenes when a publisher and a subscriber do not have access to the same memory. But this is not yet implemented. What is possible is to write some kind of bridge on user side that takes care of this. I'm not sure if I fully understand your use case. You won't get around sending data over the inter-processor comm channel. But now you would have to send everything and if it is optimized, you would only have to transfer the messages needed by one node? Maybe you want to show up on one of our next developer meetings and we can discuss this https://github.com/eclipse-iceoryx/iceoryx/wiki/Developer-meetup
Ulrich Eck
@ulricheck
@budrus thanks for your answer! As i'm new to iceoryx i might not be aware of some more advanced options - so far my understanding is, that I can only run one instance of roudi and that it allocates one segment for management and one for data. i can control where it's allocated using numactl on linux. The use case i have would involve allocating a data-shm segment on each numa node and have applications select which segment to be used for each publisher/subscriber - so an explicit approach would work for me at this point. Being able to allocate on GPU memory (specifically CUDA) would be another huge improvement for our system .. currently we have to download/upload GPU data at the process boundaries. i'ld be interested to join one of the meetups ..
16 replies
Yonggang Luo
@lygstate
Hi, I am new to iceoryx, I have a question, can I use iceoryx without running a deamon like dds for service discovery?
bob
@bob:matrix.bobedibob.de
[m]
@lygstate: iceoryx has a central deamon called RouDi. This daemon creates all the resouces like share memory and takes care of the service discovery. Without this deamon applications will not run
Yonggang Luo
@lygstate

@lygstate: iceoryx has a central deamon called RouDi. This daemon creates all the resouces like share memory and takes care of the service discovery. Without this deamon applications will not run

Yeap, I've seen that, is that possible getting roudi to be not a deamon, embbed into each running process, and free it by the last running process?
In other words, what's I need to do to getting roudi can be embeed into each process, and can be used like DDS for single machine

elBoberido
@elBoberido
@lygstate that's not possible. There is an option to embed RouDi into a single process (see the singleprocess example) but then this application must be started first and stopped last. There is currently no mechanism for what you described.
bob
@bob:matrix.bobedibob.de
[m]
@lygstate: you might be interested in this eclipse-iceoryx/iceoryx#970
Nikhil Marathe
@nikhilm:matrix.org
[m]
Is it safe to use an Iceoryx publisher from any thread, or does it have to be used on the thread it is created on? Does it need external synchronization? Just trying to decide if I should be creating a publisher per-thread in my process, or having one shared by all of them. Thanks!
1 reply
Yonggang Luo
@lygstate
@bob:matrix.bobedibob.de Thanks a lot
Nikhil Marathe
@nikhilm:matrix.org
[m]
Thank you! Another question I have is around accidentally quadratic behavior of Iceoryx. Are there downsides to increasing IOX_MAX_SUBSCRIBERS for example? Or, for a Listener changing the max allowed subscribers to a much higher value like 255 (uint8_t max). Will it lead to extra iterations over all 255 entries instead of over 128 entries? We have a system with a lot of subscribers for certain topics, and while I can eventually reduce some of them, I have tweaked these values for a rough and dirty prototype. Would like to know if I'm paying hidden costs.
1 reply
elfenpiff
@elfenpiff

@nikhilm:matrix.org the short answer is: it is very likely that you do not have any increase in runtime behavior unless you massively increase the number to something like 16384 or higher but even then the increase should be in the range of microseconds. As long as the number is less than 4096 (page size) the runtime should be more or less identical.

Here is why: When you increase the number of supported attachments (MAX_NUMBER_OF_EVENTS_PER_LISTENER) you have to increase the MAX_NUMBER_OF_NOTIFIERS_PER_CONDITION_VARIABLE and with that you increase the size of an array of bools. It is not yet optimized but in the end 8 bools are stored in one byte and then you would get an array of 256 entries = 32 bytes. For now it is then an array of 256 bytes, one byte for each bool.

Usually the page size is around 4096 bytes, therefore the CPU should get this with one bite into the memory and then starts working. So unless you do not increase the number of events to a number larger than 4096 you should get no increase of the runtime overhead of memory to cpu transfer - but if you increase it over 4096 you should get additional 100ns (something in that range) of additional runtime overhead for every page. See article below on why I came up with this number.

The CPU itself then checks every entry if its true or not, if so the entry id is stored in a vector and the listener later only checks and handles the ids where actually something happened. The vector could be in the worst case as large as the number of events but then you have serious other problems in the system like a very heavy load and the listener is rarely scheduled - the normal case is one id in the vector or maybe two.
This check is performed directly on the cpu without acquiring new pages and is so fast that the difference between 128 - 1024 should not be measurable since the next step would be again to load something into the page and do stuff which takes much longer than to check a bunch of bools for the CPU.

The final big task is then to transfer the vector of ids back to the memory but since it contains rarely anything it should also be done in one bite since it should be less than the page size.

Here a nice article on how long a read from memory to cpu cache takes: https://formulusblack.com/blog/compute-performance-distance-of-data-as-a-measure-of-latency/

Summary:

  • Less than 4096 no measurable increase in overhead
  • N * 4096, gain maybe another N * 100ns in runtime cost
  • When your runtime increases massively it is due to the fact that a huge number of subscribers is triggered in each run which can only happen when either the callbacks take a very long time (around multiple ms or even seconds) or your system is under extreme cpu load. If the system is not under load the listener should react more or less instantly and therefore it should be hard for you to accumulate more than one triggered subscriber, maybe two but then they had to receive the sample more or less at the same time.
Nikhil Marathe
@nikhilm:matrix.org
[m]
Thanks! Yea, I am seeing that our performance is generally being dominated more by thread wake-up costs than anything Iceoryx itself is doing. I only increased the number of events per listener to 255
i'm curious if you folks have also observed this (thread wakeup dominating, on Linux) and what you've done about that? I had mentioned this before (Nov 2, above) that i'm currently stuck with a 2 thread model, and it doesn't help that waking up a thread is more expensive than that Listener subscription just putting messages in another queue to integrate with our main event loop
kekpirat
@kekpirat:matrix.org
[m]
Hello, is there any way to check whether the PoshRuntime has been started for a program within the program itself? Something like a static bool PoshRuntime::isRuntimeInitialised() - I'd like to be able to check if its not so I can prevent the construction of my publishers/subscribers without getting terminated
elBoberido
@elBoberido
@kekpirat:matrix.org currently there is no option to check whether the runtime is initialized. What is your use case to not just initialize it at the beginning of main
2 replies
Simon Hoinkis
@mossmaurice

Hi @/all
I just opened a draft PR #997 to discuss the new service discovery user API for v2.0.0 and I need your help! What do you think is the best fitting name for the class which provides information about the current services in the system?

  1. iox::runtime::DiscoveryInfo
  2. iox::runtime::DiscoveryProvider
  3. iox::runtime::PoshDiscovery

Any other ideas? Would appreciate your feedback! Also, because many will be gone soonish, I wish you Merry Christmas and a Happy New Year! :santa: :christmas_tree:

2 replies
Simon Hoinkis
@mossmaurice
PSA: There is no iceoryx developer meetup on December 23rd 2021!
xR3b0rn
@xR3b0rn
Is there a way to publish data with dynamic length?
xR3b0rn
@xR3b0rn
Or is there a way to set the size of a Publisher-Chunk while runtime? Theoretically it must be possible, since the chunk size must be registered to RouDi while runtime in any case. However, I am not able to find this functionallity exposed by the API.
Michael Pöhnl
@budrus
@xR3b0rn If you use the untyped API, the publisher can send chunks with different sizes. See https://github.com/eclipse-iceoryx/iceoryx/blob/master/iceoryx_examples/icedelivery/iox_publisher_untyped.cpp#L58 But you still need to know how many bytes you need when calling loan(). The typed API will always allocate with loan() a chunk with the sizeof() of your SampleType. There are first ideas to also support SampleTypes with dynamic size but this is not yet implemented. There is a PR with a design doc how this could look like. https://github.com/eclipse-iceoryx/iceoryx/pull/912/files Would this be the solution for your use case?
6 replies
Simon Hoinkis
@mossmaurice

Hi @/all

the merge window is closing and v2.0 will likely land end of this week. We still need a name for it! In our established tradition to name each release alphabetically after ice cream flavours, please participate in the following poll: https://nuudel.digitalcourage.de/KHMZu2N1RxQQJTOD

Simon Hoinkis
@mossmaurice
I'm happy to announce that the Eclipse iceoryx v2.0 release will be called Blueberry 🫐 :tada: Thanks for voting everyone! We should enjoy some blueberry ice cream in the upcoming developer meetup :yum:
Simon Hoinkis
@mossmaurice
Hi @/all Eclipse iceoryx Blueberry 🫐 aka v2.0.0has landed! More info is available on the release page. One of the new features is request/ response communication, closing the gaps towards AUTOSARs ara::com :car: :truck: :tractor: Come & join the developer meetup on Thursday if you have questions!
Simon Hoinkis
@mossmaurice
Quick reminder due to the current daylight saving in the US: the developer meetup today is happening at 17:00 CET which is in 1h and 40 mins from now.
sgf201
@sgf201
Is there a description about the improvement of Blueberry compared with V1.0
Simon Hoinkis
@mossmaurice
Sure a compact version is here, full change log is here.
sgf201
@sgf201
image.png
image.png
README.md is up and V1.0 is below, iceoryx support macos at v1.0? then the README.md is too old?
elBoberido
@elBoberido
@sgf201 the "not planned for implementation" is only for the access rights. So one cannot create multiple memory segments with multiple user rights
elBoberido
@elBoberido
Could we switch our developer meetups from Zoom to another solution, e.g. jitsi? It seems Zoom gets creepy https://nypost.com/2022/05/11/zoom-must-nix-emotion-tracking-feature-human-rights-groups/
Simon Hoinkis
@mossmaurice
Hi @/all due to a bank holiday in Germany tomorrow, the developer meetup will not take place. Next one is on 2022/06/09
Simon Hoinkis
@mossmaurice
Hi @/all next Tuesday on 2022/06/07 at 17:00 CET there will be a community meetup "Eclipse iceoryx Blueberry - Speeds up AUTOSAR Adaptive" happening. The talk will briefly discuss the history of iceoryx and introduce new features like request-response communication. You can register via this link: https://www.crowdcast.io/e/June7_22_EclipseMeetUp/register Hope to see you there!