Where communities thrive


  • Join over 1.5M+ people
  • Join over 100K+ communities
  • Free without limits
  • Create your own community
People
Repo info
Activity
  • Mar 31 21:15

    mikeb01 on master

    [Java] Assert that the error co… (compare)

  • Mar 31 21:15
    mikeb01 closed #897
  • Mar 31 21:15
    mikeb01 opened #897
  • Mar 31 20:24

    mjpt777 on master

    [C++] Use size_t rather than ss… (compare)

  • Mar 31 20:12

    mjpt777 on master

    [C] Put back stdbool.h for MSVC. (compare)

  • Mar 31 20:05

    mjpt777 on master

    [C++] Warning clean up. (compare)

  • Mar 31 20:04

    mjpt777 on master

    [C++] Another go at keeping MSV… (compare)

  • Mar 31 19:50

    mjpt777 on master

    [C++] Move constructor should b… (compare)

  • Mar 31 19:44

    mjpt777 on master

    [C] Put back stdbool.h. (compare)

  • Mar 31 19:43

    mjpt777 on master

    [C] Use sizeof structure rather… (compare)

  • Mar 31 19:35

    mjpt777 on master

    [C] Fix warnings. (compare)

  • Mar 31 19:10

    mjpt777 on master

    [Java] Remove unused import. (compare)

  • Mar 31 16:08

    mjpt777 on master

    [Java] Only allow the notified … (compare)

  • Mar 31 14:12

    mjpt777 on master

    [Java] Bypass catchup when not … (compare)

  • Mar 31 10:16

    mjpt777 on master

    [Java] LeadershipTermId should … (compare)

  • Mar 31 09:42

    mjpt777 on master

    [Java] Simplify. (compare)

  • Mar 31 09:41

    mjpt777 on master

    [Java] Commit position for last… (compare)

  • Mar 31 09:12

    mjpt777 on master

    [Java] Tidy up naming and use o… (compare)

  • Mar 30 22:19

    mjpt777 on master

    [Java] Be prepared to return se… (compare)

  • Mar 30 17:32

    mjpt777 on master

    [Java] Tighten up some fo the t… (compare)

Ivan Zemlyanskiy
@QIvan

(sorry, didn't press the send button =) )

Unsafe in Java 11 can often not inline as an intrinsic which results in a JNI function call per operation.
We spend time minimising the impact with Aeron so people can get an reasonable experience on Java 11. We hope people appreciate our efforts.

wow, that's scary. I can't speak from all Aeron users but i'm sure most of ones definitely appreciate your efforts, guys, aeron is awesome, that's the fastest growing transport (and probably a project) i've ever known. Thank you for all your hard job!

Feras
@feribg
@mjpt777 Wondering if you have any suggestion on how would a muxing protocol suggested (https://github.com/real-logic/aeron/wiki/Best-Practices-Guide) work without having to send the entire contents of the stream in real time or via the archiver and do all the filtering on the receiver.
Martin Thompson
@mjpt777
@feribg I cannot think of a simple answer to that question without understanding a lot more about the requirement.
Feras
@feribg
@mjpt777 Since the requirement of the overall number of streams is for it to be low, if the full number of "topics" of interest to clients is vastly larger than the number of streams, how to avoid having to send the entire stream to all the clients all the time. For instance if there's 3-500 tickers per stream and a client is interested in at most 20-30 at a time, it's forced to consume the entire list of 500 regardless, this becomes more problematic when your topics of interest are spread accross streams. Does that clarify it a bit better? So in other terms how do you reasonably straightforward abstract a more traditional concept of topics over a smaller number of streams withouth aving to send the full stream.
Martin Thompson
@mjpt777
@feribg How do you think topics work in other messaging systems?
Feras
@feribg
@mjpt777 My understanding is that topics map to streams 1:1 from a building block perspective, but it's usually an unbounded number or a very large number. For a very large number of consumers the traffic multiplication factor is huge without the filtering burden being put on the broker.
Todd L. Montgomery
@tmontgomery
@feribg broker filtering per client is a huge drag and limits scaling to a HUGE degree. Brokerless designs do scale better, provably by 2-3 orders of magnitude, as you can control the topic to transport mapping (in UM and in Aeron). It is easier to throw data away at the receiver than to filter and route at the broker
for example, throwing away 100% of 20-30M msgs/sec on receiver can be done at < 10% of a single CPU. Where as a broker routing > 10 clients with 10 topics can't break more than 100K msgs/sec.
just anecdotally from systems I've worked on
Feras
@feribg
@tmontgomery Yep that all makes sense, I'm just trying to solve for the case of catching up faster from a stream replayed by the archive by avoiding streaming a giant amount of data just to throw most of it away. There's some fixed cost for book keeping in the media driver per stream, if one is to go against the recommendation for a small number of streams and instead uses a rather large one but with only a few hot ones at a time, do you see any stability issues with the media driver? I imagine there will be some performance penalty because of locality, I can measure that but anything else that comes to mind ?
Todd L. Montgomery
@tmontgomery
Depends on the numbers... for merging a replay into a live channel, see ReplayMerge, it's a different kind of problem.
but, for the issue of "How many streams is 'small'?"
that depends on the application somewhat.... the largest barrier is, actually, the subscription polling strategy.
for an aggressive policy, just going to each image and polling is going to induce delay and latency once a large number of images is concerned
@mjpt777 and I have a plan for how best to address that and to make the number of streams more a function of just memory usage.... but no one has sponsored that development yet. It essentially is a "select/poll" type interaction instead of a poll per image.
functionally, we've seen some stream counts in the 100s or even 1000s, though
Feras
@feribg
@tmontgomery Thanks a lot for the feedback, clarifies things quite a bit, I will do some benchmarks and see where it nets out. Do you think a dirty approach could be a) minimizing the memory impact per stream as you mentioned and b) a config parameter per stream that maps to a probability of checking it during polling, so busy streams are visited more often than less busy ones and roughly follow the distribution of traffic rather than a uniform iteration. If this sounds reasonable I would be interested to hack on it and get to know the codebase a bit more. The removal of stream # limitation would make it a very viable Kafka alternative out of the box.
Todd L. Montgomery
@tmontgomery
possibly. A more ideal mechanism is to have the driver signal data to process via a dedicated mechanism and data structure.
Deniz Evrenci
@denizevrenci

Hi. I have a case of a deadlock between two nodes due to backpressure from each side.

To elaborate, there are three nodes A, B and C. A generates messages and sends it to B which receives it with a subscriber and sends a different message to C with one to one mapping with what it receives from A. While it is waiting for the responses from C, B keeps sending more messages to C which eventually backpressures B’s publisher. C’s publisher is also backpressured because B did not prioritize messages from C instead of the ones from A. This only happens in a throughput test where A constantly keeps generating messages as fast as it can.

I can think of solving this by having two subscribers with different stream IDs or explicitly polling C's stream at B with a higher priority so that C’s stream gets drained before A’s stream while keeping everything in a single thread. But I am wondering if there is a better solution that I might be missing.

ratcash
@ratcashdev
@denizevrenci very interesting situation. Which messages can't B process in your case? The status messages from C, or the application level confirmation messages from C (if such exist)? Which threading model are you using in your media driver (s)?
Deniz Evrenci
@denizevrenci
B makes requests and C sends replies to each request. That publications at B and C get backpressured. Nodes are in different boxes and communicate through UDP and each driver uses dedicated threading model with sender and receiver in no op idle strategy and conductor in busy spin idle strategy. Each box has 8 hardware threads so 3 threads for the driver and 2 threads for the client should be able to run concurrently without starving each other.
ratcash
@ratcashdev
and I am assuming that the replies C is sending are in the same channel/stream as the data from "A"?
Deniz Evrenci
@denizevrenci
Yes, exactly.
ratcash
@ratcashdev
@denizevrenci how about this idea: while B is waiting for C to be available and doing the busy-spin (or sleeping in a loop), it should rather do an explicit poll on C's image and process it? In other words, use the time doing something usefull? You can do this for example by creating a special IdleStrategy that instead of sleeping (or parking) does a poll using Subscription.imageById().poll(). However, i think using a separate subscription channel/subscription/thread for C's responses may be a better option, if feasible - as you already suggested.
But honesly, I am not sure, if it's possible or even sane doing a poll inside a FragmentHandler.
Martin Thompson
@mjpt777
@denizevrenci Separate the streams/channels if you can to avoid the deadlock. This can be the simplest solution.
Deniz Evrenci
@denizevrenci

Thanks @ratcashdev and @mjpt777.

I suppose we can consider A acts as a source that fills B and C’s publication buffers. And B’s publication to A as a drain for the same. If B processes A’s messages first, it leads to the buffers to fill up and eventually to the backpressure deadlock. If B polls messages from C first, it drains the buffers. I was able fix the issue via that approach. And I presume it can be generalized to a case where there are A0..An which are sources and C0..Cn which bounce messages from B. As long as B prioritizes C0..Cn before A0..An, the backpressure deadlock will not occur.

ratcash
@ratcashdev
Good Morning. Is it normal behavior, that a cluster client is disconnected by the cluster with the reason TIMEOUT, but the cluster client believes it is connected (even after minutes or hours later)?
and keeps sending messages to the cluster happily? Although, obviously, it will never ever receive an application level response, because the ClusterService will never receive those messages.
Todd L. Montgomery
@tmontgomery
@ratcashdev are you polling egress?
and if so, does your EgressListener handle sessionEvent?
ratcash
@ratcashdev
@tmontgomery Yes, I was polling egress (I assume the timeout arose because of starvation), but I wasn't handling the sessionEvent (not much javadoc there :) ). I had assumed, that the session would be simply disconnected and I'd get the "NOT_CONNECTED" result when trying to send, eventually. Is there a specific EventCode that's realated to the disconnection? And just out of interest, why isn't (or can't be) the channel simply disconnected?
Martin Thompson
@mjpt777
@ratcashdev The default handler could be changed to close the cluster client. It is something I'd like to ponder.
ratcash
@ratcashdev
I'd consider that a safer default.
keithwong1
@keithwong1
Hi, I would like to ask if publication.offer is thread safe?
If I use dedicated thread mode, should aeron use single thread to send message?
Martin Thompson
@mjpt777
@keithwong1 ConcurrentPublication is thread safe; ExclusivePublication is not thread safe. ConcurrentPublication is returned from Aeron.addPublication. ThreadingMode for the driver is an independent feature for configuring thread usage by the driver.
keithwong1
@keithwong1
I only find addPublication and addExclusivePublication
Martin Thompson
@mjpt777
As I said above, Aeron.addPublication returns a ConcurrentPublication.
keithwong1
@keithwong1
Thanks
Sorry... miss this
Martin Thompson
@mjpt777
Going forward can people please ask specific feature questions on Stackoverflow and leave this channel for discussions.
Francesco Nigro
@franz1981
@mjpt777 @tmontgomery i hope to not being OT: have you ever collected in a doc or articles your thoughts of the benefit of using a brokerless messaging system vs brokered? I mean both perf and arch wise?
This message was deleted
Many thanks for both Aeron and Agrona, both awesome projects: just groundbreaking :)
Todd L. Montgomery
@tmontgomery
@franz1981 back in the 29West LBM days, yes, I wrote and contributed to much material on the brokered vs. broker less models
but have no idea where any of it would have ended up in Informatica now
Francesco Nigro
@franz1981
@tmontgomery thanks for the hint, Todd; will try to search around these keywords to see if I can find something :)
Hristo I Stoyanov
@hrstoyanov
Where can I look at some Aeron cluster samples. I need the "appointed leader" use-case, no fancy elections. Also, my case is pretty close to what I see in FileReceiver/FileSender samples, where the serialization format is not using SBE, do I need to still use SBE if I use Aeron cluster, or I can get by with something straightforward like in FileReceiver/FileSender?
Michael Barker
@mikeb01
@hrstoyanov At the moment we're not yet in a position to do community level support for Cluster. We're currently reserving support for contracted customers.