Where communities thrive


  • Join over 1.5M+ people
  • Join over 100K+ communities
  • Free without limits
  • Create your own community
People
Repo info
Activity
  • 03:36
    mikeb01 synchronize #1128
  • 03:27
    mikeb01 synchronize #1128
  • 03:19
    mikeb01 synchronize #1128
  • 03:11
    mikeb01 synchronize #1128
  • 02:57
    mikeb01 synchronize #1128
  • 02:47
    mikeb01 synchronize #1128
  • 02:32
    mikeb01 synchronize #1128
  • 02:21
    mikeb01 synchronize #1128
  • 01:57
    mikeb01 synchronize #1128
  • 01:39
    mikeb01 synchronize #1128
  • 01:37
    mikeb01 opened #1128
  • Jan 25 22:31

    vyazelenko on master

    [Java] Remove workaround for By… (compare)

  • Jan 25 21:35

    vyazelenko on master

    [Build] Remove debug output. (compare)

  • Jan 25 21:16

    vyazelenko on master

    [Java] Build full path to the J… (compare)

  • Jan 25 20:29

    vyazelenko on master

    [Java] Resolve correct tools.ja… (compare)

  • Jan 25 20:13

    vyazelenko on master

    [Java] Print Gradle java proper… (compare)

  • Jan 25 19:38

    mjpt777 on master

    [Java] Improve ReplayMergeTest … (compare)

  • Jan 25 19:23

    mjpt777 on master

    [Java] Code style. (compare)

  • Jan 25 18:57

    vyazelenko on master

    [Java] Setup Java toolchain for… [C++] Add debug output to confi… (compare)

  • Jan 25 18:33

    mjpt777 on master

    [Java] Remove dependency for in… (compare)

Martin Thompson
@mjpt777
@pieceofchum We only answer cluster questions to those on a support contract.
David Raymond
@pieceofchum
ok np
William
@ilove7412369_twitter

I think there is a bug in aeron IPC.
I run aeron archive on the channel say ipc stream id 1001

I publish thing in order,
but the subscriber read message 100057 before 100055, out of order and shown in log

The archive, however is totally in correct order.
Aeron version i use is 1.28.2

i wonder if such is a know issue or not.

Martin Thompson
@mjpt777
@ilove7412369_twitter We are not aware of such an issue. We have tests that assert the correct order. You can use the LogInspector on the log buffers to see contents. Are you certain the logic of your usage is correct? Subscriptions are not concurrent so each thread requires its own instance.
William
@ilove7412369_twitter
Thanks, i will investigate further.
Judd Gaddie
@juddgaddie

I have been looking at the design of the TimerService in Aeron Cluster and it places strict requirements on the ClusteredService to always call scheduleTimer in the same sequence after a replay or snapshot and subsequent replay.
The constraint of a ClusteredService to not do anything completely non-deterministic i.e. new Random() etc - feels reasonable to me.

However, it also requires ClusteredService to record and restore all its state in a snapshot in order to correctly call scheduleTimer() for it to be reliable. While strictly speaking the snapshot of a ClusteredService should store its entire state. Sometimes a user may only store a subset of the ClusteredService state when snapshotted, this may still result in correct behavior of the ClusteredService however it may not be a deterministic sequence of “schedule” and “cancel” calls to the TimerService therefore it may be doing NoOps when an attempt to schedule following the loading of the snapshot.

I did see the javadoc on the Cluster does hint at this.

Martin Thompson
@mjpt777
@juddgaddie The timers are recorded in the snapshot managed by the consensus module. The service does not need to worry about that.
Judd Gaddie
@juddgaddie
It needs to worry about it a little bit as the state of the ClusteredService needs to be in sync with the state of the timers in the consensus module (at least wrt the sequence of calls to schedule and cancel and correlationIds used). Otherwise because of expiredTimerCountByCorrelationIdMap in the consensus module some scheduled timers may be ignored.
Martin Thompson
@mjpt777
@juddgaddie You can take this up as a support issue as this is not the place to discuss. I'm not sure you are using things correctly.
Judd Gaddie
@juddgaddie
Fair enough, thanks
feyyaz91
@feyyaz91
Hello there, is there a way to use ReplayMerge when the live publication is on a different media driver to the archived recording? At the moment i am aware the recording and publication must share a media driver to successfully merge
William
@ilove7412369_twitter
Is it illegal to modify the buffer of the callback onFragment?
Martin Thompson
@mjpt777
@ilove7412369_twitter The buffer should be considered readonly.
Ivan Zemlyanskiy
@QIvan
hi! Happy New year, guys!
if you recall I asked some time ago about restarting archive and ports issue, when with java md everything worked fine, but with C md didn't (you said because C md is a way faster than Java one and that's cause of the problem)
Could you help to clarify this scenario: I create an aeron instance (with Aeron.connect()) and add a subscription for an endpoint let's say 10.1.1.1:1234, I work with it for some time, then I close the subscription, close the aeron instance and repeat everything once again. A question: what should I do, and should I, before I start everything over?
thank you in advance
Todd L. Montgomery
@tmontgomery
@QIvan you will want to wait until the counter associated with the channel has get removed.
Ivan Zemlyanskiy
@QIvan
but if I closed the aeron where can I get the CounterReader from?
I mean, I tried to do it with aeron.countersReader(), but I got a segFault =)
Martin Thompson
@mjpt777
@QIvan Same way AeronStat does it.
Ivan Zemlyanskiy
@QIvan
ahhh... you're right, now I feel stupid =)
thank you!
zaradai
@zaradai

Q, Possible bug?
aeron_driver_receiver.c
aeron_driver_receiver_on_add_publication_image

if ensure capacity fails, the error is logged but then continues to inc the array ptr which will surely fail. Should the capacity check be separated and skip the images array update?

Lorenzo Nicora
@nicusX

Hi everybody! Apologies for the newbie question. I am trying to run some of the samples in AWS, between separate EC2 instances, same subnet.
Ping/Pong works perfectly, but I cannot make BasicPublisher/BasicSubscriber talk each other, even via simple unicast UDP.

Publisher (10.0.101.101):

java -cp ./aeron-all-1.32.0-SNAPSHOT.jar \
     -Daeron.sample.embeddedMediaDriver=true \
     -Daeron.sample.channel="aeron:udp?endpoint=10.0.101.101:9000" \
     io.aeron.samples.BasicPublisher

Subscriber (10.0.101.201):

java -cp ./aeron-all-1.32.0-SNAPSHOT.jar \
   -Daeron.sample.embeddedMediaDriver=true \
   -Daeron.sample.channel="aeron:udp?endpoint=10.0.101.101:9000" \
   io.aeron.samples.BasicSubscriber

The subscriber always fails to connect:

io.aeron.exceptions.ChannelEndpointException: ERROR - AeronException : ERROR - channel error - Cannot assign requested address (at java.base/sun.nio.ch.Net.bind0(Native Method)): aeron:udp?endpoint=10.0.101.101:9000

Security Groups, NACL and routing seems to be ok. The two instances can ping each other and can send UDP on port 9000 with netcat. I also tried specifying the interface explicitly and nothing changes (instances have a single network interface, anyhow).

I am sure I am missing something stupid.

Todd L. Montgomery
@tmontgomery
subscriber should use 10.0.101.201. You are using the publisher IP (10.0.101.101). So, use 10.0.101.201 for both if that IP is the subscriber IP
Lorenzo Nicora
@nicusX
Many thanks @tmontgomery
I must say this was not obvious at all, looking at all samples and the wiki. Everything uses localhost that does make not obvious whether it's the publisher or the recipient
Todd L. Montgomery
@tmontgomery
just FYI
Lorenzo Nicora
@nicusX
Thanks. I read that.
Re-reading "The socket address to which publications will send messages and from which subscriptions will receive messages" now I understand the subtlety. But it wasn't massively obvious.
(I don't want to criticise. Please, just get this as a newbie feedback)
Todd L. Montgomery
@tmontgomery
thanks! If you have any suggestions on where that info should be, feel free to let us know
i.e. where would you have expected it, etc.
Lorenzo Nicora
@nicusX
The endpoint paragraph on the wiki is fine. Clarifying the "socket address" is the subscriber (recipient) machine would have helped
Martin Thompson
@mjpt777
@nicusX I've just added that.
Lorenzo Nicora
@nicusX
Cool
Martin Thompson
@mjpt777
@zaradai Thanks, I've pushed a fix.
Mendel Monteiro-Beckerman
@MendelMonteiro
Hi, has there ever been any interest in being able to set QoS traffic classes on the sockets created by Aeron?
Todd L. Montgomery
@tmontgomery
@MendelMonteiro nothing that anyone has been willing to sponsor development on. But if interested in sponsoring it, we could look into it.
Mendel Monteiro-Beckerman
@MendelMonteiro
@tmontgomery thanks for the reply. I'll get in touch if sponsorship is an option.
William
@ilove7412369_twitter
i look at the stats:
sessionId=-1385188173 streamId=1001 channel=aeron:ipc?term-length=268435456 : pub-pos (sampled):251:1694624000 pub-lmt:251:1694623552
is that when it publish reach the pub-lmt, it will take tons of time for the archive to create new memory mapped file and zero them out causing such 10 s?
any ways to avoid?
William
@ilove7412369_twitter
sorry problem solved
Ivan Zemlyanskiy
@QIvan
hi! a quick question if you don't mind:
should in theory the taggedReplicate work with C Media Driver?
I'm getting a timeout exception on my even basic tests, while with Java MD everything is just fine.
thanks.
Martin Thompson
@mjpt777
This room is not being used to answer Aeron questions. Questions can go to our support channels or Stackoverflow which we may eventually get round to.
This room is being reserved for discussions with public contributors.
Martin Thompson
@mjpt777
We are considering shutting down this room. For now we will keep it open but will no longer be giving out free support.
Ivan Zemlyanskiy
@QIvan
oh, cool! glad to read it! It would be awesome if Aeron tag would become more popular on SO
Martin Thompson
@mjpt777
@QIvan Please stop posting here if it is not related to contributions.
Ivan Zemlyanskiy
@QIvan
ok, sorry. Speaking about contributions, a little feedback about reading with Aeron:
why do we need this errorHandler in TermReader https://github.com/real-logic/aeron/blob/master/aeron-client/src/main/java/io/aeron/logbuffer/TermReader.java#L82 ?
I found it very inconveniet, because when I read with my own FH and I throw an exception, I want to use my own try-catch block around a poll, not the global errorHandler which I provided for Aeron and excpect handle aeron related exceptions.
Martin Thompson
@mjpt777
@QIvan This is really a question and not a contribution. I'll allow grace just this once. Look at the following: https://github.com/real-logic/aeron/blob/master/aeron-client/src/main/java/io/aeron/Aeron.java#L1124
William
@ilove7412369_twitter
Aeron archive replay seems quite slow when remote replicate from say Tokyo to SG. Is it not a right use of that? Any way should be better way to do for a higher throughput?
Martin Thompson
@mjpt777
@ilove7412369_twitter As previous stated we will not be offering free consultancy in the room any longer. This room is reserved for public contributor discussions. If you need support we have a commercial offering.