Where communities thrive


  • Join over 1.5M+ people
  • Join over 100K+ communities
  • Free without limits
  • Create your own community
People
Repo info
Activity
  • Feb 08 19:45
    pieceofchum commented #1420
  • Feb 08 19:42

    mikeb01 on master

    [Java] Add toString methods. (compare)

  • Feb 08 02:54

    mikeb01 on master

    [Java] Update documentation. (compare)

  • Feb 08 02:14

    mikeb01 on master

    [CodeQL] Disable integer as enu… [Java] Make null checks clearer. (compare)

  • Feb 08 00:18

    mikeb01 on master

    [C] Remove unused media driver … (compare)

  • Feb 08 00:03
    mikeb01 commented on 7b0e917
  • Feb 07 23:48
    mikeb01 closed #1420
  • Feb 07 23:48
    mikeb01 commented #1420
  • Feb 07 21:52
    mikeb01 updated the wiki
  • Feb 07 21:50

    mikeb01 on 1.40.0_tutorial_patch

    [Java] Fix basic auction clust… (compare)

  • Feb 07 21:20
    vyazelenko commented on 7b0e917
  • Feb 07 20:35

    mikeb01 on master

    [Java] Add deprecated tags to d… (compare)

  • Feb 07 20:16

    mikeb01 on master

    [Java] Deprecate dynamic join. (compare)

  • Feb 07 19:26

    mikeb01 on master

    [C] Use int64 when parsing uint… (compare)

  • Feb 07 18:18

    mikeb01 on master

    Set max of resource free queue … (compare)

  • Feb 07 18:18
    mikeb01 closed #1421
  • Feb 07 18:18
    mikeb01 commented #1421
  • Feb 07 16:53

    mjpt777 on master

    [Java] Flip buffer for writing. (compare)

  • Feb 07 16:51

    mjpt777 on master

    [Java] Add defaults to switch s… (compare)

  • Feb 07 16:16

    mjpt777 on master

    [Java] Use Object streams in tr… (compare)

Carlo
@entangled90
and the inconsistency is between these two buffers
Dmitry Vyazelenko
@vyazelenko
So you are using io.aeron.Publication#offer(org.agrona.DirectBuffer, int, int, org.agrona.DirectBuffer, int, int)? Do you have a reproducer (test case) that you can share?
Carlo
@entangled90
not yet
i'm trying upgrading right now
I'll try to see if I can reproduce it
Dmitry Vyazelenko
@vyazelenko
ok
Carlo
@entangled90
another doubt I had was if I can use ExclusivePublication from differen threads (each one with it's own Exclusive publication)
the publication is not shared between threads
Dmitry Vyazelenko
@vyazelenko
ExclusivePublication cannot be shared between threads. If you have multiple threads publishing then you need to use ConcurrentPublication.
Carlo
@entangled90
ah ok! that's probably it then..
Carlo
@entangled90
thanks very much
Carlo
@entangled90
actually that change didn't fix, as every thread used a local (per thread) ExclusivePublication
I entered the media driver machine and I found strange data from AeronStat
NAKs are consistently increasing, +1 every 20 sec or so
It doesn't sound normal for two processes in the same rack
This message was deleted
Martin Thompson
@mjpt777
@entangled90 NAKs can happen even over loopback when buffer sizes are not correct given sending rate or congestion.
Carlo
@entangled90
ok thanks
What's the best way to investigate missing "messages" in a stream? I tried the LossReport but the timestamps don't match
Martin Thompson
@mjpt777
@entangled90 Messages will not go missing due to loss. Loss gets recovered. Without spending time understanding your application it is difficult to guess what can be wrong.
Mostly when people say messages have gone missing when we have investigated it was a bug in their app.
Are you registering image unavailable handlers to see if you have connections dropping out and reconnecting?
Carlo
@entangled90
yes, but I don't explicitly reconnect
Martin Thompson
@mjpt777
Aeron will reconnect if the publication and subscription are still active unless rejoin=false is set on the URI for the subscription.
Carlo
@entangled90
ok, so in fact It reconnects normally
I don't understand one thing about the LossReport:
I have lines where LAST_OBSERVATION = 11:02:19 & FIRST_OBSERVATION = 13:19:01
Martin Thompson
@mjpt777
Not only should exclusive publications not be shared across threads, subscriptions should never be shared across threads.
Carlo
@entangled90
shouldn't first_observation be before last_observation?
Martin Thompson
@mjpt777
Which media driver?
Carlo
@entangled90

Not only should exclusive publications not be shared across threads, subscriptions should never be shared across threads.

Now i'm using ConcurrentPublication and a single thread with a Subscription

C Media driver
Martin Thompson
@mjpt777
There is a bug in the recording of loss observations in the C media driver. It is writing the last and first observations the wrong way round. I'll fix it.
Carlo
@entangled90
ah ok, thanks
btw, is it normal that the loss report is not empty?
Martin Thompson
@mjpt777
It depends on congestion and how you size your buffers. A little loss is common.
Carlo
@entangled90
ok..Is there a way to check if NAK sent by one media driver are received by the other?
I suppose that the output of AeronStat is enough: NAKs received & Retransmits sent should mean it's receiving NAK from the other media driver
Martin Thompson
@mjpt777
Correct.
Carlo
@entangled90
I still there is something strange happening. Yesterday I restarted all the services in the afternoon and no loss was reported for the remaining time. Today I log in and I see that application is not receiving some messages. It startied at 07:42 on machine 1 (machine 2 reports no losses). LossReport is filled with rows starting from 07:42. Around 280 rows for that stream id.
pulisher publishes around 3k msgs/s of about 700 bytes each
Carlo
@entangled90
is there a "golden rule"for sizing buffer in the media driver given a throughput?
Martin Thompson
@mjpt777
@entangled90 You need to read up on Bandwidth Delay Product and queuing theory.
@entangled90 If you need more detailed help we can provide consulting.
Carlo
@entangled90
ok I'll take a look
@entangled90 If you need more detailed help we can provide consulting.
atm we are still assessing aeron, I'll let you know in the future
Ghost
@ghost~5fab2f45d73408ce4ff3c0e0
Hello I had a quick question about how aeron cluster nodes behave. This is not a support question, I am just trying to understand how the nodes interact and what impact network latency would have. My questions is If you have nodes in different regions where there is much more latency for one region than another is it expected that the slower region would slow down all nodes? Is there communications that are happening that would have that type of effect possibly in how the RAFT is communicating with each node? Thanks for any help in understanding performance characteristics of the cluster.
Martin Thompson
@mjpt777
@pieceofchum We only answer cluster questions to those on a support contract.
Ghost
@ghost~5fab2f45d73408ce4ff3c0e0
ok np
William
@ilove7412369_twitter

I think there is a bug in aeron IPC.
I run aeron archive on the channel say ipc stream id 1001

I publish thing in order,
but the subscriber read message 100057 before 100055, out of order and shown in log

The archive, however is totally in correct order.
Aeron version i use is 1.28.2

i wonder if such is a know issue or not.

Martin Thompson
@mjpt777
@ilove7412369_twitter We are not aware of such an issue. We have tests that assert the correct order. You can use the LogInspector on the log buffers to see contents. Are you certain the logic of your usage is correct? Subscriptions are not concurrent so each thread requires its own instance.