Where communities thrive


  • Join over 1.5M+ people
  • Join over 100K+ communities
  • Free without limits
  • Create your own community
People
Repo info
Activity
  • Jan 20 15:39
    seglo commented #1096
  • Jan 20 15:21
    seglo synchronize #1096
  • Jan 20 12:23
    ivantopo commented #1096
  • Jan 17 17:01
    seglo commented #1096
  • Jan 12 12:46
    ihostage synchronize #760
  • Jan 07 14:12
    ryanb93 opened #1097
  • Jan 07 14:05
    ivantopo commented #1096
  • Jan 05 02:44
    seglo commented #1096
  • Jan 04 10:16
    style95 commented #1089
  • Jan 03 20:56
    seglo synchronize #1096
  • Jan 03 20:40
    seglo synchronize #1096
  • Jan 03 17:51
    seglo synchronize #1096
  • Jan 01 00:31
    seglo synchronize #1096
  • Dec 30 2021 19:10
    seglo edited #1096
  • Dec 30 2021 19:09
    seglo opened #1096
  • Dec 18 2021 01:08
    jtjeferreira commented #1094
  • Dec 18 2021 01:06
    jtjeferreira commented #1094
  • Dec 16 2021 20:25
    jtjeferreira opened #1094
  • Dec 15 2021 15:10
    yarosman closed #1091
  • Dec 15 2021 15:10
    yarosman commented #1091
Ivan Topolnjak
@ivantopo
can you please try to narrow it down the specific interaction where the context is lost? and of course, ask if you need any help!
Rajat Khandelwal
@prongs
appreciate the help :)
Ivan Topolnjak
@ivantopo
if there are any akka streams in the middle that could also be breaking propagation
Rajat Khandelwal
@prongs
akka implements websockets as streams
:(
Ivan Topolnjak
@ivantopo
but that happens before it gets to your code, right?
Rajat Khandelwal
@prongs
Yup
Ivan Topolnjak
@ivantopo
it shouldn't be a problem if you are not using streams directly in your code. Try to focus on all places where there is an async boundary: creating or transforming futures, sending messages to actors, throwing tasks into schedulers
those are the typical places where a context would be lost
Rajat Khandelwal
@prongs

Another weird thing, I have enabled trace id in the logs. I'm able to see logs like this:

[info][2020-06-11_13:34:56.636] [8cb3f9b6f92ceebe|5e401da57b753cea] c.ClassName - Log Msg

Where the format is [%traceID|%spanID].

But when I open the URL http://jaeger-host:16686/trace/8cb3f9b6f92ceebe it shows 404

Ivan Topolnjak
@ivantopo
that probably means that the trace was not sampled
I don't remember whether we have a conversion rule for the sampling decision yet
but that's something that annoys me all the time! seeing trace id in the logs and then it wasn't sampled
we should encourage users to log the sampling decision as well!
1 reply
Rajat Khandelwal
@prongs
alright, sampling is fine, but I do have lot of async boundaries like you mention. Is there any recommended way to deal with that? I already wrapped my first actor forward in a runWithContext, do I need to do this in the whole chain?
And does sampling imply that if I try with just one websocket session, it might not even come up in jaeger? I need to try multiple sessions?
Ivan Topolnjak
@ivantopo
no, the Kamon instrumentation will automatically propagate context across actors and futures
regarding sampling, yes
the first Span of the chain takes a sampling decisions and then all related spans just follow the same sampling decision
Rajat Khandelwal
@prongs

Ah. Then logs are a better indicator than jaeger UI, and I might not have an updated picture of what's working and what's not.

So I'll put logging in the whole chain, that way I'll know if/when/where the trace id gets dropped.

Ivan Topolnjak
@ivantopo
yeap
that's the way to go
Rajat Khandelwal
@prongs

Thanks @ivantopo I'm able to verify trace propagation in logs across async boundaries. Not seeing the small individual traces(ones for outound http calls or db calls) in jaeger now (as they are now part of another parent -- the WebSocket one). Not seeing the WebSocket trace in jaeger, but that's because of sampling.

This is what I did in a nutshell (so as to help others):

  • Created a context in the WebSocket actor class -- instance level
  • Created a span out of this context to represent the whole WebSocket session
  • In message handling, Created a child span from the session-level span, put proper tags in it and forwarded messages to worker actors.
  • After the worker actor, context propagation works out of the box.
Nihat Hosgur
@nhosgur
HI @ivantopo , wish to use Kamon reporter with new relic. You guys used to have new relic documentaion for 0.6xx yet don't see reporter for 2xx
10 replies
Yaroslav Derman
@yarosman
Hello. Does anyone have problem with kamon, docker, alpine, ash script, play integration, when metrics contains wrong gc and generation
jvm_gc_seconds_bucket{collector="scavenge",le="+Inf",component="jvm",generation="unknown"} when uses G1 ?
Alexis Hernandez
@AlexITC
@prongs would you share an example? how do you create the child span?, possibly asChildOf is the only way, I was looking for a way to created a span from the parent
1 reply
Alexis Hernandez
@AlexITC
@ivantopo I understand I should be able to get a view like this one from zIpkin on the APM dashboard, but I have no idea how, can you give me some insights please? https://zipkin.io/public/img/web-screenshot.png
abhihub
@abhihub
Is it possible to be assigned to an issue @ivantopo ? This is an issue that NR can fix : kamon-io/Kamon#789. So wanted to be assigned to it so I can track and prioritize it.
Ivan Topolnjak
@ivantopo
@abhihub it seems like we will need to invite you guys to the team and then we can assign
1 reply
Rajat Khandelwal
@prongs
@ivantopo in my websocket use-case, the websocket actors sends one message to a worker thread and the worker actor then sends multiple replies to the WebSocket actor. I'm seeing that context is intact until the first reply, it breaks in the subsequent replies
Rajat Khandelwal
@prongs

I think it's fine in 1:1 request-response cases, but when 1 request has multiple replies -- like a stream -- then you need to resort to context propagation.

Nevertheless, For me, for now, it's fine even without that. I'm creating new spans for new requests from UI, from a parent span. The child spans might close too fast -- giving incorrect information, but the parent span is there for the whole life of the socket.

Rajat Khandelwal
@prongs
Actually, found the case when context propagation breaks: My actor sends periodic tracking messages to itself. This is where it ends up breaking. context propagation doesn't happen. e.g.
context.system.scheduler.scheduleOnce(trackingInterval, self, Track)
Ivan Topolnjak
@ivantopo
oh man
I had this conversation before
I know
Ivan Topolnjak
@ivantopo
I don't know where did I write it down, but I remember having this conversation several times and realizing that we have to instrument all the scheduleOnce calls to keep the same context
4 replies
Rajat Khandelwal
@prongs
any band-aid fix I can do? Other than have the "websocketcontext" propagate down everywhere? Problem is, it's not quite readable when some message handlers will be Kamon.runWithContext, and some won't.
Rajat Khandelwal
@prongs
ended up doing this
  implicit class KamonScheduler(scheduler: akka.actor.Scheduler) {
    final def scheduleOnceWithKamon(delay: FiniteDuration, receiver: ActorRef, message: Any)(
      implicit
      executor: ExecutionContext,
      sender:   ActorRef = Actor.noSender
    ): Cancellable = {
      val ctx = Kamon.currentContext()
      scheduler.scheduleOnce(delay, new Runnable {
        override def run = Kamon.runWithContext(ctx) { receiver ! message }
      })
    }
  }
Franco Albornoz
@dannashirn
Hey everybody, I'm trying to migrate an existing library that uses kamon 1.x to 2.x and am almost done, but I'm struggling with migrating some custom http clients which used the Kamon.withContextKey method, which I believe now should be replaced with preStartHooks, but can't seem to find any documentation about how to use those. Could you maybe point me in the right direction there?
Rajat Khandelwal
@prongs
Hey, is there a config to disable kamon for test cases? I don't need to instrument tests.
3 replies
Rajat Khandelwal
@prongs

image.png

Incorrect "invalid parent span id". Weird behaviour

Rajat Khandelwal
@prongs

Another weird behaviour I see is incorrect ordering of spans in the UI. I do a future call and in the oncomplete I send a message to an actor. The jaeger UI is showing them in reverse order. It shows the message passing was adjusted by (-x seconds), leading to an incorrect order in the UI. And if I add x to that time, it's actually correct -- as in after the future span completion. Due to the adjustment thing it's giving incorrect behaviour.

AFAIK, UI has no way of disabling adjustment.

Ivan Topolnjak
@ivantopo
hey @prongs, that issue with Jaeger.. is it just about the order in which the Spans are shown in the UI or the parent-child relationships are wrong?
Alexey Kiselev
@alexeykiselev

Hello, Kamon devs!
I'm trying to understand how Kamon generates host tag. In local application configuration file I see:

kamon {
  enable = yes
  environment.host = "1"

In base application configuration file:

kamon {
  # Set to "yes", if you want to report metrics
  enable = no

  environment {
    service = "xxx"

So, in InfluxDB for Histogram metrics made by Kamon I see:

host =1
instance=xxx@1
service=xxx

But, if an org.influxdb.Point was written directly using the org.influxdb.InfluxDB driver, the metric also contains tag host that equals to hostname of the machine. By the way, application also reports JVM metrics using kamon-system-metrics. Is it possible that Kamon that is not used to produce some metric adds its host to it?

Arjun Karnwal
@arjunkarnwal

Hello, Kamon devs

I am having some issues about using tapir with akka-http backend together with kamon? I observe problem with resolving operation names in the span metrics, and wonder if there's some workaround ? cc @matwojcik I see you have faced this issue before. Do you find a solution for the same ?

Ivan Topolnjak
@ivantopo
hey @alexeykiselev, what Kamon versions are you using there? some of those names sound like from previous Kamon versions! Regarding the host tag, ideally you will leave kamon.environment.host set to auto and only change it if you really need to. For example, in our own deployments we use an environment variable with the actual name of the host since we are running everything in containers. There are a few settings in the InfluxDB reporter to decide whether you want to set the host tag or not
@arjunkarnwal what exactly is the problem you are seeing?
Arjun Karnwal
@arjunkarnwal
@ivantopo when I use tapir with akka-http backend together with kamon, I observer that span metrics like span_processing_time_seconds_count{operation="http.server.request"} whereas if I dont use Tapir I get span_processing_time_seconds_count{operation="api/v1/mycustomerAPIPath"} . I dont know why but when I use tapir, the operation attribute gets overridden i.e. instead of having the api path, it contain "http.server.request".
6 replies
Alexey Kiselev
@alexeykiselev

hey @alexeykiselev, what Kamon versions are you using there?

It's 2.1.0, but configuration may be from older versions.
We set host to internal node ID to show Kamon histograms in Grafana the same as we show metrics created directly. In latter case we just add tag node with the same ID.
I wonder, is it possible in project where Kamon is used and kamon-system-metrics also used that direct calls to org.influxdb.InfluxDB some how polluted with tags that Kamon set.

2 replies
Red Benabas
@red-benabas
Hi Kamon dev! We're upgrading our SBT native packager project from Kamon 1.X to Kamon 2.X, following the steps in the guide https://kamon.io/docs/latest/guides/migration/from-1.x-to-2.0/. We've added Kanela plugin as well as Kanela agent. When I point project to a zipkin sever running on localhost we can see the traces. However, not on remote zipkin server. Has anyone come across this?
Also, is there a way enable DEBUG level logging in kamon.zipkin?
7 replies