Where communities thrive


  • Join over 1.5M+ people
  • Join over 100K+ communities
  • Free without limits
  • Create your own community
People
Repo info
Activity
  • Dec 03 17:20
    yarosman opened #1091
  • Dec 02 13:28
    pnerg edited #1090
  • Dec 02 10:11
    pnerg commented #1090
  • Dec 02 10:10
    pnerg commented #1090
  • Dec 02 09:22
    pnerg opened #1090
  • Nov 30 20:47

    ivantopo on v2.4.2

    (compare)

  • Nov 30 20:41

    ivantopo on master

    handle HttpEntity.Default in th… (compare)

  • Nov 30 09:18
    ivantopo opened #1089
  • Nov 24 15:57
    MaciejSzewczyszyn closed #1077
  • Nov 24 15:57
    MaciejSzewczyszyn commented #1077
  • Nov 24 15:56
    MaciejSzewczyszyn closed #1078
  • Nov 23 15:08
    getArtemUsername commented #1075
  • Nov 23 14:58

    ivantopo on v2.4.1

    (compare)

  • Nov 23 14:54

    ivantopo on master

    Add jvm threads states metrics … (compare)

  • Nov 23 14:54
    ivantopo closed #1075
  • Nov 23 14:52
    ivantopo commented #1075
  • Nov 23 14:46
    getArtemUsername commented #1075
  • Nov 23 14:13

    ivantopo on v2.4.0

    (compare)

  • Nov 22 09:25

    ivantopo on master

    reduce default retries and back… proactively drop old spans in t… (compare)

  • Nov 18 09:25

    ivantopo on master

    fix thread pool type on schedul… (compare)

Ivan Topolnjak
@ivantopo
so, for example, one incoming message in the socket generates a Span and a related JDBC/HTTP call also generates a Span but those Spans are not tied on the same trace?
Rajat Khandelwal
@prongs
yup
Ivan Topolnjak
@ivantopo
do you have any Cats or Monix in between the websocket and the JDBC/HTTP calls?
Rajat Khandelwal
@prongs
I think context is not being propagated somehow with both Kamon.runWithContext and Kamon.runWithSpan
no cats or monix
(akka + play + slick)
Ivan Topolnjak
@ivantopo
ok.. what versions are you using?
of Play, Slick and Kamon
Rajat Khandelwal
@prongs
  val akkaVersion           = "2.6.5"
  val playVersion           = "2.6.13"
  val playSlickVersion      = "3.0.3"
  val kamonVersion          = "2.1.1"
Ivan Topolnjak
@ivantopo
and also using the SBT plugin for dev mode?
Rajat Khandelwal
@prongs
addSbtPlugin("io.kamon" % "sbt-kanela-runner-play-2.6" % "2.0.6")
addSbtPlugin("net.virtual-void" % "sbt-dependency-graph" % "0.9.2")
resolvers += Resolver.bintrayIvyRepo("kamon-io", "sbt-plugins")
addSbtPlugin("io.kamon" % "sbt-aspectj-runner-play-2.6" % "1.1.2")
but I'm not running in dev mode as of now. It's packaged as docker image and running in k8s.
Ivan Topolnjak
@ivantopo
ok ok
are you able to access the status page on that container and see what's the status of the instrumentation modules?
Rajat Khandelwal
@prongs
Actually I was targeting 3 of our services with Kamon. 2 of them are working great. The 3rd one had WebSockets. In that also, HTTP is working fine, only WebSocket is problematic. I've been debugging that since 2 days now, tried doing create context at different places (mostly wrapping the message passing to worker actors), but looks like somewhere it gets lost or something.
curl for instrumentation module gives this json
{
  "present": true,
  "modules": {
    "annotation": {
      "name": "Annotation Instrumentation",
      "description": "Provides a set of annotations to create Spans and Metrics out of annotated methods",
      "enabled": true,
      "active": false
    },
    "akka-http": {
      "name": "Akka HTTP Instrumentation",
      "description": "Provides context propagation, distributed tracing and HTTP client and server metrics for Akka HTTP",
      "enabled": true,
      "active": true
    },
    "executor-service": {
      "name": "Executor Service Instrumentation",
      "description": "Provides automatic Context propagation to all non-JDK Runnable and Callable implementations which enables\n         Context propagation on serveral situations, including Scala, Twitter and Scalaz Futures",
      "enabled": true,
      "active": true
    },
    "play-framework": {
      "name": "Play Framework Instrumentation",
      "description": "Provides context propagation, distributed tracing and HTTP client and server metrics for Play Framework",
      "enabled": true,
      "active": true
    },
    "mongo-driver": {
      "name": "Mongo Driver Instrumentation",
      "description": "Provides automatic tracing of client operations on the official Mongo driver",
      "enabled": true,
      "active": false
    },
    "akka-remote": {
      "name": "Akka Remote Instrumentation",
      "description": "Provides distributed Context propagation and Cluster Metrics for Akka",
      "enabled": true,
      "active": true
    },
    "jdbc": {
      "name": "JDBC Instrumentation",
      "description": "Provides instrumentation for JDBC statements, Slick AsyncExecutor and the Hikari connection pool",
      "enabled": true,
      "active": false
    },
    "scala-future": {
      "name": "Scala Future Intrumentation",
      "description": "Provides automatic context propagation to the thread executing a Scala Future's body and callbacks",
      "enabled": true,
      "active": true
    },
    "akka": {
      "name": "Akka Instrumentation",
      "description": "Provides metrics and message tracing for Akka Actor Systems, Actors, Routers and Dispatchers",
      "enabled": true,
      "active": true
    },
    "logback": {
      "name": "Logback Instrumentation",
      "description": "Provides context propagation to the MDC and on AsyncAppenders",
      "enabled": true,
      "active": true
    }
  },
  "errors": {}
}
Ivan Topolnjak
@ivantopo
when it gets to this point and you need to start debugging it gets a bit annoying but relatively easy: println eveywhere! :joy:
I would start logging the current trace id everywhere and see where it gets lost
if it is on a future, actor or something we already support then maybe there is an issue with the agent or initialization... if it is on something else that we don't support at the moment then new (or manual) instrumentation will be necessary
can you please try to narrow it down the specific interaction where the context is lost? and of course, ask if you need any help!
Rajat Khandelwal
@prongs
appreciate the help :)
Ivan Topolnjak
@ivantopo
if there are any akka streams in the middle that could also be breaking propagation
Rajat Khandelwal
@prongs
akka implements websockets as streams
:(
Ivan Topolnjak
@ivantopo
but that happens before it gets to your code, right?
Rajat Khandelwal
@prongs
Yup
Ivan Topolnjak
@ivantopo
it shouldn't be a problem if you are not using streams directly in your code. Try to focus on all places where there is an async boundary: creating or transforming futures, sending messages to actors, throwing tasks into schedulers
those are the typical places where a context would be lost
Rajat Khandelwal
@prongs

Another weird thing, I have enabled trace id in the logs. I'm able to see logs like this:

[info][2020-06-11_13:34:56.636] [8cb3f9b6f92ceebe|5e401da57b753cea] c.ClassName - Log Msg

Where the format is [%traceID|%spanID].

But when I open the URL http://jaeger-host:16686/trace/8cb3f9b6f92ceebe it shows 404

Ivan Topolnjak
@ivantopo
that probably means that the trace was not sampled
I don't remember whether we have a conversion rule for the sampling decision yet
but that's something that annoys me all the time! seeing trace id in the logs and then it wasn't sampled
we should encourage users to log the sampling decision as well!
1 reply
Rajat Khandelwal
@prongs
alright, sampling is fine, but I do have lot of async boundaries like you mention. Is there any recommended way to deal with that? I already wrapped my first actor forward in a runWithContext, do I need to do this in the whole chain?
And does sampling imply that if I try with just one websocket session, it might not even come up in jaeger? I need to try multiple sessions?
Ivan Topolnjak
@ivantopo
no, the Kamon instrumentation will automatically propagate context across actors and futures
regarding sampling, yes
the first Span of the chain takes a sampling decisions and then all related spans just follow the same sampling decision
Rajat Khandelwal
@prongs

Ah. Then logs are a better indicator than jaeger UI, and I might not have an updated picture of what's working and what's not.

So I'll put logging in the whole chain, that way I'll know if/when/where the trace id gets dropped.

Ivan Topolnjak
@ivantopo
yeap
that's the way to go
Rajat Khandelwal
@prongs

Thanks @ivantopo I'm able to verify trace propagation in logs across async boundaries. Not seeing the small individual traces(ones for outound http calls or db calls) in jaeger now (as they are now part of another parent -- the WebSocket one). Not seeing the WebSocket trace in jaeger, but that's because of sampling.

This is what I did in a nutshell (so as to help others):

  • Created a context in the WebSocket actor class -- instance level
  • Created a span out of this context to represent the whole WebSocket session
  • In message handling, Created a child span from the session-level span, put proper tags in it and forwarded messages to worker actors.
  • After the worker actor, context propagation works out of the box.
Nihat Hosgur
@nhosgur
HI @ivantopo , wish to use Kamon reporter with new relic. You guys used to have new relic documentaion for 0.6xx yet don't see reporter for 2xx
10 replies
Yaroslav Derman
@yarosman
Hello. Does anyone have problem with kamon, docker, alpine, ash script, play integration, when metrics contains wrong gc and generation
jvm_gc_seconds_bucket{collector="scavenge",le="+Inf",component="jvm",generation="unknown"} when uses G1 ?
Alexis Hernandez
@AlexITC
@prongs would you share an example? how do you create the child span?, possibly asChildOf is the only way, I was looking for a way to created a span from the parent
1 reply
Alexis Hernandez
@AlexITC
@ivantopo I understand I should be able to get a view like this one from zIpkin on the APM dashboard, but I have no idea how, can you give me some insights please? https://zipkin.io/public/img/web-screenshot.png
abhihub
@abhihub
Is it possible to be assigned to an issue @ivantopo ? This is an issue that NR can fix : kamon-io/Kamon#789. So wanted to be assigned to it so I can track and prioritize it.
Ivan Topolnjak
@ivantopo
@abhihub it seems like we will need to invite you guys to the team and then we can assign
1 reply
Rajat Khandelwal
@prongs
@ivantopo in my websocket use-case, the websocket actors sends one message to a worker thread and the worker actor then sends multiple replies to the WebSocket actor. I'm seeing that context is intact until the first reply, it breaks in the subsequent replies
Rajat Khandelwal
@prongs

I think it's fine in 1:1 request-response cases, but when 1 request has multiple replies -- like a stream -- then you need to resort to context propagation.

Nevertheless, For me, for now, it's fine even without that. I'm creating new spans for new requests from UI, from a parent span. The child spans might close too fast -- giving incorrect information, but the parent span is there for the whole life of the socket.