Where communities thrive


  • Join over 1.5M+ people
  • Join over 100K+ communities
  • Free without limits
  • Create your own community
People
Activity
    janxyz
    @janxyz:matrix.org
    [m]
    Thanks! If I understand that correctly, it's a function on the pipeline, right? So all my histograms have the same buckets?
    Dawid Nowak
    @dawid-nowak
    opentelemetry_otlp::new_metrics_pipeline(tokio::spawn, delayed_interval)
                        .with_export_config(export_config)
                        .with_period(std::time::Duration::from_secs(open_telemetry.metric_window.unwrap_or_else(|| 30)))
                .with_aggregator_selector(selectors::simple::Selector::Histogram(vec![0.0, 0.1, 0.2, 0.3, 0.5, 0.8, 1.3, 2.1]))
                        .build()?,
    there is a PR open so you can customize aggregators and buckets that they might be using based on metric descriptor name open-telemetry/opentelemetry-rust#497
    Zhongyang Wu
    @TommyCpp
    I think our coverage broke somehow. Haven't seen a coverage report for a while
    Found this error in build log
    ./target/debug/deps/sha2-0dbd11ac79fa6266.gcno:version '408*', prefer 'A93*'
    find: ‘gcov’ terminated by signal 11
    Julian Tescher
    @jtescher
    hm
    sorta odd, we don't specify much for that?
    Zhongyang Wu
    @TommyCpp
    It works until open-telemetry/opentelemetry-rust#465, but we haven't change anything in CI pipeline around that PR
    Dawid Nowak
    @dawid-nowak
    This message was deleted

    hey all, i am running this example here

    fn init_meter() -> metrics::Result<PushController> {
        let export_config = ExporterConfig {
            endpoint: "http://localhost:4317".to_string(),
            protocol: Protocol::Grpc,
            ..ExporterConfig::default()
        };
        opentelemetry_otlp::new_metrics_pipeline(tokio::spawn, delayed_interval)
            .with_export_config(export_config)
        .with_aggregator_selector(selectors::simple::Selector::Exact)
            .build()
    }
    
    #[tokio::main]
    async fn main() -> Result<(), Box<dyn Error + Send + Sync + 'static>> {
        let _started = init_meter()?;
        let meter = global::meter("test2");
        let value_counter = meter.u64_counter("blah9").init();   
        let labels = vec![KeyValue::new("key1","val1")];
        for i in 0..100{
        let j = i%4;
        println!("{} {}",i,j);
        let mut labels = vec![];//labels.clone();
        let kv = match j{
            0=> KeyValue::new("method","GET"),
            1=> KeyValue::new("method","POST"),
            2=> KeyValue::new("method","PUT"),
            3=> KeyValue::new("method","DELETE"),
            _=> KeyValue::new("method","HEAD"),
        };
        labels.push(kv);
    //    labels.push(KeyValue::new("key4",j.to_string()));
    
        value_counter.add(1, &labels);
        tokio::time::sleep(Duration::from_secs(1)).await;
        }
    
        // wait for 1 minutes so that we could see metrics being pushed via OTLP every 10 seconds.
        tokio::time::sleep(Duration::from_secs(60)).await;
    
        shutdown_tracer_provider();
    
        Ok(())
    }

    At the end, in Prometheus dashboard I am getting
    image.png

    agent_blah9{collector="pf", instance="otel-agent:8889", job="otel-collector", method="PUT", type="docker"}   25

    where I would expect

    agent_blah9{collector="pf", instance="otel-agent:8889", job="otel-collector", method="PUT", type="docker"}   25
    agent_blah9{collector="pf", instance="otel-agent:8889", job="otel-collector", method="POST", type="docker"}   25
    agent_blah9{collector="pf", instance="otel-agent:8889", job="otel-collector", method="GET", type="docker"}   25
    agent_blah9{collector="pf", instance="otel-agent:8889", job="otel-collector", method="DELETE", type="docker"}   25

    Any ideas ? I had a look at agent opentelemetry-agent logs and there seems to be only one type of metric there

    Data point labels:
         -> method: PUT
    12 replies
    Zhongyang Wu
    @TommyCpp
    I tried it and it seems to be the right data
    Dawid Nowak
    @dawid-nowak
    More interested in the general approach. Are you guys using some sophisticated tooling or pepper the code with printfs
    Zhongyang Wu
    @TommyCpp
    Clion has a debugger for Rust, it’s not as powerful as the one for Java but it works fine I think
    Dawid Nowak
    @dawid-nowak
    Yeah, it crashed when I was loading the project 🤣
    Zhongyang Wu
    @TommyCpp
    @jtescher Should we consider push a new patch version for opentelemetry-otlp to (hopefully) fix the broken doc?
    Julian Tescher
    @jtescher
    @TommyCpp could do a patch release for it
    Zhongyang Wu
    @TommyCpp
    That’s great. Thanks!
    Ido Barkan
    @barkanido
    hey folks. I am writing a service which reports to datadog. should I use opentelemtry-rust ? how stable for production is it?
    7 replies
    Jasper
    @jbg
    hi all, i'm trying to build something that depends on the opentelemetry-otlp rust crate, and I get this cryptic error from opentelemetry-otlp's build.rs:
    10 replies
    error: failed to run custom build command for `opentelemetry-otlp v0.5.0`
    965    
    966    Caused by:
    967      process didn't exit successfully: `/.../target/release/build/opentelemetry-otlp-13bb7928af03e4cb/build-script-build` (exit code: 101)
    968      --- stderr
    969      thread 'main' panicked at 'Error generating protobuf: Os { code: 2, kind: NotFound, message: "No such file or directory" }', /usr/local/cargo/registry/src/github.com-1ecc6299db9ec823/opentelemetry-otlp-0.5.0/build.rs:28:10
    970
    it's happening when opentelemetry-otlp's build.rs calls tonic_build, so i thought my build container might be missing protoc, but it's installed
    i could hack the build.rs to provide a useful error message but it's about 5 dependencies deep so will involve a fair amount of dependency patching. has anyone seen this error before and knows what might be wrong?
    Nisheeth Barthwal
    @nbaztec
    Hey folks, of the late seems otel (jaeger) propagator has stopped working on the tonic gRPC clients even though the traceId is in the current context (via withContext) and the propagator set. My other Go service thus cannot join spans correctly. Anyone experience and/or have an idea about this? Thanks!
    7 replies
    Noel Campbell
    @nlcamp
    I'm running into errors trying to adapt the basic-otlp example to have a SumObserver with a callback that calls observe() using a variable instead of a constant. Is there a recommended approach for this? Here's a diff showing my changes to basic-otlp/src/main.rs, which currently don't compile:
    diff --git a/examples/basic-otlp/src/main.rs b/examples/basic-otlp/src/main.rs
    index 51ac886..6a981ac 100644
    --- a/examples/basic-otlp/src/main.rs
    +++ b/examples/basic-otlp/src/main.rs
    @@ -51,6 +51,7 @@ lazy_static::lazy_static! {
         ];
     }
    
    +
     #[tokio::main]
     async fn main() -> Result<(), Box<dyn Error + Send + Sync + 'static>> {
         let _ = init_tracer()?;
    @@ -59,7 +60,8 @@ async fn main() -> Result<(), Box<dyn Error + Send + Sync + 'static>> {
         let tracer = global::tracer("ex.com/basic");
         let meter = global::meter("ex.com/basic");
    
    -    let one_metric_callback = |res: ObserverResult<f64>| res.observe(1.0, COMMON_LABELS.as_ref());
    +    let mut f64_metric_val: f64 = 1.0;
    +    let one_metric_callback = |res: ObserverResult<f64>| res.observe(f64_metric_val, COMMON_LABELS.as_ref());
         let _ = meter
             .f64_value_observer("ex.com.one", one_metric_callback)
             .with_description("A ValueObserver set to 1.0")
    @@ -101,8 +103,15 @@ async fn main() -> Result<(), Box<dyn Error + Send + Sync + 'static>> {
             });
         });
    
    -    // wait for 1 minutes so that we could see metrics being pushed via OTLP every 10 seconds.
    -    tokio::time::sleep(Duration::from_secs(60)).await;
    +    let mut count = 0u32;
    +    loop {
    +        tokio::time::sleep(Duration::from_secs(10)).await;
    +        count += 1;
    +        f64_metric_val += 1.0;
    +        if count == 6 {
    +            break;
    +        }
    +    }
    
         shutdown_tracer_provider();
    3 replies
    elbaro
    @elbaro

    Hi, I am trying out the tracing api with local jaeger instance,
    only a short lived #[instrument] appears in jaeger ui, and ui shows no unfinished span or event.

    Even this simple example does not work; it only shows 'doing_work' span and no event at all. what is wrong?

    fn main() {
        opentelemetry::global::set_text_map_propagator(opentelemetry_jaeger::Propagator::new());
        let tracer = opentelemetry_jaeger::new_pipeline()
            .install_simple()
            .unwrap();
        use opentelemetry::trace::Tracer;
        tracing::error!("first error event");
        tracer.in_span("doing_work", |cx| {
            tracing::error!("nested error event");
            tracing::info!("nested info event");
            tracing::warn!("nested warn event");
            tracing::debug!("nested debug event");
            tracing::trace!("nested trace event");
        });
        opentelemetry::global::shutdown_tracer_provider();
    }
    image.png
    image.png
    Julian Tescher
    @jtescher
    @elbaro you probably want to stick with either the tracing or the otel API to not run into issues when mixing the two in the same app
    then you can switch the in_span calls to https://docs.rs/tracing/0.1.25/tracing/span/struct.Span.html#method.in_scope and you should see both your spans and the logs
    (jaeger doesn't show logs outside of spans, so the "first error event" log won't appear there)
    elbaro
    @elbaro

    Thanks, didn't realize the tokio-tracing and otel APIs could conflict.
    Now I am left with an incomplete span problem. For example, in the code below, jaeger shows async_fn2 (child) with its parent span id, but the parent span itself is missing.

    #[instrument]
    async fn async_fn2() {
        tracing::info!("enter2");
        tokio::time::sleep(std::time::Duration::from_secs(5)).await;
        tracing::info!("exit2");
    }
    
    #[instrument]
    async fn async_fn() {
        tracing::info!("enter");
        async_fn2().await;
        // panic!();  // <- this flushes all spans
        tokio::time::sleep(std::time::Duration::from_secs(1000)).await;
        tracing::info!("exit");
    }
    
    #[tokio::main]
    async fn main() -> Result<()> {
        let tracer = opentelemetry_jaeger::new_pipeline()
            .with_service_name("timer")
            .install_simple()?;
    
        tracing_subscriber::registry()
            .with(tracing_opentelemetry::subscriber().with_tracer(tracer))
            .try_init()?;
    
        async_fn().await;
        Ok(())
    }

    Is there a way to flush long-running span periodically? Is this a limitation of jaeger, opentelemetry interface, or rust implementation?
    It would be great to have a 'growing' span: the span's name and start time is immediately visible in jaeger (batching delay is ok), and its end time and additional log/context is updated over time.

    image.png
    Julian Tescher
    @jtescher
    it is a limitation of jaeger, opentelemetry has a hook for exporters to be notified when a span starts, however the jaeger API only supports showing completed spans
    janxyz
    @janxyz:matrix.org
    [m]
    What would be a backend to support this? That sounds interesting!
    Julian Tescher
    @jtescher
    None that I'm aware of, you could make one and open source it though 🙂
    Nuutti Kotivuori
    @nakedible-p
    Hi! I'm trying to use a tracer to simply trace an async function. It seems that the helper for that is with_context from FutureExt. However, it seems horribly difficult to use? I mean, I need to create a tracer, and a span, and then set that span in the current context and then pass that context on to the with_context helper - and it still doesn't seem to automatically set the span status based on the response, or to record any errors in the span from the future.
    Nuutti Kotivuori
    @nakedible-p
    Am I missing something, or is this sort of half-way baked still?
    janxyz
    @janxyz:matrix.org
    [m]
    You can use with_current_context() to simply attach the currently active context. https://docs.rs/opentelemetry/0.13.0/opentelemetry/trace/trait.FutureExt.html#method.with_current_context
    Nuutti Kotivuori
    @nakedible-p
    Should the example be updated to reflect that? https://github.com/open-telemetry/opentelemetry-rust/blob/main/examples/async/src/main.rs - I'm still not sure how that makes things easier - and it still doesn't seem to be handling setting the error response in to the span.
    (afk a bit)
    janxyz
    @janxyz:matrix.org
    [m]
    I understood from your previous post that you find it hard to extract the span context from a running process and forwarding it to an asynchronous function. Was I wrong with that assumption?
    Nuutti Kotivuori
    @nakedible-p
    Guess what I'm looking for is more what tracing provides in their instrument method - that a span is automatically entered on that async call, and exited when done, and status and exception in the span is set automatically based on the results of that function call. To easily track how long it takes for an async function to complete, and if the result was success or error.
    janxyz
    @janxyz:matrix.org
    [m]
    Something like with_span(span_name, fn) that takes a closure or something similar and traces it?
    Nuutti Kotivuori
    @nakedible-p
    I'd like to be able to write TcpStream::connect(proxy_addr).with_span("connect").await?; inside an async function - and the result would be that there would be a "connect" span which is entered when that call starts, and exited when that call finishes, and if the Result from that call is Err, then the Err is recorded in to the span - and if the Result is Ok, then 200 OK is recorded as the status.
    And the span would automatically have as parent the currently active span, as is customary
    janxyz
    @janxyz:matrix.org
    [m]
    Sounds nice! I don't think that's in the library yet though so you'd need to role your own 😬 but I might be mistaken
    Nuutti Kotivuori
    @nakedible-p
    That sounds like it should be quite a basic feature... all in all it would feel like opentelemetry-rust has a lot of implementation done on the internal pieces, but the actual consumer side is kinda clunky. For example, the fact that I usually need to create a tracer object, then start a span inside that tracer object, and then set that span active in the context, all on separate lines is awfully spammy. I'm willing to maybe spend 1 line in a function to ensure it is traced, not spend more lines on handling tracing than actual implementation.
    Anyway, will need to think on this and maybe write some helper of my own.
    Nuutti Kotivuori
    @nakedible-p
    Thank you for the help!