@awssandra It's Matt (MSFT), good to meet you today. Some questions about the X-Ray Spec (by Section):
Workflow
1) Do you consider X-Ray "head-based", "tail-based" sampling, or something in between?
2) Does X-Ray impact throughput of the "entry service" (assuming no memory constraints)?
3) Is there a scenario where sampling at the Client would be useful (before data is sent)?
Sampling Rule
4) Do you have any requests to cap volume by default? Seems like "reservoir" + 5% rate could be expensive with high-volume.
5) Does "Service type" only refer to the entry service? What if two difference service types are in a dependency chain?
6) Do you have any usage data on these filters? For instance, is "URL Path" used enough to justify it?
Work Modes
7) How long does it take for sampling rules changes to take effect? Any customer feedback here?
8) Without the X-Ray SDK, is sampling unavailable? Do customers have other sampling options?
9) Can customers using 3rd-party monitoring venders take advantage of X-Ray?
I been looking at the sampler for .NET and wanted to propose building the concept of an Aggregate Sampler (AndSmapler, OrSampler). Here is a draft of a OTEP describing it. At a high level, it describes building a sampler that evaluates the results of multiple inner samplers to make a decision. I feel that it may make things easier in scenarios where multiple considerations/approaches go into making a sampling decision.
https://github.com/jifeingo/oteps/blob/AggregateSamplers/0000-sampler-and.md
Is this the right place to discuss something like this?
sampling.rate
or sampling.rule
would make sense to fill that gap. Or maybe there's a reason to not include such information?
Hi all, I'm new here, and I'm not sure if this is the right place for my concern and if it hasn't already been addressed elsewhere.With trace ID ratio based sampling, a trace can break into several parts, if there is a sampler with a smaller ratio in the middle. You then know that all spans belong to the same trace, but it is not possible to determine which part is a child of another.
For example, consider a trace consisting of 3 spans, A -> B -> C. The sampling ratio is 1 for A and C, while it is 0.01 for B. This means that only A and C are collected while B is discarded with 99% probability. Since both will have the same trace ID, we just know they are part of the same trace. However, the parent span ID of C will be the span ID of B which is not collected and therefore not known finally. Hence, it is not possible in the general case to conclude that span C is a (grand-)child of A. (In this particular case, we know that C must be a child of A, because A is the root span. In the general case, this reasoning is not possible though.)
A possible solution would be forwarding the parent span ID in case a sampler decides not to sample a span. For that example, this means that the parent ID of A instead of B would be received for C, and the span data of C would finally report span A as its parent. A further improvement would be to also count the number of ancestor spans that have not been sampled. If this counter is carried by the span context, incremented after every hop, and finally added to the span data, it would be easy to find out, how many generations are between the parent span. In other words, we would know, for example, if a span is a child, grand-child, or a great-grandchild.
trace_id
(and it's used to compute the hash), they will be consistently sampled (unless there are some collectors with different hash seed configured at the same tier)