enhancement: honor span limits when collecting events #93

mladedav · 2024-02-15T23:51:36Z

Motivation

The tracing-opentelemetry library uses SpanBuilder to create spans and puts events in there, but if there are more than the configured limit, these are kept around, ignored by Tracer when a span is actually built and then discarded.

This can be an issue especially for long-running spans with much larger number of events than the limit as they are effectively just leaking memory.

Solution

I've tried to ask at the OpenTelemetry side (issue, if this can be achieved without relying on the SDK, but then I've found out this dependency already exists and is used for filtering spans.

I've added a method to the PreSampleTracer to obtain the limits, which I also provided a default implementation for since tracers are not required to support limits at all. It might be better to not provide the default implementation and let the implementors explicitly declare, that they do not support any limits, I'll be happy to change it if you think that that would be better.

I am not sure if this is the correct place for this, but it felt more correct than to create a new trait for LimittedTracer and requiring that trait also in OpenTelemetryLayer. If this goes through as is, the current trait could be renamed in a breaking change to better convey the meaning.

The limits are then loaded when an event is to be stored in a SpanBuilder and if a limit has been reached, the layer will discard the event. I am not sure if there are some side-effects later in the function (namely in Event::record) which should run regardless of the limits, so the place, where the event handling is aborted may need to be changed.

jtescher · 2024-02-16T00:03:08Z

src/layer.rs

@@ -1022,6 +1022,20 @@ where
            let mut otel_data = span.extensions_mut().remove::<OtelData>();
            let span_builder = otel_data.as_mut().map(|data| &mut data.builder);

+            let limits = self.tracer.span_limits();


could be worth looking at the benchmarks to see what the perf impact of this is. Could alternatively load this when the layer is initialized or otherwise lazy load this value once?

Sure, I'll try to run the benchmarks and get back to you hopefully tomorrow.

Could alternatively load this when the layer is initialized or otherwise lazy load this value once?

Right now, probably. I'm not sure in the long-term though, because the limits may be changed through a TracerProvider and all Tracers should honor that change even if they were created earlier. Currently, the limits cannot be changed after a provider is created in opentelemetry-sdk, but they may add it later. I do not know of any other Tracer implementation, but that could also be a consideration.

I've added a benchmark with events, see this PR for details.

Current HEAD

otel_many_events/spec_baseline time: [19.951 µs 19.969 µs 19.991 µs] change: [-0.2111% +0.0843% +0.3784%] (p = 0.58 > 0.05) No change in performance detected. Found 11 outliers among 100 measurements (11.00%) 3 (3.00%) high mild 8 (8.00%) high severe otel_many_events/no_data_baseline time: [48.757 µs 48.803 µs 48.859 µs] change: [+0.1993% +0.4747% +0.7119%] (p = 0.00 < 0.05) Change within noise threshold. Found 14 outliers among 100 measurements (14.00%) 4 (4.00%) high mild 10 (10.00%) high severe otel_many_events/data_only_baseline time: [68.263 µs 68.286 µs 68.313 µs] change: [+0.2342% +0.5605% +0.8355%] (p = 0.00 < 0.05) Change within noise threshold. Found 7 outliers among 100 measurements (7.00%) 4 (4.00%) high mild 3 (3.00%) high severe otel_many_events/full_filtered time: [251.47 µs 251.64 µs 251.83 µs] change: [+82.121% +82.453% +82.717%] (p = 0.00 < 0.05) Performance has regressed. Found 11 outliers among 100 measurements (11.00%) 3 (3.00%) high mild 8 (8.00%) high severe otel_many_events/full_not_filtered time: [280.82 µs 280.92 µs 281.02 µs] change: [+2.0043% +2.1354% +2.2589%] (p = 0.00 < 0.05) Performance has regressed. Found 15 outliers among 100 measurements (15.00%) 2 (2.00%) low severe 3 (3.00%) low mild 6 (6.00%) high mild 4 (4.00%) high severe

This PR

otel_many_events/spec_baseline time: [19.504 µs 19.520 µs 19.540 µs] change: [-2.4723% -2.1766% -1.8365%] (p = 0.00 < 0.05) Performance has improved. Found 18 outliers among 100 measurements (18.00%) 8 (8.00%) high mild 10 (10.00%) high severe otel_many_events/no_data_baseline time: [48.471 µs 48.558 µs 48.669 µs] change: [-0.7518% -0.6127% -0.4556%] (p = 0.00 < 0.05) Change within noise threshold. Found 16 outliers among 100 measurements (16.00%) 3 (3.00%) high mild 13 (13.00%) high severe otel_many_events/data_only_baseline time: [67.843 µs 67.891 µs 67.947 µs] change: [-0.6862% -0.5205% -0.3061%] (p = 0.00 < 0.05) Change within noise threshold. Found 10 outliers among 100 measurements (10.00%) 1 (1.00%) low mild 5 (5.00%) high mild 4 (4.00%) high severe otel_many_events/full_filtered time: [136.64 µs 136.72 µs 136.81 µs] change: [-45.679% -45.580% -45.464%] (p = 0.00 < 0.05) Performance has improved. Found 13 outliers among 100 measurements (13.00%) 2 (2.00%) high mild 11 (11.00%) high severe otel_many_events/full_not_filtered time: [292.62 µs 292.72 µs 292.82 µs] change: [+4.1160% +4.2252% +4.3375%] (p = 0.00 < 0.05) Performance has regressed. Found 7 outliers among 100 measurements (7.00%) 1 (1.00%) low severe 1 (1.00%) low mild 3 (3.00%) high mild 2 (2.00%) high severe

The benchmarks do not seem to be very consistent though. It might be just due to background workload on my machine, but the first benchamarks which do not depend on the changes at all have wildly different results so it would be great if someone else would also run it in some better environment.

The default for filtering is 128 events and the benchmark generates 1000 so the filtered result has much better performance because it doesn't need to process the events that would be discarded anyway.

The not filtered benchmark sets the limit to 1000 so nothing is filtered, there is a 4% penalty.

mladedav · 2024-02-17T00:21:18Z

I've converted this into a draft. I've found out there is an issue with the events (#94) and this might be better to finish on top of that since that moves stuff around.

I've also realized that simply ignoring the events will lead to incorrect information in the number of dropped events so we should keep at least a count somewhere and then inflate the builder with dummy events. Or decide that losing this information is worth the potential performance cost, but that is to be seen.

## Motivation #94 - Mutli-threaded tracing drops most of the events ## Solution I have reorganized the code so that the lock on extensions is held, but not while `record` is called which should solve the deadlock originally solved in #59 ### Benchmark I've used the benchmark from #93, please ignore the "filtered" vs "non_filtered" distinction, for the purpose of this PR they should be identical cases. See results in later comment as the code got changed.

mladedav · 2024-02-29T21:24:16Z

It should be ready for review now. The code now discards all events, attributes, and links, which would be discarded by otel.

One thing I'm particularly unhappy with is the fact that we do not signal that some events have been dropped but that would require either changes on the otel side so that we could construct a span and set the value directly, or creating dummy values in the events or attributes vectors where there is a risk that opentelemetry-rust changes the way they select which events/attributes to keep and they could keep just the dummy values. And binding their hands in that regard by implementation here would also be somewhat sad.

Benchmark

It seems that this change has negative impact on performance when filtering does not kick in. I'll try to see why that is and if I can fix that with some inlining or if I can work better with the optional limits. I feel like I must be doing something wrong because the checks themselves should not amount to 10 % even though the benchmarks do seem to fluctuate a lot.

I seem to remember that you suggested we load the span limits just once (I can't find that comment in this PR), I'll probably try that and see if that has any effect.

Details

     Running benches/trace.rs (target/release/deps/trace-811e6591eded5464)
Gnuplot not found, using plotters backend
otel_many_children/spec_baseline
                        time:   [11.434 µs 11.492 µs 11.551 µs]
                        change: [-0.5528% +0.2298% +1.0020%] (p = 0.56 > 0.05)
                        No change in performance detected.
Found 2 outliers among 100 measurements (2.00%)
  1 (1.00%) high mild
  1 (1.00%) high severe
otel_many_children/no_data_baseline
                        time:   [23.979 µs 24.078 µs 24.180 µs]
                        change: [-2.5883% -1.7865% -1.0361%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 5 outliers among 100 measurements (5.00%)
  5 (5.00%) high mild
otel_many_children/data_only_baseline
                        time:   [28.804 µs 28.958 µs 29.117 µs]
                        change: [-2.9538% -2.3271% -1.6733%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 3 outliers among 100 measurements (3.00%)
  3 (3.00%) high mild
otel_many_children/full time:   [58.390 µs 58.680 µs 58.995 µs]
                        change: [+3.0099% +3.7960% +4.5802%] (p = 0.00 < 0.05)
                        Performance has regressed.
Found 3 outliers among 100 measurements (3.00%)
  3 (3.00%) high mild

otel_many_events/spec_baseline
                        time:   [20.585 µs 20.672 µs 20.761 µs]
                        change: [-2.8551% -2.2160% -1.5239%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 2 outliers among 100 measurements (2.00%)
  2 (2.00%) high mild
otel_many_events/no_data_baseline
                        time:   [50.324 µs 50.606 µs 50.899 µs]
                        change: [-2.0556% -1.4063% -0.7466%] (p = 0.00 < 0.05)
                        Change within noise threshold.
Found 3 outliers among 100 measurements (3.00%)
  3 (3.00%) high mild
otel_many_events/data_only_baseline
                        time:   [71.144 µs 71.525 µs 71.926 µs]
                        change: [-1.2366% -0.5851% +0.1102%] (p = 0.10 > 0.05)
                        No change in performance detected.
Found 4 outliers among 100 measurements (4.00%)
  3 (3.00%) high mild
  1 (1.00%) high severe
otel_many_events/full_filtered
                        time:   [140.88 µs 141.32 µs 141.80 µs]
                        change: [-40.172% -39.848% -39.474%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 13 outliers among 100 measurements (13.00%)
  11 (11.00%) high mild
  2 (2.00%) high severe
otel_many_events/full_not_filtered
                        time:   [256.88 µs 258.02 µs 259.28 µs]
                        change: [+8.9896% +9.7865% +10.611%] (p = 0.00 < 0.05)
                        Performance has regressed.
Found 8 outliers among 100 measurements (8.00%)
  5 (5.00%) high mild
  3 (3.00%) high severe

mladedav requested a review from jtescher as a code owner February 15, 2024 23:51

jtescher reviewed Feb 16, 2024

View reviewed changes

mladedav force-pushed the dm/limits branch from 451abd5 to 549949f Compare February 16, 2024 21:10

mladedav marked this pull request as draft February 16, 2024 23:07

mladedav mentioned this pull request Feb 17, 2024

fix: process parallel events and potential deadlock in on_record #95

Merged

enhancement: honor span limits when collecting events

83cbc22

mladedav force-pushed the dm/limits branch from 549949f to 83cbc22 Compare February 29, 2024 21:07

chore: fix imports in tests

4bba0f4

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

enhancement: honor span limits when collecting events #93

enhancement: honor span limits when collecting events #93

mladedav commented Feb 15, 2024

jtescher Feb 16, 2024

mladedav Feb 16, 2024

mladedav Feb 16, 2024

mladedav commented Feb 17, 2024

mladedav commented Feb 29, 2024

enhancement: honor span limits when collecting events #93

Are you sure you want to change the base?

enhancement: honor span limits when collecting events #93

Conversation

mladedav commented Feb 15, 2024

Motivation

Solution

jtescher Feb 16, 2024

Choose a reason for hiding this comment

mladedav Feb 16, 2024

Choose a reason for hiding this comment

mladedav Feb 16, 2024

Choose a reason for hiding this comment

Current HEAD

This PR

mladedav commented Feb 17, 2024

mladedav commented Feb 29, 2024

Benchmark

Current `HEAD`