feat(tracebuffer): flush on drop, drain on shutdown, and support nested tokio runtimes#2005
feat(tracebuffer): flush on drop, drain on shutdown, and support nested tokio runtimes#2005Aaalibaba42 wants to merge 8 commits into
Conversation
📚 Documentation Check Results📦
|
Clippy Allow Annotation ReportComparing clippy allow annotations between branches:
Summary by Rule
Annotation Counts by File
Annotation Stats by Crate
About This ReportThis report tracks Clippy allow annotations for specific rules, showing how they've changed in this PR. Decreasing the number of these annotations generally improves code quality. |
🔒 Cargo Deny Results📦
|
Codecov Report❌ Patch coverage is Additional details and impacted files@@ Coverage Diff @@
## main #2005 +/- ##
==========================================
+ Coverage 72.83% 72.92% +0.09%
==========================================
Files 458 458
Lines 75789 76036 +247
==========================================
+ Hits 55201 55453 +252
+ Misses 20588 20583 -5
🚀 New features to boost your workflow:
|
|
Artifact Size Benchmark Reportaarch64-alpine-linux-musl
aarch64-unknown-linux-gnu
libdatadog-x64-windows
libdatadog-x86-windows
x86_64-alpine-linux-musl
x86_64-unknown-linux-gnu
|
b5a716f to
d55d383
Compare
d4d669c to
463495c
Compare
yannham
left a comment
There was a problem hiding this comment.
We've discussed this offline, and the consensus seemed to be that the actual issue of the runtime-in-the-runtime problem is that some functions of our API pretend to be async but actually use block_on under the hood. It seems one possible solution is to actually make them async, and to offer a customary blocking API at the top-level only. Then, from Rust we could only use the purely async API and not care about the shared runtime at all.
On the other hand, the current solution secretly spawns a thread and run a second runtime on the side, while blocking the original executor's thread waiting for the answer. This is a bit fishy, and we never want to block an async executor thread, as this will block a whole bunch of unrelated async tasks queued on this thread.
13fe812 to
f8d6d54
Compare
…kio-inside-tokio problems
… have tokio-inside-tokio problems" This reverts commit a9fceca.
f8d6d54 to
fd24165
Compare
Co-authored-by: Cursor <cursoragent@cursor.com>
Co-authored-by: Cursor <cursoragent@cursor.com>
What?
Make the
TraceBufferandSharedRuntimesafe to use from within a tokio runtime, add a synchronousflush_and_waitAPI, ensure pending spans are drained on shutdown, and fire a best-effort flush onDrop.Why?
SharedRuntime::block_onandshutdownpanic with "Cannot start a runtime from within a runtime" when called from inside a tokio context (e.g. the Rust tracer embedded in an async application).TraceBufferwithout an explicit flush also lost any buffered spans.How?
SharedRuntime(libdd-shared-runtime):block_onandshutdownnow detect a running tokioHandleviaHandle::try_current()and, when found, drive the blocking work on a scoped OS thread (std::thread::scope) instead of callingRuntime::block_ondirectly. TheArc<Runtime>is explicitly dropped on that same thread so its join-on-drop doesn't panic either. AddedSendbounds on the future and its output to support this.TraceBuffer(libdd-data-pipeline): Addedflush_and_wait(timeout)— triggers a flush, captures the batch generation, and blocks until the worker has processed it. AddedReceiver::drain()to synchronously pull remaining chunks without waiting for a flush trigger;TraceExporterWorker::shutdownnow calls it to export any leftover spans. ImplementedDrop for TraceBuffer<T>that fires a non-blocking flush notify.TraceExporter: AddedSendbounds onT::TextandT::Bytesforsend_trace_chunksto satisfy the newblock_onsignature.Additional Notes
Dropimpl is intentionally best-effort (errors are swallowed) — if the runtime is already gone, there's nothing useful to do.flush_and_waitshort-circuits and returns immediately when the batch is empty.block_on,shutdown, anddropfrom inside#[tokio::test].