Fix performance of tracing rate limiter#1530
Conversation
There was a problem hiding this comment.
Pull request overview
Introduces configurable rate limiting for tracing output by adding a tracing-throttle-based layer to tracectl, wiring it into dataplane startup, and exposing configuration via a new CLI argument in args.
Changes:
- Add
TracingRateLimitConfig/TracingControl::init_with_rate_limit()and initialize tracing with optional rate limiting. - Add
--tracing-rate-limit BURST:REPLENISH_PER_SECONDparsing and propagate it into the launch configuration. - Add the
tracing-throttledependency to the workspace andtracectl.
Reviewed changes
Copilot reviewed 6 out of 7 changed files in this pull request and generated 4 comments.
Show a summary per file
| File | Description |
|---|---|
tracectl/src/lib.rs |
Re-exports rate-limit config type from control. |
tracectl/src/control.rs |
Adds rate-limit config + subscriber initialization changes, including throttling logic and tests. |
tracectl/Cargo.toml |
Adds tracing-throttle dependency. |
dataplane/src/main.rs |
Initializes tracing with optional rate limit from CLI args; adjusts default log level behavior. |
Cargo.toml |
Adds tracing-throttle to workspace dependencies. |
Cargo.lock |
Locks new tracing-throttle transitive dependency. |
args/src/lib.rs |
Adds TracingRateLimit type, CLI parsing/help, and config propagation + tests. |
7f3a6a7 to
0814455
Compare
0814455 to
0e90334
Compare
81f0207 to
65ae6b0
Compare
65ae6b0 to
510ff04
Compare
This reverts commit 75f964b. Signed-off-by: Sergey Matov <sergey.matov@githedgehog.com>
510ff04 to
c7c6cb5
Compare
dc1c83c to
55a1289
Compare
| /// Handle used to atomically swap in a new `EnvFilter` from outside the | ||
| /// subscriber. Mirrors `reload::Handle`. | ||
| #[derive(Clone)] | ||
| struct AtomicEnvFilterHandle { |
There was a problem hiding this comment.
This type is identical to AtomicEnvFilter, except that it implements Clone.
Why do we need two types that are the same?
There was a problem hiding this comment.
Asking because both types are private to this crate?
Fredi-raspall
left a comment
There was a problem hiding this comment.
Looks good to me overall, but I wish the PR included several changes.
- why two types for AtomicEnvFilter? Any chance a single type suffices?
- I'd remove the no longer used Result cases.
- I would like to see a test where a bunch of logs are produced and rate-limitted. There is a way to know if an event would be traced under a given config. Unsure if that is extensible when you have rate-limiting. It would be good to at least be able to see the effect.
- Super ideally: I wonder if two subscribers (one with rate limiting, one without) could be compared (profiled) to be able to see the impact in performance of the rate limiter. Unsure if this can be done in any meaningful / reliable way.
| /// Handle used to atomically swap in a new `EnvFilter` from outside the | ||
| /// subscriber. Mirrors `reload::Handle`. | ||
| #[derive(Clone)] | ||
| struct AtomicEnvFilterHandle { |
There was a problem hiding this comment.
Asking because both types are private to this crate?
Avoid using RwLock to reduce complexity for EnvFilter. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> Signed-off-by: Sergey Matov <sergey.matov@githedgehog.com>
55a1289 to
dcc9ef1
Compare
|
I will add test as separate commit. Thanks @Fredi-raspall |
Move to own atomic operations for per-level filter for callsites. Reloading of interest is done via ArcSwap rather then RwLock
Layered structure of throtteled/unthrotteled filters produces atomic reloading of
Interestfor each callsite.Layered<EnvFilter>::enable()invokes every time on each of tracing message, buttracing_subscriber::reloadpath that was previously used to check atomic state spawnedRwLock::read_contendedon each of such calls.Pyroscope flame graph analysis showed that those operations were causing extreme cost of CPU operations.
Proposed solution is to use
ArcSwapand precisely update callsite Interest for each of theLayer