Skip to content

Fix performance of tracing rate limiter#1530

Open
sergeymatov wants to merge 2 commits into
mainfrom
pr/smatov/rate-limit-recover
Open

Fix performance of tracing rate limiter#1530
sergeymatov wants to merge 2 commits into
mainfrom
pr/smatov/rate-limit-recover

Conversation

@sergeymatov
Copy link
Copy Markdown
Contributor

@sergeymatov sergeymatov commented May 12, 2026

Move to own atomic operations for per-level filter for callsites. Reloading of interest is done via ArcSwap rather then RwLock

Layered structure of throtteled/unthrotteled filters produces atomic reloading of Interest for each callsite. Layered<EnvFilter>::enable() invokes every time on each of tracing message, but tracing_subscriber::reload path that was previously used to check atomic state spawned RwLock::read_contended on each of such calls.
Pyroscope flame graph analysis showed that those operations were causing extreme cost of CPU operations.

Proposed solution is to use ArcSwap and precisely update callsite Interest for each of the Layer

Copilot AI review requested due to automatic review settings May 12, 2026 11:47
@sergeymatov sergeymatov requested a review from a team as a code owner May 12, 2026 11:47
@sergeymatov sergeymatov requested review from daniel-noland and removed request for a team May 12, 2026 11:47
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Introduces configurable rate limiting for tracing output by adding a tracing-throttle-based layer to tracectl, wiring it into dataplane startup, and exposing configuration via a new CLI argument in args.

Changes:

  • Add TracingRateLimitConfig / TracingControl::init_with_rate_limit() and initialize tracing with optional rate limiting.
  • Add --tracing-rate-limit BURST:REPLENISH_PER_SECOND parsing and propagate it into the launch configuration.
  • Add the tracing-throttle dependency to the workspace and tracectl.

Reviewed changes

Copilot reviewed 6 out of 7 changed files in this pull request and generated 4 comments.

Show a summary per file
File Description
tracectl/src/lib.rs Re-exports rate-limit config type from control.
tracectl/src/control.rs Adds rate-limit config + subscriber initialization changes, including throttling logic and tests.
tracectl/Cargo.toml Adds tracing-throttle dependency.
dataplane/src/main.rs Initializes tracing with optional rate limit from CLI args; adjusts default log level behavior.
Cargo.toml Adds tracing-throttle to workspace dependencies.
Cargo.lock Locks new tracing-throttle transitive dependency.
args/src/lib.rs Adds TracingRateLimit type, CLI parsing/help, and config propagation + tests.

Comment thread tracectl/src/control.rs
Comment thread tracectl/src/control.rs
Comment thread args/src/lib.rs
Comment thread Cargo.toml
@sergeymatov sergeymatov force-pushed the pr/smatov/rate-limit-recover branch from 7f3a6a7 to 0814455 Compare May 12, 2026 16:14
@mvachhar mvachhar added the dont-merge Do not merge this Pull Request label May 12, 2026
@sergeymatov sergeymatov force-pushed the pr/smatov/rate-limit-recover branch from 0814455 to 0e90334 Compare May 13, 2026 08:13
@qmonnet qmonnet marked this pull request as draft May 13, 2026 08:19
@sergeymatov sergeymatov force-pushed the pr/smatov/rate-limit-recover branch 4 times, most recently from 81f0207 to 65ae6b0 Compare May 18, 2026 12:56
@sergeymatov sergeymatov marked this pull request as ready for review May 19, 2026 09:05
@sergeymatov sergeymatov force-pushed the pr/smatov/rate-limit-recover branch from 65ae6b0 to 510ff04 Compare May 19, 2026 09:08
This reverts commit 75f964b.

Signed-off-by: Sergey Matov <sergey.matov@githedgehog.com>
@sergeymatov sergeymatov force-pushed the pr/smatov/rate-limit-recover branch from 510ff04 to c7c6cb5 Compare May 19, 2026 09:09
@sergeymatov sergeymatov changed the title DONT MERGE!! TEST ONLY: Revert "revert: feat(logging): Add configurable rate limit for logging" Fix performance of tracing rate limiter May 19, 2026
@sergeymatov sergeymatov removed the dont-merge Do not merge this Pull Request label May 19, 2026
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 6 out of 7 changed files in this pull request and generated 3 comments.

Comment thread tracectl/src/control.rs
Comment thread Cargo.toml
Comment thread args/src/lib.rs
@sergeymatov sergeymatov force-pushed the pr/smatov/rate-limit-recover branch 3 times, most recently from dc1c83c to 55a1289 Compare May 19, 2026 13:08
Comment thread tracectl/src/control.rs Outdated
Comment thread tracectl/src/control.rs Outdated
/// Handle used to atomically swap in a new `EnvFilter` from outside the
/// subscriber. Mirrors `reload::Handle`.
#[derive(Clone)]
struct AtomicEnvFilterHandle {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This type is identical to AtomicEnvFilter, except that it implements Clone.
Why do we need two types that are the same?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Asking because both types are private to this crate?

Copy link
Copy Markdown
Contributor

@Fredi-raspall Fredi-raspall left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me overall, but I wish the PR included several changes.

  1. why two types for AtomicEnvFilter? Any chance a single type suffices?
  2. I'd remove the no longer used Result cases.
  3. I would like to see a test where a bunch of logs are produced and rate-limitted. There is a way to know if an event would be traced under a given config. Unsure if that is extensible when you have rate-limiting. It would be good to at least be able to see the effect.
  4. Super ideally: I wonder if two subscribers (one with rate limiting, one without) could be compared (profiled) to be able to see the impact in performance of the rate limiter. Unsure if this can be done in any meaningful / reliable way.

Comment thread tracectl/src/control.rs Outdated
/// Handle used to atomically swap in a new `EnvFilter` from outside the
/// subscriber. Mirrors `reload::Handle`.
#[derive(Clone)]
struct AtomicEnvFilterHandle {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Asking because both types are private to this crate?

Comment thread tracectl/src/control.rs Outdated
Comment thread tracectl/src/control.rs Outdated
Avoid using RwLock to reduce complexity for EnvFilter.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Signed-off-by: Sergey Matov <sergey.matov@githedgehog.com>
@sergeymatov sergeymatov force-pushed the pr/smatov/rate-limit-recover branch from 55a1289 to dcc9ef1 Compare May 21, 2026 11:56
@sergeymatov
Copy link
Copy Markdown
Contributor Author

I will add test as separate commit. Thanks @Fredi-raspall

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants