Skip to content

Feat/consumer based rate limit#1681

Closed
Saadha123 wants to merge 9 commits intowso2:mainfrom
Saadha123:Feat/consumer-based-rate-limit
Closed

Feat/consumer based rate limit#1681
Saadha123 wants to merge 9 commits intowso2:mainfrom
Saadha123:Feat/consumer-based-rate-limit

Conversation

@Saadha123
Copy link
Copy Markdown
Contributor

@Saadha123 Saadha123 commented Apr 9, 2026

Purpose

All LLM gateway rate limits are currently applied at the provider level with a single shared counter. If App A exhausts the token or cost budget, App B gets blocked too. This PR adds consumer-based rate limiting so each GenAI application has its own independent counter.

Goals

  • Track token, cost, and request count limits independently per GenAI application (x-wso2-application-id)
  • Support backend-wide and per-consumer limits running simultaneously in the same request chain
  • Unauthenticated requests (no app ID) fall back to a shared "default" counter rather than colliding with the backend counter
  • Sync the default policy YAML copies in gateway/gateway-controller/default-policies/ with upstream

Approach

platform-api/src/internal/service/llm_deployment.go

  • Consumer request limit: emits advanced-ratelimit with x-wso2-application-id in key extraction (fallback: default) and quota name consumer-request-limit
  • Consumer token limit: emits token-based-ratelimit with consumerBased: true
  • Consumer cost limit: emits llm-cost-based-ratelimit with consumerBased: true; hasPolicy guard prevents llm-cost appearing twice when both backend and consumer cost limits are configured
  • Fixed addOrAppendPolicyPath to be scope-aware so backend and consumer entries with the same policy name are not merged

gateway/gateway-controller/default-policies/

  • Updated token-based-ratelimit.yaml, llm-cost-based-ratelimit.yaml, advanced-ratelimit.yaml, and api-key-auth.yaml to match upstream policy definitions

gateway/it/

  • Added consumer-token-based-ratelimit.feature, consumer-request-based-ratelimit.feature, consumer-cost-based-ratelimit.feature (registered in suite_test.go)

Automation tests

Unit tests — llm_deployment_test.go (11 tests)

Covers generateLLMProviderDeploymentYAML for: backend-only limits, consumer-only limits, both limits together, all three consumer limit types combined, and disabled limits being skipped.

Integration tests — 9 scenarios across 3 feature files

Feature Scenarios
consumer-token-based-ratelimit Independent consumer counters; backend blocks all when exhausted; fallback default counter when no app ID
consumer-request-based-ratelimit Same three scenarios for request count limits
consumer-cost-based-ratelimit Same three scenarios for cost limits

Related PRs

  • gateway-controllers Feat/consumer-based-rl — policy-level changes (api-key-auth, token-based-ratelimit, llm-cost-based-ratelimit, advanced-ratelimit)
  • apim-saas fix/consumer-ratelimit-ui — removes the "Coming Soon" chip from the Per Consumer card in the UI

Fixes: #2307

Summary by CodeRabbit

  • New Features

    • Optional per‑consumer rate limiting for token-, request-, and cost-based policies so each application can have independent counters.
  • Updates

    • Bumped default policy versions for authentication and rate-limiting policies.
  • Tests

    • Added end-to-end integration tests covering consumer-based request, token, and cost rate-limiting scenarios and suite registration updates.

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented Apr 9, 2026

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 76cae86c-4f2f-44a4-bdcb-d793e9eddd17

📥 Commits

Reviewing files that changed from the base of the PR and between f36fd26 and 3e99bf4.

📒 Files selected for processing (1)
  • gateway/it/features/consumer-cost-based-ratelimit.feature
🚧 Files skipped from review as they are similar to previous changes (1)
  • gateway/it/features/consumer-cost-based-ratelimit.feature

Walkthrough

Policy version bumps and addition of consumerBased parameters enable per-consumer vs shared scoping for token, request, and cost rate limits. Policy-generation logic now emits consumer-scoped policies and merges scope-aware policy paths. New integration tests validate consumer-based behaviors and added unit tests cover YAML generation.

Changes

Cohort / File(s) Summary
Policy Version Updates
gateway/gateway-controller/default-policies/advanced-ratelimit.yaml, gateway/gateway-controller/default-policies/api-key-auth.yaml
Minor metadata version bumps only (v1.x.x updates); no functional changes.
Consumer-Based Policy params
gateway/gateway-controller/default-policies/llm-cost-based-ratelimit.yaml, gateway/gateway-controller/default-policies/token-based-ratelimit.yaml
Added consumerBased boolean parameter (default false) and bumped versions to support per-consumer scoping keyed by x-wso2-application-id.
Policy Generation Logic
platform-api/src/internal/service/llm_deployment.go
Expanded generation to emit consumer- and resource-scoped policies for tokens, requests, and costs; added scope-aware merging (addOrAppendPolicyPath) and hasPolicy helper; generates llm-cost policies when needed.
Unit Tests for YAML generation
platform-api/src/internal/service/llm_deployment_test.go
New tests and helpers validating consumer-only, backend+consumer, backend-only, and disabled-limit scenarios; asserts presence/absence of policy entries and consumer scoping markers.
Integration Tests (Gherkin)
gateway/it/features/consumer-token-based-ratelimit.feature, gateway/it/features/consumer-request-based-ratelimit.feature, gateway/it/features/consumer-cost-based-ratelimit.feature
Three new feature files covering per-consumer independent counters, shared backend exhaustion, fallback/default buckets, and combined-policy interactions.
IT Suite Registration
gateway/it/suite_test.go
Registered the three new consumer-based rate-limiting feature files in the integration test suite.

Sequence Diagram

sequenceDiagram
    participant Consumer as Consumer
    participant Gateway as Gateway
    participant Auth as api-key-auth
    participant RateLimit as RateLimiter
    participant Backend as SharedBackend

    Consumer->>Gateway: Request (may include API Key)
    activate Gateway
    Gateway->>Auth: Validate/extract API key -> x-wso2-application-id
    activate Auth
    Auth-->>Gateway: app-id or none
    deactivate Auth

    Gateway->>RateLimit: Evaluate policy (consumerBased true/false, limits)
    activate RateLimit
    alt consumerBased == true
        RateLimit->>RateLimit: Lookup counter for app-id (or "default")
        RateLimit->>RateLimit: Increment & compare vs consumer limit
        alt within consumer limit
            RateLimit-->>Gateway: Allow (200)
        else
            RateLimit-->>Gateway: Reject (429)
        end
    else consumerBased == false
        RateLimit->>Backend: Increment/check shared backend counter
        Backend->>Backend: Compare vs backend limit
        alt within backend limit
            Backend-->>RateLimit: Allow
            RateLimit-->>Gateway: Allow (200)
        else
            Backend-->>RateLimit: Reject
            RateLimit-->>Gateway: Reject (429)
        end
    end
    deactivate RateLimit

    Gateway-->>Consumer: Response (200 or 429)
    deactivate Gateway
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Poem

🐰 A rabbit hops where counters bloom,
Per-app buckets find their room,
Tokens, requests, and costs align,
Each little app keeps its own line,
Thump-thump—rate limits celebrate with a tune 🎉

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 65.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (2 passed)
Check name Status Explanation
Title check ✅ Passed The title 'Feat/consumer based rate limit' clearly summarizes the main feature being added: consumer-based rate limiting functionality that allows independent tracking per GenAI application.
Description check ✅ Passed The PR description comprehensively covers Purpose, Goals, Approach with implementation details, Automation tests with unit and integration test breakdown, and Related PRs. All major template sections are well-documented.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (1)
platform-api/src/internal/service/llm_deployment_test.go (1)

22-22: Unused helper function.

float32Ptr is defined but not used in any of the tests. Consider removing it to avoid dead code.

🧹 Proposed fix
-func float32Ptr(f float32) *float32 { return &f }
-
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@platform-api/src/internal/service/llm_deployment_test.go` at line 22, The
helper function float32Ptr is unused and should be removed to eliminate dead
code: delete the float32Ptr(f float32) *float32 { return &f } declaration from
the test file (or, if it was intended to be used, replace its callers to use it
and ensure tests import it); ensure no references to float32Ptr remain in the
test package before committing.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In `@platform-api/src/internal/service/llm_deployment_test.go`:
- Line 22: The helper function float32Ptr is unused and should be removed to
eliminate dead code: delete the float32Ptr(f float32) *float32 { return &f }
declaration from the test file (or, if it was intended to be used, replace its
callers to use it and ensure tests import it); ensure no references to
float32Ptr remain in the test package before committing.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 9929c116-db65-417d-a5dc-dcf3fdafdc9a

📥 Commits

Reviewing files that changed from the base of the PR and between 9f931fb and f36fd26.

📒 Files selected for processing (10)
  • gateway/gateway-controller/default-policies/advanced-ratelimit.yaml
  • gateway/gateway-controller/default-policies/api-key-auth.yaml
  • gateway/gateway-controller/default-policies/llm-cost-based-ratelimit.yaml
  • gateway/gateway-controller/default-policies/token-based-ratelimit.yaml
  • gateway/it/features/consumer-cost-based-ratelimit.feature
  • gateway/it/features/consumer-request-based-ratelimit.feature
  • gateway/it/features/consumer-token-based-ratelimit.feature
  • gateway/it/suite_test.go
  • platform-api/src/internal/service/llm_deployment.go
  • platform-api/src/internal/service/llm_deployment_test.go

@Saadha123
Copy link
Copy Markdown
Contributor Author

Moved changes to #1817 #1818 #1820

@Saadha123 Saadha123 closed this Apr 29, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant