Skip to content

🤖 feat: add Fast/Slow service-tier control to chat input#3476

Open
ammar-agent wants to merge 5 commits into
mainfrom
chat-input-f8xm
Open

🤖 feat: add Fast/Slow service-tier control to chat input#3476
ammar-agent wants to merge 5 commits into
mainfrom
chat-input-f8xm

Conversation

@ammar-agent
Copy link
Copy Markdown
Collaborator

@ammar-agent ammar-agent commented Jun 5, 2026

Summary

Adds a service-tier (speed) control to the chat input: a lightning-bolt to the right of the thinking slider that appears only for models supporting service_tier (OpenAI/GPT today). The bolt glows orange for Fast, turns blue for Slow, and is neutral grey for Auto/default. Clicking it opens a small menu to set a chat-specific override.

The UI deliberately says Fast/Slow (not provider wire values) so it generalizes to future providers, even though only OpenAI GPT is supported for now.

Background

service_tier already existed end-to-end as a global, per-provider config setting (config.openai.serviceTier), threaded through providerOptions.openai.serviceTier. There was no per-chat override and no way to set it from the chat input. This adds that affordance.

Mapping (single source of truth in serviceTier.ts): Fast → priority (low latency), Slow → flex (cheaper, higher latency), Auto → no override (provider/global default applies).

An earlier revision of this PR also added /fast and /slow one-shot slash commands. They were removed: as standalone service-tier-only one-shots they did not compose with the existing /<model>[+thinking] one-shot syntax, so the bolt control is now the sole entry point.

Implementation

  • Shared module src/common/utils/ai/serviceTier.ts: supportsServiceTier(model) (OpenAI-gated, and only for direct/passthrough routes since non-passthrough gateways drop the field), Fast/Slow ↔ wire mappings, getServiceTierSpeed, and withServiceTierOverride(providerOptions, tier, model) — the one place every send path merges the tier (returns options unchanged for no-override or unsupported models, never mutating input).
  • Per-chat state: useServiceTier(scopeId) backed by usePersistedState (localStorage, keyed by workspace/project scope, cross-component synced). Like the existing Anthropic 1M-context toggle, it is intentionally not persisted to workspace metadata — it rides along per request via providerOptions.
  • UI: ServiceTierPicker (lightning-bolt + keyboard-navigable menu, conditional-rendered for happy-dom testability). Colors use new --color-service-tier-fast/-slow tokens in globals.css. Returns null (occupies no layout) for unsupported models.
  • Send paths: useSendMessageOptions and the non-React getSendOptionsFromStorage both apply the override via the shared helper. The creation flow carries a tier chosen pre-workspace into the new workspace scope.

Validation

  • New unit tests: serviceTier helpers (mapping/gating/merge/no-mutation, including passthrough-gateway routing).
  • New component test: ServiceTierPicker gating, open, Fast/Slow/Auto selection + persistence (5 cases).
  • Creation-flow test asserts the tier override follows into the new workspace scope.
  • make static-check green; targeted suites pass.

Risks

Low. The change is additive and OpenAI-gated. The shared withServiceTierOverride never attaches a tier for unsupported models, so non-OpenAI requests are unaffected. The bolt only renders for supporting models, so existing layouts/snapshots for non-OpenAI models are unchanged; ChatInput stories using a GPT model will now show the bolt (expected).


Generated with mux • Model: anthropic:claude-opus-4-8 • Thinking: xhigh • Cost: $41.61

Add a lightning-bolt service-tier control to the chat input (right of the
thinking slider) for models that support service_tier (OpenAI/GPT today).
The bolt glows orange for Fast (priority), turns blue for Slow (flex), and
is neutral grey for Auto/default. Clicking opens a per-chat override menu.

Also add /fast and /slow one-shot slash commands that reuse the existing
model-oneshot send path, with TipsCarousel discovery. Uses Fast/Slow wording
to stay provider-agnostic for future models.
@ammar-agent
Copy link
Copy Markdown
Collaborator Author

@codex review

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 23d1497908

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread src/common/utils/ai/serviceTier.ts Outdated
Addresses Codex P2: non-passthrough gateway-routed OpenAI models (e.g.
openrouter:openai/gpt-5) canonicalize to 'openai' but the backend drops
providerOptions.openai.serviceTier on those routes. supportsServiceTier now
only returns true for direct openai:* models and passthrough gateways
(mux-gateway), so the UI never advertises a no-op Fast/Slow override.
@ammar-agent
Copy link
Copy Markdown
Collaborator Author

@codex review

Addressed the P2 (commit 8c7fdf9): supportsServiceTier() now only returns true for routes that actually forward providerOptions.openai.serviceTier:

  • direct openai:* models, and
  • passthrough gateways (mux-gateway, passthrough: true).

Non-passthrough gateways like openrouter:openai/gpt-5 and github-copilot:* now return false, so the bolt is hidden and /fast /slow are blocked for them (no silent no-op). Mirrors the backend resolveProviderOptionsNamespaceKey passthrough logic. Added unit tests for the openrouter (false), github-copilot (false), and mux-gateway (true) cases.

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 8c7fdf93cb

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread src/browser/features/ChatInput/index.tsx
Addresses Codex P2: syncCreationPreferences() copied model/agent/thinking
from the project scope to the new workspace but not the service-tier override,
so a Fast/Slow choice made during creation reverted to Auto after the workspace
opened. Now copies getServiceTierKey(projectScope) -> getServiceTierKey(workspace),
mirroring thinking level. Extends the creation success test to assert the sync.
@ammar-agent
Copy link
Copy Markdown
Collaborator Author

@codex review

Addressed the creation-flow P2 (commit 615aa48): syncCreationPreferences() now also copies the chat-specific service-tier override from the project scope to the new workspace scope (getServiceTierKey(projectScope)getServiceTierKey(workspaceId)), mirroring how model/agent/thinking are carried over. So a Fast/Slow choice made during creation persists for subsequent messages in the new workspace instead of reverting to Auto. Extended the creation success test to assert the sync.

@chatgpt-codex-connector
Copy link
Copy Markdown

Codex Review: Didn't find any major issues. Bravo.

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

The /fast and /slow slash commands were standalone service-tier-only
one-shots that could not compose with the existing /<model>[+thinking]
one-shot syntax. Remove them entirely (parser, suggestions, tips, the
ParsedCommand serviceTier field, the index.tsx one-shot guard/merge, and
the command-key helpers) while keeping the bolt UI control and the
per-chat service-tier override plumbing.
@ammar-agent ammar-agent changed the title 🤖 feat: add Fast/Slow service-tier control and /fast /slow one-shots 🤖 feat: add Fast/Slow service-tier control to chat input Jun 5, 2026
@ammar-agent
Copy link
Copy Markdown
Collaborator Author

@codex review

Removed the /fast and /slow one-shot slash commands. They were standalone service-tier-only one-shots that didn't compose with the existing /<model>[+thinking] one-shot syntax. The lightning-bolt service-tier control and all per-chat override plumbing remain; this commit only reverts the slash-command surface (parser, suggestions, tips, the ParsedCommand.serviceTier field, the one-shot guard/merge in ChatInput, and the command-key helpers in serviceTier.ts).

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 6f75162084

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread src/browser/hooks/useSendMessageOptions.ts
Re-merge the persisted Fast/Slow tier against the effective (one-shot)
model at send time. useSendMessageOptions bakes the tier against the
saved model, so a /<model> one-shot to an OpenAI model previously dropped
a tier set while the saved model was non-OpenAI. withServiceTierOverride
is now authoritative: it attaches the tier for supported models and
strips any stale tier when the effective model can't honor it.
@ammar-agent
Copy link
Copy Markdown
Collaborator Author

@codex review

Addressed the composition gap. The per-chat Fast/Slow tier is now re-merged against the effective (one-shot) model at send time in ChatInput.handleSend: when a /<model> one-shot is present, providerOptions is recomputed via withServiceTierOverride(...) against the override model, reading the persisted tier from useServiceTier. withServiceTierOverride is now authoritative — it attaches the tier for supported effective models and strips any stale tier when the effective model can't honor it (e.g. an OpenAI saved model + /haiku one-shot). Added a unit test for the strip case.

@ammar-agent
Copy link
Copy Markdown
Collaborator Author

@codex review

Friendly re-ping: the previous P2 (per-chat Fast/Slow tier not composing with /<model> one-shots) is fixed in commit c180454ChatInput.handleSend now re-merges the persisted tier against the effective one-shot model via the authoritative withServiceTierOverride (attaches for supported effective models, strips a stale tier otherwise). All slash-command /fast /slow surface has been removed. Please take another look.

@chatgpt-codex-connector
Copy link
Copy Markdown

Codex Review: Didn't find any major issues. Swish!

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant