Skip to content

🤖 feat: add workflow visibility surfaces#3495

Open
ThomasK33 wants to merge 5 commits into
mainfrom
workflows-azg3
Open

🤖 feat: add workflow visibility surfaces#3495
ThomasK33 wants to merge 5 commits into
mainfrom
workflows-azg3

Conversation

@ThomasK33

Copy link
Copy Markdown
Member

Summary

Adds workflow visibility surfaces for Dynamic Workflows: a shared browser workflow store, a right-sidebar Workflows tab, top-bar workflow indicator, and chat-card store integration. Also hardens the implementation against the deep-review findings by gating the UI behind the Dynamic Workflows experiment, preserving state on transient API failures, restarting polling after resubscribe, discovering externally-created runs, and limiting retry/run actions to supported flows.

Implementation

  • Added a shared workflow store with active-run polling, low-frequency workspace refresh polling, queued invalidations, error-preserving snapshot refreshes, and stable external-store subscriptions.
  • Added/registered Workflows sidebar and top-bar indicator surfaces behind the Dynamic Workflows feature flag.
  • Defaulted sidebar starts to background runs so Current runs can show progress immediately, with duplicate-start protection while requests are pending.
  • Reused checkpoint retry eligibility in the sidebar and invalidated the shared store from slash commands and workflow card actions.
  • Fixed scratch/project definition precedence in the workflow definition scanner.

Validation

  • bun test src/browser/utils/commands/sources.test.ts src/browser/features/Workflows/WorkflowStore.test.ts src/browser/features/Workflows/WorkflowsTab.test.tsx src/browser/features/Workflows/workflowStatusPresentation.test.ts src/browser/components/WorkflowIndicator/WorkflowIndicator.test.tsx
  • make typecheck
  • make static-check
  • Dogfooding artifacts from the feature implementation are in dogfood-output/workflows-ui/ with screenshots/videos for empty state, definition discovery, live foreground/background runs, failure surfacing, scratch promotion, and narrow layout checks.

Risks

Workflow visibility is experiment-gated and uses polling rather than a backend subscription. The main regression risk is extra local API traffic while the Workflows tab/indicator/card is mounted; the shared store deduplicates run polling and tears down timers when subscribers leave.


📋 Implementation Plan

Workflow Visibility UX Plan

Goal

Improve workflow visibility in Mux so users can discover available workflows, understand where they come from, and monitor live workflow execution outside the chat transcript. The design should preserve chat as the canonical chronological record while adding a dashboard-style right-sidebar tab and a glanceable top-bar indicator.

Evidence and constraints

  • Right-sidebar tabs use a two-layer architecture: static metadata in src/browser/features/RightSidebar/Tabs/tabConfig.ts, React labels/panels in src/browser/features/RightSidebar/Tabs/tabRegistry.tsx. Keep React imports out of tabConfig.ts.
  • GoalTab is the closest UX precedent: current/active content appears first, with secondary items organized into collapsible board sections below. Workflows need a multi-active variant because multiple runs can be pending/running/backgrounded at once.
  • Workflow definitions have four scopes: project, global, built-in, and scratch.
  • Workflow run statuses are pending, running, backgrounded, interrupted, completed, and failed.
  • Current workflow cards refresh live state with local polling in WorkflowRunToolCall.tsx; there is no verified workflow-specific browser store or oRPC subscription today.
  • featureFlag and keepAlive in tab config are metadata only for this task unless explicit runtime behavior is added and verified.
  • Existing scratch promotion support is limited to APIs already present (promoteScratch, promoteScratchDefinition); do not imply broader definition management like rename/delete/edit unless separately verified.

Recommendation

Build the feature as a three-surface system:

  1. Chat workflow cards remain the canonical execution transcript and detailed event/log surface.
  2. New Workflows right-sidebar tab becomes the discovery, summary, and control surface.
  3. New top-bar Workflows indicator becomes the glanceable status entry point near the existing Skills indicator.

Recommended implementation approach: Full phased system with shared store, sidebar tab, then top-bar indicator.

  • Net product LoC estimate: 1,200–1,800 lines.
  • Test/story LoC estimate, not counted as product code: 500–900 lines.
  • Why this approach: it directly solves workflow invisibility, avoids duplicated polling across chat/sidebar/top-bar, and preserves chat context.

Alternative implementation approaches:

Approach Product LoC estimate Pros Cons Recommendation
Sidebar-only MVP 650–1,000 Fastest user-visible improvement; definitions/runs become discoverable. Top-bar still lacks glanceable workflow status; likely repeats polling unless store still built. Acceptable first slice only if timeboxed.
Full phased system with shared store 1,200–1,800 Best UX and performance path; supports chat/sidebar/top-bar consistently. More moving parts; requires careful store tests. Recommended.
Backend subscription-first system 1,600–2,400 Best long-term live-state model; avoids polling. Larger backend/API scope; unnecessary to prove the UX. Defer until polling store reveals real pain.

Information architecture

Workflows sidebar tab

Workflows
├─ Header summary
│  ├─ active run count
│  ├─ highest-severity badge
│  └─ Run workflow action
├─ Current runs
│  ├─ pending / running / backgrounded
│  └─ interrupted / failed needing attention
├─ Available definitions
│  ├─ Project
│  ├─ Global
│  ├─ Built-in
│  └─ Scratch
└─ History
   ├─ Failed / interrupted recent runs
   └─ Completed recent runs

Sidebar cards should show name, scope/source, status, elapsed time, latest meaningful event/phase, and the next supported action. Keep detailed logs behind an explicit disclosure, dialog, or “View in chat” jump.

Top-bar Workflows indicator

Add a sibling indicator near SkillIndicator, not inside it. Skills are capabilities; workflows are capabilities plus live processes, so urgent workflow state should not be hidden in the skills popover.

Indicator behavior:

  • No active/problem runs: quiet icon or hidden-at-narrow-width state.
  • Active runs: badge with count.
  • Failed/interrupted runs: warning/error badge takes precedence.
  • Popover sections: Active runs, recent failures/interrupted runs, available definitions by scope, and “Open Workflows tab”.

Shared status severity

Status Severity UI treatment
failed error Highest priority; red/error badge.
interrupted warning Needs attention; amber/warning badge.
running active Success/active cue.
backgrounded active-background Active-neutral or amber with background badge.
pending pending Neutral/subtle active cue.
completed terminal Muted in history; do not dominate active indicator.

Use a shared workflow-specific helper rather than exporting WorkflowRunToolCall.tsx’s local toToolStatus() directly.

Phased implementation plan

Phase 0 — Verify event shape and define shared presentation

Files/subsystems to inspect or update:

  • src/common/orpc/schemas/workflow.ts
  • src/browser/features/Tools/WorkflowRunToolCall.tsx
  • new shared helper near workflow UI code, e.g. src/browser/features/Workflows/workflowStatusPresentation.ts

Tasks:

  1. Confirm exact event fields needed for “latest meaningful event” summaries.
  2. Add workflow status/severity presentation helpers.
  3. Add debug assertions for impossible/unknown statuses in helper tests while keeping runtime fallback safe for persisted older data.

Acceptance criteria:

  • All six known workflow statuses map exhaustively to severity and display labels.
  • Unknown persisted status values do not crash the UI, but tests/assertions catch missing known statuses during development.
  • Existing workflow chat cards keep their current behavior.

Quality gate:

  • Run targeted unit tests for the presentation helper.
  • Run make typecheck or a narrower typecheck target if available.

Phase 1 — Add a shared browser workflow store

Likely files/subsystems:

  • new src/browser/stores/WorkflowStore.ts or equivalent colocated workflow store
  • hooks such as useWorkflowSummary(workspaceId), useWorkflowRuns(workspaceId), useWorkflowRun(workspaceId, runId)
  • existing oRPC calls: workflows.listRuns, workflows.getRun, workflows.listDefinitions

Tasks:

  1. Implement workspace-level run/definition snapshot loading.
  2. Implement run-detail polling for active statuses using existing 2s cadence.
  3. Deduplicate polling by workspaceId and runId.
  4. Poll only while subscribers exist.
  5. Expose narrow selectors for tab label/top-bar usage: active count, problem count, highest severity, grouped definitions.
  6. Add defensive assertions around required IDs and store key construction.

Acceptance criteria:

  • New sidebar/top-bar consumers share one store-backed polling loop per active run; full chat/sidebar/top-bar dedupe is not claimed until Phase 4 integrates chat cards with the store.
  • Store subscriptions are scoped: summary consumers do not receive every event-log detail update unless the summary changes.
  • Definitions are grouped by all four scopes.
  • Store teardown clears timers/subscriptions when no consumers remain.

Quality gate:

  • Unit tests for deduped polling, teardown, severity aggregation, and scope grouping.
  • Instrument or test a fake client to ensure duplicate subscribers do not duplicate getRun polling.

Phase 2 — Build Workflows sidebar tab MVP

Likely files/subsystems:

  • src/browser/features/RightSidebar/Tabs/tabConfig.ts
  • src/browser/features/RightSidebar/Tabs/tabRegistry.tsx
  • new src/browser/features/RightSidebar/WorkflowsTab.tsx or src/browser/features/Workflows/WorkflowsTab.tsx
  • optional WorkflowsTabLabel in TabLabels.tsx or a colocated label imported by the registry

Tasks:

  1. Add a workflows static tab metadata entry without React imports in tabConfig.ts.
  2. Register label and panel renderers in tabRegistry.tsx.
  3. Render header summary, current runs, definitions grouped by scope, and recent history.
  4. Keep completed history collapsed/truncated by default.
  5. Provide supported actions only: run/start where existing UI/API supports it, interrupt/resume/retry where supported by run state, scratch promotion where existing APIs apply, and “View in chat” where a chat anchor can be resolved.
  6. Add empty states for no definitions, no active runs, and no history.

Acceptance criteria:

  • User can see available workflows and their sources/scopes in the right sidebar.
  • Active/backgrounded/interrupted/failed runs are visible above definitions/history.
  • Sidebar never becomes a full log dump by default.
  • The tab works when API context is unavailable in isolated stories/tests by rendering a non-crashing empty or unavailable state.
  • Keyboard access exists for open tab, run workflow, and primary run actions; shortcuts should not be shown on mobile views.

Quality gate:

  • Component tests for grouping, empty states, severity ordering, and current-run sorting.
  • Storybook stories or UI fixtures for empty, many definitions, active run, failed run, backgrounded run, and narrow width.

Phase 3 — Add top-bar Workflows indicator

Likely files/subsystems:

  • src/browser/components/WorkspaceMenuBar/WorkspaceMenuBar.tsx
  • new src/browser/components/WorkflowIndicator/WorkflowIndicator.tsx or colocated component
  • shared workflow store hooks from Phase 1

Tasks:

  1. Add a compact indicator near SkillIndicator, respecting menu-bar density and platform window-control spacing.
  2. Use summary selector only; do not pass full workflow arrays through WorkspaceMenuBar.
  3. Popover includes active runs, recent failures/interrupted runs, grouped definitions, and “Open Workflows tab”.
  4. Add responsive behavior for narrow widths.

Acceptance criteria:

  • User can tell from the top bar when workflows are active or need attention.
  • Popover exposes workflow definitions and sources without crowding the main bar.
  • Indicator does not overlap or make existing controls unclickable on narrow layouts or Windows/Linux titlebar-control layouts.
  • Clicking “Open Workflows tab” focuses the sidebar tab.

Quality gate:

  • Component tests for badge priority, active counts, quiet state, and popover grouping.
  • Visual stories/Chromatic snapshots for normal and narrow menu bars.

Phase 4 — Integrate chat cards with the shared store where safe

Likely files/subsystems:

  • src/browser/features/Tools/WorkflowRunToolCall.tsx
  • workflow store from Phase 1

Tasks:

  1. Replace or wrap component-local polling with store-backed run detail subscriptions.
  2. Preserve existing chat-card behavior: auto-collapse completed runs, promotion controls, task/event expansion, and dialogs.
  3. Keep chat cards as the canonical detailed execution surface.

Acceptance criteria:

  • No user-visible regression in workflow chat cards.
  • Existing workflow card tests still pass.
  • When chat integration is complete, one run visible in chat, sidebar, and top bar shares the store-backed polling path instead of each surface owning independent polling.

Quality gate:

  • Existing WorkflowRunToolCall tests pass.
  • Targeted regression test proves one shared polling path when chat and sidebar both observe the same run.

Dogfooding and self-verification plan

Use the project dev-server-sandbox flow so dogfooding does not interfere with the user’s normal Mux data.

Setup:

make dev-server-sandbox DEV_SERVER_SANDBOX_ARGS="--clean-projects"

Then use the assigned Vite URL from the sandbox output, normally http://localhost:<VITE_PORT>, and drive the UI with the direct agent-browser binary, not npx.

Before running browser automation, load the installed agent-browser command guidance and the dogfood workflow guidance so commands match the local version:

agent-browser skills get core
agent-browser skills get dogfood

Prepare a dogfood report from the dogfood template and read the issue taxonomy/checklist before exploring. If using the Mux skill files, copy templates/dogfood-report-template.md into the output directory and read references/issue-taxonomy.md.

Dogfood artifacts directory:

mkdir -p dogfood-output/workflows-ui/screenshots dogfood-output/workflows-ui/videos

Required dogfood scenarios:

  1. Empty state
    • Open Mux in the sandbox.
    • Open the Workflows tab.
    • Capture annotated screenshot of no active workflows / available definitions state.
  2. Definition discovery
    • Seed or open a workspace with project/global/built-in/scratch workflow definitions.
    • Capture screenshot showing grouped definitions and source/scope labels.
  3. Foreground active run
    • Start a workflow.
    • Record video showing the workflow appearing in chat, the sidebar Current Runs section, and the top-bar indicator.
    • Capture screenshots at start, mid-run, and completion.
  4. Backgrounded run
    • Start or background a workflow if supported by the existing workflow UX.
    • Capture top-bar/sidebar state and verify severity/count presentation.
  5. Interrupted/failed run
    • Reproduce an interrupted or failed workflow using a safe test workflow.
    • Record video showing warning/error surfacing in the indicator and sidebar.
  6. Scratch promotion
    • Run or create a scratch workflow.
    • Capture the promotion affordance and result after promoting to project/global.
  7. Narrow layout
    • Resize to a narrow width and capture screenshot/video that menu-bar controls remain clickable and the indicator degrades gracefully.
  8. Dogfood report package
    • Update the report summary counts.
    • Attach screenshots/videos for each verified dogfood claim.
    • Close the browser session after the report is complete.

For interactive issues discovered during dogfooding, follow the dogfood skill evidence standard: reproduce once, then record a watchable repro video with step-by-step screenshots before filing the issue. Static visual issues need at least one annotated screenshot. Final implementation handoff should attach the screenshots and videos so reviewers can validate the claims.

Validation plan

Run validation in this order while implementing:

  1. Targeted unit/component tests for new helpers/store/components.
  2. Store polling instrumentation or fake-client tests proving duplicate subscribers do not duplicate listRuns/getRun polling for the same key.
  3. make typecheck.
  4. make lint or make static-check if the touched area warrants full validation.
  5. Storybook/visual validation for sidebar/menu-bar states if stories are added.
  6. Black-box dogfood scenarios above with screenshots and videos.

Risks and mitigations

Risk Mitigation
Sidebar becomes too dense and duplicates chat logs. Summary-first cards; detailed logs behind disclosure/dialog/chat jump.
Multiple surfaces duplicate polling. Build shared store before top-bar/sidebar live rendering; test dedupe.
Tab config runtime assumptions are wrong. Only add static metadata there; wire actual React UI in registry; do not rely on existing keepAlive/featureFlag behavior.
Top-bar gets crowded. Quiet default state, compact badge, responsive hiding/overflow behavior, narrow-layout dogfood.
Unsupported workflow management slips in. Limit actions to verified APIs; follow-up issue for rename/delete/edit if desired.
Unknown persisted workflow data crashes UI. Exhaustive known-status tests plus safe runtime fallback for unrecognized persisted values.

Advisor review status

Advisor review fully approved the final plan. Required clarifications from the first review were applied and re-reviewed: Phase 1/Phase 4 dedupe scope, dogfood setup, black-box dogfood separation from implementation validation, and approval status. The second advisor review reported no remaining blockers or required changes.


Generated with mux • Model: openai:gpt-5.5 • Thinking: xhigh • Cost: 1223199{MUX_COSTS_USD:-0}

ThomasK33 added 2 commits June 8, 2026 16:56
Implement shared workflow visibility state, Workflows sidebar, topbar indicator, chat-card integration, workflow definition precedence, and tests.\n\n---\n\n_Generated with `mux` • Model: `openai:gpt-5.5` • Thinking: `xhigh` • Cost: `07.49`_\n\n<!-- mux-attribution: model=openai:gpt-5.5 thinking=xhigh costs=207.49 -->
@ThomasK33

Copy link
Copy Markdown
Member Author

@codex review

@chatgpt-codex-connector

Copy link
Copy Markdown

Codex Review: Didn't find any major issues. Breezy!

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

@ThomasK33

Copy link
Copy Markdown
Member Author

@codex review

@chatgpt-codex-connector

Copy link
Copy Markdown

Codex Review: Didn't find any major issues. More of your lovely PRs please.

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

@ThomasK33

Copy link
Copy Markdown
Member Author

@codex review

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 8400b636ad

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread src/browser/App.tsx Outdated
@ThomasK33

Copy link
Copy Markdown
Member Author

@codex review

Please take another look. I preserved all enabled right-sidebar feature flags for command palette tab visibility, including Browser and Desktop.

@chatgpt-codex-connector

Copy link
Copy Markdown

Codex Review: Didn't find any major issues. 🎉

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant