fix(core): implement workspace grounding and exploration abort guards#4
fix(core): implement workspace grounding and exploration abort guards#4JessicaMulein wants to merge 1 commit into
Conversation
Ground implement/resume turns with lib/test snapshots and flutter test verification; abort ls/repetition/read-range dead ends; default heavy keep_alive to -1 so router models stay loaded between agent LLM calls. Pin cecli dev-integration (ReadRange int marker coerce). Co-authored-by: Cursor <cursoragent@cursor.com>
|
Your Typo free trial has ended. To continue receiving code reviews, please ask your admin to upgrade to a paid plan. |
|
Important Review skippedAuto reviews are disabled on base/target branches other than the default branch. Please check the settings in the CodeRabbit UI or the ⚙️ Run configurationConfiguration used: defaults Review profile: CHILL Plan: Pro Run ID: You can disable this status message by setting the Use the checkbox below for a quick retry:
✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
Review Summary by QodoImplement workspace grounding and exploration abort guards with flutter test verification
WalkthroughsDescription• Implement workspace grounding: inject on-disk lib//test/ snapshot and next-action hints on implement/resume turns; run flutter test verification at turn end when applicable • Exploration dead-end guards: abort turns after repeated ls, ReadRange errors, repetition detection, or empty Ollama exploration with no edits; skip stall auto-continue when exploration was aborted • Model router keep-alive: default keep_alive_heavy to -1 (normalize legacy 0); expose Heavy keep-alive in Settings UI • Session/UI: release inflight SSE on done/error; Resume work UI in TodoPanel; resume prompt tells agent to use snapshot not ls • Edit failure recovery: auto-continue once after EditText failure with ReadRange guidance; abort after consecutive failures Diagramflowchart LR
A["Implement/Resume Turn"] -->|"Inject workspace snapshot"| B["Snapshot + Next Action"]
B -->|"Avoid ls loops"| C["ReadRange + EditText"]
C -->|"Test-focused checklist"| D["Run flutter test"]
E["Exploration Activity"] -->|"Track ls/ReadRange/repetition"| F["Abort Guards"]
F -->|"No auto-continue"| G["User Recovery Prompt"]
H["Model Router"] -->|"Normalize keep_alive=0 to -1"| I["Heavy Model Stays Loaded"]
J["EditText Failure"] -->|"Auto-continue once"| K["ReadRange Guidance"]
K -->|"Consecutive failures"| L["Abort Turn"]
File Changes1. bright_vision_core/agent_turn.py
|
Code Review by Qodo
1. buildResumeWorkMessage lacks attribution
|
CI Feedback 🧐A test triggered by this PR failed. Here is an AI-generated analysis of the failure:
|
|
Closing — merging directly to dev-integration (no internal PR workflow). |
| if (todoHasSpecLayers(item)) { | ||
| return ( | ||
| '/agent Continue the active task from where you stopped. ' + | ||
| 'A **workspace snapshot** is injected — do **not** ls, Grep, or GitStatus. ' + | ||
| 'Use ReadRange + EditText on the **Next action** file only. ' + | ||
| 'Do not reset completed checklist items; work the next incomplete item.' + | ||
| blockedNote | ||
| ) | ||
| } | ||
|
|
||
| return ( | ||
| '/agent Continue the active task checklist from where you stopped. ' + | ||
| 'Use ReadRange and EditText on the next incomplete item — ' + | ||
| 'do not repeat exploration unless necessary. Do not uncheck completed items.' + | ||
| blockedNote | ||
| ) |
There was a problem hiding this comment.
1. buildresumeworkmessage lacks attribution 📘 Rule violation ⚙ Maintainability
New user-facing agent guidance text describes BrightVision agent behavior (resume/implement workflow, workspace snapshot) but does not attribute the underlying agent system to Cecli as required. This can create incorrect product attribution in UI copy and docs.
Agent Prompt
## Issue description
User-facing copy added/modified in this PR describes the agent workflow (resume/implement, workspace snapshot, model router behavior) without attributing the underlying agent system to Cecli and clarifying BrightVision’s role as the Vision HTTP layer.
## Issue Context
PR Compliance ID 3 requires Cecli partnership attribution wherever agent capabilities are described to users.
## Fix Focus Areas
- src/todos/formatContext.ts[139-154]
- docs/TROUBLESHOOTING.md[277-283]
- bright_vision_core/implement_workspace.py[76-191]
- bright_vision_core/agent_turn.py[182-187]
ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools
| if (turnComplete && isTauriRuntime()) { | ||
| void invoke('cancel_vision_message') | ||
| } |
There was a problem hiding this comment.
2. Cross-turn sse cancel 🐞 Bug ☼ Reliability
On Tauri, the UI unblocks and may start a new send immediately after a done/error event, while the previous send() still runs its finally block and invokes cancel_vision_message. Because the Tauri cancel flag is stored as a single global slot, this cleanup can cancel the *next* turn’s SSE stream, truncating or dropping the response.
Agent Prompt
### Issue description
On desktop (Tauri), `cancel_vision_message` cancels whatever stream is currently stored in a single global `VisionMessageStreamState.cancel` slot. The React client now calls `releaseInflightAfterTurn()` on `done`/`error`, which can trigger a *new* send before the previous send’s `finally` runs; the previous send then calls `cancel_vision_message`, potentially cancelling the new stream.
### Issue Context
- `releaseInflightAfterTurn()` is executed on core `done/error` events and can lead to subsequent sends starting quickly.
- `visionApi.send()` calls `invoke('cancel_vision_message')` in `finally` when a turn completes.
- Tauri’s cancel mechanism uses a single shared `Option<Arc<AtomicBool>>` that is overwritten on every new `send_vision_message`.
### Fix Focus Areas
Implement one of the following robust fixes (preferred order):
1) **Make cancel scoped**: Change Tauri state to track cancel flags per **session_id** or a monotonic **stream_id** returned/known to JS, and update `cancel_vision_message` to accept that identifier and cancel only the matching stream.
2) **Prevent overlap**: Ensure the UI cannot start a new `send()` until the previous `send()` has fully finished its cleanup (separate “UI busy” from “network inflight”, or add a dedicated “sendInProgress” gate).
Fix locations:
- src/ipc/visionApi.ts[201-249]
- src/hooks/useVisionSession.ts[174-190]
- src/App.tsx[826-834]
- src-tauri/src/vision_message.rs[14-16]
- src-tauri/src/vision_message.rs[48-52]
- src-tauri/src/vision_message.rs[127-133]
### Acceptance checks
- Add an automated test or an e2e mock to simulate: turn N emits `done`, immediate second send starts, and verify the second stream is not cancelled by cleanup from turn N.
- Verify `cancelSend()` still cancels the current in-flight stream reliably.
ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools
Summary
lib/+test/workspace snapshot and next-action hints on implement/resume turns; runflutter testverification at turn end when applicable.ls, ReadRange errors, repetition detection, or empty Ollama exploration with no edits; skip stall auto-continue when exploration was aborted.keep_alive_heavyto-1(normalize legacy0); expose Heavy keep-alive in Settings.ls.dev-integration(cecli-dev/cecli#561).Test plan
python -m pytest tests/core/test_implement_workspace.py tests/core/test_agent_turn.py tests/core/test_spec_focus.py tests/core/test_model_router.py tests/core/test_session_io_events.py -qyarn test src/theme/modelRouterPrefs.test.ts src/todos/formatContext.test.tslsloopsRelated
Made with Cursor