Skip to content

Commit edccfad

Browse files
committed
fix: DE-135 - resilience to / better errors for bad frontmatter
1 parent 516ec0b commit edccfad

20 files changed

Lines changed: 1486 additions & 30 deletions

File tree

Lines changed: 79 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,79 @@
1+
---
2+
id: ISSUE-053
3+
name: Codex preboot context not loaded via AGENTS.md @path includes
4+
created: "2026-04-26"
5+
updated: "2026-04-26"
6+
status: open
7+
kind: issue
8+
categories: []
9+
severity: p3
10+
impact: user
11+
---
12+
13+
# Codex preboot context not loaded via AGENTS.md @path includes
14+
15+
## Summary
16+
17+
Codex sessions in this repo do not receive the generated spec-driver boot
18+
context at startup. The `# Spec-Driver Boot Context` heading is absent from
19+
the model's preloaded context, so `/boot` validation fails unless the agent
20+
explicitly reads the boot file.
21+
22+
## Observed behaviour
23+
24+
- Root `AGENTS.md` contains only two lines:
25+
- `@.spec-driver/AGENTS.md`
26+
- `@.spec-driver/agents/boot.md`
27+
- Reported from a sibling spec-driver repo: in a Codex session those
28+
`@path` lines surfaced **literally** in model context rather than being
29+
expanded into the referenced files' contents. Behaviour in this repo is
30+
expected to match (same AGENTS shape) but has not been independently
31+
reproduced here.
32+
- `.agents/spec-driver-boot.md` exists and contains the expected
33+
`# Spec-Driver Boot Context` heading (generated via
34+
`spec-driver admin preboot`), but it is not referenced from `AGENTS.md`,
35+
so Codex never sees it.
36+
37+
## Why this matters
38+
39+
- The `/boot` skill instructs agents to validate by checking for the
40+
`Spec-Driver Boot Context` heading and to print `BOOT ERROR !!!` when
41+
missing. On Codex this validation will trip on every session.
42+
- Without the preboot bundle, agents start without doctrine, glossary,
43+
workflow stance, accepted ADRs, required policies/standards, and routing
44+
rules — the exact context the project relies on for correct routing.
45+
46+
## Root cause hypothesis
47+
48+
Codex AGENTS discovery (per public docs) reads `AGENTS.md` /
49+
`AGENTS.override.md` along the instruction chain, but does not document
50+
recursive `@path` include expansion. Claude Code expands `@path`; Codex
51+
appears not to. Adding more `@path` lines (e.g. `@.agents/spec-driver-boot.md`)
52+
will likely surface as literal text in Codex too.
53+
54+
## Candidate fixes
55+
56+
1. **Inline the preboot bundle** into a Codex-visible AGENTS file
57+
(e.g. make `AGENTS.md` or `.spec-driver/AGENTS.md` contain the generated
58+
boot context literally, regenerated by `spec-driver admin preboot`).
59+
2. **Codex-specific boot fallback**: have the `/boot` skill detect the
60+
missing heading and explicitly read `.agents/spec-driver-boot.md` before
61+
warning.
62+
3. **Harness-aware preboot output**: extend `spec-driver admin preboot` to
63+
emit a Codex-compatible artefact (inlined AGENTS content) alongside the
64+
existing symlink/file used by Claude Code.
65+
66+
Option 1 is simplest but couples agent-harness ergonomics into a single
67+
file; option 3 keeps harness adapters explicit. Decide before scoping a
68+
delta.
69+
70+
## References
71+
72+
- `AGENTS.md` (root) — current `@path`-only contents
73+
- `.agents/spec-driver-boot.md` — generated preboot bundle
74+
- `.spec-driver/skills/boot/SKILL.md` — boot validation logic
75+
- `.spec-driver/agents/boot.md``/boot` invocation directive
76+
- Codex AGENTS docs: https://developers.openai.com/codex/guides/agents-md
77+
- Source: report from sibling spec-driver repo (memory
78+
`mem.fact.codex.agents.preboot-include` in that repo); not yet
79+
reproduced here.
Lines changed: 51 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,51 @@
1+
---
2+
id: ISSUE-054
3+
name: list deltas dumps Rich traceback when a phase file has invalid YAML frontmatter
4+
created: "2026-05-01"
5+
updated: "2026-05-01"
6+
status: open
7+
kind: issue
8+
categories: []
9+
severity: p3
10+
impact: user
11+
---
12+
13+
# list deltas dumps Rich traceback when a phase file has invalid YAML frontmatter
14+
15+
## Symptom
16+
17+
`spec-driver list deltas` (and likely siblings) crashes with a full Rich
18+
traceback when any phase file under a delta has a YAML parse error in its
19+
frontmatter. The user-facing output is a stack trace ending in
20+
`yaml.scanner.ScannerError: mapping values are not allowed in this context`
21+
with no indication of which file or line is at fault.
22+
23+
## Root cause
24+
25+
`load_change_artifact` (`supekku/scripts/lib/changes/artifacts.py:180`) wraps
26+
the per-phase-file `load_markdown_file` call in
27+
`except (ValueError, OSError)`. PyYAML raises `yaml.YAMLError` (e.g.
28+
`ScannerError`), which is not a subclass of either, so the parse error
29+
escapes the per-file guard.
30+
31+
The surrounding `ChangeRegistry.collect()` catch
32+
(`supekku/scripts/lib/changes/registry.py:84-87`) is also `except ValueError`,
33+
so the YAML error propagates all the way out to the CLI handler and is
34+
rendered as a traceback rather than a friendly message.
35+
36+
The same pattern likely affects sibling loaders (specs, decisions,
37+
backlog) wherever `load_markdown_file` is called inside `except ValueError`.
38+
39+
## Expected
40+
41+
- A clear, actionable message identifying the offending file and ideally the
42+
line/column of the YAML error (consistent with PROD-010.FR-010).
43+
- The command should skip the bad file and continue listing the rest, or fail
44+
fast with a clear error — but never dump a Python traceback.
45+
46+
## Repro
47+
48+
1. Edit any `.spec-driver/deltas/DE-XXX/phases/phase-0N.md` so its frontmatter
49+
contains a YAML parse error (e.g. an unquoted colon inside a value).
50+
2. Run `spec-driver list deltas`.
51+
3. Observe Rich traceback instead of a friendly diagnostic.
Lines changed: 154 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,154 @@
1+
---
2+
id: DE-134
3+
slug: subagent_worktree_base_ref_alignment_and_compensation_defences
4+
name: Delta - Subagent worktree base-ref alignment and compensation defences
5+
created: "2026-04-26"
6+
updated: "2026-04-26"
7+
status: draft
8+
kind: delta
9+
aliases: []
10+
relations:
11+
- type: relates_to
12+
target: DE-132
13+
- type: relates_to
14+
target: DE-133
15+
context_inputs:
16+
- type: brief
17+
ref: ./brief.md
18+
summary: "Source brief — three-layer defence for subagent worktree isolation (stale-fork + silent-compensation)."
19+
- type: reference
20+
ref: supekku/agents/dispatch-worker.md
21+
summary: "Existing managed subagent that uses isolation: worktree."
22+
- type: reference
23+
ref: supekku/claude.settings.json
24+
summary: "Hook registration template installed to .claude/settings.json."
25+
- type: reference
26+
ref: supekku/claude.hooks/
27+
summary: "Source for installable hook scripts (currently startup.sh, artifact_event.py)."
28+
applies_to:
29+
specs: []
30+
requirements: []
31+
---
32+
33+
# DE-134 – Subagent worktree base-ref alignment and compensation defences
34+
35+
```yaml supekku:delta.relationships@v1
36+
schema: supekku.delta.relationships
37+
version: 1
38+
delta: DE-134
39+
revision_links:
40+
introduces: []
41+
supersedes: []
42+
specs:
43+
primary: []
44+
collaborators: []
45+
requirements:
46+
implements: []
47+
updates: []
48+
verifies: []
49+
phases: []
50+
```
51+
52+
## 1. Summary & Context
53+
54+
- **Source brief**: [brief.md](./brief.md) — three-layer defence for subagent worktree isolation.
55+
- **Implementation Plan**: [IP-134](./IP-134.md)
56+
- **Design Revision**: [DR-134](./DR-134.md)
57+
- **Change Drivers**: Empirical incident in a sibling spec-driver project where a `isolation: worktree` subagent forked from `origin/main` (38 commits behind the supervisor's local `main`), then silently compensated by re-staging files from elsewhere. Result: a branch unmergeable against trunk and only diagnosable by inspecting the diff base.
58+
- **Related deltas**: DE-132 (sub-agent orchestration / `/dispatch`), DE-133 (installer support for `.claude/agents/`). This delta extends the same surface — managed subagent definitions, installer-managed Claude config, and harness hook scripts.
59+
60+
## 2. Motivation
61+
62+
Two coupled failure modes in worktree-isolated subagents:
63+
64+
1. **Stale fork**: Claude Code's `isolation: worktree` does not reliably fork from the supervisor's HEAD. Empirically observed forking from a tracking ref (e.g. `origin/main`) instead. The subagent then operates in a tree missing all in-flight supervisor work.
65+
2. **Silent compensation**: When the worktree state contradicts the delegation prompt (missing files, missing commits), subagents have been observed reconstructing state by copying files from outside the worktree. The resulting branch looks plausible in isolation but is unmergeable against trunk because its diff baseline is wrong.
66+
67+
Loud failure is recoverable; silent compensation is not. The defence must be three-layer (pre-spawn alignment, in-prompt refusal, handback verification) because no single layer covers all paths.
68+
69+
## 3. Scope & Objectives
70+
71+
- **Primary Outcomes**:
72+
- **Layer 1 — pre-spawn alignment**: a `SubagentStart` hook script that aligns a worktree-isolated subagent's working tree to the supervisor's HEAD (or a documented project-level override).
73+
- **Layer 2 — compensation refusal**: a shared subagent prompt fragment that instructs worktree-isolated subagents to stop and report rather than reconstruct missing state. Inherited by `dispatch-worker` (and any future managed subagent declaring `isolation: worktree`) without per-file edits.
74+
- **Layer 3 — handback verification**: a `SubagentStop` hook script that runs a merge-base sanity check against the captured base ref, scans for the compensation signature (added files identical to base-ref state), and writes incidents to a project-local log.
75+
- **Installer support**: `spec-driver install` sources both new hook scripts from `supekku/claude.hooks/` and registers them in the installed `.claude/settings.json`. Per-agent log/state directory is created (or auto-created on first run).
76+
- **Documentation**: `supekku/claude.hooks/README.md` (or equivalent) documents the parent-HEAD-vs-trunk policy choice and the project-level + per-subagent override mechanisms.
77+
- **Operational Constraints**:
78+
- Source-of-truth changes belong in `supekku/`. The `.spec-driver/` installation is regenerated by `spec-driver install` and must not be hand-edited.
79+
- Hooks must no-op cleanly for non-isolated subagents; existing dispatch flow must remain functional.
80+
- No persistent per-agent state files after subagent completion.
81+
- **Dependencies**:
82+
- DE-133 lands installer-managed `.claude/agents/` sync. DE-134 extends installer-managed scope to `.claude/hooks/` entries that are *added* by this delta and to the `SubagentStart` / `SubagentStop` keys in the settings template.
83+
84+
## 4. Out of Scope
85+
86+
- Path enforcement during subagent execution (separate brief if needed).
87+
- Read sandboxing.
88+
- Replacing Claude Code's worktree creation entirely.
89+
- Process or network isolation.
90+
- Automatic resolution of stale-fork incidents — the supervisor diagnoses, the hook only surfaces.
91+
- Generic hook framework refactor — extend the existing `supekku/claude.hooks/` pattern rather than redesigning it.
92+
93+
## 5. Approach Overview
94+
95+
- **System Touchpoints**:
96+
- `supekku/claude.hooks/` — two new scripts (working names `align-worktree-to-parent.sh`, `verify-worktree-base.sh`).
97+
- `supekku/claude.settings.json` — register `SubagentStart` and `SubagentStop` hook entries.
98+
- `supekku/agents/dispatch-worker.md` — adopt the shared compensation-refusal directive.
99+
- `supekku/templates/agents/` (or a new shared snippet location) — single-source compensation-refusal directive that managed subagents reference.
100+
- `supekku/scripts/` — installer changes if `_install_claude_config` / `_install_agents` need to learn about new hook files or new state-dir conventions.
101+
- **Key Changes**:
102+
1. Author both hook scripts under `supekku/claude.hooks/`. Resolve hook input shape (parent CWD, agent_id, worktree path) by capturing real `SubagentStart` / `SubagentStop` invocations — see DR open question 1.
103+
2. Extend `supekku/claude.settings.json` with the two new hook entries. Confirm the settings installer copies/merges this file correctly into per-project `.claude/settings.json`.
104+
3. Add the compensation-refusal directive as a single-source fragment, included by `dispatch-worker.md` (and any future worktree-isolated managed subagent) without copy-paste.
105+
4. Document parent-HEAD-vs-trunk policy and the override surface (`.claude/agent-base-ref` project-level, per-subagent frontmatter opt-out) in a README under `supekku/claude.hooks/`.
106+
5. Confirm installer behaviour for the new hook script files and the runtime state directory (`.claude/state/agent-base-ref/`, `.claude/state/worktree-incidents.log`).
107+
- **Migration / Rollout Notes**: Existing installs pick up the new hooks on next `spec-driver install`. No data migration required. New state directory is created on first subagent spawn.
108+
109+
## 6. Verification Strategy
110+
111+
- **VT**:
112+
- Unit-style coverage for the hook scripts via shellcheck plus a small harness that feeds synthetic hook input JSON and asserts side-effects (worktree HEAD match, exit code, state file presence/absence).
113+
- Installer test confirming new hook files land in the installed workspace and `.claude/settings.json` registers the hook entries.
114+
- **VA**:
115+
- Controlled `/dispatch` run where the supervisor is several commits ahead of the tracking ref; confirm subagent worktree HEAD matches supervisor and `worktree-incidents.log` is empty.
116+
- Adversarial run where a delegation prompt mentions a file the subagent is not given; confirm the subagent reports rather than fabricates, and the SubagentStop hook flags the run if compensation occurs.
117+
- **VH**: User attestation that the integration into a real delta pass works end-to-end without disrupting `/dispatch` ergonomics.
118+
- **Acceptance Criteria** (from brief):
119+
- Worktree-isolated subagent starts at supervisor's HEAD-at-delegation regardless of Claude Code's default resolution.
120+
- Subagent whose worktree state contradicts its delegation prompt stops and reports rather than compensating.
121+
- Stale or different-ancestry merge-base at handback is surfaced before the supervisor accepts the result.
122+
- Compensation signature (added files identical to base-ref state) is detected and flagged.
123+
- Subagents without `isolation: worktree` are unaffected — hooks no-op cleanly.
124+
- Project-level base-ref override via `.claude/agent-base-ref`.
125+
- Per-subagent opt-out via documented frontmatter flag.
126+
- No per-agent state files persist after completion.
127+
- Incidents accumulate in `.claude/state/worktree-incidents.log` for pattern review.
128+
- Subagent template/generator update is single-source; existing definitions inherit without per-file edits.
129+
- `supekku/claude.hooks/README.md` documents the design and override surface.
130+
131+
## 7. Risks & Mitigations
132+
133+
- **Risk**: Hook input shape (especially how to retrieve parent session CWD from `SubagentStart`) is undocumented and may vary across Claude Code releases. – _Likelihood_: medium – _Impact_: medium – _Mitigation_: Capture real invocations during DR work; fall back to deriving parent CWD from `.git/worktrees/{name}/gitdir` or `git worktree list --porcelain` run from inside the worktree.
134+
- **Risk**: `git reset --hard` in a worktree could destroy uncommitted state if Claude Code has already populated the worktree with WIP. – _Likelihood_: low – _Impact_: high – _Mitigation_: Hook checks for clean tree before reset; aborts loudly otherwise. Per-subagent opt-out covers any legitimate WIP-carry case.
135+
- **Risk**: Compensation-signature scan produces false positives when a subagent legitimately re-introduces a file deleted on a sibling branch. – _Likelihood_: medium – _Impact_: low – _Mitigation_: Default to warn-only output; surface signal without blocking. Tunable threshold deferred to later if needed.
136+
- **Risk**: Installer-managed settings merge clobbers user-customised hook entries. – _Likelihood_: medium – _Impact_: medium – _Mitigation_: Verify installer's existing settings-handling semantics; document install-time behaviour in DR.
137+
- **Risk**: Single-source directive mechanism does not yet exist for subagent prompts; introducing one risks scope creep into a generic templating concern. – _Likelihood_: medium – _Impact_: low – _Mitigation_: Choose the simplest viable mechanism (e.g. a referenced skill or an installer-time include) and resist building a generic engine.
138+
139+
## 8. Follow-ups & Tracking
140+
141+
- **Future Phases / Deltas**:
142+
- Path-enforcement layer for subagent execution (separate brief).
143+
- Consider extending merge-base verification to all worker-produced branches, not only worktree-isolated ones.
144+
- **Backlog Items**: To be created if scope splits during DR refinement.
145+
- **Open Decisions / Questions** (carried into DR-134):
146+
1. Confirm exact JSON shape of `SubagentStart` / `SubagentStop` hook inputs and the reliable way to retrieve the parent session's working directory.
147+
2. Confirm Claude Code's actual base-ref resolution rule for `isolation: worktree`.
148+
3. Decide whether stale-fork warnings hard-block via SubagentStop exit code, or warn-only. Default: warn-only.
149+
4. Decide precedence between `.claude/agent-base-ref` (project) and per-subagent frontmatter override. Default: per-subagent wins, project config is the framework default.
150+
151+
## 9. Implementation Notes
152+
153+
- All source-of-truth edits land in `supekku/`. The `.spec-driver/` installed copy is regenerated via `spec-driver install` and must not be edited directly.
154+
- Manual end-to-end verification requires a `/dispatch` invocation against a deliberately-stale tracking ref to exercise Layer 1 + Layer 3 in concert.

0 commit comments

Comments
 (0)