Skip to content

fix(agent): add drop_session for cleanup paths; require base_sample in finish_session#2117

Closed
jingshenghang wants to merge 2 commits into
THUDM:mainfrom
jingshenghang:fix-base-sample-drop-session
Closed

fix(agent): add drop_session for cleanup paths; require base_sample in finish_session#2117
jingshenghang wants to merge 2 commits into
THUDM:mainfrom
jingshenghang:fix-base-sample-drop-session

Conversation

@jingshenghang

@jingshenghang jingshenghang commented Jun 22, 2026

Copy link
Copy Markdown
Collaborator

What

In the SWE coding-agent rollout, generate() calls adapter.finish_session(session_id) in its finally block as a catch-all cleanup. Since this path passes no base_sample, finish_session / get_trajectory defaulted base_sample=None.

On the normal path the trajectory is already consumed (the finally call is a no-op), but when the rollout raises before consuming the trajectory (e.g. sandbox boot, evaluate, or the wall-clock timeout guard fires), the session tree is still in the manager. The cleanup call then reaches to_sample(base_sample=None) and crashes on None.index.

Overloading base_sample=None to mean "cleanup only" also hid the intent: a reader can't tell from the call site whether it consumes the trajectory into trainable Samples or just frees state.

Changes

  • TrajectoryManager.drop_session(sid) / BaseAdapter.drop_session(sid) — free a session's state (drain in-flight turns, pop the tree) without linearizing it into Samples. For paths that must release the session but produce nothing trainable.
  • finish_session: base_sample is now required (was =None). The consume path can never silently build Samples from a None base; misuse fails loudly.
  • generate.py finally: use drop_session(session_id) instead of finish_session(session_id) — the cleanup path now states its intent and can't crash.
  • run_command (harness/common.py): the done marker holds the exit code as text, but echo N > done truncates the file before writing, so a poll landing in that window reads "" and int("") raised. Skip empty reads and keep polling instead.

Behavior

  • Success path: unchanged — finish_session(base_sample=...) consumes and returns Samples; the finally drop_session is a no-op (tree already popped).
  • Abort/timeout path: drop_session releases the session cleanly, no crash, nothing trained.

jingshenghang added 2 commits June 22, 2026 03:17
…n finish_session

- TrajectoryManager.drop_session / BaseAdapter.drop_session: free a session's
  state without linearizing it into Samples, for rollouts that ended before the
  trajectory could be consumed.
- finish_session: base_sample is now required (was =None) so the consume path
  can never silently produce Samples from a None base.
- generate.py finally: use drop_session instead of finish_session(sid).
- run_command: skip empty done-marker reads (echo > truncates before write) to
  avoid int('') crashes; keep polling instead.
@jingshenghang jingshenghang changed the title Fix base sample drop session fix(agent): add drop_session for cleanup paths; require base_sample in finish_session Jun 22, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant