Skip to content

docs(schema): warn against in-place mutation of shared stream elements#1074

Open
hi-pender wants to merge 89 commits into
alpha/10from
docs/stream-element-cow
Open

docs(schema): warn against in-place mutation of shared stream elements#1074
hi-pender wants to merge 89 commits into
alpha/10from
docs/stream-element-cow

Conversation

@hi-pender

Copy link
Copy Markdown
Contributor

What this PR does

Adds GoDoc warnings about a recurring concurrency pitfall: StreamReader.Copy fan-out duplicates readers, not elements, so pointer elements (e.g. *schema.Message) are shared across branches and may be consumed concurrently. Mutating a received message in place — especially map fields like msg.Extra[k] = v — causes concurrent map read/write panics.

Comment-only change; no behavior change.

Where the guidance was added

  • schema/stream.goStreamReader.Copy: copies share the same element values; treat received elements as read-only, use copy-on-write to modify
  • schema/stream.goStreamReaderWithConvert: the convert callback must not modify its input in place; return a new value (shallow-copy + clone changed map/slice fields)
  • schema/message.goMessage.Extra: never write to Extra on a message you did not create; copy-on-write steps
  • schema/agentic_message.goAgenticMessage.Extra: same guidance
  • adk/handler.goTypedChatModelAgentMiddleware.WrapModel: wrappers transforming the response stream must shallow-copy + clone instead of writing in place

Why

Users writing stream convert callbacks or ADK middleware commonly attach metadata via msg.Extra[k] = v on messages received from streams, which intermittently panics under fan-out. The doc comments now steer users to the copy-on-write pattern at every place they're likely to look.

🤖 Generated with Claude Code

Change-Id: I09b9754cf51f10fe662fb6cb33935ad92a4d1656
…mance

- Add HumanReadableSerializer that produces human-readable JSON output
- Add GobSerializer for comparison benchmarks
- Refactor serialization_test.go to use table-driven tests for both serializers
- Add comprehensive benchmarks comparing InternalSerializer, HumanReadableSerializer, and GobSerializer

Performance improvements over InternalSerializer:
- 30-68% faster marshal/unmarshal operations
- 17-49% less memory allocation
- 30-76% fewer allocations
- 33-59% smaller serialized output size

The HumanReadableSerializer uses standard JSON encoding with type annotations
only for interface{} fields, making the output both human-readable and
type-preserving for round-trip serialization.~

Change-Id: Ifeae12484fc73b74067a0ac3a496e73e0674b975
Change-Id: Iecbcfd8ab5905e61df9a5578e8d3c99387dace85
Change-Id: I5ac5d8b60be92a28d1e57929f03fa4201649fbd0
Change-Id: I318ff16a6bd81bb57b38791eeea17b3385b1fde4
Change-Id: Iefb047967d4cb4796edd363e7e6ba0a9c4f49148
Change-Id: I3de2ba37914fe779cd880b38ce067acf5fbde7c8
The runner previously wrote the interrupt checkpoint inline, before the
session event persister had flushed. This risks a checkpoint that
references events not yet durable in the SessionStore.

Introduce deferredRunnerCheckpoint: on the interrupt/cancel path the
checkpoint payload is captured but not written until finalize() confirms
persister.closeAndWait succeeded. If event persistence failed, the
checkpoint write is skipped entirely (fail-closed).

Also adds regression tests for checkpoint ordering invariants and the
InMemoryStore checkpoint round-trip.

Change-Id: Ib29263add4a261513f1349a9173a913c74ee65ef
Resume previously assumed the agent_tool interrupt state was always the
JSON envelope (agentToolInterruptState) introduced for SessionID-based
event filtering. Pre-envelope checkpoints stored raw gob bridge bytes,
so json.Unmarshal failed outright and resume errored.

Try the JSON envelope first; on parse failure or empty BridgeCheckpoint,
treat the raw bytes as the legacy bridge checkpoint and synthesize a
fresh childSessionID. Pre-envelope checkpoints predate session
persistence and have no parent-session filter to coordinate with, so
the synthesized ID is harmless.

Un-skip the v0.7.37, v0.8.2, v0.8.3, v0.8.4 compat fixtures so the
resume path is now exercised for real on-disk legacy bytes.

Change-Id: I5000424ba028e9fb9fe3843ea9673a3dd23a7860
Merge AfterCursor and PageToken into a single `After` field in
LoadEventsOptions. Rename NextPageToken to `Next` in LoadEventsResult.
Rename afterEventCursor to afterCursor in LoadLatestTurnEnd returns.

One concept, one name: the caller seeds After from LoadLatestTurnEnd,
then passes res.Next back as After on subsequent pages.

Change-Id: Icd0960940b7f69938e2b4a5062d220c709bc0e66
The Options suffix implies a functional-options pattern; Request better
reflects the struct-pointer parameter style and pairs with LoadEventsResult.
Also removes dead variable in conformance test.

Change-Id: Ia28d8a1f92f4307182508dfba56e2b3c454acfa4
…ializer

Correct the tool-call middleware ordering in chatmodel.go doc comments
(cancelMonitor wraps around user handlers, not inside). Add architectural
doc comment to newTypedInvokableAgentToolRunner explaining why AgentTool
lacks its own SessionStore. Remove the unused GobSerializer type alias
from schema/serialization.go (already aliased internally). Update
conformance test variable names to match LoadEventsRequest rename.

Change-Id: I2251e190c9d343dbd1e57cbebfc236840be4a0fa
…igurable page size

AppendEvents failures in the session persister now retry with exponential
backoff (default 3 retries, 50ms initial delay, 2x multiplier, 25% jitter)
before latching the error. This prevents transient store failures from
causing irrecoverable session log corruption.

Also extracts the hard-coded page-size=100 in reconstructFromEventLog and
replayTailEvents into the configurable LoadPageSize field on
SessionPersistenceConfig (default 100, preserving current behavior).

New config fields on SessionPersistenceConfig:
- MaxFlushRetries (default 3, set to -1 to disable)
- FlushRetryInitialBackoff (default 50ms)
- LoadPageSize (default 100)

Change-Id: I407b6038fd61df74420a5ddd16ce8ec0f8860948
…preservation

When a SessionStore is configured, the Runner now reuses the exact tool
list from the previous turn's TurnEndState to feed the model, ensuring
byte-exact prompt cache hits across turns. The ToolSearch middleware
skips its initialization strip logic when it detects pre-seeded tool
infos via the new ToolInfosPreSeededKey RunLocalValue.

Users can opt out with WithRefreshToolInfos() when tools have genuinely
changed between turns.

Change-Id: I95585e105d41689f08116325478f21552482b79b
Change-Id: Id35944259eaee9254bc9ffa7288fe6bc43bc021b
…ion mode

When sessions are enabled, TurnEndState.Messages carries the previous
turn's system message. The Runner prepends this history to the new input,
then defaultGenModelInput unconditionally prepends a fresh system message,
causing duplication. Strip the leading system message from history before
prepending the fresh instruction so that dynamic SessionValues are always
re-evaluated without duplicating the system prompt.

Change-Id: Id14acc6aa5f2941ea64c4de82393a36bcea2fb36
…to single file

Merge three scattered test files (human_readable_test.go,
human_readable_edge_test.go, schema/toolinfo_humanreadable_test.go) into
one unified file in internal/serialization. Replace schema.ToolInfo usage
with a local mock type to eliminate the circular dependency, and remove
duplicate test cases that exercised identical code paths.

Change-Id: I0b1fa9d001201382917abb8bf4b3bd220ffdf13d
Reduce SessionStore from 4 methods to 2 (AppendEvents + LoadEvents) by
merging TurnEndState into the event log as a SessionEvent variant. This
eliminates duplicate message storage and unifies reconstruction into a
single reverse-scan algorithm.

- Rewrite session/in_memory_store.go with forward/reverse pagination
- Remove SaveTurnEnd/LoadLatestTurnEnd from all test mocks
- Replace encodeTurnEndState/decodeTurnEndState with encodeSessionEvent
- Replace reconstructFromEventLog with reconstructSessionState

Change-Id: I979e88727dd33bfaa6983241b7e90b7bbbe040e0
Replace the opaque integer-index cursor with the per-event UUIDv4
event_id, giving SSE consumers a stable identity for Last-Event-ID
resume and de-duplication. AppendEvents becomes idempotent
(first-write-wins on duplicate event_id) so persister retries no longer
double-write. Introduce ErrInvalidEventID and ErrEventIDOutOfRange
sentinels with isProtocolError classification, so the persister
fail-fasts protocol violations while still retrying infrastructure
errors. Stores treat event_id as an opaque non-empty string; UUIDv4 is
the Runner allocation convention, not a store-enforced format.

Change-Id: I70c277af5505d57c7354e8056372c1021d6009eb
Allocate EventID once at the AgentEvent emission boundary (execCtx.send)
so user-land stream consumers and persisted SessionStore records share
the same identity. SSE adapters can now use AgentEvent.EventID as the
SSE id: line, and clients reconnecting with Last-Event-ID can pass that
value directly to SessionStore.LoadEvents(After: ...) without going
through the store.

toSessionEvent reuses event.EventID instead of minting a new UUID, with
a defensive fallback for test fixtures that construct events directly.
makeInputSessionEvent is unchanged (no upstream AgentEvent).

Change-Id: I525675f948aef55d6afc0dcdc46bd7e255a68311
Persist lifecycle, model span, retry/failover, and tool observation events through the managed session log while preserving the EventID identity contract between live AgentEvents and stored SessionEvents.

Add focused coverage for replay boundaries, timeline exposure, retry/failover observability, model usage metadata, gob compatibility, and persistence guards.

Change-Id: Id78c137b40265324d315237067a82e8ea1ffc8ef
Change-Id: I84f8155f66807d6e229741b9f5b3f6345f58f1d5
Change-Id: Id53b725579b6e7ceaeb9f4edd3c65a11e092f8e1
Delay loaded checkpoint abandonment until fresh-turn agent preparation succeeds, and fail before execution if checkpoint deletion fails.

Fold durable attack coverage into normal ADK test suites and remove the standalone attack test file.

Change-Id: I43fa3cd8d25dbd4ea3d1ca4c845191fa5a9c503d
Change-Id: I6714153ef2f58bed54dce9dfe3fe73bace3f6f04
Change-Id: I6fdded746daacdaa973c1914bfb789230254b17f
Change-Id: Ib830e9e647f9f2f9429d50136153cd253305c382
Add a JSONL-backed session store, expose schema serializers for session persistence, and wire configurable session event serialization through runner reconstruction and persistence.

Change-Id: Ic6bf3905162c2f730428cf1bc5b65f1c8d8567cd
Add exported helper docs, reduce TurnLoop.run length by extracting input collection, and group model span end parameters to satisfy revive in CI.

Change-Id: Ie23013d91d3d502d50c3787fa3e9ec191feeaf35
Replace the blocking ErrPendingSessionCheckpoint behavior with automatic
checkpoint deletion via deleteCheckPointIfSupported. Calling Run is an
explicit intent to start a new turn; session correctness is guaranteed
by event log replay, not checkpoint presence.

Change-Id: I9a19b4b44c90d134c28a4d2cc2746cd44d705144
shentongmartin and others added 26 commits June 11, 2026 13:18
Change-Id: Ia1c5c6eb00a2734b4e6841fe3729a31c813e39e3
Change-Id: I93e66a21e371479ca5c88956208cccbd91109b11
Change-Id: Id2582f94957a547e8c9883d26db2d40a42d2c285
Add opt-in cleanup for orphan and duplicate tool results, strict validation for malformed histories, and persisted session mutation events for replay consistency.

Change-Id: Ib01ec6d0b5a8b0c621cc435e82ef5f13f9944d9f
Stop tracking examples and ext gitlinks from the root repository and ignore those local working trees going forward.

Change-Id: Ieda10721a80041d4e73f93b4b0503c55931196d8
Add rich execute input handling for filesystem shell tools, document DeepAgent's manual middleware path for advanced filesystem configuration, and cover shell-only and DeepAgent integration paths.

Change-Id: If3b23cb34cb3336b96611fc25ea2aa451519d4e8
Change-Id: I4381f0e50ad1a58f14bc390607f74cf5d0956bfd
Move fenced session ownership to an external token callback consumed only at append boundaries, strengthen append-tail conformance coverage, and document the atomic append and exact replay contract.

Change-Id: I4eea15e85785b8e9ad1c5373c0286dfdec0dd71d
Change-Id: I29e8e458608bec90a4b5a779c4299fa7d1ce5d65
Change-Id: I46efb3a9f136f18a666b5812723b5743e6d8f3f3
Change-Id: Id6e6ecc790137f3b4f2d7a55b37598047e2c92b9
Change-Id: I898c2fe90d3457176f59f3641b6562f36f16b282
Change-Id: Ibda1e56cd2cd0173a87f574a07598db80596b17e
Change-Id: Ia50beebf6962206cb146db52c1df8aafede3b7c9
Change-Id: I544f633257f94e9eab91c74ce434148020d44e59
Change-Id: I67d64f006af0a29374eb68737b5d198b72efb9f7
Make managed session persistence use explicit boundary commits and remove the configurable async batching path. Preserve event identity through the runner and add coverage for checkpoint durability and handle cleanup failure paths.

Change-Id: I29d9c957cfbb23017cd0a3edee637eececa2938a
Change-Id: Ibd45f8772b459399d6fb93d1f623e2b7f13e559d
Change-Id: I23d56f5e7c295357c73f34f5270f196de5c25bc7
Change-Id: Icf5f48ef8da03d419aa2629ddc60a7874e95d7dc
Change-Id: I282c18e00757471e708379a8e09b6db7df682606
Change-Id: Ibb04a1fd7ddd47f68ef7a05faaca9d2856fa88b1
Change-Id: Ie4c567b307f834ad9e3e2a7efadaaab06623bfb8
Carry interrupt semantics solely on AgentInterruptContext.Info; ADK no
longer classifies interrupts and lets the triggering component own the
payload type.

Change-Id: I7ae6a00ff34719167b4bfe8218df2b46bcdcf014
StreamReader.Copy duplicates readers, not elements, so pointer elements
(e.g. *Message) are shared across fan-out branches and may be consumed
concurrently. In-place writes to map fields such as Message.Extra cause
concurrent map read/write panics.

Add GoDoc guidance at the places users are most likely to hit this:
- StreamReader.Copy: copies share elements; treat them as read-only
- StreamReaderWithConvert: convert must not modify its input in place
- Message.Extra / AgenticMessage.Extra: copy-on-write instructions
- adk WrapModel: stream-transform wrappers must return copies

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
@codecov

codecov Bot commented Jun 12, 2026

Copy link
Copy Markdown

Codecov Report

✅ All modified and coverable lines are covered by tests.
⚠️ Please upload report for BASE (alpha/10@12955df). Learn more about missing BASE report.

Additional details and impacted files
@@             Coverage Diff             @@
##             alpha/10    #1074   +/-   ##
===========================================
  Coverage            ?   83.07%           
===========================================
  Files               ?      172           
  Lines               ?    28228           
  Branches            ?        0           
===========================================
  Hits                ?    23449           
  Misses              ?     3222           
  Partials            ?     1557           

☔ View full report in Codecov by Harness.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Development

Successfully merging this pull request may close these issues.

2 participants