Base URL:
http://<host>:8000
Protocol: HTTP/1.1 REST (JSON)
Content-Type:application/json
Tenant Isolation: Whenauth.enabled=false(development default), memory and graph requests MUST carry theX-Tenant-IDheader. Public ops endpoints (/health,/metrics,/metrics_prom) are exempt. Whenauth.enabled=true, the tenant is parsed from the Token/JWT, but it is still highly recommended to includeX-Tenant-IDfor explicit alignment and debugging.
This document targets developers integrating with LATRACE Memory. The goal is to provide a comprehensive, runnable, and aligned integration contract, detailing HTTP boundaries, public vs internal endpoints, and data contracts.
LATRACE Memory is a multimodal / full-modal memory service: text, image, audio, and video are compiled into the same tenant-isolated backbone and converge on the same typed graph and retrieval surface.
Applicability:
- For ADK Semantic Tools (Layer 1), please review
adk_integration.mdfirst (which covers tool schemas, runtime wiring, and agent orchestration). - For tenant boundaries and request scoping, please review
tenant_isolation.md(which coverstenant_id,user_tokens, and namespace guidance). - This documentation strictly covers External HTTP Contracts; internal processes are omitted unless necessary.
- Overview
- Authentication & Security
- Quick Start
- Core Concepts & Data Contracts
- API Reference (HTTP)
- Errors & Retries
- Limits & Performance
- Versioning & Compatibility
- Best Practices: Dialog Ingress (Session Write)
LATRACE Memory is a "Multimodal Memory Service", segregated into three core layers:
- Memory Layer (Recall + Filter + Optional Graph Expansion + Rerank): Evaluated via
POST /search. Built for injecting broad context into LLMs. - Graph Layer (Typed TKG: Entities/Events/Evidences/TimeSlices/Knowledge): Evaluated via
/graph/v0/*and/graph/v1/*. Built for strict structure, exact event tracking, and explicit timelines. - Multimodal Ingress Layer (Dialog / Audio / Video): Evaluated via
POST /ingest/dialog/v1,POST /ingest/media/video/v1, andPOST /ingest/media/audio/v1. Built for compiling raw sources into the same graph and retrieval backbone.
Rule of Thumb: /search handles fuzzy conceptual recall. The Graph APIs handle structured exact topological queries. Multimodal ingest handles source-to-graph compilation; it does not create a separate memory silo for media.
- Writing (Recommended: Batch upon session close)
- High-Level:
POST /ingest/dialog/v1(the recommended session-ingest endpoint; handles the ingestion workflow automatically). - High-Level:
POST /ingest/media/video/v1andPOST /ingest/media/audio/v1(source-based multimodal ingress; compile into the same graph). - Low-Level (Fallback):
POST /write(StoreMemoryEntryvectors explicitly).
- High-Level:
- Retrieving (During Agent Runtime)
- High-Level:
POST /retrieval/dialog/v2(Multi-path recall + fusion + optional Synthesis). - Low-Level (Fallback):
POST /search(Returns evidencehits+neighbors+trace).
- High-Level:
POST /retrieval/dialog/v2: Advanced Dialogue Orchestration (Multi-channel recall + Graph interpretation + Option QA). This is the default recommendation.POST /search: Vector Recall (Qdrant ANN). Optional overlays for BM25, TKG expansion, and time decay. Best for implicit natural language.POST /graph/v1/search: Structured Event Search (Neo4j). Explicit keyword/entity matching. It is not an approximation fallback.
If your query relies heavily on implicit NLP: do not force Graph-first recall. Trigger /retrieval/dialog/v2.
Memory Server (Service Root)
│
├── 📦 1. Write & Lifecycle
│ ├── POST /write (Atomic Write: Text/Vector)
│ ├── POST /update (Deep-patch Entry)
│ ├── POST /delete (Delete Entry)
│ ├── POST /link (Edge Creation)
│ ├── POST /batch_delete (Batch Remove)
│ ├── POST /memory/v1/clear (Tenant Cache Wipe, supports dry-run)
│ └── POST /rollback (Version Rollback)
│
├── 🔍 2. Core Search
│ ├── POST /search (★ Main Entry: Hybrid + Graph)
│ ├── POST /timeline_summary (Time-Series Summary)
│ ├── POST /speech_search (Audio-Transcription Keyword hit)
│ ├── POST /entity_event_anchor (Spatiotemporal Anchor localization)
│ └── POST /object_search (Visual Object Search)
│
├── 🕸️ 3. Graph TKG
│ ├── POST /graph/v1/search (Structured Event Target Search)
│ ├── POST /graph/v0/upsert (Node Write injection)
│ ├── GET /graph/v0/events/{event_id}
│ ├── GET /graph/v0/entities/{entity_id}/timeline
│ ├── GET /graph/v0/entities/{entity_id}/evidences
│ ├── GET /graph/v0/entities/resolve
│ └── GET /graph/v0/explain/* (Evidence Chain Derivation)
│
├── 🤖 4. Multimodal Ingress & Retrieval
│ ├── POST /ingest/dialog/v1 (Session dialog ingestion)
│ ├── POST /ingest/media/video/v1 (Video source ingestion)
│ ├── POST /ingest/media/audio/v1 (Audio source ingestion)
│ ├── GET /ingest/jobs/{id} (Job state tracking)
│ └── POST /retrieval/dialog/v2 (Orchestrated inference recall)
│
└── ⚙️ 5. Infrastructure & Ops
├── GET /health (Deep Health Status)
├── GET /metrics (Telemetry)
├── GET /config (System Snapshots)
├── PATCH /config (Hot-Reload Configurations)
└── POST /config/search/* (Dynamic Overrides)
Security operates across three tiers: Tenant Isolation → API Token (JWT) → HMAC Operations Signature.
| Header | Required | Description |
|---|---|---|
Content-Type |
Yes (POST/PATCH) | Strictly application/json |
Authorization |
Recommended | Bearer <token> |
X-API-Token |
Optional | Fallback format mapping |
X-Tenant-ID |
Dependent | MANDATORY for memory and graph routes if auth.enabled=false. Public ops routes (/health, /metrics, /metrics_prom) are exempt. If auth is enabled, the server maps by token, but explicit alignment is encouraged. |
X-Request-ID |
Recommended | UUID request tracing tag. |
When auth.signing.required=true, mutating endpoints (write/update/delete) demand:
X-Signature-Ts: Unix timestamp (Int). Tolerance default±300s.X-Signature: Hex-Digest (using tenant secret mapped tof"{ts}.{request_path}.{raw_request_body_bytes}").
- End-user provider keys are never passed via raw header parameters.
- LATRACE queries
api_token -> api_key_idlocally behind the firewall. - Unmapped keys organically fallback to platform-curated inference pools.
curl -sS "http://127.0.0.1:8000/health" -H "X-Tenant-ID: t1"curl -sS "http://127.0.0.1:8000/write" \
-H "Content-Type: application/json" \
-H "X-Tenant-ID: t1" \
-H "Authorization: Bearer <token>" \
-d '{
"upsert": true,
"entries": [
{
"kind": "semantic",
"modality": "text",
"contents": ["User likes Apples"],
"metadata": {
"user_id": ["u:1001"],
"memory_domain": "dialog",
"timestamp": 1734775200
}
}
]
}'curl -sS "http://127.0.0.1:8000/search" \
-H "Content-Type: application/json" \
-H "X-Tenant-ID: t1" \
-d '{
"query": "What does he like eating?",
"topk": 10,
"expand_graph": true,
"graph_backend": "memory",
"filters": {
"user_id": ["u:1001"],
"memory_domain": "dialog"
}
}'curl -sS "http://127.0.0.1:8000/ingest/media/video/v1" \
-H "Content-Type: application/json" \
-H "X-Tenant-ID: t1" \
-d '{
"routing": {
"user_id": ["u:1001"],
"memory_domain": "video"
},
"source_ref": {
"source_id": "demo.mp4",
"file_path": "/data/demo.mp4"
},
"overwrite_existing": true,
"enable_visual_operator": true,
"enable_audio_operator": true
}'curl -sS "http://127.0.0.1:8000/ingest/media/audio/v1" \
-H "Content-Type: application/json" \
-H "X-Tenant-ID: t1" \
-d '{
"routing": {
"user_id": ["u:1001"],
"memory_domain": "audio"
},
"source_ref": {
"source_id": "demo.wav",
"file_path": "/data/demo.wav"
},
"overwrite_existing": false,
"enable_audio_operator": true
}'A MemoryEntry dictates atomic vector storage objects.
| Field | Type | Required | Description |
|---|---|---|---|
id |
string |
No | Server assigns UUID if absent. |
kind |
string |
Yes | "episodic" | "semantic" |
modality |
string |
Yes | "text" | "image" | "audio" | "video" | "structured" |
contents |
string[] |
Yes | Core payload injected to contents[0]. |
metadata |
object |
Yes | Target routing filters (user_id, run_id, memory_domain, timestamp). |
Standard JSON configuration injected into filters mapping blocks.
| Field | Type | Description |
|---|---|---|
user_id |
string[] |
Restrict search matching isolated identities. |
user_match |
string |
"any" | "all" |
memory_domain |
string |
Workspace scopes (e.g. dialog, project_alpha). |
time_range |
object |
{gte: "start", lte: "end"} constraints in unix or ISO logic. |
topic_path |
string[] |
Node-tree explicit paths (travel/japan). |
- Response: Status map checking
vectors(Qdrant),graph(Neo4j),llm_provider, anddiskbuffers. - Error Code Mappings:
API_KEY_MISSING,AUTH_FAILED,BALANCE_BELOW_THRESHOLD.
The safest method to process conversations recursively.
With the default self-hosted .env.example settings, API auth is disabled and callers must send X-Tenant-ID. In that mode, user_tokens can be omitted because the server derives a stable user token from the tenant boundary.
| Field | Type | Req | Description |
|---|---|---|---|
session_id |
string |
Yes | Session lock guaranteeing identical message boundaries. |
user_tokens |
string[] |
No | Optional user mappings; derived from tenant scope when omitted. |
memory_domain |
string |
No | Partition workspace isolation. |
turns |
object[] |
Yes | Array of {turn_id, text/content, role(user/assistant/tool), timestamp_iso}. |
commit_id |
string |
No | Idempotent key blocking duplicate parallel execution. |
client_meta |
object |
Yes | Must include at least memory_policy and user_id; can also carry model overrides. |
Returns: {"ok": true, "job_id": "job_123", "status": "RECEIVED"} (Job queues automatically).
Media ingest is source-based. You submit a source_ref, the service creates a job, the worker compiles the source into the same TKG backbone, and the graph writer persists it with the same tenant/user/domain isolation as dialog ingest.
Common request shape
| Field | Type | Req | Description |
|---|---|---|---|
routing.user_id |
string[] |
Yes | Strong isolation axis. |
routing.memory_domain |
string |
Yes | Domain for governance and retrieval. |
routing.run_id |
string | null |
No | Optional compile/run identifier. |
routing.trace_id |
string | null |
No | Optional tracing identifier. |
source_ref.source_id |
string |
Yes | Stable source ID and purge scope. |
source_ref.file_path |
string | null |
One of | Local or mounted source path. |
source_ref.blob_ref |
string | null |
One of | Object storage reference. |
source_ref.recorded_at |
string | null |
No | Optional RFC3339 recording timestamp. |
commit_id |
string | null |
No | Idempotent key. |
overwrite_existing |
boolean |
No | When true, the service rewrites the same source scope safely. |
The service accepts source references, not raw file bodies. For public deployments, prefer blob_ref; for same-host / on-prem deployments, file_path is acceptable.
Checks asynchronous ingestion tracks.
Returns Status Enums: RECEIVED, STAGE2_RUNNING, STAGE2_FAILED, STAGE3_RUNNING, STAGE3_FAILED, COMPLETED.
Job Types: dialog, media_video, media_audio.
Advanced orchestration for LLM inference (Hybrid Match + Graph Logic).
| Field | Type | Req | Description |
|---|---|---|---|
query |
string |
Yes | Natural Language user prompt. |
user_tokens |
string[] |
No | Optional user mappings; derived from tenant scope when omitted. |
with_answer |
boolean |
No | Directs server to synthesize an LLM contextual answer array natively. |
topk |
number |
No | Default 30. |
client_meta |
object |
Yes | Must include at least memory_policy and user_id; can also supply BYOK/provider metadata. |
Returns Evidence Map: Containing tkg_event_id, score, text, _base_score and optionally tkg_explain array chains dictating exactly where memory derivations occurred.
The same retrieval surface can explain dialog, image, audio, and video-derived evidence because they land in the same graph and share the same isolation keys.
Raw Vector + Graph logic query executing nearest-neighbor algorithms over Qdrant.
Includes expand_graph=boolean dictating if Neo4j neighborhoods should be loaded, and threshold=number to slice similarity metrics rigidly before Reranker pipelines execute.
Low-level operations bypassing semantic LLM generation pipelines. Requires explicit entries: [], patch: {}, or graph links: [{src_id, dst_id, rel_type}].
Governs merges across separated identity mappings securely using approvals. Endpoints: /equiv/pending/add, /confirm, /remove.
Hot-reloads system heuristics without rebooting the memory core. Endpoints overlay memory.search.rerank, graph limits, and scope resolutions. Modifies alpha_vector, beta_bm25, gamma_graph, etc.
Structured entity, relation, and state surfaces.
GET /memory/v1/entities: Queries paginated Entity objects (name,type,first_mentioned,mention_count).GET /memory/v1/topics: Queries categorized hierarchical subjects running natively intopic_pathstructures.
Validates a {topic: "..."} and generates chronological state changes formatting into a timeline JSON array holding event_id, when, and summary per entry.
Pass {"entity": "Alice"} to extract 360-fields dictating:
facts: Raw immutable graph mappings.relations:co_occurs_withrelationship networks mapping to known contacts.recent_events: Immediate temporal context hits.
Fetches identical verbatim spoken records parsing UtteranceEvidence blocks natively mapped back to speaker_id identities to prevent hallucinated citations.
Requests temporal drift metrics tracking exactly when an entity/topic was last observed. Calculates a direct days_ago float representing temporal decay.
Atomic tracing interface. Given an explicitly mapped event_id="xyz", triggers a network dump querying interconnected entities, places, evidences, utterances, and knowledge tags defining the context strictly to frontend visualizers.
Total WIPE operation isolating against the targeted Tenant.
Requires explicit {"scope": "tenant", "reason": "wipe", "confirm": true} payloads. If confirm=false, dry-runs generating estimated vector counts.
Tracks shifting variable states against ISO limits.
POST /memory/state/current: Ingestssubject_idandproperty="job". Returnsitem: {value: "employed", valid_from: "2026-01-01"}.POST /memory/state/changes: Retrieves all sequential history permutations mutating an object over time.POST /memory/state/pending/list: Queries pending non-approved state collisions.POST /memory/state/pending/approve: Human-In-The-Loop authorization pipeline.
GET /memory/agentic/tools?format=openai: Automatically compiles native Tool Schemas mapped over local Python endpoints directly to REST parsable formatting schemas.POST /memory/agentic/execute: Triggers explicit function maps without routing algorithms.POST /memory/agentic/query: Submit aquery="Natural phrase". LATRACE will internally prompt a secondary LLM Router, decide the exacttool_used, serializetool_args, execute the matching system process asynchronously, and emit back aToolResult!
| HTTP | Meaning | Developer Mitigation Protocol |
|---|---|---|
201 |
Async Job Created | Wait; Query GET /ingest/jobs/{id} for completion. |
400 |
Payload Missing | Enforce valid JSON headers, roles, and required constraints. |
401 |
Unauthorized | Refresh JWT or X-API-Token. Ensure HMAC signatures map correct TS. |
403 |
Tenant Breach | Absolute blockage. X-Tenant-ID cross-contamination rejected. |
404 |
Data Missing | Safely handle null arrays internally; object isolated inside network does not exist. |
409 |
Conflict | Safely handle. commit_id matched an existing processing task. Skip. |
413 |
Over-Capacity | Restrict JSON payloads below 10MB bounds. |
429 |
Depleted Ratelimits | Sleep natively using exponential random offsets, commonly +60s. |
503 |
Reranker Subsystem Down | Subsystems temporarily isolated by active fault-breakers. Retry later. |
504 |
Graph Timeout | Multi-hop derivation exhausted system capacity. Reduce expand_graph bounds. |
- Asynchronous Batches: Submit entire dialogue sessions once they are finished or paused natively. Submitting on every single character generates unstable subgraph entities.
- Never Consolidate Text: Supplying
{"role": "user", "text": "USER said X, and AI said Y"}is architecturally forbidden. Separate each actor explicitly across index arrays soUtteranceEvidenceextraction targets correctly. - Commit Flags: Pass highly stable
commit_idhashes toPOST /ingestalgorithms natively to ensure that frontend network errors do not multiply entities on retry attempts. - Media Sources: For video and audio, submit a stable
source_refand let the service compile the source into the same graph. Preferblob_reffor remote deployments andfile_pathonly when the service can access the path directly.