-
Notifications
You must be signed in to change notification settings - Fork 17
Engram docs #345
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Open
g-despot
wants to merge
24
commits into
main
Choose a base branch
from
engram
base: main
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Open
Engram docs #345
Changes from 1 commit
Commits
Show all changes
24 commits
Select commit
Hold shift + click to select a range
301bac7
Add initial docs
g-despot bc28bd0
Update docs
g-despot 1b934bb
Remove duplicate reference
g-despot a129d89
Merge branch 'main' into engram
g-despot 8157717
Update docs
g-despot bd35628
Update docs
g-despot cbde720
Add three tutorials
g-despot 6ba99c6
Merge branch 'main' into engram
g-despot d8469d5
Update concepts
g-despot 4c90794
Update docs
g-despot 11e3181
Minor update
g-despot 567503c
Improve Engram docs
g-despot 4e54443
Improve code verification
g-despot 4ed4952
Update docs
g-despot 165cda3
Improve code
g-despot 85f9cab
Update concepts
g-despot f398fad
Minor updates
g-despot 70a714d
Improve diagrams and linking
g-despot 0a69c5a
Add links
g-despot ff9a955
Improve diagram
g-despot 5f40a64
Improve concepts
g-despot fc28b39
Link to REST API
g-despot 4c6bac3
Update diagrams
g-despot c4f56a6
Implement feedback
g-despot File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,68 @@ | ||
| --- | ||
| title: API reference | ||
| sidebar_position: 4 | ||
| description: "Engram REST API overview: authentication, base URL, and interactive API reference." | ||
| --- | ||
|
|
||
| The Engram API is a REST API for storing, searching, and managing memories. | ||
|
|
||
| ## Base URL | ||
|
|
||
| Your project's API URL is available in the [Weaviate Cloud console](https://console.weaviate.cloud). It follows the format: | ||
|
|
||
| ``` | ||
| https://your-project.engram.weaviate.cloud | ||
| ``` | ||
|
|
||
| ## Authentication | ||
|
|
||
| Include your API key in the `Authorization` header: | ||
|
|
||
| ``` | ||
| Authorization: Bearer eng_your_api_key | ||
| ``` | ||
|
|
||
| API keys are scoped to a project. All requests authenticated with a key operate within that project's scope. You can create and manage API keys in the Weaviate Cloud console. | ||
|
|
||
| ## Interactive API reference | ||
|
|
||
| The full API reference is generated from the OpenAPI spec and includes request/response schemas, parameter details, and example payloads. | ||
|
|
||
| **[Open the interactive API reference](/engram/api-reference/rest)** | ||
|
|
||
| ## Endpoints | ||
|
|
||
| | Method | Path | Description | | ||
| |--------|------|-------------| | ||
| | POST | `/v1/memories` | Store a new memory (async) | | ||
| | GET | `/v1/memories/{id}` | Get a memory by ID | | ||
| | DELETE | `/v1/memories/{id}` | Delete a memory | | ||
| | POST | `/v1/memories/search` | Search memories | | ||
| | GET | `/v1/runs/{run_id}` | Get pipeline run status | | ||
|
|
||
| ## Error format | ||
|
|
||
| All error responses use a consistent format: | ||
|
|
||
| ```json | ||
| { | ||
| "status": 400, | ||
| "message": "error description" | ||
| } | ||
| ``` | ||
|
|
||
| ### HTTP status codes | ||
|
|
||
| | Code | Description | | ||
| |------|-------------| | ||
| | 400 | Bad request — invalid input or missing required fields | | ||
| | 401 | Unauthorized — missing or invalid API key | | ||
| | 403 | Forbidden — insufficient permissions | | ||
| | 404 | Not found — resource does not exist | | ||
| | 500 | Internal server error | | ||
|
|
||
| ## Questions and feedback | ||
|
|
||
| import DocsFeedback from '/_includes/docs-feedback.mdx'; | ||
|
|
||
| <DocsFeedback/> |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,131 @@ | ||
| --- | ||
| title: Concepts | ||
| sidebar_position: 2 | ||
| description: "Core concepts in Engram: memories, topics, groups, scoping, pipelines, retrieval types, and runs." | ||
| --- | ||
|
|
||
| This page explains the core concepts behind Engram's memory system. | ||
|
|
||
| ## Memories | ||
|
|
||
| A memory is a discrete piece of information stored in Engram. Each memory has: | ||
|
|
||
| - **Content** — The text of the memory (e.g. "The user prefers dark mode"). | ||
| - **Topic** — The category it belongs to (e.g. `user_facts`, `preferences`). | ||
| - **Group** — The memory group that defines how it was processed. | ||
| - **Scope** — The project, user, and conversation it belongs to. | ||
| - **Tags** — Optional string labels for additional classification. | ||
|
|
||
| Memories are automatically embedded as vectors, making them searchable by meaning. | ||
|
|
||
| ## Topics | ||
|
|
||
| Topics are named categories within a group. They tell Engram what kinds of information to extract and how to scope it. | ||
|
|
||
| Each topic has: | ||
|
|
||
| | Property | Description | | ||
| |----------|-------------| | ||
| | `name` | Unique identifier within the group (e.g. `user_facts`) | | ||
| | `description` | Natural language description used in LLM prompts during extraction (e.g. "What food the user likes to eat") | | ||
| | `scoping` | Whether the topic requires a `user_id` and/or `conversation_id` | | ||
| | `is_bounded` | Whether the topic has size constraints | | ||
|
|
||
| The topic `description` is important — it's what the extraction pipeline uses to decide how to categorize information. For example, a travel agent might have separate topics with descriptions like "The places the user would like to visit" and "What food the user likes to eat" so the pipeline can route extracted facts to the right topic. | ||
|
|
||
| When you create a project, Engram sets up a default group with a default topic automatically. | ||
|
|
||
| ## Groups | ||
|
|
||
| A group bundles a pipeline definition with one or more topics. Each project can have multiple named groups, but most use cases only need the `default` group. | ||
|
|
||
| Groups provide: | ||
|
|
||
| - A stable UUID identifier for the pipeline configuration | ||
| - Topic definitions that control what gets extracted | ||
| - Pipeline steps that define the processing flow | ||
| - Topic name isolation — different groups can have topics with the same name without collision (e.g. two agents can each have a `user_preferences` topic in separate groups) | ||
|
|
||
| ## Scoping | ||
|
|
||
| Engram uses a multi-level scoping system to isolate memories: | ||
|
|
||
| - **Project** — Always required. Every memory belongs to a project, identified by the API key. | ||
| - **User** — Required for user-scoped topics. Memories are strictly isolated between users. | ||
| - **Conversation** — Required when storing to conversation-scoped topics. Optional when searching (see below). | ||
|
|
||
| Which scopes are required depends on the topic configuration: | ||
|
|
||
| ### User-scoped topics | ||
|
|
||
| User-scoped topics store memories that belong to an individual user, such as preferences or personal details. Memories are strictly isolated between users — a query for one `user_id` never returns another user's memories. Both storing and searching require the `user_id`. | ||
|
|
||
| ### Project-wide topics | ||
|
|
||
| Topics that are not user-scoped are shared across the entire project. These are useful for procedural memory — things an agent learns about how to perform a task, regardless of which user it is working with. No `user_id` is needed for storing or searching. | ||
|
|
||
| ### Conversation-scoped topics | ||
|
|
||
| Conversation-scoped topics associate memories with a specific conversation. When **storing**, you must provide the `conversation_id`. When **searching**, the `conversation_id` is optional: | ||
|
|
||
| - **With `conversation_id`** — Returns only memories from that conversation (e.g. to get a summary of a specific chat). | ||
| - **Without `conversation_id`** — Returns memories across all conversations (e.g. to find everything a user has discussed). | ||
|
|
||
| Conversation-scoped topics are typically also user-scoped (e.g. conversation summaries are private to a user). | ||
|
|
||
| ### Multiple topics in one request | ||
|
|
||
| A single request can interact with multiple topics. When it does, the required scope parameters are the union of each topic's requirements. For example, if one topic requires `user_id` and another requires `conversation_id`, the request must include both. | ||
|
|
||
| ## Pipelines | ||
|
|
||
| When you send content to Engram, it runs through an asynchronous pipeline that extracts, transforms, and commits memories. Pipelines are defined as a directed acyclic graph (DAG) of steps. | ||
|
|
||
| ### Input types | ||
|
|
||
| Engram accepts three types of input content: | ||
|
|
||
| | Type | Description | Use case | | ||
| |------|-------------|----------| | ||
| | `string` | Raw text | Free-form notes, agent observations | | ||
| | `pre_extracted` | Already-structured content | When you've done your own extraction | | ||
| | `conversation` | Multi-turn messages with roles | Chat transcripts, agent conversations | | ||
|
|
||
| ### Pipeline steps | ||
|
|
||
| Each pipeline processes content through a sequence of steps: | ||
|
|
||
| 1. **Extract** — Pulls structured memories from the input content. The extraction method depends on the input type (`ExtractFromString`, `ExtractFromConversation`, or `ExtractFromPreExtracted`). | ||
| 2. **Transform** — Refines extracted memories using existing context. Steps like `TransformWithContext` and `TransformOperations` deduplicate, merge, and resolve conflicts with existing memories. | ||
| 3. **Commit** — Finalizes the operations (create, update, delete) and persists them to storage. | ||
|
|
||
| ## Retrieval types | ||
|
|
||
| Engram supports three search strategies: | ||
|
|
||
| | Type | Description | Best for | | ||
| |------|-------------|----------| | ||
| | `vector` | Pure semantic search using embeddings | Finding conceptually related memories | | ||
| | `bm25` | Full-text keyword search | Exact term matching | | ||
| | `hybrid` | Combination of vector and BM25 | General-purpose search (recommended) | | ||
|
|
||
| You specify the retrieval type in the `retrieval_config` when searching. | ||
|
|
||
| ## Runs | ||
|
|
||
| Each call to store memories creates a **run** — a trackable unit of pipeline execution. Runs have four possible states: | ||
|
|
||
| | Status | Meaning | | ||
| |--------|---------| | ||
| | `running` | Pipeline is actively processing | | ||
| | `in_buffer` | Queued and waiting to start | | ||
| | `completed` | All operations committed successfully | | ||
| | `failed` | An error occurred during processing | | ||
|
|
||
| When a run completes, its `committed_operations` field shows exactly which memories were created, updated, or deleted. | ||
|
|
||
| ## Questions and feedback | ||
|
|
||
| import DocsFeedback from '/_includes/docs-feedback.mdx'; | ||
|
|
||
| <DocsFeedback/> |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,83 @@ | ||
| --- | ||
| title: Check run status | ||
| sidebar_position: 4 | ||
| description: "How to poll pipeline run status and interpret committed operations in Engram." | ||
| --- | ||
|
|
||
| When you store memories, Engram processes them asynchronously through a pipeline. Each request returns a `run_id` that you can use to track progress. | ||
|
|
||
| ## Poll a run | ||
|
|
||
| ```bash | ||
| curl $ENGRAM_API_URL/v1/runs/{run_id} \ | ||
| -H "Authorization: Bearer $ENGRAM_API_KEY" | ||
| ``` | ||
|
|
||
| ### Response | ||
|
|
||
| ```json | ||
| { | ||
| "run_id": "run-uuid", | ||
| "status": "completed", | ||
| "group_id": "group-uuid", | ||
| "starting_step": 0, | ||
| "input_type": "string", | ||
| "error": null, | ||
| "committed_operations": { | ||
| "created": [ | ||
| { | ||
| "memory_id": "memory-uuid-1", | ||
| "committed_at": "2025-01-01T00:00:01Z" | ||
| } | ||
| ], | ||
| "updated": [], | ||
| "deleted": [] | ||
| }, | ||
| "created_at": "2025-01-01T00:00:00Z", | ||
| "updated_at": "2025-01-01T00:00:01Z" | ||
| } | ||
| ``` | ||
|
|
||
| ## Run statuses | ||
|
|
||
| | Status | Meaning | | ||
| |--------|---------| | ||
| | `running` | Pipeline is actively processing the content | | ||
| | `in_buffer` | Run is queued and waiting to start | | ||
| | `completed` | All operations have been committed successfully | | ||
| | `failed` | An error occurred during processing | | ||
|
|
||
| ## Committed operations | ||
|
|
||
| When a run completes, the `committed_operations` field tells you exactly what changed: | ||
|
|
||
| - **`created`** — New memories that were added to storage. | ||
| - **`updated`** — Existing memories that were modified (e.g. merged or refined). | ||
| - **`deleted`** — Memories that were removed (e.g. superseded by an update). | ||
|
|
||
| Each entry includes the `memory_id` and a `committed_at` timestamp. | ||
|
|
||
| ## Handling failures | ||
|
|
||
| If a run fails, the `error` field contains a description of what went wrong. | ||
|
|
||
| ```json | ||
| { | ||
| "run_id": "run-uuid", | ||
| "status": "failed", | ||
| "error": "extraction failed: invalid input format", | ||
| "committed_operations": null, | ||
| "created_at": "2025-01-01T00:00:00Z", | ||
| "updated_at": "2025-01-01T00:00:01Z" | ||
| } | ||
| ``` | ||
|
|
||
| :::tip | ||
| For production systems, implement a polling loop that checks the run status at regular intervals (e.g. every 1-2 seconds) until the status is `completed` or `failed`. | ||
| ::: | ||
|
|
||
| ## Questions and feedback | ||
|
|
||
| import DocsFeedback from '/_includes/docs-feedback.mdx'; | ||
|
|
||
| <DocsFeedback/> | ||
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,20 @@ | ||
| --- | ||
| title: Guides | ||
| sidebar_position: 3 | ||
| description: "Step-by-step guides for storing, searching, and managing memories in Engram." | ||
| --- | ||
|
|
||
| These guides cover common Engram operations with detailed examples. | ||
|
|
||
| ## Available guides | ||
|
|
||
| - **[Store memories](store-memories.md)** — Send string, pre-extracted, or conversation content to Engram. | ||
| - **[Search memories](search-memories.md)** — Query memories using vector, BM25, or hybrid search with filtering and scoping. | ||
| - **[Manage memories](manage-memories.md)** — Get and delete individual memories by ID. | ||
| - **[Check run status](check-run-status.md)** — Poll pipeline runs and interpret committed operations. | ||
|
|
||
| ## Questions and feedback | ||
|
|
||
| import DocsFeedback from '/_includes/docs-feedback.mdx'; | ||
|
|
||
| <DocsFeedback/> |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,64 @@ | ||
| --- | ||
| title: Manage memories | ||
| sidebar_position: 3 | ||
| description: "How to get and delete individual memories in Engram by ID." | ||
| --- | ||
|
|
||
| You can retrieve and delete individual memories using their ID. | ||
|
|
||
| ## Get a memory | ||
|
|
||
| Retrieve a single memory by its ID. | ||
|
|
||
| ```bash | ||
| curl $ENGRAM_API_URL/v1/memories/{id}?user_id={user-uuid}&topic={topic-name}&group={group-name} \ | ||
| -H "Authorization: Bearer $ENGRAM_API_KEY" | ||
| ``` | ||
|
|
||
| ### Query parameters | ||
|
|
||
| | Parameter | Type | Description | | ||
| |-----------|------|-------------| | ||
| | `user_id` | string | User scope (required if the topic is user-scoped) | | ||
| | `conversation_id` | string | Conversation scope (required if the topic is conversation-scoped) | | ||
| | `topic` | string | The topic the memory belongs to | | ||
| | `group` | string | The memory group name | | ||
|
|
||
| ### Response | ||
|
|
||
| ```json | ||
| { | ||
| "id": "memory-uuid", | ||
| "project_id": "project-uuid", | ||
| "user_id": "user-uuid", | ||
| "conversation_id": null, | ||
| "content": "The user prefers dark mode", | ||
| "topic": "user_facts", | ||
| "group": "default", | ||
| "tags": ["preference", "ui"], | ||
| "created_at": "2025-01-01T00:00:00Z", | ||
| "updated_at": "2025-01-01T00:00:00Z", | ||
| "score": null | ||
| } | ||
| ``` | ||
|
|
||
| ## Delete a memory | ||
|
|
||
| Remove a memory permanently by its ID. | ||
|
|
||
| ```bash | ||
| curl -X DELETE $ENGRAM_API_URL/v1/memories/{id}?user_id={user-uuid}&topic={topic-name}&group={group-name} \ | ||
| -H "Authorization: Bearer $ENGRAM_API_KEY" | ||
| ``` | ||
|
|
||
| The query parameters are the same as for the get request. You must provide the correct scoping parameters to identify the memory. | ||
|
|
||
| :::warning | ||
| Deleting a memory is permanent and cannot be undone. | ||
| ::: | ||
|
|
||
| ## Questions and feedback | ||
|
|
||
| import DocsFeedback from '/_includes/docs-feedback.mdx'; | ||
|
|
||
| <DocsFeedback/> |
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think we want to encourage this generally. The idea is that most of the time people shouldn't be waiting on the pipelines completing, the memories should just be eventually consistent.
While the status could change to
failedpart way through a pipeline running, that should only be because of internal errors. It's probably best IMO to encourage the user to just check the run status that is returned frommemories.addto see if it errored immediately.