ABES: Structured Belief Revision & Dynamic Epistemic Memory for Autonomous Agents

ABES is a structured belief revision backend for autonomous agents. It ingests natural language inputs, extracts and stores discrete beliefs, and manages them through a 15-phase agent pipeline covering confidence scoring, contradiction detection, salience decay, and evidence tracking.

Install & Quickstart

Requirements

Python 3.10+
Node.js 18+ (optional debug UI only)
Ollama (optional, for LLM-backed mutation and chat)

External models used:

cross-encoder/nli-deberta-v3-base — contradiction detection via NLI
sentence-transformers/all-MiniLM-L6-v2 — embedding similarity and scoring
Ollama or compatible LLM provider — mutation, narrative, and chat generation

Local setup

git clone https://github.com/Aftermath-Technologies-Ltd/abes.git
cd abes

python -m venv .venv
source .venv/bin/activate
pip install -e ".[dev]"

Start the backend

# Terminal 1
PYTHONPATH=$PWD uvicorn backend.api.app:app --host 0.0.0.0 --port 8000

# Terminal 2 — add a belief
curl -X POST http://localhost:8000/beliefs \
  -H "Content-Type: application/json" \
  -d '{"content": "System target is alpha-node-4", "confidence": 0.9, "source": "agent"}'

# Retrieve all beliefs
curl http://localhost:8000/beliefs | python3 -m json.tool

Optional debug UI

cd frontend
npm install
npm run dev

Docker

docker compose up
docker compose up --profile ui
docker compose up --profile llm --profile ui
STORAGE_BACKEND=sqlite docker compose up

Architecture Overview

ABES is a backend-first service with an optional TypeScript debug UI. The core is a belief store (in-memory or SQLite) managed by a 15-phase agent pipeline.

Core flow:

Agents or chat clients send payloads via REST or WebSocket.
The scheduler runs 15 ordered phases over the current belief set.
Beliefs are updated in the store.
The relevance stack is selected and ranked for response generation.
Safety and response validation run before output is returned.

Belief Pipeline (15 Phases)

Phase	Agent	What it does
1	Perception	Extracts candidate belief statements from raw input
2	Creation	Creates new Belief records from extracted candidates
3	Reinforcement	Increases confidence and salience on corroborating evidence
4	Decay	Applies time-based confidence and salience decay
5	Contradiction	Detects contradicting pairs via rule-based analysis + NLI fallback
6	Mutation	Proposes revised beliefs for high-tension, low-confidence entries
7	Resolution	Applies resolution strategies to contradiction pairs
8	Relevance	Scores and ranks beliefs by cosine similarity to current context
9	RL Policy	Applies learned parameter adjustments to control variables
10	Consistency	Checks belief state invariants
11	Safety	Detects prompt injection and system prompt extraction attempts
12	Baseline	Bridges to baseline memory implementations for comparison
13	Narrative	Generates summaries of belief state changes
14	Experiment	Collects telemetry for evaluation runs
15	Consolidation	Merges near-duplicate beliefs; prunes low-salience orphans

Belief Model Fields

id, content, confidence, tension, salience
status: active, decaying, dormant, mutated, deprecated
is_axiom: immutable, immune to decay and deprecation
memory_tier: L1 (working), L2 (episodic), L3 (deep)
half_life_days, evidence_for, evidence_against, evidence_balance
links, parent_id, user_id, session_id, origin

Key formulas:

Salience decay: s(t) = s0 * 0.5^(elapsed_hours / (half_life_days * 24))
Confidence update: posterior = 0.7 * evidence_weight + 0.3 * prior_confidence
Stack ranking: weighted score over confidence, relevance, salience, recency, and tension

Usage

CLI commands

Command	Description
`abes demo`	Run scripted ingestion demo (12-turn sequence)
`abes chat`	Launch backend
`abes seed`	Load seed beliefs from JSON
`abes inspect`	Show current belief state
`abes verify-quick`	Run cognitive smoke test
`abes verify-determinism`	Compare repeated runs for reproducibility

abes demo --headless
abes demo --headless --with-decay --decay-hours 12
abes inspect --json-out | jq .
abes verify-quick --prompts 200
abes verify-determinism --runs 5

API Reference Summary

Route	Method	Description
`/beliefs`	GET	List beliefs with optional status/tag filtering
`/beliefs`	POST	Create a new belief
`/beliefs/{id}`	GET	Get single belief
`/beliefs/{id}/ecology`	GET	Full belief state including evidence ledger and links
`/beliefs/fitness`	GET	Beliefs ranked by fitness with tier utilization stats
`/beliefs/clusters`	GET	Cluster summaries with tension and fitness metrics
`/chat/`	POST	Ingest a message through the full pipeline
`/clusters`	GET	List semantic clusters
`/rl/status`	GET	RL policy status and training metrics
`/bel/stack`	GET	Current ranked belief stack for a context query
`/snapshots`	GET	List belief state snapshots
`/docs`	GET	Interactive OpenAPI documentation

Eval Results Summary

1000-Prompt Cognitive Evaluation

Result: 825/1000 (82.5%), 95% CI [0.800, 0.848] — tested on Llama 3.1 8B, single model, self-reported. See docs/EVALUATIONS.md for full breakdown, failure analysis, and planned baselines.

Domain	Score
Episodic Memory	96.8%
Working Memory	94.4%
Semantic Memory	92.8%
Selective Attention	85.6%
Self-Correction	82.4%
Reasoning	79.2%
Language Comprehension	70.4%
Social Cognition	58.4%

Side-by-Side: Structured Memory Layer vs. Stateless LLM

This comparison demonstrates the value of a structured memory layer over a stateless LLM. ABES and raw Ollama receive identical prompts; the score difference reflects what persistent belief management adds, not model quality.

Metric	ABES (structured memory)	Ollama (stateless)
Blocks passed	14/15	6/15
Contradiction detection	Rule-based + NLI	Context window only
Session isolation	Zero cross-session leakage	No memory
Safety (prompt injection)	0 leaks across 5 vectors	3 leaks

Full protocol: docs/side_by_side_eval.md | Results: results/side_by_side_eval.json

Limitations

Modality and temporal contradiction detection are weaker than quantifier and numeric rules.
Social cognition and moral reasoning scores are primarily limited by LLM refusal behavior in the underlying model, not belief-layer mechanics.
In-memory storage is the default; use SQLite (STORAGE_BACKEND=sqlite) for persistence across restarts.
Multi-agent concurrency has not been load-tested.
All evaluations used a single model (Llama 3.1 8B). No external baselines (Mem0, LangGraph MemoryStore, vector DB + reranker) have been benchmarked yet.

Contributing

See CONTRIBUTING.md.

License

GNU Affero General Public License v3.0 (AGPL-3.0)

Attribution requirements: NOTICE. Citation metadata: CITATION.cff.

Name		Name	Last commit message	Last commit date
Latest commit History 10 Commits
.github/workflows		.github/workflows
backend		backend
baselines		baselines
beliefs		beliefs
data		data
docs		docs
examples		examples
experiments		experiments
frontend		frontend
interfaces		interfaces
metrics		metrics
results		results
scripts		scripts
tests		tests
.dockerignore		.dockerignore
.env.example		.env.example
.gitignore		.gitignore
CHANGELOG.md		CHANGELOG.md
CITATION.cff		CITATION.cff
CONTRIBUTING.md		CONTRIBUTING.md
Dockerfile		Dockerfile
LICENSE		LICENSE
NOTICE		NOTICE
README.md		README.md
docker-compose.yml		docker-compose.yml
pyproject.toml		pyproject.toml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

ABES: Structured Belief Revision & Dynamic Epistemic Memory for Autonomous Agents

Quick Navigation

Install & Quickstart

Requirements

Local setup

Start the backend

Optional debug UI

Docker

Architecture Overview

Belief Pipeline (15 Phases)

Belief Model Fields

Usage

CLI commands

API Reference Summary

Eval Results Summary

1000-Prompt Cognitive Evaluation

Side-by-Side: Structured Memory Layer vs. Stateless LLM

Limitations

Contributing

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

ABES: Structured Belief Revision & Dynamic Epistemic Memory for Autonomous Agents

Quick Navigation

Install & Quickstart

Requirements

Local setup

Start the backend

Optional debug UI

Docker

Architecture Overview

Belief Pipeline (15 Phases)

Belief Model Fields

Usage

CLI commands

API Reference Summary

Eval Results Summary

1000-Prompt Cognitive Evaluation

Side-by-Side: Structured Memory Layer vs. Stateless LLM

Limitations

Contributing

License

About

Topics

Resources

License

Contributing

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages