Hybrid rules-engine + LLM system for healthcare claims processing — deterministic logic for auditable decisions, RAG-powered reasoning for edge cases.
Live at healthcare-agent-mu.vercel.app
Healthcare claims adjudication is a compliance-sensitive workflow. Every decision must be auditable — "the model decided" is not a valid answer in a regulatory audit. At the same time, the rules are complex enough that a pure rules engine leaves too many edge cases unhandled.
The standard industry approach: manual review for everything the rules don't catch. That's slow, expensive, and inconsistent.
Two-tier processing:
Tier 1 — Deterministic rules engine handles ~90% of claims. Every decision produces a line-by-line audit trail. No LLM involved. Fast, explainable, auditable.
Tier 2 — RAG-augmented LLM handles edge cases the rules don't cover. ChromaDB retrieves similar historical claims as context. Every LLM decision is logged with its retrieved context and confidence score — so auditors can trace exactly why the model decided what it did.
Human reviewers see both tiers in a unified UI and can override any decision with a logged reason.
Claim input (CSV or JSON)
↓
Rules engine → deterministic decision + audit log
↓ (if ambiguous)
ChromaDB similarity search → top-k historical claims
↓
LLM adjudication → decision + retrieved context logged
↓
Review UI → human override (optional, logged)
↓
Final decision + full audit trail
| Layer | Tech |
|---|---|
| Backend | FastAPI, Python |
| Frontend | React, Vite |
| Vector DB | ChromaDB (RAG) |
| LLM | OpenAI-compatible API / local LLMs |
| Storage | SQLite (claims), ChromaDB (vectors) |
| Deploy | Docker Compose, Vercel (frontend) |
- Auditability requirement — healthcare adjudication requires every decision to be explainable. Pure LLM systems fail here. The hybrid architecture exists specifically so every decision has a traceable reason
- RAG context quality — retrieving the wrong historical claims makes the LLM worse, not better. The ChromaDB schema and embedding strategy matter more than the LLM choice
- Human-in-the-loop UX — reviewers aren't engineers. The UI must surface the right context (why did the system decide this?) without overwhelming the reviewer with raw model output
- Edge case definition — deciding what goes to Tier 1 vs Tier 2 is itself a design problem. Too aggressive with rules = brittle. Too aggressive with LLM = unauditable
# Clone and set up environment
git clone https://github.com/Shubh3005/HealthcareAgent
cp .env.example .env # add your OpenAI API key
# Run with Docker Compose
docker-compose up -d
# Frontend at localhost:3000
# Backend API at localhost:8000/docsBuilt by Shubham Gupta · Penn State CS, Schreyer Honors College