Skip to content

Commit 7c04fc7

Browse files
author
Aegis Stack
committed
AI Docs
1 parent 815dc6b commit 7c04fc7

13 files changed

Lines changed: 305 additions & 91 deletions

File tree

docs/images/llm_catalog.png

523 KB
Loading

docs/images/rag_1.png

463 KB
Loading

docs/images/rag_2.png

535 KB
Loading

docs/images/usage.png

544 KB
Loading

docs/images/voice.png

414 KB
Loading

docs/services/ai/cost-tracking.md

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,7 @@
11
# Cost Tracking & Usage Analytics
22

3+
![Usage Analytics](../../images/usage.png)
4+
35
Every AI request is automatically tracked — tokens consumed, cost calculated, and success or failure recorded. No instrumentation required. The data is available immediately via API and visualized in the frontend analytics dashboard.
46

57
!!! info "Requires Database Backend"

docs/services/ai/illiana.md

Lines changed: 274 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,274 @@
1+
# Illiana
2+
3+
![Illiana Demo](../../images/illiana-demo.gif)
4+
5+
**Illiana** is the conversational AI interface that ships with every AI-enabled Aegis Stack project. She is not a generic chatbot. She has live awareness of your running system through context injection, and she can search your codebase when RAG is enabled.
6+
7+
She is not required to use Aegis Stack, and nothing in the system depends on her being present. When enabled, she becomes another way to understand what your application is doing and why, alongside the CLI and Overseer.
8+
9+
## What Makes Her Different
10+
11+
Illiana receives live data injected into her system prompt before every response. This means she answers based on what your system is actually doing right now, not what it could theoretically do.
12+
13+
| Context | What She Knows | Example Questions |
14+
|---------|---------------|-------------------|
15+
| **Health** | Component status, uptime, resource usage | "Is my database healthy?" "What's the scheduler doing?" |
16+
| **Usage** | Her own token consumption, costs, success rate | "How much have I spent today?" "What's my most-used model?" |
17+
| **RAG** | Your codebase (when indexed) | "How does auth work in this project?" "Where is the scheduler configured?" |
18+
| **Catalog** | Available models, pricing, capabilities | "What's the cheapest model with vision?" "Compare Claude vs GPT-4o pricing" |
19+
20+
```mermaid
21+
graph TB
22+
subgraph "AI Service"
23+
Illiana[Illiana<br/>System-Aware AI Assistant]
24+
25+
subgraph "Interfaces"
26+
CLI[CLI Interface<br/>ai chat, llm, rag]
27+
API[REST API<br/>/ai, /llm, /rag, /voice]
28+
end
29+
30+
subgraph "Capabilities"
31+
Catalog[LLM Catalog<br/>~2000 models]
32+
RAG[RAG Service<br/>ChromaDB + Embeddings]
33+
Voice[Voice<br/>TTS + STT]
34+
Usage[Cost Tracking<br/>Usage Analytics]
35+
end
36+
37+
subgraph "Context Injection"
38+
Health[Health Context]
39+
UsageCtx[Usage Context]
40+
RAGCtx[RAG Context]
41+
CatalogCtx[Catalog Context]
42+
end
43+
44+
Providers[Providers<br/>OpenAI, Anthropic, Google<br/>Groq, Mistral, Cohere<br/>Ollama, PUBLIC]
45+
Conv[Conversations<br/>Memory / SQLite / PostgreSQL]
46+
end
47+
48+
Backend[Backend Component<br/>FastAPI]
49+
50+
Illiana --> CLI
51+
Illiana --> API
52+
Illiana --> Providers
53+
Illiana --> Conv
54+
Catalog --> Illiana
55+
RAG --> Illiana
56+
Usage --> Illiana
57+
Health --> Illiana
58+
UsageCtx --> Illiana
59+
RAGCtx --> Illiana
60+
CatalogCtx --> Illiana
61+
API --> Backend
62+
63+
style Illiana fill:#e8f5e8,stroke:#2e7d32,stroke-width:3px
64+
style CLI fill:#e1f5fe,stroke:#1976d2,stroke-width:2px
65+
style API fill:#e1f5fe,stroke:#1976d2,stroke-width:2px
66+
style Providers fill:#fff3e0,stroke:#f57c00,stroke-width:2px
67+
style Conv fill:#f3e5f5,stroke:#7b1fa2,stroke-width:2px
68+
style Catalog fill:#e8f5e8,stroke:#2e7d32,stroke-width:2px
69+
style RAG fill:#e8f5e8,stroke:#2e7d32,stroke-width:2px
70+
style Voice fill:#e8f5e8,stroke:#2e7d32,stroke-width:2px
71+
style Usage fill:#e8f5e8,stroke:#2e7d32,stroke-width:2px
72+
style Backend fill:#e1f5fe,stroke:#1976d2,stroke-width:2px
73+
```
74+
75+
## Getting Started
76+
77+
```bash
78+
# Generate a project with AI
79+
aegis init my-app --services "ai[sqlite,rag]"
80+
cd my-app && uv sync && source .venv/bin/activate
81+
82+
# Start chatting
83+
my-app ai chat
84+
```
85+
86+
```
87+
Illiana v0.6.4
88+
Provider: public | Model: auto
89+
90+
You: What can you tell me about my system?
91+
Illiana: I can see your system is running with...
92+
```
93+
94+
### With Codebase Context
95+
96+
Index your code so Illiana can reference specific files and line numbers:
97+
98+
```bash
99+
# Index your codebase
100+
my-app rag index ./app --collection code --extensions .py
101+
102+
# Chat with RAG enabled
103+
my-app ai chat --rag --collection code --top-k 20 --sources
104+
```
105+
106+
Now she answers from your actual code instead of generic knowledge:
107+
108+
```
109+
You: How does the auth service validate tokens?
110+
Illiana: Based on your codebase, token validation happens in
111+
app/services/auth/service.py [1]. The validate_token() method...
112+
113+
Sources:
114+
[1] app/services/auth/service.py:42
115+
[2] app/components/backend/api/auth/router.py:15
116+
```
117+
118+
## Slash Commands
119+
120+
During interactive chat, use slash commands for quick actions:
121+
122+
| Command | Description |
123+
|---------|-------------|
124+
| `/help` | Show available commands |
125+
| `/model [name]` | Show current model or switch to a new one |
126+
| `/status` | Show current configuration |
127+
| `/new` | Start a new conversation |
128+
| `/rag [off\|collection]` | Toggle RAG mode or select collection |
129+
| `/sources [enable\|disable]` | Toggle source references in output |
130+
| `/clear` | Clear the screen |
131+
| `/exit` | Exit the chat session |
132+
133+
### Switching Models Mid-Conversation
134+
135+
```
136+
You: /model gpt-4o
137+
✓ Switched to OpenAI/gpt-4o
138+
139+
You: /model claude-sonnet-4-20250514
140+
✓ Switched to Anthropic/claude-sonnet-4-20250514
141+
```
142+
143+
Tab completion is available for model names (populated from Ollama and configured cloud providers).
144+
145+
### RAG Controls
146+
147+
```
148+
You: /rag code
149+
✓ RAG enabled with collection: code
150+
151+
You: /sources enable
152+
✓ Source references enabled
153+
154+
You: /rag off
155+
RAG disabled
156+
```
157+
158+
## Context Injection
159+
160+
Illiana's system prompt is assembled dynamically before every response. Four context formatters inject live data:
161+
162+
### Health Context
163+
164+
**Source:** `app/services/ai/health_context.py`
165+
166+
Injects component health status. Illiana reports what **is** running, not what **could** run.
167+
168+
### Usage Context
169+
170+
**Source:** `app/services/ai/usage_context.py`
171+
172+
Gives Illiana awareness of her own activity: tokens consumed, costs, success rates. Supports a compact mode for smaller models (Ollama) where context window is limited.
173+
174+
### RAG Context
175+
176+
**Source:** `app/services/ai/rag_context.py`
177+
178+
When RAG is enabled, search results are formatted as markdown with file names, line numbers, and syntax highlighting. Illiana is instructed to answer from this code, not generic knowledge.
179+
180+
### LLM Catalog Context
181+
182+
**Source:** `app/services/ai/llm_catalog_context.py`
183+
184+
Top models per featured vendor (OpenAI, Anthropic, Google, xAI, Mistral, Groq, DeepSeek) with pricing and capabilities. This lets Illiana recommend models when asked.
185+
186+
### Prompt Assembly
187+
188+
**Source:** `app/services/ai/prompts.py`
189+
190+
All contexts are combined via `build_system_prompt()`. Health context is injected last so the LLM weights it more heavily for status questions.
191+
192+
## Chat Modes
193+
194+
### Single Message
195+
196+
```bash
197+
my-app ai chat "Explain the architecture of this project"
198+
```
199+
200+
### Interactive Session
201+
202+
```bash
203+
my-app ai chat
204+
```
205+
206+
Features:
207+
208+
- Conversation memory (context maintained during session)
209+
- Markdown rendering in terminal
210+
- Streaming responses
211+
- Slash commands
212+
- Tab completion for models and collections
213+
214+
### With RAG
215+
216+
```bash
217+
my-app ai chat --rag --collection code --top-k 20 --sources \
218+
"How does the scheduler component work?"
219+
```
220+
221+
| Flag | Description |
222+
|------|-------------|
223+
| `--rag` | Enable RAG context |
224+
| `--collection` | Collection to search |
225+
| `--top-k` | Number of search results to include |
226+
| `--sources` | Show source file references after response |
227+
228+
## API Access
229+
230+
Illiana is also accessible via the REST API:
231+
232+
```bash
233+
# Chat
234+
curl -X POST http://localhost:8000/ai/chat \
235+
-H "Content-Type: application/json" \
236+
-d '{"message": "What is the health of my system?"}'
237+
238+
# Stream
239+
curl -X POST http://localhost:8000/ai/chat/stream \
240+
-H "Content-Type: application/json" \
241+
-d '{"message": "Explain the auth service"}' \
242+
--no-buffer
243+
```
244+
245+
See [API Reference](api.md) for full endpoint documentation.
246+
247+
## Configuration
248+
249+
Illiana uses the same configuration as the AI service:
250+
251+
```bash
252+
# .env
253+
AI_ENABLED=true
254+
AI_PROVIDER=public # or openai, anthropic, groq, ollama, etc.
255+
AI_MODEL=auto
256+
AI_TEMPERATURE=0.7
257+
AI_MAX_TOKENS=1000
258+
```
259+
260+
Switch models at any time:
261+
262+
```bash
263+
my-app llm use gpt-4o
264+
my-app llm use claude-sonnet-4-20250514
265+
```
266+
267+
---
268+
269+
**Next Steps:**
270+
271+
- **[RAG](rag.md)** - Index your codebase for Illiana to search
272+
- **[LLM Catalog](llm-catalog.md)** - Browse and switch models
273+
- **[Provider Setup](providers.md)** - Configure AI providers
274+
- **[CLI Commands](cli.md)** - Complete CLI reference

0 commit comments

Comments
 (0)