Skip to content

feat: per-project envelope encryption, IDOR protection, material URL proxy, and E2E test suite#460

Merged
GenerQAQ merged 19 commits intodevfrom
fix/encryption-review-issues
Mar 27, 2026
Merged

feat: per-project envelope encryption, IDOR protection, material URL proxy, and E2E test suite#460
GenerQAQ merged 19 commits intodevfrom
fix/encryption-review-issues

Conversation

@GenerQAQ
Copy link
Copy Markdown
Contributor

@GenerQAQ GenerQAQ commented Mar 19, 2026

Why we need this PR?

Add per-project S3 envelope encryption for data-at-rest protection using a zero-knowledge API key architecture. The encryption master key is embedded directly in the user's API key — the server never stores it.

This PR also includes:

  • Cross-project IDOR protection across all resource endpoints
  • Material URL proxy for serving encrypted (and non-encrypted) files via Redis-backed tokens
  • Redis cache encryption for session message parts
  • Comprehensive E2E test suite covering encryption, project isolation, and all major features
  • CI improvements with parallel Docker builds and .dockerignore optimization
  • OSS Dashboard encryption page, route handler proxies for file uploads, and i18n

Describe your solution

1. Encryption Architecture

Key Hierarchy

API Token: sk-ac-{base64url(0x01 | authSecret_16B | AES-KW(wrappingKey, masterKey))}
    │                         76 chars total
    │
    ├─ authSecret (hex) → HMAC-SHA256(pepper) → DB lookup key
    ├─ authSecret (hex) → Argon2 → password verification
    └─ authSecret (hex) + pepper → HKDF-SHA256 → wrappingKey
                                                    │
                                    AES-KW-Unwrap(wrappingKey, wrappedMasterKey)
                                                    │
                                              masterKey (= user_kek)
                                                    │
                                    ┌───────────────┤
                                    ▼               ▼
                               S3 objects      DB/Redis content
                          (per-object DEK)   (per-value DEK)
                           AES-256-GCM       AES-256-GCM

Token Format (Compact)

  • sk-ac-{base64url(0x01 | auth_secret_16B | aes_kw_40B)} — 76-char body, master key (KEK) generated once at project creation
  • Format: version byte (0x01) + 16-byte auth_secret + 40-byte AES Key Wrap (RFC 3394) of 32-byte master key
  • Legacy keys (shorter body, no 0x01 prefix) still authenticate but cannot use encryption
  • ParseProjectToken checks compact format (76 chars, valid base64url, version 0x01) first, falls back to legacy

Envelope Encryption (AES-256-GCM)

  • Per-object random DEK for every S3 upload — limits blast radius of any single DEK compromise
  • DEK wrapped with user KEK and stored in S3 metadata (enc-algo, enc-dek-user)
  • Authenticated encryption — tampering with ciphertext or wrapped DEK is detected
  • Backward compatibleMetadataFromMap returns nil for unencrypted objects, passthrough on decrypt

DB/Redis Content Encryption (content.go / artifact.py)

Wire format for inline encrypted text (asset_meta["content"], Redis parts cache):

base64( 0x01 | wrappedDEK_len [2 bytes BE] | wrappedDEK | ciphertext )

Legacy plaintext auto-detected (no prefix byte or 0x00 prefix).

Redis Parts Cache Encryption

Session message parts cached in Redis use a 1-byte prefix scheme:

  • 0x00 | json → plaintext
  • 0x01 | ciphertext → encrypted with user KEK
  • Legacy (raw JSON, starts with [/{) → backward compatible

Key Rotation (Zero-Rewrap Design)

  • Rotation replaces only the auth_secret and re-wraps the same master key → no S3 re-encryption needed, O(1) regardless of data volume
  • RewrapDEK is idempotent for crash-safe retries — tries old KEK first, then tries new KEK to detect already-rotated state, returns skip signal if already rotated
  • Bearer route (PUT /admin/v1/project/secret_key): preserves master key from token context
  • JWT admin route (PUT /admin/v1/project/:id/secret_key): blocked for encrypted projects to prevent data loss

Crash Safety

  • Encrypt: set encryption_enabled = true FIRST, then batch-encrypt objects. Safe because plaintext objects pass through decryption gracefully.
  • Decrypt: batch-decrypt objects FIRST, then set flag false. Safe because the KEK is still available until flag is cleared.

Known Limitations (Documented)

  • Text search (grep) disabled for encrypted projects — regex operates on ciphertext
  • Deduplication disabled for encrypted uploads — different users produce different ciphertext for the same plaintext

2. Material URL Proxy

Replaces direct S3 presigned URLs with an API-proxied download mechanism for all asset downloads.

Why: Presigned S3 URLs expose raw encrypted bytes — they can't decrypt. The proxy downloads, decrypts in-process, and streams plaintext to the client.

  • MaterialService.CreateMaterialURL() → generates 32-byte random token, stores MaterialMeta (S3 key, base64 KEK, MIME, filename) in Redis with TTL
  • GET /api/v1/material/{token} → unauthenticated endpoint, token is the credential. Looks up Redis, downloads from S3 with KEK, streams decrypted bytes.
  • Used by: GetArtifact, GetMessages (asset URLs), GetAgentSkillFile

3. IDOR Protection (Cross-Project Ownership Validation)

Systematic project ownership validation at both handler and service layers. All violations return 403 Forbidden:

Resource Guard
Sessions session.ProjectID != project.ID check on every operation
Disks/Artifacts diskRepo.GetByProjectAndID(ctx, project.ID, diskID) — WHERE id=? AND project_id=?
Tasks Session ownership verification before returning tasks
Agent Skills GetByID(ctx, project.ID, id) scoped by project
Session Assets S3 key prefix check (assets/{project_id}/) prevents cross-project download

Defense-in-depth: checks exist at both handler AND service layer.


4. user_kek Threading Through CORE (Python)

  • KEK travels as base64 string in MQ messages (session.user_kek, learning.user_kek)
  • Consumers hard-fail on invalid base64 — never silently fall back to plaintext
  • Reaches: message fetching → artifact creation → skill learner tools → sandbox backends
  • Python encode_content/decode_content mirrors Go's binary framing format exactly
  • crypto.py reimplements Go's envelope primitives (same wire format, cross-service interoperable)

5. Dashboard UI

Commercial Dashboard (dashboard/)

  • API Key Storage: localStorage-based key management per project (useApiKeyStorage hook)
  • Encryption Settings: New "Encryption" tab in project settings with enable/disable toggles and confirmation dialogs
  • Encryption Status: Shield icons (green/red) in project selector dropdown
  • Key Rotation Routing: Uses Bearer route when key is saved (preserves master key), falls back to JWT admin route

OSS Dashboard (src/server/ui/)

  • Encryption page: Enable/disable encryption with API key input, status display, confirmation dialogs
  • Route handlers: Replaced server action file uploads with Next.js route handlers (api/disk/upload, api/agent_skills/upload, api/session/messages) to proxy FormData to the Go API
  • Sidebar: Added Shield icon + "Encryption" nav item
  • i18n: 22 encryption-related keys added to en.json and zh.json

6. CI & Infrastructure

  • Parallel Docker image builds (4 images with GHA layer cache, mode=max)
  • .dockerignore for both API and CORE to reduce build context
  • Dockerfile.e2e — thin test runner image with tests mounted as volume
  • docker-compose.test.yml — 8-service test stack with mock LLM, disabled Argon2
  • APP_EXTERNALURL default set in server docker-compose.yaml so Load Preview works out of the box
  • cancel-in-progress: true for CI concurrency

7. AGENTS.md Update

Added exception rule: encryption features (encryption settings, API key localStorage, key rotation UI) are commercial-only and do NOT need to be added to Dashboard OSS.


Implementation Tasks

Go API — Crypto Package (internal/infra/crypto/)

  • envelope.go: AES-256-GCM primitives — GenerateDEK, WrapDEK/UnwrapDEK, Encrypt/Decrypt, DeriveKEK (HKDF-SHA256)
  • keywrap.go: RFC 3394 AES Key Wrap — AESKeyWrap/AESKeyUnwrap for deterministic key wrapping in compact tokens
  • service.go: DeriveUserKEK, PackCompactToken/UnpackCompactToken, GenerateMasterKey, EncryptData/DecryptData, RewrapDEK (idempotent), S3 metadata helpers
  • content.go: EncodeContent/DecodeContent with binary framing for DB/Redis content encryption
  • envelope_test.go + content_test.go + keywrap_test.go: Full coverage including determinism, rotation, idempotency, legacy detection, RFC 3394 test vectors

Go API — Token Parsing (internal/pkg/utils/tokens/)

  • ParseProjectToken() with ParsedToken{AuthSecret, CompactRaw} — compact (76 chars, base64url) vs legacy
  • Full edge-case test coverage (empty, short tokens, compact format validation, legacy fallback)

Go API — S3 Blob Layer (internal/infra/blob/s3.go)

  • userKEK []byte parameter on all upload/download functions
  • encryptAndMergeMetadata() / decryptWithUserKEK() helpers
  • Dedup bypass for encrypted uploads
  • EncryptObject(), DecryptObject(), RewrapObjectDEK() for admin operations
  • Optional SSE (server-side encryption) via config

Go API — Material URL Proxy

  • service/material.go: CreateMaterialURL() + ServeMaterial() with Redis token TTL
  • handler/material.go: GET /api/v1/material/{token} (unauthenticated)
  • Unit tests with miniredis for token lifecycle and TTL expiry

Go API — Middleware (internal/middleware/auth.go)

  • ProjectAuth: Extract master key from compact tokens via UnpackCompactToken, set user_kek in gin context
  • Dedicated projectAuthCache struct to preserve secret fields (fixes stale cache from json:"-" on model.Project)
  • GetUserKEK, GetUserKEKIfEncrypted, GetUserKEKBase64IfEncrypted helpers
  • InvalidateProjectAuthCache for Redis auth cache invalidation after encryption state changes

Go API — Encryption + Admin Handlers

  • POST /api/v1/project/encrypt + POST /admin/v1/project/encrypt — enable encryption with idempotent batch re-encryption
  • POST /api/v1/project/decrypt + POST /admin/v1/project/decrypt — disable encryption with batch decryption
  • PUT /admin/v1/project/secret_key (Bearer auth) — preserves master key
  • PUT /admin/v1/project/:id/secret_key (JWT) — blocked for encrypted projects

Go API — Handlers (IDOR + Encryption Threading)

  • session.go: IDOR checks on all operations + userKEK threading to StoreMessage/GetMessages/Delete/DownloadAsset
  • artifact.go: diskRepo.GetByProjectAndID IDOR guard + material URL switchover + KEK threading
  • disk.go: Ownership validation on delete
  • task.go: Session ownership check before returning tasks
  • agent_skills.go: Project-scoped queries + KEK threading + material URLs

Go API — Services

  • session.go: Redis parts cache encryption (prefix byte scheme), material URLs for message assets, MQ KEK forwarding
  • artifact.go: KEK threading to S3 + EncodeContent for text content
  • project.go: Compact token generation with embedded encrypted master key, unified RotateSecretKey
  • agent_skills.go: Material URLs for binary skill files, KEK propagation in concurrent uploads
  • session_cache_test.go: 12 unit tests for Redis prefix byte scheme

Go API — Config & DI

  • secretPepper configurable via ROOT_SECRET_PEPPER env var
  • App.ExternalURL for material proxy URLs
  • MaterialService registered in DI container, injected into SessionService + AgentSkillsService

Go API — Model & Repo

  • Project.EncryptionEnabled bool field added (GORM)
  • disk.GetByProjectAndID — central IDOR guard
  • asset_reference.ListS3KeysByProject — for batch encrypt/decrypt
  • AssetRefBuffer — Redis-buffered asset reference writer (from merged dev)

Python CORE — Crypto & S3

  • infra/crypto.py: AES-256-GCM envelope encryption (Go-compatible wire format)
  • infra/s3.py: Auto-encrypt/decrypt with user_kek on upload/download
  • service/data/artifact.py: encode_content/decode_content with binary framing (mirrors Go)
  • service/data/message.py: KEK threading to S3 download for message parts
  • service/data/agent_skill.py: KEK threading for skill creation
  • service/data/sandbox.py: KEK threading for sandbox file operations

Python CORE — MQ Consumers & Controllers

  • service/session_message.py: Hard-fail on invalid KEK, forward to message controller
  • service/skill_learner.py: Hard-fail on invalid KEK for both pipeline stages
  • service/controller/message.py: Forward user_kek to data fetchers
  • service/controller/skill_learner.py: Re-encode KEK for MQ republication

Python CORE — Skill Learner Tools

  • SkillLearnerCtx.user_kek — KEK carrier into all LLM tools
  • get_skill_file.py: Decrypt before returning to LLM
  • create_skill.py, create_skill_file.py, str_replace_skill_file.py: Encrypt on write

Python CORE — Sandbox Backends

  • All 5 backends (E2B, Cloudflare, AWS AgentCore, Novita, base): user_kek parameter on upload/download

Python CORE — Schema & ORM

  • schema/mq/session.py + learning.py: user_kek field in MQ messages
  • schema/orm/project.py: encryption_enabled column (synced with Go GORM)
  • schema/api/request.py: SandboxUploadRequest.user_kek

Dashboard (Commercial)

  • use-api-key-storage.ts: localStorage hook for API key per project
  • api-keys-page-client.tsx: Save/show/copy/delete API key card; auto-save on rotation
  • general-page-client.tsx: Encryption toggle tab with confirmation dialogs
  • top-nav.tsx: ShieldCheck/ShieldX encryption status icons per project
  • operations/admin.ts: encryptProject, decryptProject, rotateProjectSecretKey (Bearer)
  • actions.ts: Server actions for encrypt/decrypt operations

Dashboard OSS (src/server/ui/)

  • encryption/page.tsx: Encryption settings page with enable/disable toggles, API key input, confirmation dialogs
  • encryption/actions.ts: Server actions for encryption status, enable, disable
  • api/disk/upload/route.ts, api/agent_skills/upload/route.ts, api/session/messages/route.ts: Route handler proxies replacing server action file uploads
  • components/app-sidebar.tsx: Shield icon + Encryption nav item
  • messages/en.json + zh.json: 22 encryption-related i18n keys

Documentation

  • encryption.mdx: User guide — architecture, enable/disable steps, key rotation, warnings
  • AGENTS.md: Encryption exception rule for Dashboard OSS

E2E Tests

  • test_encryption.py (946 lines): 24 tests — enable/disable, upload/download, material URLs, message store/retrieve, batch encrypt/decrypt, key rotation, session copy/delete, agent skill encryption, Redis cache inspection (verifies 0x01 prefix + no plaintext leakage), cross-project rejection
  • test_project_isolation.py (519 lines): 27 IDOR tests across all endpoints
  • test_disk_artifact.py (425 lines): Disk CRUD, artifact ls/grep/glob, session configs/copy/flush, message meta patch
  • test_agent_skills.py (253 lines): ZIP upload, pagination, get file, delete, invalid ZIP
  • test_learning_spaces.py (373 lines): CRUD, skill associations, meta filter
  • test_users.py (156 lines): User listing, resource counts, cascade delete
  • conftest.py: Shared fixtures (direct DB seeding, async polling, helper functions)

Unit Tests

  • envelope_test.go + content_test.go + keywrap_test.go: Full crypto primitive coverage + RFC 3394 vectors
  • tokens_test.go: Token parsing — compact vs legacy format
  • auth_test.go: Cache round-trip + old-format stale entry rejection
  • session_cache_test.go: Redis prefix byte scheme (12 tests)
  • material_test.go: Material URL lifecycle (5 tests)
  • test_artifact_data.py: 35+ Python unit tests including encode/decode encryption

CI & Infrastructure

  • .github/workflows/e2e-test.yaml: Parallel image builds with GHA layer cache
  • docker-compose.test.yml: 8-service test stack
  • docker-compose.yaml: APP_EXTERNALURL default for Load Preview
  • Dockerfile.e2e: Thin test runner image
  • .dockerignore for API and CORE

Impact Areas

  • API Server
  • Core Service
  • Dashboard
  • Documentation
  • Client SDK (Python)
  • Client SDK (TypeScript)
  • CLI Tool
  • Other: E2E test infrastructure, CI workflow

Checklist

  • Open your pull request against the dev branch.
  • All tests pass in available continuous integration systems (e.g., GitHub Actions).
  • Tests are added or modified as needed to cover code changes.

@GenerQAQ GenerQAQ requested a review from a team as a code owner March 19, 2026 10:24
GenerQAQ added a commit that referenced this pull request Mar 19, 2026
- Return 404 instead of 500 when GetByPath fails in DownloadArtifact
- Thread UserKEK through learning space creation so default skills are encrypted
- Update docs for encryption-aware SDK behavior (download fallback, raw_content)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
GenerQAQ added a commit that referenced this pull request Mar 20, 2026
1. Thread userKEK through Delete and CopySession handlers/service/repo
   so encrypted projects can properly decrypt S3 parts during these ops
2. Fix SDK auth header: use Authorization Bearer instead of X-Encryption-Key
3. Preserve original S3 ContentType in EncryptObject/DecryptObject by
   returning it from downloadRaw instead of hardcoding octet-stream

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
GenerQAQ added a commit that referenced this pull request Mar 20, 2026
- Thread user_kek through DownloadToSandbox pipeline (Go→Python) for encrypted projects
- Add disk ownership check in GetArtifact to prevent IDOR
- Fix atomicity: set encryption flag before encrypting, decrypt before clearing flag
- Add userKEK param to GetAllMessages so encrypted messages are decrypted
- Remove unused base64/secrets imports in admin.go
- Add AGENTS.md exception: encryption features are commercial-only (no OSS Dashboard)
- Add 5 e2e regression tests for cross-project access, copy/delete/skills encryption

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
GenerQAQ added a commit that referenced this pull request Mar 20, 2026
1. Cache encrypted data in Redis: cachePartsInRedis/getPartsFromRedis now
   encrypt cached parts when userKEK is present, using prefix-byte framing
   (0x00=plaintext, 0x01=encrypted) with backward compatibility for legacy
   entries.

2. Add disk ownership check to GrepArtifacts/GlobArtifacts: both handlers
   now verify disk belongs to the authenticated project via
   diskRepo.GetByProjectAndID, closing an IDOR vulnerability.

3. Set SSE on RewrapObjectDEK CopyObject: CopyObjectInput now includes
   ServerSideEncryption when configured, matching all other write operations.

Includes Go unit tests (miniredis) for all cache encryption paths and e2e
tests that directly verify Redis cache contents for encrypted/plain projects.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
GenerQAQ added a commit that referenced this pull request Mar 20, 2026
Covers the disk ownership check added in PR #460 with:
- Unit tests for 404 on cross-project disk access and 400 on invalid body
- E2E test for cross-project upload_from_sandbox denial

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
GenerQAQ added a commit that referenced this pull request Mar 20, 2026
- Return 404 instead of 500 when GetByPath fails in DownloadArtifact
- Thread UserKEK through learning space creation so default skills are encrypted
- Update docs for encryption-aware SDK behavior (download fallback, raw_content)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
GenerQAQ added a commit that referenced this pull request Mar 20, 2026
1. Thread userKEK through Delete and CopySession handlers/service/repo
   so encrypted projects can properly decrypt S3 parts during these ops
2. Fix SDK auth header: use Authorization Bearer instead of X-Encryption-Key
3. Preserve original S3 ContentType in EncryptObject/DecryptObject by
   returning it from downloadRaw instead of hardcoding octet-stream

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
GenerQAQ added a commit that referenced this pull request Mar 20, 2026
- Thread user_kek through DownloadToSandbox pipeline (Go→Python) for encrypted projects
- Add disk ownership check in GetArtifact to prevent IDOR
- Fix atomicity: set encryption flag before encrypting, decrypt before clearing flag
- Add userKEK param to GetAllMessages so encrypted messages are decrypted
- Remove unused base64/secrets imports in admin.go
- Add AGENTS.md exception: encryption features are commercial-only (no OSS Dashboard)
- Add 5 e2e regression tests for cross-project access, copy/delete/skills encryption

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
GenerQAQ added a commit that referenced this pull request Mar 20, 2026
1. Cache encrypted data in Redis: cachePartsInRedis/getPartsFromRedis now
   encrypt cached parts when userKEK is present, using prefix-byte framing
   (0x00=plaintext, 0x01=encrypted) with backward compatibility for legacy
   entries.

2. Add disk ownership check to GrepArtifacts/GlobArtifacts: both handlers
   now verify disk belongs to the authenticated project via
   diskRepo.GetByProjectAndID, closing an IDOR vulnerability.

3. Set SSE on RewrapObjectDEK CopyObject: CopyObjectInput now includes
   ServerSideEncryption when configured, matching all other write operations.

Includes Go unit tests (miniredis) for all cache encryption paths and e2e
tests that directly verify Redis cache contents for encrypted/plain projects.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@GenerQAQ GenerQAQ force-pushed the fix/encryption-review-issues branch from 40d656e to 6c6cafb Compare March 20, 2026 14:04
GenerQAQ added a commit that referenced this pull request Mar 20, 2026
Covers the disk ownership check added in PR #460 with:
- Unit tests for 404 on cross-project disk access and 400 on invalid body
- E2E test for cross-project upload_from_sandbox denial

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@GenerQAQ GenerQAQ force-pushed the fix/encryption-review-issues branch 2 times, most recently from ae4db17 to 463a8aa Compare March 22, 2026 08:29
@GenerQAQ GenerQAQ force-pushed the fix/encryption-review-issues branch from 21e0307 to 7afd1ff Compare March 23, 2026 10:09
@GenerQAQ GenerQAQ changed the title feat: S3 envelope encryption for data-at-rest protection feat: per-project envelope encryption, IDOR protection, material URL proxy, and E2E test suite Mar 24, 2026
@GenerQAQ GenerQAQ force-pushed the fix/encryption-review-issues branch 2 times, most recently from e3542b1 to e2fa91f Compare March 26, 2026 03:57
…DOR protection

Encryption architecture:
- AES-256-GCM envelope encryption with per-object random DEKs
- Zero-knowledge key hierarchy: master key embedded in API token
  (sk-ac-{authSecret}.{encryptedMasterKey}), never stored server-side
- HKDF-SHA256 key derivation from auth secret + pepper
- Idempotent RewrapDEK for crash-safe key rotation (O(1), no S3 re-encryption)
- Redis cache encryption with prefix byte scheme (0x00=plain, 0x01=encrypted)
- DB content encryption via EncodeContent/DecodeContent binary framing

Material URL proxy:
- Replaces presigned S3 URLs with Redis-backed decryption proxy
- GET /api/v1/material/{token} — unauthenticated, token is the credential
- Enables transparent download of encrypted S3 objects

IDOR protection:
- Cross-project ownership validation on all resource endpoints
- Defense-in-depth checks at both handler and service layers
- diskRepo.GetByProjectAndID, session.ProjectID guards, S3 key prefix checks

Admin encryption lifecycle:
- POST /admin/v1/project/encrypt — idempotent batch encryption
- POST /admin/v1/project/decrypt — batch decryption
- PUT /admin/v1/project/secret_key (Bearer) — preserves master key
- PUT /admin/v1/project/:id/secret_key (JWT) — blocked for encrypted projects
…tion

- Add crypto.py: AES-256-GCM envelope encryption (Go-compatible wire format)
- S3 client: auto-encrypt/decrypt with user_kek on upload/download
- MQ consumers: hard-fail on invalid KEK (never silent plaintext fallback)
- Skill learner: SkillLearnerCtx carries KEK to all LLM tool handlers
- Artifact data: encode_content/decode_content with binary framing (mirrors Go)
- Sandbox backends: user_kek parameter on all 5 backends (E2B, CF, AWS, Novita, base)
- ORM: encryption_enabled column synced with Go GORM model
- Unit tests: 35+ cases for artifact data including encryption round-trips
…atus indicators

- Encryption toggle in Project Settings with confirmation dialogs
- API key save/show/copy/delete via useApiKeyStorage localStorage hook
- ShieldCheck/ShieldX encryption status icons in project selector
- Bearer-authenticated encrypt/decrypt/rotate server actions
- Key rotation auto-saves new key when previous key is stored
- Envelope encryption architecture and API key format
- Enable/disable encryption steps
- Key rotation (zero-rewrap design) instructions
- Warnings: key loss, disabled text search, disabled dedup
E2E tests (6 new test files, 1864 lines):
- test_encryption.py: 17 tests — lifecycle, upload/download, material URLs,
  key rotation, Redis cache inspection (verifies no plaintext leakage)
- test_project_isolation.py: 20 IDOR regression tests across all endpoints
- test_disk_artifact.py: disk CRUD, artifact ls/grep/glob, session configs
- test_agent_skills.py: ZIP upload, pagination, file download, deletion
- test_learning_spaces.py: CRUD, skill associations, meta JSONB filtering
- test_users.py: user listing, resource counts, cascade delete

CI improvements:
- Parallel Docker image builds (4 images with GHA layer cache)
- .dockerignore for API and CORE to reduce build context
- Dockerfile.e2e: thin test runner image with tests as volume mount
- docker-compose.test.yml: 8-service stack with mock LLM, disabled Argon2
@GenerQAQ GenerQAQ force-pushed the fix/encryption-review-issues branch from e2fa91f to 53bc9ca Compare March 26, 2026 06:24
…S encryption page

- Scope Redis message parts cache key by project_id to prevent
  cross-project cache collisions (message:parts:{projectID}:{sha256})
- Add update_session_status("failed") in session_message.py on invalid
  KEK decode, matching skill_learner.py's hard-fail pattern
- Add encryption page to OSS dashboard (src/server/ui) with API key
  input and encrypt/decrypt toggle
- Include encryption_enabled in /api/v1/project/configs response
- Update docker-compose default ROOT_API_BEARER_TOKEN to new format
  (auth_secret.encrypted_master_key) so encryption works out of the box
- Add ROOT_SECRET_PEPPER default in docker-compose
- Update EnsureDefaultProjectExists to parse new token format (extract
  auth_secret before the dot for HMAC/PHC hashing)
- Use default pepper "your-secret-pepper" (viper default) for token
  generation instead of a separate pepper — no need for ROOT_SECRET_PEPPER
- Update viper default apiBearerToken to new format matching docker-compose
- Fix encryption page: hide card when initial status fetch fails (e.g.
  Unauthorized), only show error alert
config.yaml uses ${ROOT_SECRET_PEPPER} which os.ExpandEnv resolves to
empty string when the env var is unset, overriding the viper default.
This caused HMAC mismatch (token generated with "your-secret-pepper"
but server used "") resulting in 401 Unauthorized.
Compress API token from 130 to 76 chars (body) by:
- Reducing auth_secret from 32 to 16 bytes (128-bit, still secure)
- Using AES Key Wrap (RFC 3394) instead of AES-GCM for master key
  wrapping (40 bytes vs 60, no random nonce needed)
- Binary packing version byte + auth + wrapped_mk into single base64url

New format: sk-ac-{base64url(0x01 | auth_16B | aes_kw_40B)} = 82 chars
Old dot-separated and legacy formats remain fully supported.

Includes RFC 3394 test vectors (Section 4.1, 4.3, 4.6).
The dot-separated format (sk-ac-{auth}.{encrypted_mk}) was never
deployed. Remove it to simplify the codebase:

- Remove WrapMasterKey/UnwrapMasterKey (AES-GCM token wrapping)
- Remove EncryptedMasterKey field from ParsedToken
- Remove dot-parsing branch in ParseProjectToken
- Remove dot-format test cases
- Remove generateRandomSecret (unused after compact format)
- Update Python E2E tests to use compact format with AES Key Wrap

S3 envelope encryption (WrapDEK/UnwrapDEK via AES-GCM) is unchanged.
…dis keys

- test_key_rotation_plain_project: assert 76-char compact body instead of dot
- test_redis_cache_*: include project_id in Redis key lookup to match
  the project-scoped cache key format (message:parts:{project_id}:{sha256})
The uniqueIndex caused a harmless but noisy GORM AutoMigrate error on
every startup: DROP CONSTRAINT uni_learning_space_sessions_session_id
fails because the constraint doesn't exist in the database.

A session can be learned by multiple learning spaces, so uniqueIndex
was semantically wrong. Changed to plain index.
…rve secret fields

model.Project uses json:"-" on SecretKeyHMAC and SecretKeyHashPHC to
prevent API leakage, but this caused these fields to be silently dropped
when cached in Redis. First request (DB hit) succeeded, but subsequent
requests (Redis hit) failed with 401 because Argon2 verification ran
against empty strings.

Introduce projectAuthCache struct with explicit JSON tags for Redis
serialization. Add guard (SecretKeyHMAC != "") to reject stale entries
from the old format.
…ibility

Move encrypt/decrypt logic into shared package-level functions in
handler/encryption.go. Both AdminHandler and ProjectHandler delegate
to these functions, eliminating code duplication.

Register POST /api/v1/project/encrypt and POST /api/v1/project/decrypt
on the standard API router so OSS Docker deployments (which don't run
the admin binary) can use encryption features.

Admin binary retains /admin/v1/project/encrypt and /admin/v1/project/decrypt
for backward compatibility.

E2E tests updated to call the standard API endpoints.
Server actions cannot reliably serialize File objects in Next.js
standalone/docker builds, causing ERR_INCOMPLETE_CHUNKED_ENCODING 500.

- Add route handlers for disk, agent skills, and session message uploads
- Remove File parameters from server actions
- Fix encryption endpoints to use /api/v1 instead of /admin/v1
… the box

Without a default, buildURL() falls back to container-internal hostname,
producing material URLs unreachable from the browser. Default to
http://localhost:${API_EXPORT_PORT:-8029} for the CLI docker-compose.
Same fix as the CLI docker-compose — default to
http://localhost:${API_EXPORT_PORT:-8029} instead of empty string.
@GenerQAQ GenerQAQ merged commit ef3a0ff into dev Mar 27, 2026
18 checks passed
@GenerQAQ GenerQAQ deleted the fix/encryption-review-issues branch March 27, 2026 13:04
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant