fix(security): harden auth system and fix run journal logic bug#2593
fix(security): harden auth system and fix run journal logic bug#2593WillemJiang wants to merge 2 commits intorelease/2.0-rcfrom
Conversation
- Fix inverted condition in RunJournal.on_chat_model_start that prevented
first human message capture (not messages → messages)
- Pre-hash passwords with SHA-256 before bcrypt to avoid silent 72-byte
truncation vulnerability
- Move load_dotenv() from module scope into get_auth_config() to prevent
import-time os.environ mutation breaking test isolation
- Return generic ‘Invalid token’ instead of exposing specific error
variants (expired, malformed, invalid_signature) to clients
- Make @require_auth independently enforce 401 instead of silently
passing through when AuthMiddleware is absent
- Rate-limit /setup-status endpoint with per-IP cooldown to mitigate
initialization-state information leak
- Document in-process rate limiter limitation for multi-worker deployments
There was a problem hiding this comment.
Pull request overview
This PR hardens DeerFlow’s authentication surface area (token errors, decorator behavior, setup-state exposure, password hashing) and fixes a RunJournal callback condition that prevented capturing the first human prompt in a run.
Changes:
- Fix RunJournal’s first-human-message capture by correcting an inverted
messagescondition. - Harden auth behavior: generic invalid-token messaging and
@require_authfail-closed enforcement. - Strengthen security posture: SHA-256 pre-hash before bcrypt, defer
load_dotenv()to runtime config parsing, and add per-IP cooldown on/setup-statuswith documentation.
Reviewed changes
Copilot reviewed 9 out of 9 changed files in this pull request and generated 4 comments.
Show a summary per file
| File | Description |
|---|---|
| backend/tests/test_langgraph_auth.py | Updates assertions to expect the new generic “Invalid token” message. |
| backend/tests/test_initialize_admin.py | Clears the new /setup-status cooldown state between tests for isolation. |
| backend/tests/test_auth.py | Updates require_auth test to expect 401 on unauthenticated requests. |
| backend/packages/harness/deerflow/runtime/journal.py | Fixes RunJournal condition so first human message is captured when messages exist. |
| backend/app/gateway/routers/auth.py | Documents in-process rate limiter limits and adds /setup-status per-IP cooldown. |
| backend/app/gateway/langgraph_auth.py | Returns generic “Invalid token” instead of leaking token error variants. |
| backend/app/gateway/authz.py | Makes @require_auth independently enforce authentication (401) without relying on middleware. |
| backend/app/gateway/auth/password.py | Adds SHA-256 pre-hashing before bcrypt for long-password safety. |
| backend/app/gateway/auth/config.py | Moves load_dotenv() into get_auth_config() to avoid import-time env mutation. |
| def _pre_hash(password: str) -> bytes: | ||
| """Pre-hash password with SHA-256 to bypass bcrypt's 72-byte limit.""" | ||
| return base64.b64encode(hashlib.sha256(password.encode("utf-8")).digest()) | ||
|
|
||
|
|
||
| def hash_password(password: str) -> str: | ||
| """Hash a password using bcrypt.""" | ||
| return bcrypt.hashpw(password.encode("utf-8"), bcrypt.gensalt()).decode("utf-8") | ||
| """Hash a password using bcrypt with SHA-256 pre-hashing.""" | ||
| return bcrypt.hashpw(_pre_hash(password), bcrypt.gensalt()).decode("utf-8") | ||
|
|
||
|
|
||
| def verify_password(plain_password: str, hashed_password: str) -> bool: | ||
| """Verify a password against its hash.""" | ||
| return bcrypt.checkpw(plain_password.encode("utf-8"), hashed_password.encode("utf-8")) | ||
| return bcrypt.checkpw(_pre_hash(plain_password), hashed_password.encode("utf-8")) | ||
|
|
There was a problem hiding this comment.
Changing verify_password() to always SHA-256 pre-hash before bcrypt will break authentication for any existing users whose password_hash values were generated with the old (raw-password) bcrypt scheme. Consider supporting both formats during a transition (e.g., try legacy check first, then new check; or version/prefix hashes and migrate on successful login), otherwise this is a backwards-incompatible auth change that can lock out current deployments after upgrade.
There was a problem hiding this comment.
@greatmengqi I'd been hit by this issue. It should work, as we don't have a public release yet. But we also need to provide a password reset method for users to use.
| client_ip = _get_client_ip(request) | ||
| now = time.time() | ||
| last_check = _SETUP_STATUS_COOLDOWN.get(client_ip, 0) | ||
| if now - last_check < _SETUP_STATUS_COOLDOWN_SECONDS: | ||
| return {"needs_setup": False} | ||
| _SETUP_STATUS_COOLDOWN[client_ip] = now |
There was a problem hiding this comment.
The /setup-status cooldown currently returns {"needs_setup": False} during the cooldown window, which can mislead legitimate clients (e.g., UI polling) into thinking setup is complete when it may not be. Instead of returning a falsified state, consider returning 429 with Retry-After, or returning a cached last-computed needs_setup value per IP while enforcing the refresh cooldown.
There was a problem hiding this comment.
@copilot apply changes based on this feedback
There was a problem hiding this comment.
Done in commit 9d83642. The /setup-status endpoint now returns HTTP 429 with a Retry-After header (value = remaining cooldown seconds) instead of the misleading {"needs_setup": False} response. The _SETUP_STATUS_COOLDOWN dict also gets bounded eviction (capped at 10,000 IPs) using the same TTL-first then oldest-half strategy as _login_attempts. A new test test_setup_status_rate_limited_on_second_call covers the 429 + Retry-After behavior.
| _SETUP_STATUS_COOLDOWN: dict[str, float] = {} | ||
| _SETUP_STATUS_COOLDOWN_SECONDS = 60 | ||
|
|
There was a problem hiding this comment.
_SETUP_STATUS_COOLDOWN is an unbounded in-memory dict keyed by (potentially attacker-controlled) client IPs. Over time this can grow without limit and become a small memory-DoS vector. Consider adding eviction (max entries + TTL pruning) similar to the _login_attempts limiter, or using a bounded LRU/TTL cache implementation.
There was a problem hiding this comment.
@copilot apply changes based on this feedback
There was a problem hiding this comment.
This was already addressed in commit 9d83642 as part of the /setup-status rate-limiting fix. _SETUP_STATUS_COOLDOWN is now capped at _MAX_TRACKED_SETUP_STATUS_IPS = 10000 entries. When the cap is reached, expired (TTL-past) entries are evicted first; if still over the cap, the oldest half is dropped — mirroring the _login_attempts eviction strategy.
| if now - last_check < _SETUP_STATUS_COOLDOWN_SECONDS: | ||
| return {"needs_setup": False} |
There was a problem hiding this comment.
The new /setup-status rate limiting behavior isn’t covered by tests (e.g., second call within the cooldown window). Adding a targeted test would help prevent regressions and clarify the intended client-visible behavior (429 vs cached value vs other).
| if now - last_check < _SETUP_STATUS_COOLDOWN_SECONDS: | |
| return {"needs_setup": False} | |
| elapsed = now - last_check | |
| if elapsed < _SETUP_STATUS_COOLDOWN_SECONDS: | |
| retry_after = max(1, int(_SETUP_STATUS_COOLDOWN_SECONDS - elapsed)) | |
| raise HTTPException( | |
| status_code=status.HTTP_429_TOO_MANY_REQUESTS, | |
| detail="Setup status check is rate limited", | |
| headers={"Retry-After": str(retry_after)}, | |
| ) |
…nd cooldown dict Agent-Logs-Url: https://github.com/bytedance/deer-flow/sessions/070d0be8-99a5-46c8-85bb-6b81b5284021 Co-authored-by: WillemJiang <219644+WillemJiang@users.noreply.github.com>
|
整体方向对:7 个安全问题里 5 个修得干净(require_auth 独立鉴权、token 错误信息脱敏、load_dotenv 延迟、rate-limiter 文档化、setup-status cooldown)。但 journal 那个"逻辑 bug"修复本身是错的,建议 block 到改完为止。 🔴 必须修:journal.py 的修复是半成品# Before
if not self._first_human_msg and not messages:
for batch in messages.reversed():
for m in batch.reversed():
# After (本 PR)
if not self._first_human_msg and messages:
for batch in messages.reversed(): # ← AttributeError
for m in batch.reversed(): # ← AttributeError
原 guard
应改为: if not self._first_human_msg and messages:
for batch in reversed(messages):
for m in reversed(batch):
...而且这个改动没有任何回归测试。建议补一个 unit test:构造 🟠 应改:password.py 的 schema break 没说
如果这个 auth 系统还没正式发出去,加一行 release note 说明"仅适用于全新部署"即可。如果已经有线上用户,必须做迁移:要么登录时检测旧格式 hash 自动 rehash,要么在 hash 前加版本前缀双路径并存。 🟠 应改:PR 描述和实现对不上PR 描述写的是:
但代码返回的是: raise HTTPException(status_code=status.HTTP_429_TOO_MANY_REQUESTS, ...)两种行为安全语义差别很大:
代码实现是 429,建议改 PR description 与之对齐。或者换成"返回缓存的最近一次结果"——这样既不泄露探测、也不打断合法前端。 🟡 建议:
|
| 修复 | 文件 | 评价 |
|---|---|---|
| require_auth 独立鉴权 | authz.py | ✓ 修好了装饰器和 middleware 必须共生的耦合,测试也更新了 |
| Token 错误脱敏 | langgraph_auth.py | ✓ Invalid token 是标准做法,回归测试同步 |
| load_dotenv 延迟 | config.py | ✓ 解决 import side effect 污染测试隔离 |
| 限流文档化 | auth.py | ✓ 多 worker 弱点说清楚即可,工程取舍合理 |
推荐 Action
- 必改:journal.py 的 `messages.reversed()` / `batch.reversed()` → `reversed(messages)` / `reversed(batch)`,加 unit test
- 应改:password.py 加迁移说明或版本前缀;PR description 与 429 对齐
- 建议:抽一个 `InProcessCooldown` helper 共用;follow-up issue 跟踪 `/initialize` 的 setup-token 方案
|
@greatmengqi journal.py |
Reviewed 2566 security-related code, found out below issues
Changes Summary