Skip to content

Add finance-agent routing eval dataset and builder guidance#1625

Open
maxpetrusenko wants to merge 1 commit intoopenai:mainfrom
maxpetrusenko:feat/finance-agent-routing-eval
Open

Add finance-agent routing eval dataset and builder guidance#1625
maxpetrusenko wants to merge 1 commit intoopenai:mainfrom
maxpetrusenko:feat/finance-agent-routing-eval

Conversation

@maxpetrusenko
Copy link
Copy Markdown

@maxpetrusenko maxpetrusenko commented Feb 24, 2026

Summary

  • add a new template-compatible eval: finance-agent-routing
  • add a 24-sample JSONL dataset for deterministic tool-routing intent checks
  • add a short note to docs/build-eval.md describing route-label evals and multi-label ambiguity handling

Why

Tool-using agent systems usually fail first at planner routing before answer generation quality. This eval pattern provides a deterministic regression gate for that routing layer using existing Match template support (no custom eval code).

Files

  • evals/registry/evals/finance-agent-routing.yaml
  • evals/registry/data/finance_agent_routing/samples.jsonl
  • docs/build-eval.md

Validation

  • validated JSONL structure with local parsing (24 samples, all rows include input and ideal)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant