From 4ed268502cfa581c2b9b7bd5e2ae6ee334e28ffd Mon Sep 17 00:00:00 2001
From: himmi-01 <himmi-01@users.noreply.github.com>
Date: Wed, 3 Jun 2026 00:27:36 -0700
Subject: [PATCH] docs: add EvalMonkey Sonnet 4.5 benchmark badge

---
 README.md | 6 ++++++
 1 file changed, 6 insertions(+)

diff --git a/README.md b/README.md
index 78d7dcaa..9017827a 100644
--- a/README.md
+++ b/README.md
@@ -74,6 +74,12 @@ flowchart TB
 - **Comprehensive Reports**: Produces detailed markdown reports with findings and sources
 - **Concurrent Processing**: Handles multiple searches and result processing in parallel for efficiency
 
+## 📊 EvalMonkey Benchmark Results (Claude Sonnet 4.5)
+
+[![EvalMonkey Reliability](https://img.shields.io/badge/Production%20Reliability-Score%3A46.2-orange)](https://github.com/Corbell-AI/evalmonkey)
+
+*This agent scored a Production Reliability of **46.2/100** when benchmarked on Claude Sonnet 4.5 across HotpotQA, TruthfulQA, and MMLU with adversarial chaos profiles (prompt injection & schema mutation) by [EvalMonkey](https://github.com/Corbell-AI/evalmonkey).*
+
 ## Requirements
 
 - Node.js environment