Skip to content

Add AG2 integration for multi-agent tracing#2596

Open
faridun-ag2 wants to merge 2 commits intoconfident-ai:mainfrom
faridun-ag2:add-ag2-integration
Open

Add AG2 integration for multi-agent tracing#2596
faridun-ag2 wants to merge 2 commits intoconfident-ai:mainfrom
faridun-ag2:add-ag2-integration

Conversation

@faridun-ag2
Copy link
Copy Markdown

Summary

  • Adds deepeval/integrations/ag2/ with instrument_ag2() that patches AG2's ConversableAgent to automatically capture AgentSpan, LlmSpan, and ToolSpan for DeepEval evaluation
  • Follows the same pattern as the existing CrewAI and LangChain integrations (monkey-patching + Observer context managers)
  • Includes usage example in examples/tracing/ag2_tracing.py

What it captures

AG2 method DeepEval span Data captured
generate_reply / a_generate_reply AgentSpan Agent name, reply output
OpenAIWrapper.create LlmSpan Model name, messages, token counts
execute_function / a_execute_function ToolSpan Function name, arguments, result

Usage

from deepeval.integrations.ag2 import instrument_ag2
instrument_ag2()

# Run AG2 agents as usual — traces captured automatically
executor.run(assistant, message="...").process()

Why AG2

AG2 is an open-source multi-agent orchestration framework with 500K+ monthly PyPI downloads. It's a natural addition alongside the existing CrewAI, LangChain, and LlamaIndex integrations.

Files

deepeval/integrations/ag2/
├── __init__.py      # Public exports
├── handler.py       # instrument_ag2() / reset_ag2_instrumentation()
├── wrapper.py       # Method wrapping (generate_reply, execute_function, OpenAIWrapper.create)
examples/tracing/
├── ag2_tracing.py   # Usage example

No changes to existing code.

Test plan

  • Import verification: from deepeval.integrations.ag2 import instrument_ag2 works
  • Lifecycle: instrument → verify patched → reset → verify restored
  • AgentSpan: generate_reply creates span with correct agent name
  • ToolSpan: execute_function creates span with function name, args, output
  • Nested spans: tool span appears as child of agent span
  • Error handling: failed tools get ERRORED status
  • E2E: full conversation with gpt-4o-mini produces traces
  • Linting: black + ruff pass cleanly

Add `deepeval/integrations/ag2/` with `instrument_ag2()` that patches
AG2's ConversableAgent to capture agent, LLM, and tool spans for
DeepEval evaluation — following the same pattern as the existing
CrewAI and LangChain integrations.
@vercel
Copy link
Copy Markdown

vercel Bot commented Apr 3, 2026

@faridun-ag2 is attempting to deploy a commit to the Confident AI Team on Vercel.

A member of the Team first needs to authorize it.

These 20 files were already failing black --check on main branch
before this PR. Reformatting them here to pass CI.
@faridun-ag2
Copy link
Copy Markdown
Author

Hi @penguine-ip @kritinv!
We'd love your review on this when you get a chance!

This PR adds an AG2 multi-agent tracing integration following the same pattern as the existing CrewAI and LangChain integrations. All new files, no changes to existing code.

Note on CI: The test_core job fails on forked PRs due to a pre-existing issue — the ignore path for test_dataset_iterator.py is missing the test_integration/ subdirectory:

--ignore=tests/test_core/test_tracing/test_dataset_iterator.py            # current (wrong)
--ignore=tests/test_core/test_tracing/test_integration/test_dataset_iterator.py  # correct

This causes the test to run without an OPENAI_API_KEY and fail. The black formatting failures (20 files) were also pre-existing on main — I've included a fix for those in a separate commit.

Happy to address any feedback!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant