OmniMCP is a UI automation system that enables Claude to control the computer through the Model Control Protocol (MCP). It combines OmniParser's visual understanding with Claude's natural language capabilities to automate UI interactions.
This standalone package provides OmniMCP with minimal dependencies, letting you use the core functionality without installing all of OpenAdapt's dependencies. It's part of a larger refactoring effort to make components more modular and easier to use.
- Python 3.10 or 3.11
- uv - Fast Python package installer and resolver
# Install uv curl -LsSf https://astral.sh/uv/install.sh | sh
# Clone the OpenAdapt repository
git clone https://github.com/OpenAdaptAI/OpenAdapt.git
cd OpenAdapt/omnimcp
# Run the installation script (creates a virtual environment using uv)
# For Unix/Mac:
./install.sh
# Note: If you get a permission error, run: chmod +x ./install.sh
# For Windows:
install.batThis installation method:
- Creates an isolated virtual environment using uv
- Only installs the dependencies needed for OmniMCP
- Sets up Python to find the required OpenAdapt modules without installing the full package
After installation, activate the virtual environment:
# For Unix/Mac
source .venv/bin/activate
# For Windows
.venv\Scripts\activate.batFor development and testing, you can reset the environment with:
# Reset the virtual environment and reinstall dependencies
cd /path/to/OpenAdapt/omnimcp
rm -rf .venv && chmod +x install.sh && ./install.sh# Run CLI mode (direct command input)
omnimcp cli
# Run MCP server (for Claude Desktop)
omnimcp server
# Run in debug mode to visualize screen elements
omnimcp debug
# Run Computer Use mode (Anthropic's official Computer Use integration)
computer-use
# Connect to a remote OmniParser server
omnimcp cli --server-url=https://your-omniparser-server.example.com
# Deploy OmniParser automatically without confirming
omnimcp cli --auto-deploy-parser --skip-confirmation
# IMPORTANT: Always use auto-deploy with skip-confirmation
omnimcp cli --auto-deploy-parser --skip-confirmation
# Disable automatic OmniParser deployment attempt
omnimcp cli --auto-deploy-parser=False
# With additional options
omnimcp cli --use-normalized-coordinates
omnimcp debug --debug-dir=/path/to/debug/folder
# Computer Use with specific model
computer-use --model=claude-3-opus-20240229
# Computer Use with auto-deploy of OmniParser
computer-use --auto-deploy-parser --skip-confirmationOmniMCP requires access to an OmniParser server for analyzing screenshots:
-
Use a Remote OmniParser Server (Recommended)
omnimcp cli --server-url=https://your-omniparser-server.example.com
-
Auto-Deploy OmniParser (Convenient but requires AWS credentials)
- By default, OmniMCP will offer to deploy OmniParser if not available
- You can control this behavior with these flags:
# Deploy without asking for confirmation omnimcp cli --auto-deploy-parser --skip-confirmation # Disable auto-deployment completely omnimcp cli --auto-deploy-parser=False
-
Use the Default Local Server
- OmniMCP will try to connect to
http://localhost:8000by default - This requires running an OmniParser server locally
- OmniMCP will try to connect to
-
IMPORTANT: Always Use Auto-Deploy with Skip-Confirmation
- For best results, always use these flags together:
omnimcp cli --auto-deploy-parser --skip-confirmation
OmniMCP and Anthropic's ComputerUse both enable Claude to control computers, but with different architectural approaches:
Integration Approach:
- OmniMCP uses OmniParser for understanding UI elements
- ComputerUse captures screenshots and provides them directly to Claude
Environment:
- OmniMCP runs directly on the host system with minimal dependencies
- ComputerUse operates in a containerized virtual desktop environment
MCP vs. Anthropic-defined Tools:
- OmniMCP uses the Model Control Protocol (MCP), a structured protocol for AI models to interact with tools
- ComputerUse uses Anthropic-defined tools (
computer,text_editor, andbash) via Claude's tool use API
Future OmniMCP development could:
- Dual Protocol Support: Support both MCP and Anthropic-defined tools
- Container Option: Provide a containerized deployment similar to ComputerUse
- Unified Approach: Create a bridge between MCP and ComputerUse tools
- Feature Parity: Incorporate ComputerUse capabilities while maintaining MCP compatibility
Both approaches have merits, and integrating aspects of ComputerUse could enhance OmniMCP's capabilities while preserving its lightweight nature and existing MCP integration.
- Visual UI analysis with OmniParser
- Natural language understanding with Claude
- Keyboard and mouse control with pynput
- Model Control Protocol integration
- Debug visualizations
OmniMCP uses code from the OpenAdapt repository but with a minimal set of dependencies. The key components are:
omnimcp/pyproject.toml: Minimal dependency listomnimcp/setup.py: Setup script that adds OpenAdapt to the Python pathomnimcp/omnimcp/package:omnimcp/omnimcp/omnimcp.py: Core OmniMCP functionalityomnimcp/omnimcp/run_omnimcp.py: CLI interfaceomnimcp/omnimcp/computer_use.py: Computer Use integrationomnimcp/omnimcp/pathing.py: Python path configurationomnimcp/omnimcp/adapters/omniparser.py: OmniParser client and provideromnimcp/omnimcp/mcp/server.py: Model Control Protocol server implementation