Releases: askui/python-sdk
v0.35.0
v0.35.0
🎉 Overview
v0.35.0 adds support for OpenAI-compatible APIs as model providers, enabling the use of OpenAI, Ollama, vLLM, LM Studio, Together AI, RunPod, and any other service that exposes an OpenAI-compatible chat completions endpoint. Truncation strategies now preserve the first user message across summarization to retain the original task instructions, and the truncation headroom has been doubled to reduce the chance of hitting context limits immediately after truncation.
✨ New Features
OpenAIVlmProvider— VLM provider for any OpenAI-compatible API (OpenAI, vLLM, LM Studio, Together AI, etc.) by @philipph-askui in #268OpenAIImageQAProvider— image Q&A provider for any OpenAI-compatible API by @philipph-askui in #268OllamaVlmProvider— convenience wrapper for local Ollama instances with sensible defaults (base_url=http://localhost:11434/v1,model_id=qwen3.5) by @philipph-askui in #268OllamaImageQAProvider— image Q&A via local Ollama instances by @philipph-askui in #268OpenAICompatibleVlmProvider— VLM provider for endpoints that require an exact URL (e.g., RunPod, custom proxies) where the OpenAI SDK's automatic path appending would break the request by @philipph-askui in #268OpenAIMessagesApi— full translation layer between the internalMessageParamformat and OpenAI's chat completions API, handling tool calls, image content, thinking blocks, and role alternation by @philipph-askui in #268OpenAIGetModel—GetModelimplementation for OpenAI-compatible APIs with structured output support by @philipph-askui in #268- Built-in pricing data for
gpt-5.4,gpt-5.4-mini, andgpt-5.4-nanomodels by @philipph-askui in #268
🔧 Improvements
- Truncation strategies now preserve the first user message across summarization, ensuring the original task instructions are never lost when the conversation is truncated by @philipph-askui in #280
MAX_INPUT_TOKENSincreased from 100k to 200k andTRUNCATION_THRESHOLDlowered from 0.7 to 0.56, roughly doubling the headroom after truncation to reduce the chance of re-triggering truncation immediately by @philipph-askui in #280process_idparameter inlist_process_windowstool is now auto-converted toint, preventing tool errors when the agent passes it as a string by @philipph-askui in #279
🐛 Bug Fixes
AgentSpeakernow handles the case where the model returnsstop_reason='tool_use'but no actual tool call blocks in the content, preventing stopped executions by prompting the model to retry with a valid tool call by @philipph-askui in #278
Full Changelog: v0.34.0...v0.35.0
v0.34.0
v0.34.0
🎉 Overview
v0.34.0 adds new tools that let agents interact with the file system and display configuration on the automation target: ComputerGetFileTool reads files (text or image), ComputerGetFileNamesTool lists directory contents, and ComputerRemoveVirtualDisplaysTool tears down virtual displays. A new clean_virtual_displays controller setting auto-removes virtual displays on startup. The ComputerAgent docstring now documents per-call tool registration via act(..., tools=[...]).
✨ New Features
ComputerGetFileTool(experimental) — reads a file at an absolute path on the automation target, returning UTF-8 text as a string or decoded images asPIL.Image.Imageby @mlikasam-askui in #277ComputerGetFileNamesTool(experimental) — lists regular file names (not subdirectories) in a directory on the automation target by @mlikasam-askui in #277ComputerRemoveVirtualDisplaysTool(experimental) — removes all virtual displays from the controller, leaving only physical displays active by @mlikasam-askui in #277clean_virtual_displayssetting onAskUiControllerClientSettings— when enabled, automatically removes all virtual displays after the controller connects by @mlikasam-askui in #277
🔧 Improvements
ComputerAgentdocstring updated with examples for per-call tool registration viaact(..., tools=[...])by @mlikasam-askui in #277- Pinned
askui-agent-os>=26.4.1on macOS and>=26.5.1on other platforms to ensure gRPC compatibility with the new commands by @mlikasam-askui in #277
Full Changelog: v0.33.0...v0.34.0
v0.33.0
v0.33.0
🎉 Overview
v0.33.0 introduces AutomationError — a new exception type for unfixable errors that immediately terminate agent execution instead of being auto-corrected. The conversation control loop now properly cleans up via try/finally, ensuring reporters and teardown always run even when errors propagate. This release also corrects the typing speed unit documentation and fixes a bug where messages could be lost if the truncation strategy crashed.
✨ New Features
AutomationError— new exception type for unfixable errors (e.g., missing credentials, unreachable services) that propagates immediately to the caller, bypassing the agent's auto-correction retry loop. Regular exceptions remain fixable by the agent as before. by @philipph-askui in #271- Documentation for error handling in tools — added a new "Error Handling in Tools" section to the tools guide explaining the distinction between fixable errors (regular exceptions) and unfixable errors (
AutomationError) by @philipph-askui in #271
🔧 Improvements
- Conversation control loop now uses
try/finallyto guarantee_on_conversation_end()and_teardown_control_loop()execute even when anAutomationErroror other exception propagates, preventing resource leaks by @philipph-askui in #271 - Messages are now reported to the reporter before being passed to the truncation strategy, preventing data loss if truncation crashes by @philipph-askui in #274
- Truncation failures are now caught, logged, and reported to the reporter with the message
"Truncation Failed with error: {e}"before re-raising, improving observability of context-window management errors by @philipph-askui in #274
🐛 Bug Fixes
- Corrected typing speed unit in
ComputerTypeTooldescription andAgentOs.type()docstring from "characters per minute" to "characters per second" by @philipph-askui in #272
⚠️ Breaking Changes
AgentExceptionrenamed toAgentError— if you were catchingAgentExceptiondirectly, update your imports to useAgentErrorfromaskui.models.shared.tools
Full Changelog: v0.32.1...v0.33.0
v0.32.1
v0.32.1
🎉 Overview
v0.32.1 fixes a bug that led to a crash if the optional "web" dependency group was not installed.
🐛 Bug Fixes
- fix: add missing import guard for PlaywrightBaseTool by @philipph-askui in #270
Full Changelog: v0.32.0...v0.32.1
v0.32.0
v0.32.0
🎉 Overview
v0.32.0 introduces the new WebAgent, a browser automation agent with native Playwright tools for mouse, keyboard, and screenshot interactions. The release also adds numpad key support across the AgentOS keyboard abstraction.
✨ New Features
WebAgent— a new browser automation agent with a full suite of Playwright tools (screenshot,move_mouse,mouse_click,mouse_scroll,mouse_hold_down,mouse_release,type,keyboard_tap,keyboard_pressed,keyboard_release) in addition to the existing navigation tools by @philipph-askui in #267- Numpad key support — added
numpad_lock,numpad_0–numpad_9,numpad_+,numpad_-,numpad_*,numpad_/, andnumpad_.toPcKeywith corresponding Playwright key mappings by @mlikasam-askui in #269
🔧 Improvements
- Set
is_cacheableflag onlist_process_toolfor improved caching by @philipph-askui in #267
⚠️ Breaking Changes
WebVisionAgentis deprecated — useWebAgentinstead.WebVisionAgentstill works but emits aDeprecationWarningWebAgentnow extendsAgentdirectly instead ofComputerAgent, with a new constructor signature that acceptscallbacksandtruncation_strategyparameters- Playwright navigation tools (
PlaywrightGotoTool,PlaywrightBackTool, etc.) now inherit fromPlaywrightBaseToolinstead ofTooland require aPlaywrightAgentOs(or compatible) instance as their agent OS
Full Changelog: v0.31.0...v0.32.0
v0.31.0
v0.31.0
🎉 Overview
v0.31.0 substantially improves the memory efficiency of askui. The SimpleHtmlReporter has been rearchitected to stream message rows (including base64-encoded screenshots) to a temporary file on disk instead of accumulating them in memory, significantly reducing memory usage during long-running sessions. Further, reporters are now wrapped with automatic error handling so that a failure in one reporter no longer crashes the agent.
✨ New Features
ReporterErrorHandler— a decorator that wraps anyReporterwith try/except error handling; on first failure the reporter is disabled for the rest of the session, preventing reporting errors from interrupting agent execution by @mlikasam-askui in #258
🔧 Improvements
SimpleHtmlReporternow streams HTML message rows to a temporary file as they arrive instead of holding all base64 image data in memory, reducing peak memory usage for screenshot-heavy sessions by @mlikasam-askui in #258CompositeReporternow automatically wraps all reporters inReporterErrorHandler, making error resilience the default behavior by @mlikasam-askui in #258
Full Changelog: v0.30.0...v0.31.0
v0.30.0
v0.30.0
🎉 Overview
v0.30.0 introduces a new infrastructure-error handling prompt that prevents agents from entering unfixable retry loops when the underlying controller, session, or RPC connection fails. It also enriches the HTML report's conversation breakdown with per-conversation step counts, durations, and cache token statistics, and quiets noisy tool-failure logs by demoting them from WARNING to INFO.
✨ New Features
- Infrastructure / tool error prompt added to the computer, Android, and multi-device agent capabilities — instructs agents to retry infrastructure failures (connection lost, session expired, RPC errors, stream closed, service unavailable, controller timeouts) at most once and otherwise stop immediately with a
BROKENreport status instead of looping on unfixable errors by @philipph-askui in #265 - Step count,
cache_creation_input_tokens, andcache_read_input_tokensadded to the per-conversation usage breakdown inSimpleHtmlReporterby @philipph-askui in #264 - Per-conversation duration added to the HTML report breakdown —
started_at/ended_attimestamps are captured on conversation summaries and rendered in a human-readable elapsed-time format by @philipph-askui in #266
🔧 Improvements
Tool failedlogs inToolCollectiondemoted fromWARNINGtoINFOto reduce log noise during normal agent operation by @philipph-askui in #264
⚠️ Breaking Changes
UsageTrackingCallbackrenamed toConversationStatisticsCallback
Full Changelog: v0.29.0...v0.30.0
v0.29.0
v0.29.0
🎉 Overview
v0.29.0 replaces the simple message-dropping truncation strategy with a new VLM-based SummarizingTruncationStrategy that summarizes older conversation history to preserve context while staying within token limits. It also fixes mouse scroll coordinate scaling issues, improves scroll tool descriptions with OS-specific guidance, removes get and locate from the default agent tools, hardens the move_mouse tool against malformed coordinate inputs, and makes base64 image truncation in html reports more robust.
✨ New Features
SummarizingTruncationStrategy— new default truncation strategy that uses the VLM to summarize older conversation history instead of dropping messages, with prompt caching support during summarization for cost efficiency by @philipph-askui in #257SlidingImageWindowSummarizingTruncationStrategy(experimental) — extends summarization with dynamic image removal from older messages to reduce network traffic and latencies while staying compatible with prompt caching by @philipph-askui in #257truncation_strategyinit parameter onComputerAgent,AndroidAgent, andAgent— allows passing a custom truncation strategy with auto-injection of conversation dependencies (vlm_provider,reporter,callbacks) by @philipph-askui in #257
🔧 Improvements
- Mouse scroll tool description now includes OS-dependent scroll guidance (start with
dy=150/dy=-150, macOS direction info) by @programminx-askui in #260 truncate_contentin reporting replaced bytruncate_base64_images— only base64 image data is replaced with placeholders, leaving all other content (prompts, tool outputs) untouched by @philipph-askui in #259move_mousetool now robustly parses coordinates when the agent passes them as strings or comma-separated values, with clearer tool description and improved error messages by @philipph-askui in #262
🐛 Bug Fixes
- Fix incorrect coordinate scaling on mouse scroll deltas —
ComputerAgentOsFacade.mouse_scrollno longer applies display scaling to scroll amounts (SOLENG-332) by @programminx-askui in #260
⚠️ Breaking Changes
SimpleTruncationStrategyandSimpleTruncationStrategyFactoryremoved — replaced bySummarizingTruncationStrategyas the new defaultConversationconstructor parametertruncation_strategy_factoryreplaced bytruncation_strategy(a strategy instance instead of a factory)getandlocatetools removed fromAgent's default tool list — they are no longer auto-added when anagent_osis providedmouse_scrollparameters renamed fromx/ytodx/dyacross allAgentOsimplementations (AskUiControllerClient,PlaywrightAgentOs,ComputerAgentOsFacade,ComputerAgent)truncate_contentfunction inreporting.pyremoved — replaced bytruncate_base64_images
Full Changelog: v0.28.0...v0.29.0
v0.28.0
v0.28.0
🎉 Overview
v0.28.0 integrates AgentOS as a Python package dependency (no more manual installation), adds a UIAutomator hierarchy tool for Android agents, improves support for Anthropic prompt caching to reduce inference cost, introduces Tool.from_mcp_tool() for wrapping FastMCP tools, and overhauls usage tracking with per-step and per-conversation cost breakdowns including cache token costs in the HTML reports.
✨ New Features
- AgentOS shipped as Python package (
askui-agent-os) — no manual installation needed by @mlikasam-askui in #246 - Anthropic prompt caching (auto strategy) with
cache_controlparameter by @philipph-askui in #253 AndroidGetUIAutomatorHierarchyTool— accessibility hierarchy dump for Android agents, providing structured UI element data (text, resource IDs, tap centers) as an alternative to screenshot-based inference by @mlikasam-askui in #251- Hierarchical usage tracking with per-step, per-conversation, and aggregate cost breakdowns including cache token costs in HTML reports by @mlikasam-askui in #253
🔧 Improvements
Tool.from_mcp_tool()to wrap FastMCP tools as AskUI Tools by @mlikasam-askui in #250markitdownandbsonmoved to optional dependencies (office-document) andpure-python-adbpromoted to core to streamline the installation by @mlikasam-askui in #255- Documented optional install extras (
office-document,bedrock,vertex,otel,web) in README by @mlikasam-askui in #255 - Workspace ID (
askui.workspace.id) added to OTEL trace resource attributes by @philipph-askui in #256 - Improved tracing structure with
_get_next_message()span for better observability by @philipph-askui in #256
🐛 Bug Fixes
- Fix prompt caching breakpoints to improve prompt caching efficiency by @philipph-askui in #253
- Fix report formatting and cache statistics accumulation by @philipph-askui in #253
- Constrain
grpcio<1.80.0to avoid compatibility issues by @philipph-askui in #250 - Clean up OTEL tracing: remove stale
cluster_nameconfig and unnecessary SQLAlchemy instrumentation by @philipph-askui in #256
⚠️ Breaking Changes
ASKUI_COMPONENT_REGISTRY_FILE,ASKUI_INSTALLATION_DIRECTORY, andASKUI_CONTROLLER_PATHenvironment variables are no longer recognized — AgentOS is now auto-discovered via theaskui-agent-ospackageOtelSettings.cluster_namefield andASKUI__OTEL_CLUSTER_NAMEenv var removed; replaced byworkspace_id/ASKUI_WORKSPACE_ID- Minimum
anthropicSDK version bumped from>=0.72.0to>=0.86.0 androidoptional extra removed —pure-python-adbis now a core dependency; useoffice-documentextra for MarkItDown features previously bundled by defaultbsonandmarkitdownremoved from default dependencies — installaskui[office-document]if you need Office file conversion
Full Changelog: v0.27.0...v0.28.0
v0.27.0
v0.27.0
🎉 Overview
v0.27.0 adds a MultiDeviceAgent that can operate android and computer devices simultaneously, improves the SDK structure by introducing default tool lists for the ComputerAgent, and AndroidAgent, and fixes a bug with single-display handling on android.
✨ New Features
- Execution Cost in Html Reports by @philipph-askui in #243
- Otel tracing @philipph-askui in #249
- max_steps parameter to stop the execution after a predefined number of steps by @philipph-askui in #244
🔧 Improvements
- format execution time in html reports as
hh:mm:ssby @philipph-askui in #247
🐛 Bug Fixes
Full Changelog: v0.26.1...v0.27.0