Alumnium v0.17 with MCP and reasoning model support

Alumnium v0.17 brings significant advancements in integration capabilities and AI model support. This release introduces the MCP Server for integration with general-purpose agents and comprehensive reasoning model support across all major providers.

This release was made both for PyPI and NPM packages along with a Docker image for Alumnium Server.

MCP Server

This release introduces the Model Context Protocol (MCP) Server, enabling general-purpose agents like Claude Code to leverage Alumnium’s web and mobile automation capabilities. This integration allows AI assistants to control browsers and mobile applications directly through standardized tooling. Alumnium is currently the only MCP server that allows AI assistants to drive both browser and mobile applications with natural language, allowing token- and cost-efficient execution.

Check out the MCP guide detailed setup instructions and a demo.

Reasoning Models

Alumnium now fully embraces reasoning models across all major AI providers, delivering improved accuracy and more sophisticated decision-making in test automation. The framework has been updated to support extended thinking capabilities and optimize prompts for reasoning-enabled models.

Provider	Before	After
Anthropic	Claude Haiku 4.5 ($1/1M input, $5/1M output)	Claude Haiku 4.5 with extending thinking ($1/1M input, $5/1M output)
DeepSeek	DeepSeek V1 ($0.28/1M input, $0.42/1M output)	DeepSeek R1 ($0.28/1M input, $0.42/1M output)
Google	Gemini 2.0 Flash ($0.10/1M input, $0.40/1M output)	Gemini 3 Flash ($0.50/1M input, $3/1M output)
OpenAI	GPT-4o Mini ($0.15/1M input, $0.60/1M output)	GPT-5 Nano ($0.05/1M input, $0.40/1M output)
xAI	Grok 4 Fast (Non Reasoning) ($0.20/1M input, $0.50/1M output)	Grok 4.1 Fast (Reasoning) ($0.20/1M input, $0.50/1M output)

The framework automatically handles reasoning output across all providers, logging the model’s thought process and adapting prompts to leverage thinking capabilities where available.

This release maitains backwards compatibility, so you can keep using non-reasoning models (Mistral, Llama) or switch back to older models if needed.

Core Improvements

Alumnium now supports navigating to URLs, file uploading, custom JavaScript execution, and scrolling to elements as part of its do command.

In addition to that, this release brings multitple improvements to drivers:

full support for asynchronous Playwright driver in Python;
automated switching to new tabs in browsers;
automated scrolling to elements on Appium before interaction;
automated keyboard hiding after typing on Appium.

On top of that, LangChain framework which is used for all LLM communication was upgraded to the first stable major release v1.

Coming Next

With MCP server and comprehensive reasoning model support, Alumnium now provides a robust foundation for AI-powered automation across multiple platforms and agent types. The addition of file uploading and enhanced navigation tools expands the range of testing scenarios Alumnium can handle effectively.

Future development will focus on expanding MCP server capabilities to improve agents execution and exploring vision-based actions for applications that cannot be autoamted with an accessibility tree.