Skip to content
GitHubDiscordSlack

MCP

Alumnium’s Model Context Protocol server enables general-purpose AI agents like Claude Code to leverage Alumnium’s web and mobile automation capabilities through the standardized Model Context Protocol. This integration allows AI assistants to control browsers and mobile applications directly.

The MCP Server is included in the Alumnium package for Python.

It’s recommended to use uv to automatically install Python and manage virtual environments. Follow official installation instructions to set it up on your system. Once installed, proceed to install Alumnium MCP server.

Terminal window
claude mcp add alumnium --env OPENAI_API_KEY=... -- uvx --from alumnium alumnium-mcp
Terminal window
codex mcp add alumnium --env OPENAI_API_KEY=... -- uvx --from alumnium alumnium-mcp

Add the the following to mcp.json:

{
"mcpServers": {
"alumnium": {
"command": "uvx",
"args": ["--from", "alumnium", "alumnium-mcp"],
"env": {
"OPENAI_API_KEY": "..."
}
}
}
}
Terminal window
gemini mcp add alumnium --env OPENAI_API_KEY=... uvx --from alumnium alumnium-mcp
Terminal window
code --add-mcp '{
"name": "alumnium",
"command": "uvx",
"args": [
"--from",
"alumnium",
"alumnium-mcp"
],
"env": {
"OPENAI_API_KEY": "..."
}
}'

The MCP Server exposes Alumnium’s core automation capabilities:

ToolDescription
start_driverInitialize browser/mobile drivers with Appium/Selenium/Playwright capabilities
stop_driverCleanup resources and retrieve token usage statistics
doExecute natural language automation commands
checkVerify statements about the current page state with optional vision support
getExtract data from pages using natural language descriptions
fetch_accessibility_treeDebug page structure with raw accessibility tree

Initialize the browser or mobile driver session with JSON capabilities. Supports all drivers: Appium, Selenium, or Playwright.

Selects the driver to use. Supported values are chrome, ios, and android.

Selenium and Playwright only. Set pre-defined cookies before the session starts, useful for authentication.

Selenium and Playwright only. Set custom headers for all browser requests in the session.

Pass alumnium:options in capabilities to configure Alumnium behavior for the session:

{
"platformName": "chrome",
"alumnium:options": {
"changeAnalysis": true,
"planner": false,
"excludedAttributes": ["url"],
"driverSettings": {
"autoswitchToNewTab": false
}
}
}
OptionDescription
changeAnalysisEnable UI changes analysis after each do() call. Default is true.
plannerEnable or disable the planning step in do(). Default is true.
excludedAttributesArray of accessibility tree attributes to exclude. Reduces tree size on large pages.
driverSettingsKey-value pairs applied directly to the underlying driver (e.g. autoswitchToNewTab).

For iOS and Android sessions, pass appium:settings in capabilities to configure Appium settings that are applied to the driver after it is created:

{
"platformName": "ios",
"appium:settings": {
"allowInvisibleElements": true,
"ignoreUnimportantViews": true
}
}

Stops running driver session and cleans up resources. Returns path to the artifacts directory, token usage statistics for the session and optionally saves the execution cache.

Perform actions in the application using natural language commands and return summary of the performed steps. Alumnium automatically captures screenshot upon completion and stores it in the artifacts directory.

Verify application state and run assertions using natural language commands. Returns the result of the check along with explanation the verification was evaluated. Alumnium automatically captures screenshot upon completion and stores it in the artifacts directory.

Extract data from the application based on natural language descriptions. If data is not found, returns explanation why it can’t be retrieved. Alumnium automatically captures screenshot upon completion and stores it in the artifacts directory.

Returns the raw accessibility tree of the current page as XML. Useful for debugging when do, check, or get behave unexpectedly - inspect the tree to verify element visibility, roles, and attributes.