Skip to content
GitHubDiscordSlack

Reference

Alumnium works by building an accessibility tree of the webpage. Unfortunately, there is no standard API in browsers to provide this tree. Due to this limitation, the current version of Alumnium only works in Chromium-based browsers such as Google Chrome, Microsoft Edge, Opera, and others.

Playwright driver supports both headful and headless modes, while Selenium driver only supports the headful mode.

Alumnium currently supports Appium with XCUITest driver for iOS automation and UiAutomator2 driver for Android automation.

The following environment variables can be used to control the behavior of Alumnium.

Sets the cache provider used by Alumnium. Supported values are:

  • filesystem (default)
  • none or false

Sets the directory where the filesystem cache is stored. Default is .alumnium/cache.

Set to true to enable analysis of UI changes made by do(). When enabled, Alumnium captures the accessibility tree before and after each action and returns a description of what changed. Default is false when using Alumnium as a library and true when running Alumnium MCP server.

Delay in seconds between retries when an action fails. Default is 0.5.

Comma-separated list of accessibility tree attributes to exclude (e.g. focusable,url). Useful for reducing accessibility tree size on large pages.

Set to true to capture full-page screenshots instead of viewport-only screenshots. Default is false.

Sets the level used by Alumnium logger. Supported values are:

  • debug
  • info
  • warning (default)
  • error
  • critical

Sets the output location used by Alumnium logger. Supported values are:

  • a path to a file (e.g. alumnium.log);
  • stdout to print logs to the standard output.

Select AI provider and model to use.

ValueLLMNotes
anthropicclaude-haiku-4-5-20251001Anthropic API.
azure_foundrygpt-5-nanoAzure AI Foundry API.
azure_openaigpt-5-nanoSelf-hosted Azure OpenAI API. Recommended model version is 2025-08-07.
aws_anthropicus.anthropic.claude-haiku-4-5-20251001-v1:0Serverless Amazon Bedrock API.
aws_metaus.meta.llama4-maverick-17b-instruct-v1:0Serverless Amazon Bedrock API.
deepseekdeepseek-reasonerDeepSeek Platform.
githubgpt-4o-miniGitHub Models API.
googlegemini-3.1-flash-lite-previewGoogle AI Studio API.
mistralaimistral-medium-2505Mistral AI Studio API.
ollamamistral-small3.1:24bLocal model inference with Ollama.
openaigpt-5-nano-2025-08-07OpenAI API.
xaigrok-4-1-fast-reasoningxAI API.

You can also override the LLM for each provider by passing it after /.

Custom OpenAI model
export ALUMNIUM_MODEL="openai/gpt-5"

Timeout in seconds for AI model requests. Default is 90.

Sets the URL for Ollama models if you host them externally on a server.

Set to false to disable the planning step. When disabled, the actor’s own reasoning is used as the explanation. Default is true.

Set to false to start Playwright in headed mode. Only used in the MCP server. Default is true.

Timeout in milliseconds when waiting for a new tab to open after interacting with elements using Playwright driver. Increase when Alumnium fails to detect a new tab. Default is 200.

API key used when ALUMNIUM_MODEL is set to azure_foundry.

API version used when ALUMNIUM_MODEL is set to azure_foundry.

Endpoint URL used when ALUMNIUM_MODEL is set to azure_foundry.

Sets the URL for OpenAI models if you access them via custom endpoint.