Skip to content
GitHubDiscordSlack

Alumnium v0.14 with a standalone server and an agent to find elements

Published by Alex Rodionov's avatar Alex Rodionov
release notes

Alumnium v0.14 establishes the foundation for multi-language support and improves developer experience. Highlights include a standalone server with RESTful API, Docker containerization, and a natural language agent for finding UI elements.

This release introduces a client-server architecture that centralizes LLM interactions on a server component. The server provides REST API endpoints for all core Alumnium capabilities including session management, area identification, step execution, verification, data retrieval, and planning.

By moving LLM interactions to the server, there will be only one implementation of Alumnium prompts, agents, and supported models. This enables client implementations in multiple languages beyond Python, making Alumnium accessible to a broader developer ecosystem.

The server is now available at alumnium/alumnium on Docker Hub, enabling containerized deployments across different platforms. The image packages the Alumnium server for use by clients implemented in other languages.

You can launch the container with:

Terminal window
docker run --rm -p 8013:8013 \
--volume $(PWD)/.alumnium/cache:/app/.alumnium/cache \
--env-file .env \
alumnium/alumnium

A new agent for finding elements enables developers to locate UI components using natural language queries and get native framework instances:

submit_button = al.find("submit button")
submit_button.click()

The agent accepts a text description and returns it in a format compatible with the underlying automation framework—either Appium/Selenium’s WebElement or Playwright’s Locator. This provides a simpler alternative to complex selector chains while maintaining full compatibility with existing automation code.

The client-server architecture opens the door to TypeScript, Java, and other language implementations. We’re excited to see the Alumnium ecosystem expand to support more development workflows and testing frameworks.