Alumnium v0.13 with a new planner, a filesystem cache and Llama 4

Alumnium v0.13 is a big release focused on improving accuracy and stability. You can see the whole changelog on GitHub. Highlights include a new structured planner, a filesystem cache, and migration to Llama 4 with vision support.

Structured Planner

This release introduces a structured planner that improves accuracy by requiring the LLM to return a JSON object containing both reasoning and an executable plan. The JSON includes an explanation field where the model must outline its step-by-step thinking before selecting an action. This enforced self-explanation reduces errors and produces more consistent outcomes. The planner is implemented across DeepSeek, MistralAI, Google, Llama, and OpenAI models.

Filesystem Cache

A new filesystem-backed cache is now enabled by default to improve performance and eliminate error across parallel test runs. It supports safe concurrent access. To switch back to the previous implementation, set ALUMNIUM_CACHE=sqlite. To disable caching entirely, set ALUMNIUM_CACHE=false.

Llama 4

This version migrates from Llama 3.2 to Llama 4 using the Maverick 17B model, delivering better performance, accuracy, and reasoning. Llama 4 also adds vision checks, allowing the model to retrieve data and perform validations using application screenshots.

Coming Next

The structured planner paves the way to integrating more advanced “thinking” models.

Work is also underway to extract core logic into a standalone Alumnium Server, the first step toward multi-language client bindings, enabling test authoring in additional languages. Stay tuned!