Self-hosting LLMs
Using third-party AI providers such as Anthropic, Google AI Studio, and OpenAI is the easiest way to use Aluminium. However, you might prefer using self-hosted LLMs for security, privacy, or cost reasons.
Alumnium provides several options for using self-hosted LLMs:
- Serverless models on Amazon Bedrock.
- OpenAI service on Azure.
- Local model inference with Ollama.
Amazon Bedrock
Alumnium supports the following models on Amazon Bedrock:
Please follow the respective documentation on how to enable access to these models on Bedrock. Once enabled, configure Alumnium to use it by exporting the following environment variables:
export ALUMNIUM_MODEL="aws_anthropic" # for Claudeexport ALUMNIUM_MODEL="aws_meta" # for Llama
export AWS_ACCESS_KEY="..."export AWS_SECRET_KEY="..."export AWS_REGION_NAME="us-west-1" # default: us-east-1
Azure
Alumnium supports GPT-4o Mini model on Azure OpenAI service.
Please follow the respective documentation on how to deploy the model to Azure. Once deployed, configure Alumnium to use it by exporting the following environment variables:
export ALUMNIUM_MODEL="azure_openai"export AZURE_OPENAI_API_KEY="..."# Change as neededexport AZURE_OPENAI_API_VERSION="2024-08-01-preview"export AZURE_OPENAI_ENDPOINT="https://my-model.openai.azure.com"
Ollama
Ollama provides a fully local model inference. You can use it to power test execution on your own machine or deploy it to a server and access via API.
Please follow the respective documentation on how to deploy Ollama to the cloud. Once deployed, download necessary model and configure Alumnium to use it:
ollama pull mistral-small3.1:24bexport ALUMNIUM_MODEL="ollama"