Use this file to discover all available pages before exploring further.
Jitera Self-Hosted uses AI/LLM providers for code generation, AI chat, documentation generation, and code understanding. This guide covers the complete configuration for all supported providers.
The third-party service procedures in this guide (Azure OpenAI, OpenAI, AWS Bedrock, Anthropic, Google AI) are provided as examples. Refer to the official documentation for each provider for the most up-to-date instructions:
LiteLLM alias — must match SuperAdmin name for non-Azure providers
claude-3.5-sonnet, gemini-2.5-pro
For Boost, the SuperAdmin name field is the routing key and must match either the Azure deployment name (extracted from the URL) or the LiteLLM model_name. For Ultron, the modelKey field determines provider routing.
Ultron and Boost each access LLM providers differently. The tables below list the required and available models for each provider, organized by which service uses them.
Ultron routes requests by pattern-matching the modelKey field (see Provider Routing Reference). Any model ID that matches a supported provider pattern will work — the lists below cover pre-configured and commonly used models.
Boost dynamically discovers models from configured endpoints at runtime. Any model available through a configured endpoint (Azure deployment, LiteLLM proxy, or OpenAI-compatible API) can be used.
Additional Azure models can be registered dynamically using AZURE_DEVELOPMENT_NAME_* environment variables in values.yaml. The env var value becomes both the deployment lookup key and the deployment name. For example, setting AZURE_DEVELOPMENT_NAME_GPT_41=gpt-4.1 registers gpt-4.1 as a known deployment.
Azure OpenAI models require specific deployment SKUs. Using the wrong SKU results in 400 InvalidResourceProperties or 400 ServiceModelDeprecated errors.
Standard: Deploys in a specific region. Use for data residency requirements (e.g., japaneast for Japan-local processing). Available for gpt-4.1, gpt-4.1-mini, gpt-4o, text-embedding-ada-002. Note that Standard SKUs are being retired on a per-model, per-region schedule — verify availability before deploying.
GlobalStandard: Deploys on Azure’s global infrastructure (requests are routed to the nearest available region). Required for gpt-4.1-nano, o1, o3, o3-mini, o4-mini, and gpt-5 series — these models do not support Standard.
text-embedding-ada-002 is still GA (no retirement scheduled before 2027-04-15), but Microsoft recommends text-embedding-3-small or text-embedding-3-large for new deployments.
Even when a user selects a specific model, background operations use default models. These defaults must be available from a configured provider.Ultron defaults:
Role
Default Model
Environment Variable
Main background model
gpt-4.1
OPENAI_MAIN_MODEL_NAME
Small model
gpt-4o-mini
(code default)
Vision model
gpt-4o
(code default)
OPENAI_MAIN_MODEL_NAME is required for both AI_MODE: azure and AI_MODE: open_ai. For Azure, set this to a deployment name that exists in your Azure OpenAI resource.Boost defaults:
Boost does not maintain a hardcoded model list. It discovers available models dynamically:
Endpoint Type
How Models Are Discovered
Azure OpenAI
Deployment name extracted from the URL path
OpenAI-compatible (incl. LiteLLM)
Calls GET /v1/models on the endpoint
Internal workflows
Registered from Boost’s workflow registry
Local audio
Hardcoded: jitera/tts and jitera/stt (via sherpa-onnx)
To make a model available in Boost, configure an endpoint that serves it (Azure deployment, LiteLLM proxy entry, or OpenAI-compatible API) and register it in SuperAdmin with a name that matches the discovered model ID.
This determines Ultron’s primary provider only. Other providers (Bedrock, Anthropic, Google, vLLM) can be added alongside either mode. Boost is provider-agnostic — it connects to any OpenAI-compatible endpoint.
Create an Azure OpenAI resource and obtain your endpoint URL and API key. For detailed and up-to-date instructions, see the Azure OpenAI documentation.
The AZURE_OPENAI_INSTANCE_NAMES value must be a custom subdomain name (e.g., my-instance), not a regional endpoint. Ultron constructs URLs as https://{instance}.openai.azure.com. If your Azure OpenAI resource uses a regional endpoint (e.g., eastus2.api.cognitive.microsoft.com), enable a custom subdomain:
You can verify your endpoint format in the Azure Portal under your OpenAI resource > Keys and Endpoint. The endpoint must be https://<subdomain>.openai.azure.com/.
Jitera’s backend services (Ultron and Boost) reference Azure deployments by name through environment variables and endpoint URLs. Each deployment you create here must match the deployment name configured in Jitera’s Helm values — otherwise the services cannot route requests to the correct model. For the full list of deployment-to-environment-variable mappings, see the Azure Model-to-Deployment Mapping section.Deploy the following models in the Azure OpenAI Studio or via the Azure CLI. For deployment instructions, see the Azure OpenAI deployment guide.
Use the same string for both Model deployment name and Model name (e.g. deploy gpt-4.1 with deployment name gpt-4.1). This simplifies configuration since Jitera uses the deployment name as the routing key.
Ultron reads Azure OpenAI settings from environment variables injected via openai.secretKeys.azure:
openai: AI_MODE: azure # Required: set to "azure" for Azure OpenAI secretKeys: azure: # === Main Region (e.g. Japan East) === AZURE_OPENAI_KEY: "<your-api-key>" AZURE_OPENAI_KEYS: '["<key1>", "<key2>"]' # JSON array for load balancing AZURE_OPENAI_INSTANCE_NAME: "<your-instance-name>" AZURE_OPENAI_INSTANCE_NAMES: '["<instance1>", "<instance2>"]' AZURE_OPENAI_VERSION: "2024-10-21" # Deployment names (must match Azure portal) AZURE_OPENAI_DEVELOPMENT_NAME: gpt-4.1 # Default/fallback model AZURE_OPENAI_EMBEDDING_DEVELOPMENT_NAME: text-embedding-ada-002 AZURE_OPENAI_VISION_DEVELOPMENT_NAME: gpt-4o AZURE_OPENAI_GPT_4O_DEVELOPMENT_NAME: gpt-4o AZURE_OPENAI_GPT_4O_MINI_DEVELOPMENT_NAME: gpt-4o-mini AZURE_DEVELOPMENT_NAME_GPT_41: gpt-4.1 # === Global Region (e.g. Sweden Central or US East) === # Required for O1, O3, and GPT-5 models AZURE_OPENAI_GLOBAL_KEYS: '["<global-key1>", "<global-key2>"]' AZURE_OPENAI_GLOBAL_INSTANCE_NAMES: '["<instance-swedencentral>"]' AZURE_OPENAI_GLOBAL_VERSION: "2024-12-01-preview" # O1/O3 models (auto-routed to Global region) AZURE_OPENAI_GPT_O1_DEVELOPMENT_NAME: o1 AZURE_OPENAI_GPT_O1_MINI_DEVELOPMENT_NAME: o1-mini AZURE_OPENAI_GPT_O3_MINI_DEVELOPMENT_NAME: o3-mini AZURE_DEVELOPMENT_NAME_O3: o3 # GPT-5 models (requires azure_global: prefix in SuperAdmin modelKey) AZURE_OPENAI_GPT_5_DEVELOPMENT_NAME: gpt-5 AZURE_OPENAI_GPT_5_MINI_DEVELOPMENT_NAME: gpt-5-mini AZURE_OPENAI_GPT_5_NANO_DEVELOPMENT_NAME: gpt-5-nano AZURE_OPENAI_GPT_5_CHAT_DEVELOPMENT_NAME: gpt-5-chat AZURE_OPENAI_GPT_51_DEVELOPMENT_NAME: gpt-5.1 AZURE_OPENAI_GPT_52_DEVELOPMENT_NAME: gpt-5.2 openai: # Main model name — used by Ultron for background tasks regardless of AI_MODE OPENAI_MAIN_MODEL_NAME: gpt-4.1
OPENAI_MAIN_MODEL_NAME is required even when using Azure mode. Despite being under the openai key, this value is injected into Ultron unconditionally and determines the model used for background processing tasks. For Azure deployments, set this to a deployment name that exists in your Azure OpenAI resource (e.g., gpt-4.1).
Boost reads Azure OpenAI settings from credentials.boost.JITERA_BOOST_API_CONFIG_AZURE_* variables. Each variable encodes one Azure deployment endpoint.Format:
AllJITERA_BOOST_API_CONFIG_AZURE_INSTANCE_1_* keys defined in the chart’s values.yaml must be explicitly overridden in your values file. Keys left with the default placeholder value (<REPLACE_WITH_YOUR_AZURE_CONFIG>) will crash Boost on startup with a Pydantic validation error. Set models you have not deployed to an empty string ("").
How Boost discovers model names from Azure:Boost extracts the deployment name from the last path segment of the URL:
URL: https://instance.openai.azure.com/openai/deployments/gpt-4.1 ↓Discovered model name: gpt-4.1
This name must match the SuperAdmin LLM name field exactly.Endpoint format parameters:
O1, O1-mini, and O3-mini are automatically routed to the Azure Global region by Ultron. GPT-5 models are not auto-routed — use the azure_global: prefix in the SuperAdmin modelKey (e.g. azure_global:gpt-5) to route them to the Global region.
Enable the required Claude models in the AWS Bedrock console. For detailed instructions, see the AWS Bedrock documentation.Common models used with Jitera:
Claude 3.5 Sonnet v2 (anthropic.claude-3-5-sonnet-20241022-v2:0)
Claude 3.7 Sonnet (anthropic.claude-3-7-sonnet-20250219-v1:0)
Claude Sonnet 4 (anthropic.claude-sonnet-4-20250514-v1:0)
Claude Sonnet 4.5 (anthropic.claude-sonnet-4-5-20250929-v1:0)
Claude Sonnet 4.6 (anthropic.claude-sonnet-4-6)
Claude Opus 4 (anthropic.claude-opus-4-20250514-v1:0)
Claude Opus 4.1 (anthropic.claude-opus-4-1-20250805-v1:0)
Claude Opus 4.5 (anthropic.claude-opus-4-5-20251101-v1:0)
Ultron calls Bedrock directly using credentials from openai.secretKeys.bedrock:
openai: secretKeys: bedrock: # Main region (e.g. ap-northeast-1 for APAC) BEDROCK_CONVERSE_REGION: ap-northeast-1 BEDROCK_CONVERSE_ACCESS_KEY_ID: "<aws-access-key>" BEDROCK_CONVERSE_SECRET_ACCESS_KEY: "<aws-secret-key>" # Global region — required for Claude 3.7 and Claude 4 Opus BEDROCK_CONVERSE_GLOBAL_REGION: us-east-1 BEDROCK_CONVERSE_GLOBAL_ACCESS_KEY_ID: "<aws-access-key>" BEDROCK_CONVERSE_GLOBAL_SECRET_ACCESS_KEY: "<aws-secret-key>"
Boost accesses Claude through the LiteLLM proxy, reusing the same Bedrock credentials.Step 4a — LiteLLM credentials are injected automatically from openai.secretKeys.bedrock:
Step 4b — Add Claude models to litellm-proxy-config.yaml:For each model, set model to bedrock/ followed by the Bedrock model ID from the model ID table below. The model_name must match the SuperAdmin name field. AWS credentials are inherited from the environment (Step 4a), so only aws_region_name is required.
# charts/jitera/extra_config/litellm-proxy-config.yamlmodel_list: - model_name: claude-3.5-sonnet # Must match SuperAdmin name litellm_params: model: bedrock/apac.anthropic.claude-3-5-sonnet-20241022-v2:0 # bedrock/ + model ID aws_region_name: ap-northeast-1 - model_name: claude-sonnet-4 litellm_params: model: bedrock/apac.anthropic.claude-sonnet-4-20250514-v1:0 aws_region_name: ap-northeast-1 - model_name: claude-sonnet-4.6 litellm_params: model: bedrock/us.anthropic.claude-sonnet-4-6 aws_region_name: us-east-1 - model_name: claude-opus-4.6 litellm_params: model: bedrock/global.anthropic.claude-opus-4-6-v1 aws_region_name: us-east-1 # Add additional models following the same patterngeneral_settings: master_key: os.environ/PROXY_MASTER_KEY
Step 4c — Configure the Boost connection to LiteLLM (in credentials.boost):
credentials: boost: JITERA_BOOST_OPENAI_KEY_LITELLM: "<litellm-master-key>" # JITERA_BOOST_OPENAI_URL_LITELLM is set automatically to http://jitera-litellm:80
Provider selection — the ARN contains anthropic.claude, which triggers the Bedrock Converse provider (see routing rules)
Region selection — the {region} in the ARN determines which Bedrock credentials to use. If it matches BEDROCK_CONVERSE_REGION (e.g. ap-northeast-1), main region credentials are used. If it matches BEDROCK_CONVERSE_GLOBAL_REGION (e.g. us-east-1), global region credentials are used.
Cross-region inference profile IDs:These are AWS Bedrock cross-region inference profile IDs. Use them as the {profile-id} in the ARN for the SuperAdmin modelKey, and in LiteLLM config with a bedrock/ prefix.
Gemini preview model IDs (those with -preview suffix) change when Google promotes a model to GA or releases a new preview. Verify current model IDs in the Google AI documentation when configuring.
Ultron modelKey format for OpenAI Direct:Use the openai: prefix to explicitly route to OpenAI’s API. The prefix is stripped and the remainder is used as the model ID.
Ultron modelKey requirements for Anthropic Direct:The modelKey must contain claude but must not contain anthropic.claude — otherwise Bedrock is used instead.
The proxy model list is defined in charts/jitera/extra_config/litellm-proxy-config.yaml. See the AWS Bedrock and Google Gemini sections for model configuration examples.
Boost includes a Web Search Agent that provides web search, URL reading, and deep research capabilities. These features require additional API keys and firewall rules beyond the core LLM configuration.
Each capability has multiple backend options with a fallback chain:
Capability
Tool
Default Backend
Fallback
Trigger
Web Search
boost__web_search
Tavily (if API key set)
SearXNG (if Tavily not configured)
Agent explicitly calls the tool
Google Search
boost__google_search
Google (Agno scraping)
N/A
Legacy — registered globally but not used by any current workflow
URL Reading
boost__read_webpage
Jina Reader (r.jina.ai)
None
Agent explicitly calls the tool (e.g., deep-research skill)
URL Reading
read-urls middleware
MarkItDown (local conversion)
Jina Reader (if MarkItDown returns empty)
Automatically processes URLs in user messages
The boost__read_webpage tool depends exclusively on Jina Reader with no fallback. If Jina Reader is unreachable, this tool will fail. Skills that rely on it (e.g., deep-research) will not function.
If neither Tavily nor SearXNG is configured, the boost__web_search tool will not be registered. Skills that depend on it (e.g., deep-research) will fail. boost__google_search exists in the global tool registry but is not used by any current workflow — it is a legacy tool from Document Agent v0.1.5.
The deep-research skill requires both of the following:
Requirement
Minimum Configuration
boost__web_search
Tavily API key or SearXNG URL must be configured
boost__read_webpage
Jina Reader (r.jina.ai) must be accessible
Jina’s free tier (20 RPM) may hit rate limits during deep research sessions that read 20+ URLs. Consider setting JITERA_BOOST_JINA_READER_API_KEY for higher limits in production.
Ultron determines the LLM provider by pattern-matching the modelKey field from the SuperAdmin LLM record. Patterns are evaluated in the following priority order:
Priority order matters. For example, a Bedrock ARN containing anthropic.claude-3-5-sonnet matches rule 4 (contains anthropic.claude) before rule 5 (contains claude). Construct modelKey values carefully to avoid ambiguous matches.
GPT-5 is not automatically routed to the Azure Global region. Set the SuperAdmin modelKey to azure_global:gpt-5 to explicitly route it to the Global region.