AI Configuration

Jitera Self-Hosted uses AI/LLM providers for code generation, AI chat, documentation generation, and code understanding. This guide covers the complete configuration for all supported providers.

The third-party service procedures in this guide (Azure OpenAI, OpenAI, AWS Bedrock, Anthropic, Google AI) are provided as examples. Refer to the official documentation for each provider for the most up-to-date instructions:

Architecture Overview

Jitera routes LLM requests through two internal services — Ultron and Boost — each with its own configuration path.

Service	Role	Configuration	Provider Access
Ultron	AI agent processing, background tasks	`openai.secretKeys.*` in values.yaml	Azure OR OpenAI Direct (via `AI_MODE`) + optional: Bedrock, Anthropic, Google
Boost	Workflow engine, chat, custom agents	`credentials.boost.*` + `litellm-proxy-config.yaml`	Azure or OpenAI-compatible endpoints + LiteLLM proxy
LiteLLM	Model proxy for Boost	`extra_config/litellm-proxy-config.yaml`	Routes Claude/Gemini for Boost

Configuration Files

charts/jitera/values.yaml                             # Main configuration
charts/jitera/extra_config/litellm-proxy-config.yaml  # Claude/Gemini models for Boost

Key Concepts

Term	Description	Example
name	Display name in SuperAdmin — also the routing key for Boost	`gpt-4.1`, `claude-3.5-sonnet`
modelKey	Routing key for Ultron — pattern-matched to select the provider and credentials	`arn:aws:bedrock:ap-northeast-1:123456789012:inference-profile/apac.anthropic.claude-3-5-sonnet-20241022-v2:0`
model_name	LiteLLM alias — must match SuperAdmin `name` for non-Azure providers	`claude-3.5-sonnet`, `gemini-2.5-pro`

For Boost, the SuperAdmin name field is the routing key and must match either the Azure deployment name (extracted from the URL) or the LiteLLM model_name. For Ultron, the modelKey field determines provider routing.

Required Models by Provider

Ultron and Boost each access LLM providers differently. The tables below list the required and available models for each provider, organized by which service uses them.

Ultron routes requests by pattern-matching the modelKey field (see Provider Routing Reference). Any model ID that matches a supported provider pattern will work — the lists below cover pre-configured and commonly used models.
Boost dynamically discovers models from configured endpoints at runtime. Any model available through a configured endpoint (Azure deployment, LiteLLM proxy, or OpenAI-compatible API) can be used.

Azure OpenAI

Model	Service	Role	Region
`gpt-4.1`	Ultron, Boost	Default chat/completion; Ultron background model; Boost expert default	Main
`gpt-4.1-mini`	Ultron, Boost	Fast chat; Boost versatile default	Main
`gpt-4.1-nano`	Ultron, Boost	Lightweight tasks; Boost base/direct-tasks default	Main
`gpt-4o`	Ultron, Boost	Vision / Multimodal; Ultron vision default	Main
`gpt-4o-mini`	Ultron, Boost	Fast completions; Ultron small-model default	Main
`text-embedding-ada-002`	Ultron, Boost	Embeddings	Main
`gpt-4o-transcribe`	Ultron	Audio transcription	Global
`gpt-4o-mini-transcribe`	Ultron	Audio transcription (efficient)	Global
`o1`	Ultron, Boost	Advanced reasoning	Global (auto-routed)
`o3`	Ultron, Boost	Advanced reasoning	Global (auto-routed)
`o3-mini`	Ultron, Boost	Efficient reasoning	Global (auto-routed)
`o4-mini`	Ultron, Boost	Efficient reasoning	Global
`gpt-5`	Ultron, Boost	Next-gen	Global (`azure_global:` prefix)
`gpt-5-mini`	Ultron, Boost	Next-gen efficient	Global (`azure_global:` prefix)
`gpt-5-nano`	Ultron, Boost	Next-gen lightweight	Global (`azure_global:` prefix)
`gpt-5.1`	Ultron, Boost	Next-gen	Global (`azure_global:` prefix)
`gpt-5.2`	Ultron, Boost	Next-gen	Global (`azure_global:` prefix)

Additional Azure models can be registered dynamically using AZURE_DEVELOPMENT_NAME_* environment variables in values.yaml. The env var value becomes both the deployment lookup key and the deployment name. For example, setting AZURE_DEVELOPMENT_NAME_GPT_41=gpt-4.1 registers gpt-4.1 as a known deployment.

Azure OpenAI models require specific deployment SKUs. Using the wrong SKU results in 400 InvalidResourceProperties or 400 ServiceModelDeprecated errors.

Standard: Deploys in a specific region. Use for data residency requirements (e.g., japaneast for Japan-local processing). Available for gpt-4.1, gpt-4.1-mini, gpt-4o, text-embedding-ada-002. Note that Standard SKUs are being retired on a per-model, per-region schedule — verify availability before deploying.
GlobalStandard: Deploys on Azure’s global infrastructure (requests are routed to the nearest available region). Required for gpt-4.1-nano, o1, o3, o3-mini, o4-mini, and gpt-5 series — these models do not support Standard.

Check model and SKU availability for your region in the Azure OpenAI model matrix.

Some models in this table are approaching retirement. Plan migration to the listed replacements before these dates:

gpt-4o — Standard retired 2026-03-31; other SKUs retire 2026-10-01 (replacement: gpt-5.1)
gpt-4o-mini — Standard retired 2026-03-31; other SKUs retire 2026-10-01 (replacement: gpt-4.1-mini)
o1 — retires 2026-07-15 (replacement: o3)
o3-mini — retires 2026-08-02 (replacement: o4-mini)
gpt-4o-transcribe — retires 2026-06-01

Verify current dates on the Azure OpenAI model retirements page.

text-embedding-ada-002 is still GA (no retirement scheduled before 2027-04-15), but Microsoft recommends text-embedding-3-small or text-embedding-3-large for new deployments.

AWS Bedrock (Claude)

Ultron routes any modelKey containing anthropic.claude to AWS Bedrock Converse. Boost accesses Claude through the LiteLLM proxy.

The following Bedrock model IDs are approaching retirement. Plan migration to the listed replacements before these dates:

anthropic.claude-3-7-sonnet-20250219-v1:0 — EOL 2026-04-28 (replacement: anthropic.claude-sonnet-4-6)
anthropic.claude-opus-4-20250514-v1:0 — EOL 2026-05-31 (replacement: anthropic.claude-opus-4-6-v1)
anthropic.claude-3-5-haiku-20241022-v1:0 — EOL 2026-06-19 (replacement: anthropic.claude-haiku-4-5-20251001-v1:0)
anthropic.claude-3-5-sonnet-20240620-v1:0, anthropic.claude-3-5-sonnet-20241022-v2:0 — APAC EOL 2026-07-30
anthropic.claude-3-haiku-20240307-v1:0 — Bedrock EOL 2026-09-10 (already retired on Anthropic API)
anthropic.claude-sonnet-4-20250514-v1:0 — moved to Legacy 2026-04-14, Bedrock EOL 2026-10-14

Verify current dates on the AWS Bedrock model lifecycle page and Anthropic model deprecations.

Model	Service	Bedrock Model ID	Region
Claude 3 Haiku	Ultron	`anthropic.claude-3-haiku-20240307-v1:0`	APAC
Claude 3.5 Haiku	Ultron	`anthropic.claude-3-5-haiku-20241022-v1:0`	US
Claude Haiku 4.5	Ultron	`anthropic.claude-haiku-4-5-20251001-v1:0`	US / APAC
Claude 3.5 Sonnet v1	Ultron	`anthropic.claude-3-5-sonnet-20240620-v1:0`	APAC
Claude 3.5 Sonnet v2	Ultron, Boost (via LiteLLM)	`anthropic.claude-3-5-sonnet-20241022-v2:0`	APAC
Claude 3.7 Sonnet	Ultron, Boost (via LiteLLM)	`anthropic.claude-3-7-sonnet-20250219-v1:0`	APAC
Claude Sonnet 4	Ultron, Boost (via LiteLLM)	`anthropic.claude-sonnet-4-20250514-v1:0`	APAC
Claude Sonnet 4.5	Ultron, Boost (via LiteLLM)	`anthropic.claude-sonnet-4-5-20250929-v1:0`	US / APAC
Claude Sonnet 4.6	Ultron, Boost (via LiteLLM)	`anthropic.claude-sonnet-4-6`	US
Claude Opus 4	Ultron	`anthropic.claude-opus-4-20250514-v1:0`	US
Claude Opus 4.1	Ultron	`anthropic.claude-opus-4-1-20250805-v1:0`	US
Claude Opus 4.5	Ultron	`anthropic.claude-opus-4-5-20251101-v1:0`	US
Claude Opus 4.6	Ultron, Boost (via LiteLLM)	`anthropic.claude-opus-4-6-v1`	US

Anthropic Direct API

Ultron routes any modelKey containing claude (but not anthropic.claude) to the Anthropic API directly.

The following Claude models are deprecated on the Anthropic API and retire on 2026-06-15:

claude-sonnet-4-20250514 (replacement: claude-sonnet-4-6)
claude-opus-4-20250514 (replacement: claude-opus-4-6)

Verify current dates on the Anthropic model deprecations page.

Model	Service	modelKey
Claude Sonnet 4	Ultron	`claude-sonnet-4-20250514`
Claude Sonnet 4.5	Ultron	`claude-sonnet-4-5-20250929`
Claude Sonnet 4.6	Ultron	`claude-sonnet-4-6`
Claude Opus 4	Ultron	`claude-opus-4-20250514`
Claude Opus 4.1	Ultron	`claude-opus-4-1-20250805`
Claude Opus 4.5	Ultron	`claude-opus-4-5-20251101`
Claude Opus 4.6	Ultron	`claude-opus-4-6`

Google Gemini

Ultron routes any modelKey containing gemini to Google Generative AI. Boost accesses Gemini through the LiteLLM proxy.

Model	Service	modelKey
Gemini 3.1 Pro	Ultron, Boost (via LiteLLM)	`gemini-3.1-pro`
Gemini 3 Pro Image	Ultron, Boost (via LiteLLM)	`gemini-3-pro-image`
Gemini 3 Flash	Ultron, Boost (via LiteLLM)	`gemini-3-flash`
Gemini 2.5 Pro	Ultron, Boost (via LiteLLM)	`gemini-2.5-pro`
Gemini 2.5 Flash	Ultron, Boost (via LiteLLM)	`gemini-2.5-flash`
Gemini 2.5 Flash Image	Ultron, Boost (via LiteLLM)	`gemini-2.5-flash-image`
Gemini 2.0 Flash	Ultron, Boost (via LiteLLM)	`gemini-2.0-flash`

Models matching gemini-2.0-flash-thinking*, gemini-2.5*, or gemini-3* automatically have thinking/reasoning enabled.

Vertex AI support is not included in v26.02.16. It will be available in a future release.

Other Providers

Provider	Service	Pattern Match	modelKey Example
OpenAI Direct	Ultron, Boost	Starts with `openai:`	`openai:gpt-4o`
Groq	Ultron	Contains `deepseek-r1-distill`	`deepseek-r1-distill-llama-70b`
Qwen (vLLM)	Ultron	Contains `qwen`	`qwen-2.5-72b`
Ollama	Ultron	Configured via `OLLAMA_BASE_URL`	Any model name

Default Models for Background Tasks

Even when a user selects a specific model, background operations use default models. These defaults must be available from a configured provider. Ultron defaults:

Role	Default Model	Environment Variable
Main background model	`gpt-4.1`	`OPENAI_MAIN_MODEL_NAME`
Small model	`gpt-4o-mini`	(code default)
Vision model	`gpt-4o`	(code default)

OPENAI_MAIN_MODEL_NAME is required for both AI_MODE: azure and AI_MODE: open_ai. For Azure, set this to a deployment name that exists in your Azure OpenAI resource. Boost defaults:

Role	Default Model	Environment Variable
Base (simple tasks)	`gpt-4.1-nano`	`JITERA_BOOST_DEFAULT_BASE_MODEL`
Direct tasks (titles, tags)	`gpt-4.1-nano`	`JITERA_BOOST_DIRECT_TASKS_MODEL`
Versatile (balanced)	`gpt-4.1-mini`	`JITERA_BOOST_DEFAULT_VERSATILE_MODEL`
Expert (complex reasoning)	`gpt-4.1`	`JITERA_BOOST_DEFAULT_EXPERT_MODEL`
Vision	`gpt-4.1`	`JITERA_BOOST_DEFAULT_VISION_MODEL`
Embeddings	`text-embedding-ada-002`	`JITERA_BOOST_DEFAULT_EMBEDDING_MODEL`
Audio (speech-to-text)	`jitera/stt`	`JITERA_BOOST_DEFAULT_AUDIO_MODEL`

Model Discovery (Boost)

Boost does not maintain a hardcoded model list. It discovers available models dynamically:

Endpoint Type	How Models Are Discovered
Azure OpenAI	Deployment name extracted from the URL path
OpenAI-compatible (incl. LiteLLM)	Calls `GET /v1/models` on the endpoint
Internal workflows	Registered from Boost’s workflow registry
Local audio	Hardcoded: `jitera/tts` and `jitera/stt` (via sherpa-onnx)

To make a model available in Boost, configure an endpoint that serves it (Azure deployment, LiteLLM proxy entry, or OpenAI-compatible API) and register it in SuperAdmin with a name that matches the discovered model ID.

Choosing a Primary Provider (`AI_MODE`)

Ultron’s primary LLM provider is set via AI_MODE in values.yaml. Choose one:

Mode	Provider	Required Environment Variables
`open_ai` (default)	OpenAI Direct API	`OPENAI_API_KEYS`, `OPENAI_API_KEY`, `OPENAI_EMBEDDING_KEY`, `OPENAI_VISION_KEY`
`azure`	Azure OpenAI	`AZURE_OPENAI_KEYS`, `AZURE_OPENAI_INSTANCE_NAMES`, `AZURE_OPENAI_VERSION`, `AZURE_OPENAI_DEVELOPMENT_NAME`, `AZURE_OPENAI_EMBEDDING_DEVELOPMENT_NAME`

openai:
  AI_MODE: open_ai  # or azure

This determines Ultron’s primary provider only. Other providers (Bedrock, Anthropic, Google, vLLM) can be added alongside either mode. Boost is provider-agnostic — it connects to any OpenAI-compatible endpoint.

Azure OpenAI Configuration

Configure Azure OpenAI as Ultron’s primary provider by setting AI_MODE: azure.

Step 1: Create Azure OpenAI Resource

Create an Azure OpenAI resource and obtain your endpoint URL and API key. For detailed and up-to-date instructions, see the Azure OpenAI documentation.

Example: Create resource using Azure CLI

# Create resource
az cognitiveservices account create \
  --name jitera-openai \
  --resource-group jitera-rg \
  --kind OpenAI \
  --sku S0 \
  --location japaneast \
  --custom-domain jitera-openai

# Get endpoint
az cognitiveservices account show \
  --name jitera-openai \
  --resource-group jitera-rg \
  --query "properties.endpoint"

# Get key
az cognitiveservices account keys list \
  --name jitera-openai \
  --resource-group jitera-rg

The AZURE_OPENAI_INSTANCE_NAMES value must be a custom subdomain name (e.g., my-instance), not a regional endpoint. Ultron constructs URLs as https://{instance}.openai.azure.com. If your Azure OpenAI resource uses a regional endpoint (e.g., eastus2.api.cognitive.microsoft.com), enable a custom subdomain:

az cognitiveservices account update \
  --name <resource-name> \
  --resource-group <rg> \
  --custom-domain <desired-subdomain>

You can verify your endpoint format in the Azure Portal under your OpenAI resource > Keys and Endpoint. The endpoint must be https://<subdomain>.openai.azure.com/.

Step 2: Deploy Models

Jitera’s backend services (Ultron and Boost) reference Azure deployments by name through environment variables and endpoint URLs. Each deployment you create here must match the deployment name configured in Jitera’s Helm values — otherwise the services cannot route requests to the correct model. For the full list of deployment-to-environment-variable mappings, see the Azure Model-to-Deployment Mapping section. Deploy the following models in the Azure OpenAI Studio or via the Azure CLI. For deployment instructions, see the Azure OpenAI deployment guide.

Use the same string for both Model deployment name and Model name (e.g. deploy gpt-4.1 with deployment name gpt-4.1). This simplifies configuration since Jitera uses the deployment name as the routing key.

Recommended minimum deployments:

Model	Deployment Name	Purpose
gpt-4.1	gpt-4.1	Main chat/completion, default fallback
gpt-4o	gpt-4o	Vision, multimodal tasks
gpt-4o-mini	gpt-4o-mini	Fast completions
text-embedding-ada-002	text-embedding-ada-002	Embeddings
o1	o1	Advanced reasoning (global region)
o3-mini	o3-mini	Efficient reasoning (global region)

Step 3: Configure Ultron (values.yaml)

Ultron reads Azure OpenAI settings from environment variables injected via openai.secretKeys.azure:

openai:
  AI_MODE: azure  # Required: set to "azure" for Azure OpenAI
  secretKeys:
    azure:
      # === Main Region (e.g. Japan East) ===
      AZURE_OPENAI_KEY: "<your-api-key>"
      AZURE_OPENAI_KEYS: '["<key1>", "<key2>"]'              # JSON array for load balancing
      AZURE_OPENAI_INSTANCE_NAME: "<your-instance-name>"
      AZURE_OPENAI_INSTANCE_NAMES: '["<instance1>", "<instance2>"]'
      AZURE_OPENAI_VERSION: "2024-10-21"

      # Deployment names (must match Azure portal)
      AZURE_OPENAI_DEVELOPMENT_NAME: gpt-4.1                  # Default/fallback model
      AZURE_OPENAI_EMBEDDING_DEVELOPMENT_NAME: text-embedding-ada-002
      AZURE_OPENAI_VISION_DEVELOPMENT_NAME: gpt-4o
      AZURE_OPENAI_GPT_4O_DEVELOPMENT_NAME: gpt-4o
      AZURE_OPENAI_GPT_4O_MINI_DEVELOPMENT_NAME: gpt-4o-mini
      AZURE_DEVELOPMENT_NAME_GPT_41: gpt-4.1

      # === Global Region (e.g. Sweden Central or US East) ===
      # Required for O1, O3, and GPT-5 models
      AZURE_OPENAI_GLOBAL_KEYS: '["<global-key1>", "<global-key2>"]'
      AZURE_OPENAI_GLOBAL_INSTANCE_NAMES: '["<instance-swedencentral>"]'
      AZURE_OPENAI_GLOBAL_VERSION: "2024-12-01-preview"

      # O1/O3 models (auto-routed to Global region)
      AZURE_OPENAI_GPT_O1_DEVELOPMENT_NAME: o1
      AZURE_OPENAI_GPT_O1_MINI_DEVELOPMENT_NAME: o1-mini
      AZURE_OPENAI_GPT_O3_MINI_DEVELOPMENT_NAME: o3-mini
      AZURE_DEVELOPMENT_NAME_O3: o3

      # GPT-5 models (requires azure_global: prefix in SuperAdmin modelKey)
      AZURE_OPENAI_GPT_5_DEVELOPMENT_NAME: gpt-5
      AZURE_OPENAI_GPT_5_MINI_DEVELOPMENT_NAME: gpt-5-mini
      AZURE_OPENAI_GPT_5_NANO_DEVELOPMENT_NAME: gpt-5-nano
      AZURE_OPENAI_GPT_5_CHAT_DEVELOPMENT_NAME: gpt-5-chat
      AZURE_OPENAI_GPT_51_DEVELOPMENT_NAME: gpt-5.1
      AZURE_OPENAI_GPT_52_DEVELOPMENT_NAME: gpt-5.2

    openai:
      # Main model name — used by Ultron for background tasks regardless of AI_MODE
      OPENAI_MAIN_MODEL_NAME: gpt-4.1

OPENAI_MAIN_MODEL_NAME is required even when using Azure mode. Despite being under the openai key, this value is injected into Ultron unconditionally and determines the model used for background processing tasks. For Azure deployments, set this to a deployment name that exists in your Azure OpenAI resource (e.g., gpt-4.1).

Step 4: Configure Boost (values.yaml)

Boost reads Azure OpenAI settings from credentials.boost.JITERA_BOOST_API_CONFIG_AZURE_* variables. Each variable encodes one Azure deployment endpoint. Format:

behavior=azure,url=https://<instance>.openai.azure.com/openai/deployments/<deployment>,headers={"api-key": "<key>"},query_params={"api-version": "<version>"}

credentials:
  boost:
    # === Main region models ===
    JITERA_BOOST_API_CONFIG_AZURE_INSTANCE_1_41: 'behavior=azure,url=https://<instance>.openai.azure.com/openai/deployments/gpt-4.1,headers={"api-key": "<key>"},query_params={"api-version": "2024-12-01-preview"}'
    JITERA_BOOST_API_CONFIG_AZURE_INSTANCE_1_41_MINI: 'behavior=azure,url=https://<instance>.openai.azure.com/openai/deployments/gpt-4.1-mini,headers={"api-key": "<key>"},query_params={"api-version": "2024-12-01-preview"}'
    JITERA_BOOST_API_CONFIG_AZURE_INSTANCE_1_41_NANO: 'behavior=azure,url=https://<instance>.openai.azure.com/openai/deployments/gpt-4.1-nano,headers={"api-key": "<key>"},query_params={"api-version": "2024-12-01-preview"}'
    JITERA_BOOST_API_CONFIG_AZURE_INSTANCE_1_ADA: 'behavior=azure,url=https://<instance>.openai.azure.com/openai/deployments/text-embedding-ada-002,headers={"api-key": "<key>"},query_params={"api-version": "2"}'
    JITERA_BOOST_API_CONFIG_AZURE_INSTANCE_1_4O: 'behavior=azure,url=https://<instance>.openai.azure.com/openai/deployments/gpt-4o,headers={"api-key": "<key>"},query_params={"api-version": "2024-12-01-preview"}'
    JITERA_BOOST_API_CONFIG_AZURE_INSTANCE_1_4O_MINI: 'behavior=azure,url=https://<instance>.openai.azure.com/openai/deployments/gpt-4o-mini,headers={"api-key": "<key>"},query_params={"api-version": "2024-12-01-preview"}'

    # === Reasoning models (global region) ===
    JITERA_BOOST_API_CONFIG_AZURE_INSTANCE_1_O1: 'behavior=azure,url=https://<global-instance>.openai.azure.com/openai/deployments/o1,headers={"api-key": "<key>"},query_params={"api-version": "2024-12-01-preview"}'
    JITERA_BOOST_API_CONFIG_AZURE_INSTANCE_1_O3: 'behavior=azure,url=https://<global-instance>.openai.azure.com/openai/deployments/o3,headers={"api-key": "<key>"},query_params={"api-version": "2024-12-01-preview"}'
    JITERA_BOOST_API_CONFIG_AZURE_INSTANCE_1_O3_MINI: 'behavior=azure,url=https://<global-instance>.openai.azure.com/openai/deployments/o3-mini,headers={"api-key": "<key>"},query_params={"api-version": "2024-12-01-preview"}'
    JITERA_BOOST_API_CONFIG_AZURE_INSTANCE_1_O4_MINI: 'behavior=azure,url=https://<global-instance>.openai.azure.com/openai/deployments/o4-mini,headers={"api-key": "<key>"},query_params={"api-version": "2024-12-01-preview"}'

    # === GPT-5 family (global region — Sweden/US) ===
    JITERA_BOOST_API_CONFIG_AZURE_INSTANCE_GPT5_5: 'behavior=azure,url=https://<global-instance>.openai.azure.com/openai/deployments/gpt-5,headers={"api-key": "<key>"},query_params={"api-version": "2024-12-01-preview"}'
    JITERA_BOOST_API_CONFIG_AZURE_INSTANCE_GPT5_5_MINI: 'behavior=azure,url=https://<global-instance>.openai.azure.com/openai/deployments/gpt-5-mini,headers={"api-key": "<key>"},query_params={"api-version": "2024-12-01-preview"}'
    JITERA_BOOST_API_CONFIG_AZURE_INSTANCE_GPT5_5_NANO: 'behavior=azure,url=https://<global-instance>.openai.azure.com/openai/deployments/gpt-5-nano,headers={"api-key": "<key>"},query_params={"api-version": "2024-12-01-preview"}'
    JITERA_BOOST_API_CONFIG_AZURE_INSTANCE_GPT5_5_CHAT: 'behavior=azure,url=https://<global-instance>.openai.azure.com/openai/deployments/gpt-5-chat,headers={"api-key": "<key>"},query_params={"api-version": "2024-12-01-preview"}'
    JITERA_BOOST_API_CONFIG_AZURE_INSTANCE_GPT5_51: 'behavior=azure,url=https://<global-instance>.openai.azure.com/openai/deployments/gpt-5.1,headers={"api-key": "<key>"},query_params={"api-version": "2024-12-01-preview"}'
    JITERA_BOOST_API_CONFIG_AZURE_INSTANCE_GPT5_52: 'behavior=azure,url=https://<global-instance>.openai.azure.com/openai/deployments/gpt-5.2,headers={"api-key": "<key>"},query_params={"api-version": "2024-12-01-preview"}'

All JITERA_BOOST_API_CONFIG_AZURE_INSTANCE_1_* keys defined in the chart’s values.yaml must be explicitly overridden in your values file. Keys left with the default placeholder value (<REPLACE_WITH_YOUR_AZURE_CONFIG>) will crash Boost on startup with a Pydantic validation error. Set models you have not deployed to an empty string ("").

How Boost discovers model names from Azure: Boost extracts the deployment name from the last path segment of the URL:

URL: https://instance.openai.azure.com/openai/deployments/gpt-4.1
                                                          ↓
Discovered model name: gpt-4.1

This name must match the SuperAdmin LLM name field exactly. Endpoint format parameters:

Parameter	Description	Example
`behavior`	Provider type	`azure` or `openai`
`url`	Full API endpoint URL	`https://instance.openai.azure.com/openai/deployments/gpt-4.1`
`headers`	JSON object with request headers	`{"api-key": "xxx"}`
`query_params`	JSON object with query parameters	`{"api-version": "2024-12-01-preview"}`
`weight`	Load balancing weight (optional)	`1.0`

SuperAdmin Registration for Azure Models

Field	Value
name	Must match Azure deployment name (e.g. `gpt-4.1`)
modelKey	Same as name (e.g. `gpt-4.1`)
provider	`Azure OpenAI`

O1, O1-mini, and O3-mini are automatically routed to the Azure Global region by Ultron. GPT-5 models are not auto-routed — use the azure_global: prefix in the SuperAdmin modelKey (e.g. azure_global:gpt-5) to route them to the Global region.

AWS Bedrock Configuration (Claude)

Step 1: Enable Bedrock Models

Enable the required Claude models in the AWS Bedrock console. For detailed instructions, see the AWS Bedrock documentation. Common models used with Jitera:

Claude 3.5 Sonnet v2 (anthropic.claude-3-5-sonnet-20241022-v2:0)
Claude 3.7 Sonnet (anthropic.claude-3-7-sonnet-20250219-v1:0)
Claude Sonnet 4 (anthropic.claude-sonnet-4-20250514-v1:0)
Claude Sonnet 4.5 (anthropic.claude-sonnet-4-5-20250929-v1:0)
Claude Sonnet 4.6 (anthropic.claude-sonnet-4-6)
Claude Opus 4 (anthropic.claude-opus-4-20250514-v1:0)
Claude Opus 4.1 (anthropic.claude-opus-4-1-20250805-v1:0)
Claude Opus 4.5 (anthropic.claude-opus-4-5-20251101-v1:0)
Claude Opus 4.6 (anthropic.claude-opus-4-6-v1)

Step 2: Create IAM Policy

Create an IAM policy that grants Bedrock model invocation permissions. For IAM best practices, see the AWS IAM documentation.

Example: Minimal IAM policy for Bedrock

{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Effect": "Allow",
      "Action": [
        "bedrock:InvokeModel",
        "bedrock:InvokeModelWithResponseStream"
      ],
      "Resource": "*"
    }
  ]
}

Step 3: Configure Ultron (values.yaml)

Ultron calls Bedrock directly using credentials from openai.secretKeys.bedrock:

openai:
  secretKeys:
    bedrock:
      # Main region (e.g. ap-northeast-1 for APAC)
      BEDROCK_CONVERSE_REGION: ap-northeast-1
      BEDROCK_CONVERSE_ACCESS_KEY_ID: "<aws-access-key>"
      BEDROCK_CONVERSE_SECRET_ACCESS_KEY: "<aws-secret-key>"

      # Global region — required for Claude 3.7 and Claude 4 Opus
      BEDROCK_CONVERSE_GLOBAL_REGION: us-east-1
      BEDROCK_CONVERSE_GLOBAL_ACCESS_KEY_ID: "<aws-access-key>"
      BEDROCK_CONVERSE_GLOBAL_SECRET_ACCESS_KEY: "<aws-secret-key>"

Environment variables injected into Ultron:

Variable	Purpose
`BEDROCK_CONVERSE_REGION`	AWS region for main Bedrock access
`BEDROCK_CONVERSE_ACCESS_KEY_ID`	AWS access key for main region
`BEDROCK_CONVERSE_SECRET_ACCESS_KEY`	AWS secret key for main region
`BEDROCK_CONVERSE_GLOBAL_REGION`	Secondary AWS region for newer models
`BEDROCK_CONVERSE_GLOBAL_ACCESS_KEY_ID`	AWS access key for global region
`BEDROCK_CONVERSE_GLOBAL_SECRET_ACCESS_KEY`	AWS secret key for global region

Step 4: Configure Boost for Claude (via LiteLLM)

Boost accesses Claude through the LiteLLM proxy, reusing the same Bedrock credentials. Step 4a — LiteLLM credentials are injected automatically from openai.secretKeys.bedrock:

AWS_ACCESS_KEY_ID     ← BEDROCK_CONVERSE_ACCESS_KEY_ID
AWS_SECRET_ACCESS_KEY ← BEDROCK_CONVERSE_SECRET_ACCESS_KEY

Step 4b — Add Claude models to litellm-proxy-config.yaml: For each model, set model to bedrock/ followed by the Bedrock model ID from the model ID table below. The model_name must match the SuperAdmin name field. AWS credentials are inherited from the environment (Step 4a), so only aws_region_name is required.

# charts/jitera/extra_config/litellm-proxy-config.yaml
model_list:
  - model_name: claude-3.5-sonnet            # Must match SuperAdmin name
    litellm_params:
      model: bedrock/apac.anthropic.claude-3-5-sonnet-20241022-v2:0  # bedrock/ + model ID
      aws_region_name: ap-northeast-1

  - model_name: claude-sonnet-4
    litellm_params:
      model: bedrock/apac.anthropic.claude-sonnet-4-20250514-v1:0
      aws_region_name: ap-northeast-1

  - model_name: claude-sonnet-4.6
    litellm_params:
      model: bedrock/us.anthropic.claude-sonnet-4-6
      aws_region_name: us-east-1

  - model_name: claude-opus-4.6
    litellm_params:
      model: bedrock/global.anthropic.claude-opus-4-6-v1
      aws_region_name: us-east-1

  # Add additional models following the same pattern

general_settings:
  master_key: os.environ/PROXY_MASTER_KEY

Step 4c — Configure the Boost connection to LiteLLM (in credentials.boost):

credentials:
  boost:
    JITERA_BOOST_OPENAI_KEY_LITELLM: "<litellm-master-key>"
    # JITERA_BOOST_OPENAI_URL_LITELLM is set automatically to http://jitera-litellm:80

SuperAdmin Registration for Claude

Field	Value
name	Must match LiteLLM `model_name` (e.g. `claude-3.5-sonnet`) — used by Boost for routing
modelKey	Full ARN of the Bedrock inference profile (see format and examples below) — used by Ultron
provider	`AWS Bedrock`

modelKey requirements: The modelKey must be a full AWS Bedrock inference profile ARN in the following format:

arn:aws:bedrock:{region}:{account-id}:inference-profile/{profile-id}

Component	Description	Example
`{region}`	AWS region where the inference profile is available	`us-east-1`, `ap-northeast-1`
`{account-id}`	Your AWS account ID	`123456789012`
`{profile-id}`	Cross-region inference profile ID (see table below)	`us.anthropic.claude-sonnet-4-5-20250929-v1:0`

For example:

arn:aws:bedrock:us-east-1:123456789012:inference-profile/us.anthropic.claude-sonnet-4-5-20250929-v1:0

Ultron uses this value for two routing decisions:

Provider selection — the ARN contains anthropic.claude, which triggers the Bedrock Converse provider (see routing rules)
Region selection — the {region} in the ARN determines which Bedrock credentials to use. If it matches BEDROCK_CONVERSE_REGION (e.g. ap-northeast-1), main region credentials are used. If it matches BEDROCK_CONVERSE_GLOBAL_REGION (e.g. us-east-1), global region credentials are used.

Cross-region inference profile IDs: These are AWS Bedrock cross-region inference profile IDs. Use them as the {profile-id} in the ARN for the SuperAdmin modelKey, and in LiteLLM config with a bedrock/ prefix.

Profile ID	Region	Model
`apac.anthropic.claude-3-5-sonnet-20241022-v2:0`	Main (ap-northeast-1)	Claude 3.5 Sonnet v2
`apac.anthropic.claude-3-7-sonnet-20250219-v1:0`	Main (ap-northeast-1)	Claude 3.7 Sonnet
`apac.anthropic.claude-sonnet-4-20250514-v1:0`	Main (ap-northeast-1)	Claude Sonnet 4
`apac.anthropic.claude-sonnet-4-5-20250929-v1:0`	Main (ap-northeast-1)	Claude Sonnet 4.5
`us.anthropic.claude-sonnet-4-5-20250929-v1:0`	Global (us-east-1)	Claude Sonnet 4.5 (US)
`us.anthropic.claude-sonnet-4-6`	Global (us-east-1)	Claude Sonnet 4.6
`us.anthropic.claude-opus-4-20250514-v1:0`	Global (us-east-1)	Claude Opus 4
`us.anthropic.claude-opus-4-1-20250805-v1:0`	Global (us-east-1)	Claude Opus 4.1
`global.anthropic.claude-opus-4-5-20251101-v1:0`	Global (us-east-1)	Claude Opus 4.5
`global.anthropic.claude-opus-4-6-v1`	Global (us-east-1)	Claude Opus 4.6

Google Gemini Configuration

Step 1: Get API Key

Obtain a Gemini API key from Google AI Studio. For detailed instructions, see the Gemini API documentation.

Step 2: Configure Ultron (values.yaml)

openai:
  secretKeys:
    google:
      GOOGLE_GENERATIVE_API_KEY: "<your-gemini-api-key>"

Step 3: Configure Boost for Gemini (via LiteLLM)

The Gemini API key is injected into the LiteLLM container automatically as GEMINI_API_KEY. Add Gemini models to litellm-proxy-config.yaml:

model_list:
  - model_name: gemini-3.1-pro            # Must match SuperAdmin name
    litellm_params:
      model: gemini/gemini-3.1-pro-preview
      api_key: os.environ/GEMINI_API_KEY

  - model_name: gemini-3-flash
    litellm_params:
      model: gemini/gemini-3-flash-preview
      api_key: os.environ/GEMINI_API_KEY

  - model_name: gemini-2.5-pro
    litellm_params:
      model: gemini/gemini-2.5-pro
      api_key: os.environ/GEMINI_API_KEY

  - model_name: gemini-2.5-flash
    litellm_params:
      model: gemini/gemini-2.5-flash
      api_key: os.environ/GEMINI_API_KEY

  - model_name: gemini-2.0-flash
    litellm_params:
      model: gemini/gemini-2.0-flash
      api_key: os.environ/GEMINI_API_KEY

Gemini preview model IDs (those with -preview suffix) change when Google promotes a model to GA or releases a new preview. Verify current model IDs in the Google AI documentation when configuring.

SuperAdmin Registration for Gemini

Field	Value
name	Must match LiteLLM `model_name` (e.g. `gemini-2.0-flash`)
modelKey	Same as name
provider	`Google`

OpenAI Direct Configuration

Configure OpenAI Direct as Ultron’s primary provider by setting AI_MODE: open_ai (the default).

Step 1: Get API Key

Create an API key with GPT-4 access from the OpenAI platform. For details, see the OpenAI API documentation.

Step 2: Configure Ultron (values.yaml)

openai:
  AI_MODE: open_ai  # Default
  secretKeys:
    openai:
      OPENAI_API_KEY: "<your-openai-api-key>"
      OPENAI_API_KEYS: '["<key1>", "<key2>"]'   # JSON array for load balancing
      OPENAI_EMBEDDING_KEY: "<your-embedding-api-key>"
      OPENAI_VISION_KEY: "<your-vision-api-key>"
      OPENAI_MAIN_MODEL_NAME: "gpt-4.1"

Step 3: Configure Boost (values.yaml)

Configure Boost endpoints pointing to the OpenAI API:

credentials:
  boost:
    JITERA_BOOST_OPENAI_URL_OPENAI: "https://api.openai.com/v1"
    JITERA_BOOST_OPENAI_KEY_OPENAI: "<your-openai-api-key>"

Ultron modelKey format for OpenAI Direct: Use the openai: prefix to explicitly route to OpenAI’s API. The prefix is stripped and the remainder is used as the model ID.

modelKey	Actual Model Used
`openai:gpt-4.1`	`gpt-4.1`
`openai:gpt-4o`	`gpt-4o`

Anthropic Direct API Configuration

Direct access to Anthropic’s API, bypassing AWS Bedrock.

Step 1: Get API Key

Create an API key from the Anthropic Console. For details, see the Anthropic API documentation.

Step 2: Configure Ultron (values.yaml)

openai:
  secretKeys:
    anthropic:
      ANTHROPIC_API_KEY: "<your-anthropic-api-key>"

Ultron modelKey requirements for Anthropic Direct: The modelKey must contain claude but must not contain anthropic.claude — otherwise Bedrock is used instead.

modelKey	Description
`claude-3-haiku-20240307`	Claude 3 Haiku
`claude-3-5-sonnet-20241022`	Claude 3.5 Sonnet v2
`claude-sonnet-4-20250514`	Claude Sonnet 4
`claude-sonnet-4-5-20250929`	Claude Sonnet 4.5
`claude-sonnet-4-6`	Claude Sonnet 4.6
`claude-opus-4-20250514`	Claude Opus 4
`claude-opus-4-1-20250805`	Claude Opus 4.1
`claude-opus-4-6`	Claude Opus 4.6

SuperAdmin Registration

Field	Value
name	`claude-sonnet-4-20250514`
modelKey	`claude-sonnet-4-20250514`
provider	`Anthropic`

vLLM Configuration

For air-gapped deployments or local model hosting.

vLLM requires GPU nodes with NVIDIA CUDA support.

Step 1: Enable vLLM

vllm:
  enabled: true
  replicaCount: 1
  args:
    - "vllm serve Qwen/Qwen2.5-Coder-1.5B-Instruct-AWQ --trust-remote-code --enable-prefix-caching --disable-log-requests --dtype=float16"
  resources:
    requests:
      memory: "4Gi"
      cpu: "2000m"
    limits:
      memory: "32Gi"
      cpu: "8000m"
      nvidia.com/gpu: 1
  nodeSelector:
    accelerator: nvidia-gpu

Step 2: Configure Credentials

credentials:
  vllm:
    HUGGING_FACE_HUB_TOKEN: "<your-hf-token>"

LiteLLM Proxy Configuration

LiteLLM provides a unified API proxy for Claude (Bedrock) and Gemini models used by Boost.

litellm:
  enabled: true
  replicaCount: 1
  resources:
    requests:
      memory: "512Mi"
      cpu: "250m"

The proxy model list is defined in charts/jitera/extra_config/litellm-proxy-config.yaml. See the AWS Bedrock and Google Gemini sections for model configuration examples.

Background Model Configuration

Default models for background tasks are documented in the Default Models for Background Tasks section above.

Web Search Agent Configuration

Boost includes a Web Search Agent that provides web search, URL reading, and deep research capabilities. These features require additional API keys and firewall rules beyond the core LLM configuration.

Architecture

The Web Search Agent has two core capabilities:

Web Search — Finding information on the internet
URL Reading — Extracting content from web pages

Each capability has multiple backend options with a fallback chain:

Capability	Tool	Default Backend	Fallback	Trigger
Web Search	`boost__web_search`	Tavily (if API key set)	SearXNG (if Tavily not configured)	Agent explicitly calls the tool
Google Search	`boost__google_search`	Google (Agno scraping)	N/A	Legacy — registered globally but not used by any current workflow
URL Reading	`boost__read_webpage`	Jina Reader (`r.jina.ai`)	None	Agent explicitly calls the tool (e.g., `deep-research` skill)
URL Reading	`read-urls` middleware	MarkItDown (local conversion)	Jina Reader (if MarkItDown returns empty)	Automatically processes URLs in user messages

The boost__read_webpage tool depends exclusively on Jina Reader with no fallback. If Jina Reader is unreachable, this tool will fail. Skills that rely on it (e.g., deep-research) will not function.

If neither Tavily nor SearXNG is configured, the boost__web_search tool will not be registered. Skills that depend on it (e.g., deep-research) will fail. boost__google_search exists in the global tool registry but is not used by any current workflow — it is a legacy tool from Document Agent v0.1.5.

Web Search Backend

Choose one of the following search backends:

Tavily (Recommended)
SearXNG (Self-Hosted)

Tavily is a search API designed for AI agents. No self-hosted infrastructure is needed.

credentials:
  boost:
    JITERA_BOOST_TAVILY_API_KEY: "<your-tavily-api-key>"

Variable	Required	Default	Description
`JITERA_BOOST_TAVILY_API_KEY`	Yes	`""`	Tavily API key. When set, Tavily is preferred over SearXNG.

SearXNG is a self-hosted metasearch engine. Use this option if you want fully self-hosted search without external API dependencies.

credentials:
  boost:
    JITERA_BOOST_SEARXNG_URL: "https://searxng.your-domain.com"
    # JITERA_BOOST_SEARXNG_QUERY_PARAMS: ""  # Optional: extra query params

Variable	Required	Default	Description
`JITERA_BOOST_SEARXNG_URL`	Yes	`""`	URL of your SearXNG instance
`JITERA_BOOST_SEARXNG_QUERY_PARAMS`	No	`""`	Extra query params appended to SearXNG requests

URL Reading (Jina Reader)

Jina Reader converts web pages to text for the boost__read_webpage tool.

credentials:
  boost:
    # JITERA_BOOST_JINA_READER_API_URL: "https://r.jina.ai"  # Default
    JITERA_BOOST_JINA_READER_API_KEY: "<your-jina-api-key>"   # Optional — free tier allows 20 RPM

Variable	Required	Default	Description
`JITERA_BOOST_JINA_READER_API_URL`	Yes	`https://r.jina.ai`	Jina Reader API URL
`JITERA_BOOST_JINA_READER_API_KEY`	No	`""`	Jina API key for higher rate limits

Reranking (Optional)

Reranking improves search result quality for Document Agent and Code Agent RAG workflows. It is not required for the Web Search Agent to function.

Variable	Required	Default	Description
`JITERA_BOOST_JINA_BASE_API_URL`	No	`https://api.jina.ai`	Jina Rerank API base URL
`JITERA_BOOST_CO_API_URL`	No	`https://api.cohere.ai`	Cohere Rerank API URL (alternative to Jina)
`JITERA_BOOST_CO_API_KEY`	No	`""`	Cohere API key

Deep Research Requirements

The deep-research skill requires both of the following:

Requirement	Minimum Configuration
`boost__web_search`	Tavily API key or SearXNG URL must be configured
`boost__read_webpage`	Jina Reader (`r.jina.ai`) must be accessible

Jina’s free tier (20 RPM) may hit rate limits during deep research sessions that read 20+ URLs. Consider setting JITERA_BOOST_JINA_READER_API_KEY for higher limits in production.

Minimum Viable Configuration

With Tavily (Simplest)
With SearXNG (Self-Hosted Search)
Production (With API Keys)

credentials:
  boost:
    JITERA_BOOST_TAVILY_API_KEY: "tvly-xxxxxxxxxxxxx"
    # Jina Reader uses defaults (https://r.jina.ai, no API key, 20 RPM free tier)
    # MarkItDown requires no configuration (local library)

Required firewall rules: api.tavily.com:443, r.jina.ai:443

credentials:
  boost:
    JITERA_BOOST_SEARXNG_URL: "https://searxng.your-domain.com"
    # Jina Reader uses defaults (https://r.jina.ai, no API key, 20 RPM free tier)
    # MarkItDown requires no configuration (local library)

Required firewall rules: searxng.your-domain.com:443, r.jina.ai:443

credentials:
  boost:
    JITERA_BOOST_TAVILY_API_KEY: "tvly-xxxxxxxxxxxxx"
    JITERA_BOOST_JINA_READER_API_KEY: "jina_xxxxxxxxxxxxx"

For the full list of required firewall rules, see Network and Firewall.

Ultron Provider Routing Reference

Ultron determines the LLM provider by pattern-matching the modelKey field from the SuperAdmin LLM record. Patterns are evaluated in the following priority order:

Priority	Pattern	Provider	modelKey Example
1	Starts with `openai:`	OpenAI Direct	`openai:gpt-4o`
2	Starts with `azure:`	Azure (main region)	`azure:gpt-4.1`
3	Starts with `azure_global:`	Azure (global region)	`azure_global:gpt-5`
4	Contains `anthropic.claude`	AWS Bedrock Converse	`arn:aws:bedrock:ap-northeast-1:123456789012:inference-profile/apac.anthropic.claude-3-5-sonnet-20241022-v2:0`
5	Contains `claude`	Anthropic Direct API	`claude-3-opus-20240229`
6	Contains `gemini`	Google Generative AI	`gemini-2.0-flash`
7	Contains `deepseek-r1-distill`	Groq	`deepseek-r1-distill-llama-70b`
8	Equals `o1`, `o1-mini`, or `o3-mini`	Azure Global (auto-routed)	`o1`
9	Contains `qwen`	OpenAI-compatible endpoint	`qwen-2.5-72b`
10	Default	Azure main/default	—

Priority order matters. For example, a Bedrock ARN containing anthropic.claude-3-5-sonnet matches rule 4 (contains anthropic.claude) before rule 5 (contains claude). Construct modelKey values carefully to avoid ambiguous matches.

Azure Model-to-Deployment Mapping (Ultron)

Ultron maps requested model names to Azure deployment names using these environment variables:

Requested Model	Environment Variable	Example Value
`gpt-4o`	`AZURE_OPENAI_GPT_4O_DEVELOPMENT_NAME`	`gpt-4o`
`gpt-4o-mini`	`AZURE_OPENAI_GPT_4O_MINI_DEVELOPMENT_NAME`	`gpt-4o-mini`
`gpt-4.1`	`AZURE_DEVELOPMENT_NAME_GPT_41`	`gpt-4.1`
`gpt-3.5-instruct`	`AZURE_OPENAI_GPT_35_INSTRUCT_DEVELOPMENT_NAME`	`gpt-3.5-instruct`
`text-embedding-ada-002`	`AZURE_OPENAI_EMBEDDING_DEVELOPMENT_NAME`	`text-embedding-ada-002`
Vision model	`AZURE_OPENAI_VISION_DEVELOPMENT_NAME`	`gpt-4o`
`o1`	`AZURE_OPENAI_GPT_O1_DEVELOPMENT_NAME`	`o1`
`o1-mini`	`AZURE_OPENAI_GPT_O1_MINI_DEVELOPMENT_NAME`	`o1-mini`
`o3-mini`	`AZURE_OPENAI_GPT_O3_MINI_DEVELOPMENT_NAME`	`o3-mini`
`o3`	`AZURE_DEVELOPMENT_NAME_O3`	`o3`
`gpt-5`	`AZURE_OPENAI_GPT_5_DEVELOPMENT_NAME`	`gpt-5`
`gpt-5-mini`	`AZURE_OPENAI_GPT_5_MINI_DEVELOPMENT_NAME`	`gpt-5-mini`
`gpt-5-nano`	`AZURE_OPENAI_GPT_5_NANO_DEVELOPMENT_NAME`	`gpt-5-nano`
`gpt-5-chat`	`AZURE_OPENAI_GPT_5_CHAT_DEVELOPMENT_NAME`	`gpt-5-chat`
`gpt-5.1`	`AZURE_OPENAI_GPT_51_DEVELOPMENT_NAME`	`gpt-5.1`
`gpt-5.2`	`AZURE_OPENAI_GPT_52_DEVELOPMENT_NAME`	`gpt-5.2`
Default	`AZURE_OPENAI_DEVELOPMENT_NAME`	`gpt-4.1`

Verification

Check AI Service Health

# Check Ultron
kubectl get pods -n jitera -l app=jitera-ultron
kubectl logs -n jitera -l app=jitera-ultron --tail=100

# Check Boost
kubectl get pods -n jitera -l app=jitera-boost

# Check LiteLLM
kubectl get pods -n jitera -l app=jitera-litellm

Check Environment Variables

# Verify Ultron credentials
kubectl exec -it deploy/jitera-ultron -n jitera -- \
  env | grep -E "(AI_MODE|AZURE_|OPENAI_|BEDROCK_)"

# Verify Boost credentials
kubectl exec -it deploy/jitera-boost -n jitera -- \
  env | grep JITERA_BOOST

# Verify LiteLLM credentials
kubectl exec -it deploy/jitera-litellm -n jitera -- \
  env | grep -E "(AWS_|GEMINI_)"

Test Provider Connectivity

# Test Azure OpenAI
kubectl exec -it deploy/jitera-ultron -n jitera -- \
  curl -X POST "https://<INSTANCE>.openai.azure.com/openai/deployments/gpt-4o/chat/completions?api-version=2024-10-21" \
  -H "api-key: <KEY>" \
  -H "Content-Type: application/json" \
  -d '{"messages":[{"role":"user","content":"Hello"}]}'

# Test LiteLLM proxy models endpoint
kubectl exec -it deploy/jitera-boost -n jitera -- \
  curl -H "Authorization: Bearer $JITERA_BOOST_OPENAI_KEY_LITELLM" \
  http://jitera-litellm/v1/models

# Check LiteLLM proxy config
kubectl exec -it deploy/jitera-litellm -n jitera -- cat /app/config.yaml

Troubleshooting

Verify the LLM record is enabled in SuperAdmin:

SELECT * FROM llms WHERE name = 'your-model-name';
UPDATE llms SET enabled = true WHERE name = 'your-model-name';

Confirm the LLM is assigned to the organization.

”Deployment not found” Error (Azure)

Confirm the Azure deployment name matches the SuperAdmin name field exactly.
Verify JITERA_BOOST_API_CONFIG_AZURE_* URL contains the correct deployment name.

Check Ultron has the correct deployment name env var:

kubectl exec -it deploy/jitera-ultron -n jitera -- env | grep AZURE
kubectl exec -it deploy/jitera-boost -n jitera -- env | grep JITERA_BOOST_API_CONFIG

LiteLLM Model Not Working (Claude/Gemini)

Confirm the model entry exists in litellm-proxy-config.yaml.
Verify model_name matches the SuperAdmin name field exactly.

Confirm credentials are set correctly:

kubectl exec -it deploy/jitera-litellm -n jitera -- cat /app/config.yaml
kubectl exec -it deploy/jitera-litellm -n jitera -- env | grep -E "(AWS_|GEMINI_)"

GPT-5 Falls Back to GPT-4.1

GPT-5 is not automatically routed to the Azure Global region. Set the SuperAdmin modelKey to azure_global:gpt-5 to explicitly route it to the Global region.

API Key Errors

# Check secret values
kubectl get secret jitera-openai -n jitera -o yaml

# Verify key is correctly base64 encoded
echo "<KEY>" | base64 -d

Rate Limiting

Add multiple API keys using JSON arrays (AZURE_OPENAI_KEYS, AZURE_OPENAI_INSTANCE_NAMES).
Add additional Boost endpoint variables (JITERA_BOOST_API_CONFIG_AZURE_INSTANCE_2_*).
Increase quotas with your AI provider.
Consider deploying to multiple regions.

vLLM GPU Issues

# Check GPU nodes
kubectl get nodes -l accelerator=nvidia-gpu

# Check GPU availability
kubectl describe node <node-name> | grep nvidia

# Check vLLM pod status
kubectl describe pod -n jitera -l app=jitera-vllm

Appendix: Environment Variable Reference

Ultron

General (applies to both modes):

Variable	values.yaml Source	Purpose
`AI_MODE`	`openai.AI_MODE`	`open_ai` (default) or `azure`
`OPENAI_MAIN_MODEL_NAME`	`openai.secretKeys.openai.OPENAI_MAIN_MODEL_NAME`	Main background model (required for both modes)

OpenAI Direct mode (AI_MODE: open_ai):

Variable	values.yaml Source	Purpose
`OPENAI_API_KEYS`	`openai.secretKeys.openai.OPENAI_API_KEYS`	JSON array of OpenAI keys
`OPENAI_API_KEY`	`openai.secretKeys.openai.OPENAI_API_KEY`	Single OpenAI key
`OPENAI_EMBEDDING_KEY`	`openai.secretKeys.openai.OPENAI_EMBEDDING_KEY`	Embeddings API key
`OPENAI_VISION_KEY`	`openai.secretKeys.openai.OPENAI_VISION_KEY`	Vision model API key

Azure mode (AI_MODE: azure):

Variable	values.yaml Source	Purpose
`AZURE_OPENAI_KEYS`	`openai.secretKeys.azure.AZURE_OPENAI_KEYS`	JSON array of Azure keys
`AZURE_OPENAI_INSTANCE_NAMES`	`openai.secretKeys.azure.AZURE_OPENAI_INSTANCE_NAMES`	JSON array of instance names
`AZURE_OPENAI_VERSION`	`openai.secretKeys.azure.AZURE_OPENAI_VERSION`	API version
`AZURE_OPENAI_DEVELOPMENT_NAME`	`openai.secretKeys.azure.AZURE_OPENAI_DEVELOPMENT_NAME`	Default deployment
`AZURE_OPENAI_GLOBAL_KEYS`	`openai.secretKeys.azure.AZURE_OPENAI_GLOBAL_KEYS`	Global region keys
`AZURE_OPENAI_GLOBAL_INSTANCE_NAMES`	`openai.secretKeys.azure.AZURE_OPENAI_GLOBAL_INSTANCE_NAMES`	Global instances
`BEDROCK_CONVERSE_REGION`	`openai.secretKeys.bedrock.BEDROCK_CONVERSE_REGION`	AWS Bedrock main region
`BEDROCK_CONVERSE_ACCESS_KEY_ID`	`openai.secretKeys.bedrock.BEDROCK_CONVERSE_ACCESS_KEY_ID`	AWS access key (main)
`BEDROCK_CONVERSE_SECRET_ACCESS_KEY`	`openai.secretKeys.bedrock.BEDROCK_CONVERSE_SECRET_ACCESS_KEY`	AWS secret key (main)
`BEDROCK_CONVERSE_GLOBAL_REGION`	`openai.secretKeys.bedrock.BEDROCK_CONVERSE_GLOBAL_REGION`	AWS Bedrock global region
`BEDROCK_CONVERSE_GLOBAL_ACCESS_KEY_ID`	`openai.secretKeys.bedrock.BEDROCK_CONVERSE_GLOBAL_ACCESS_KEY_ID`	AWS access key (global)
`BEDROCK_CONVERSE_GLOBAL_SECRET_ACCESS_KEY`	`openai.secretKeys.bedrock.BEDROCK_CONVERSE_GLOBAL_SECRET_ACCESS_KEY`	AWS secret key (global)
`GOOGLE_GENERATIVE_API_KEY`	`openai.secretKeys.google.GOOGLE_GENERATIVE_API_KEY`	Gemini API key
`ANTHROPIC_API_KEY`	`openai.secretKeys.anthropic.ANTHROPIC_API_KEY`	Anthropic direct API key

Boost

Variable	Source	Purpose
`JITERA_BOOST_API_KEY_MAIN`	`credentials.boost.JITERA_BOOST_API_KEY_MAIN`	Main Boost API key
`JITERA_BOOST_OPENAI_URL_LITELLM`	Auto-generated	LiteLLM proxy URL
`JITERA_BOOST_OPENAI_KEY_LITELLM`	`credentials.boost.JITERA_BOOST_OPENAI_KEY_LITELLM`	LiteLLM master key
`JITERA_BOOST_API_CONFIG_AZURE_*`	`credentials.boost.JITERA_BOOST_API_CONFIG_AZURE_*`	Azure endpoint configs
`JITERA_BOOST_DEFAULT_BASE_MODEL`	`boost.env.*`	Background base model
`JITERA_BOOST_DEFAULT_EXPERT_MODEL`	`boost.env.*`	Background expert model

LiteLLM

Variable	Source	Purpose
`PROXY_MASTER_KEY`	`credentials.boost.JITERA_BOOST_OPENAI_KEY_LITELLM`	Proxy authentication
`AWS_ACCESS_KEY_ID`	`openai.secretKeys.bedrock.BEDROCK_CONVERSE_ACCESS_KEY_ID`	AWS Bedrock auth
`AWS_SECRET_ACCESS_KEY`	`openai.secretKeys.bedrock.BEDROCK_CONVERSE_SECRET_ACCESS_KEY`	AWS Bedrock auth
`GEMINI_API_KEY`	`openai.secretKeys.google.GOOGLE_GENERATIVE_API_KEY`	Google Gemini auth

Appendix: Configuration Checklists

Azure OpenAI Only

Set openai.AI_MODE: azure
Set OPENAI_MAIN_MODEL_NAME under openai.secretKeys.openai (e.g., gpt-4.1)
Configure AZURE_OPENAI_KEYS and AZURE_OPENAI_INSTANCE_NAMES
Set AZURE_OPENAI_VERSION
Configure deployment name env vars for each model
Configure Boost Azure endpoints in credentials.boost.JITERA_BOOST_API_CONFIG_AZURE_*
Register models in SuperAdmin with name matching Azure deployment names

Azure OpenAI + Claude (Bedrock)

Complete Azure setup above
Configure openai.secretKeys.bedrock.* (main and global regions)
Add Claude models to litellm-proxy-config.yaml
Set JITERA_BOOST_OPENAI_KEY_LITELLM in credentials.boost
Register Claude models in SuperAdmin with name matching LiteLLM model_name

Azure OpenAI + Claude + Gemini

Complete Azure + Claude setup above
Configure openai.secretKeys.google.GOOGLE_GENERATIVE_API_KEY
Add Gemini models to litellm-proxy-config.yaml
Register Gemini models in SuperAdmin with name matching LiteLLM model_name

Web Search Agent (Optional)

Configure web search backend: JITERA_BOOST_TAVILY_API_KEY (Tavily) or JITERA_BOOST_SEARXNG_URL (SearXNG)
Verify Jina Reader reachability (r.jina.ai:443)
(Optional) Set JITERA_BOOST_JINA_READER_API_KEY for higher rate limits
Add required domains to firewall allow-list (see Network and Firewall)

Helm Values

Complete configuration reference

Values Reference

All configuration parameters

Architecture

Service architecture overview

Troubleshooting

Common issues and solutions

Release Notes

Documentation Index

​Architecture Overview

​Configuration Files

​Key Concepts

​Required Models by Provider

​Azure OpenAI

​AWS Bedrock (Claude)

​Anthropic Direct API

​Google Gemini

​Other Providers

​Default Models for Background Tasks

​Model Discovery (Boost)

​Choosing a Primary Provider (AI_MODE)

​Azure OpenAI Configuration

​Step 1: Create Azure OpenAI Resource

​Step 2: Deploy Models

​Step 3: Configure Ultron (values.yaml)

​Step 4: Configure Boost (values.yaml)

​SuperAdmin Registration for Azure Models

​AWS Bedrock Configuration (Claude)

​Step 1: Enable Bedrock Models

​Step 2: Create IAM Policy

​Step 3: Configure Ultron (values.yaml)

​Step 4: Configure Boost for Claude (via LiteLLM)

​SuperAdmin Registration for Claude

​Google Gemini Configuration

​Step 1: Get API Key

​Step 2: Configure Ultron (values.yaml)

​Step 3: Configure Boost for Gemini (via LiteLLM)

​SuperAdmin Registration for Gemini

​OpenAI Direct Configuration

​Step 1: Get API Key

​Step 2: Configure Ultron (values.yaml)

​Step 3: Configure Boost (values.yaml)

​Anthropic Direct API Configuration

​Step 1: Get API Key

​Step 2: Configure Ultron (values.yaml)

​SuperAdmin Registration

​vLLM Configuration

​Step 1: Enable vLLM

​Step 2: Configure Credentials

​LiteLLM Proxy Configuration

​Background Model Configuration

​Web Search Agent Configuration

​Architecture

​Web Search Backend

​URL Reading (Jina Reader)

​Reranking (Optional)

​Deep Research Requirements

​Minimum Viable Configuration

​Ultron Provider Routing Reference

​Azure Model-to-Deployment Mapping (Ultron)

​Verification

​Check AI Service Health

​Check Environment Variables

​Test Provider Connectivity

​Troubleshooting

​Model Not Appearing in GUI Dropdown

​”Deployment not found” Error (Azure)

​LiteLLM Model Not Working (Claude/Gemini)

​GPT-5 Falls Back to GPT-4.1

​API Key Errors

​Rate Limiting

​vLLM GPU Issues

​Appendix: Environment Variable Reference

​Ultron

​Boost

​LiteLLM

​Appendix: Configuration Checklists

​Azure OpenAI Only

​Azure OpenAI + Claude (Bedrock)

​Azure OpenAI + Claude + Gemini

​Web Search Agent (Optional)

​Related Documentation

Helm Values

Values Reference

Architecture

Troubleshooting

Architecture Overview

Configuration Files

Key Concepts

Required Models by Provider

Azure OpenAI

AWS Bedrock (Claude)

Anthropic Direct API

Google Gemini

Other Providers

Default Models for Background Tasks

Model Discovery (Boost)

Choosing a Primary Provider (`AI_MODE`)

Azure OpenAI Configuration

Step 1: Create Azure OpenAI Resource

Step 2: Deploy Models

Step 3: Configure Ultron (values.yaml)

Step 4: Configure Boost (values.yaml)

SuperAdmin Registration for Azure Models

AWS Bedrock Configuration (Claude)

Step 1: Enable Bedrock Models

Step 2: Create IAM Policy

Step 3: Configure Ultron (values.yaml)

Step 4: Configure Boost for Claude (via LiteLLM)

SuperAdmin Registration for Claude

Google Gemini Configuration

Step 1: Get API Key

Step 2: Configure Ultron (values.yaml)

Step 3: Configure Boost for Gemini (via LiteLLM)

SuperAdmin Registration for Gemini

OpenAI Direct Configuration

Step 1: Get API Key

Step 2: Configure Ultron (values.yaml)

Step 3: Configure Boost (values.yaml)

Anthropic Direct API Configuration

Step 1: Get API Key

Step 2: Configure Ultron (values.yaml)

SuperAdmin Registration

vLLM Configuration

Step 1: Enable vLLM

Step 2: Configure Credentials

LiteLLM Proxy Configuration

Background Model Configuration

Web Search Agent Configuration

Architecture

Web Search Backend

URL Reading (Jina Reader)

Reranking (Optional)

Deep Research Requirements

Minimum Viable Configuration

Ultron Provider Routing Reference

Azure Model-to-Deployment Mapping (Ultron)

Verification

Check AI Service Health

Check Environment Variables

Test Provider Connectivity

Troubleshooting

Model Not Appearing in GUI Dropdown

”Deployment not found” Error (Azure)

LiteLLM Model Not Working (Claude/Gemini)

GPT-5 Falls Back to GPT-4.1

API Key Errors

Rate Limiting

vLLM GPU Issues

Appendix: Environment Variable Reference

Ultron

Boost

LiteLLM

Appendix: Configuration Checklists

Azure OpenAI Only

Azure OpenAI + Claude (Bedrock)

Azure OpenAI + Claude + Gemini

Web Search Agent (Optional)

Related Documentation