Skip to main content

Documentation Index

Fetch the complete documentation index at: https://docs.jitera.ai/llms.txt

Use this file to discover all available pages before exploring further.

Jitera Self-Hosted uses AI/LLM providers for code generation, AI chat, documentation generation, and code understanding. This guide covers the complete configuration for all supported providers.
The third-party service procedures in this guide (Azure OpenAI, OpenAI, AWS Bedrock, Anthropic, Google AI) are provided as examples. Refer to the official documentation for each provider for the most up-to-date instructions:

Architecture Overview

Jitera routes LLM requests through two internal services — Ultron and Boost — each with its own configuration path.
ServiceRoleConfigurationProvider Access
UltronAI agent processing, background tasksopenai.secretKeys.* in values.yamlAzure OR OpenAI Direct (via AI_MODE) + optional: Bedrock, Anthropic, Google
BoostWorkflow engine, chat, custom agentscredentials.boost.* + litellm-proxy-config.yamlAzure or OpenAI-compatible endpoints + LiteLLM proxy
LiteLLMModel proxy for Boostextra_config/litellm-proxy-config.yamlRoutes Claude/Gemini for Boost

Configuration Files

charts/jitera/values.yaml                             # Main configuration
charts/jitera/extra_config/litellm-proxy-config.yaml  # Claude/Gemini models for Boost

Key Concepts

TermDescriptionExample
nameDisplay name in SuperAdmin — also the routing key for Boostgpt-4.1, claude-3.5-sonnet
modelKeyRouting key for Ultron — pattern-matched to select the provider and credentialsarn:aws:bedrock:ap-northeast-1:123456789012:inference-profile/apac.anthropic.claude-3-5-sonnet-20241022-v2:0
model_nameLiteLLM alias — must match SuperAdmin name for non-Azure providersclaude-3.5-sonnet, gemini-2.5-pro
For Boost, the SuperAdmin name field is the routing key and must match either the Azure deployment name (extracted from the URL) or the LiteLLM model_name. For Ultron, the modelKey field determines provider routing.

Required Models by Provider

Ultron and Boost each access LLM providers differently. The tables below list the required and available models for each provider, organized by which service uses them.
  • Ultron routes requests by pattern-matching the modelKey field (see Provider Routing Reference). Any model ID that matches a supported provider pattern will work — the lists below cover pre-configured and commonly used models.
  • Boost dynamically discovers models from configured endpoints at runtime. Any model available through a configured endpoint (Azure deployment, LiteLLM proxy, or OpenAI-compatible API) can be used.

Azure OpenAI

ModelServiceRoleRegion
gpt-4.1Ultron, BoostDefault chat/completion; Ultron background model; Boost expert defaultMain
gpt-4.1-miniUltron, BoostFast chat; Boost versatile defaultMain
gpt-4.1-nanoUltron, BoostLightweight tasks; Boost base/direct-tasks defaultMain
gpt-4oUltron, BoostVision / Multimodal; Ultron vision defaultMain
gpt-4o-miniUltron, BoostFast completions; Ultron small-model defaultMain
text-embedding-ada-002Ultron, BoostEmbeddingsMain
gpt-4o-transcribeUltronAudio transcriptionGlobal
gpt-4o-mini-transcribeUltronAudio transcription (efficient)Global
o1Ultron, BoostAdvanced reasoningGlobal (auto-routed)
o3Ultron, BoostAdvanced reasoningGlobal (auto-routed)
o3-miniUltron, BoostEfficient reasoningGlobal (auto-routed)
o4-miniUltron, BoostEfficient reasoningGlobal
gpt-5Ultron, BoostNext-genGlobal (azure_global: prefix)
gpt-5-miniUltron, BoostNext-gen efficientGlobal (azure_global: prefix)
gpt-5-nanoUltron, BoostNext-gen lightweightGlobal (azure_global: prefix)
gpt-5.1Ultron, BoostNext-genGlobal (azure_global: prefix)
gpt-5.2Ultron, BoostNext-genGlobal (azure_global: prefix)
Additional Azure models can be registered dynamically using AZURE_DEVELOPMENT_NAME_* environment variables in values.yaml. The env var value becomes both the deployment lookup key and the deployment name. For example, setting AZURE_DEVELOPMENT_NAME_GPT_41=gpt-4.1 registers gpt-4.1 as a known deployment.
Azure OpenAI models require specific deployment SKUs. Using the wrong SKU results in 400 InvalidResourceProperties or 400 ServiceModelDeprecated errors.
  • Standard: Deploys in a specific region. Use for data residency requirements (e.g., japaneast for Japan-local processing). Available for gpt-4.1, gpt-4.1-mini, gpt-4o, text-embedding-ada-002. Note that Standard SKUs are being retired on a per-model, per-region schedule — verify availability before deploying.
  • GlobalStandard: Deploys on Azure’s global infrastructure (requests are routed to the nearest available region). Required for gpt-4.1-nano, o1, o3, o3-mini, o4-mini, and gpt-5 series — these models do not support Standard.
Check model and SKU availability for your region in the Azure OpenAI model matrix.
Some models in this table are approaching retirement. Plan migration to the listed replacements before these dates:
  • gpt-4o — Standard retired 2026-03-31; other SKUs retire 2026-10-01 (replacement: gpt-5.1)
  • gpt-4o-mini — Standard retired 2026-03-31; other SKUs retire 2026-10-01 (replacement: gpt-4.1-mini)
  • o1 — retires 2026-07-15 (replacement: o3)
  • o3-mini — retires 2026-08-02 (replacement: o4-mini)
  • gpt-4o-transcribe — retires 2026-06-01
Verify current dates on the Azure OpenAI model retirements page.
text-embedding-ada-002 is still GA (no retirement scheduled before 2027-04-15), but Microsoft recommends text-embedding-3-small or text-embedding-3-large for new deployments.

AWS Bedrock (Claude)

Ultron routes any modelKey containing anthropic.claude to AWS Bedrock Converse. Boost accesses Claude through the LiteLLM proxy.
The following Bedrock model IDs are approaching retirement. Plan migration to the listed replacements before these dates:
  • anthropic.claude-3-7-sonnet-20250219-v1:0 — EOL 2026-04-28 (replacement: anthropic.claude-sonnet-4-6)
  • anthropic.claude-opus-4-20250514-v1:0 — EOL 2026-05-31 (replacement: anthropic.claude-opus-4-6-v1)
  • anthropic.claude-3-5-haiku-20241022-v1:0 — EOL 2026-06-19 (replacement: anthropic.claude-haiku-4-5-20251001-v1:0)
  • anthropic.claude-3-5-sonnet-20240620-v1:0, anthropic.claude-3-5-sonnet-20241022-v2:0 — APAC EOL 2026-07-30
  • anthropic.claude-3-haiku-20240307-v1:0 — Bedrock EOL 2026-09-10 (already retired on Anthropic API)
  • anthropic.claude-sonnet-4-20250514-v1:0 — moved to Legacy 2026-04-14, Bedrock EOL 2026-10-14
Verify current dates on the AWS Bedrock model lifecycle page and Anthropic model deprecations.
ModelServiceBedrock Model IDRegion
Claude 3 HaikuUltronanthropic.claude-3-haiku-20240307-v1:0APAC
Claude 3.5 HaikuUltronanthropic.claude-3-5-haiku-20241022-v1:0US
Claude Haiku 4.5Ultronanthropic.claude-haiku-4-5-20251001-v1:0US / APAC
Claude 3.5 Sonnet v1Ultronanthropic.claude-3-5-sonnet-20240620-v1:0APAC
Claude 3.5 Sonnet v2Ultron, Boost (via LiteLLM)anthropic.claude-3-5-sonnet-20241022-v2:0APAC
Claude 3.7 SonnetUltron, Boost (via LiteLLM)anthropic.claude-3-7-sonnet-20250219-v1:0APAC
Claude Sonnet 4Ultron, Boost (via LiteLLM)anthropic.claude-sonnet-4-20250514-v1:0APAC
Claude Sonnet 4.5Ultron, Boost (via LiteLLM)anthropic.claude-sonnet-4-5-20250929-v1:0US / APAC
Claude Sonnet 4.6Ultron, Boost (via LiteLLM)anthropic.claude-sonnet-4-6US
Claude Opus 4Ultronanthropic.claude-opus-4-20250514-v1:0US
Claude Opus 4.1Ultronanthropic.claude-opus-4-1-20250805-v1:0US
Claude Opus 4.5Ultronanthropic.claude-opus-4-5-20251101-v1:0US
Claude Opus 4.6Ultron, Boost (via LiteLLM)anthropic.claude-opus-4-6-v1US

Anthropic Direct API

Ultron routes any modelKey containing claude (but not anthropic.claude) to the Anthropic API directly.
The following Claude models are deprecated on the Anthropic API and retire on 2026-06-15:
  • claude-sonnet-4-20250514 (replacement: claude-sonnet-4-6)
  • claude-opus-4-20250514 (replacement: claude-opus-4-6)
Verify current dates on the Anthropic model deprecations page.
ModelServicemodelKey
Claude Sonnet 4Ultronclaude-sonnet-4-20250514
Claude Sonnet 4.5Ultronclaude-sonnet-4-5-20250929
Claude Sonnet 4.6Ultronclaude-sonnet-4-6
Claude Opus 4Ultronclaude-opus-4-20250514
Claude Opus 4.1Ultronclaude-opus-4-1-20250805
Claude Opus 4.5Ultronclaude-opus-4-5-20251101
Claude Opus 4.6Ultronclaude-opus-4-6

Google Gemini

Ultron routes any modelKey containing gemini to Google Generative AI. Boost accesses Gemini through the LiteLLM proxy.
ModelServicemodelKey
Gemini 3.1 ProUltron, Boost (via LiteLLM)gemini-3.1-pro
Gemini 3 Pro ImageUltron, Boost (via LiteLLM)gemini-3-pro-image
Gemini 3 FlashUltron, Boost (via LiteLLM)gemini-3-flash
Gemini 2.5 ProUltron, Boost (via LiteLLM)gemini-2.5-pro
Gemini 2.5 FlashUltron, Boost (via LiteLLM)gemini-2.5-flash
Gemini 2.5 Flash ImageUltron, Boost (via LiteLLM)gemini-2.5-flash-image
Gemini 2.0 FlashUltron, Boost (via LiteLLM)gemini-2.0-flash
Models matching gemini-2.0-flash-thinking*, gemini-2.5*, or gemini-3* automatically have thinking/reasoning enabled.
Vertex AI support is not included in v26.02.16. It will be available in a future release.

Other Providers

ProviderServicePattern MatchmodelKey Example
OpenAI DirectUltron, BoostStarts with openai:openai:gpt-4o
GroqUltronContains deepseek-r1-distilldeepseek-r1-distill-llama-70b
Qwen (vLLM)UltronContains qwenqwen-2.5-72b
OllamaUltronConfigured via OLLAMA_BASE_URLAny model name

Default Models for Background Tasks

Even when a user selects a specific model, background operations use default models. These defaults must be available from a configured provider. Ultron defaults:
RoleDefault ModelEnvironment Variable
Main background modelgpt-4.1OPENAI_MAIN_MODEL_NAME
Small modelgpt-4o-mini(code default)
Vision modelgpt-4o(code default)
OPENAI_MAIN_MODEL_NAME is required for both AI_MODE: azure and AI_MODE: open_ai. For Azure, set this to a deployment name that exists in your Azure OpenAI resource. Boost defaults:
RoleDefault ModelEnvironment Variable
Base (simple tasks)gpt-4.1-nanoJITERA_BOOST_DEFAULT_BASE_MODEL
Direct tasks (titles, tags)gpt-4.1-nanoJITERA_BOOST_DIRECT_TASKS_MODEL
Versatile (balanced)gpt-4.1-miniJITERA_BOOST_DEFAULT_VERSATILE_MODEL
Expert (complex reasoning)gpt-4.1JITERA_BOOST_DEFAULT_EXPERT_MODEL
Visiongpt-4.1JITERA_BOOST_DEFAULT_VISION_MODEL
Embeddingstext-embedding-ada-002JITERA_BOOST_DEFAULT_EMBEDDING_MODEL
Audio (speech-to-text)jitera/sttJITERA_BOOST_DEFAULT_AUDIO_MODEL

Model Discovery (Boost)

Boost does not maintain a hardcoded model list. It discovers available models dynamically:
Endpoint TypeHow Models Are Discovered
Azure OpenAIDeployment name extracted from the URL path
OpenAI-compatible (incl. LiteLLM)Calls GET /v1/models on the endpoint
Internal workflowsRegistered from Boost’s workflow registry
Local audioHardcoded: jitera/tts and jitera/stt (via sherpa-onnx)
To make a model available in Boost, configure an endpoint that serves it (Azure deployment, LiteLLM proxy entry, or OpenAI-compatible API) and register it in SuperAdmin with a name that matches the discovered model ID.

Choosing a Primary Provider (AI_MODE)

Ultron’s primary LLM provider is set via AI_MODE in values.yaml. Choose one:
ModeProviderRequired Environment Variables
open_ai (default)OpenAI Direct APIOPENAI_API_KEYS, OPENAI_API_KEY, OPENAI_EMBEDDING_KEY, OPENAI_VISION_KEY
azureAzure OpenAIAZURE_OPENAI_KEYS, AZURE_OPENAI_INSTANCE_NAMES, AZURE_OPENAI_VERSION, AZURE_OPENAI_DEVELOPMENT_NAME, AZURE_OPENAI_EMBEDDING_DEVELOPMENT_NAME
openai:
  AI_MODE: open_ai  # or azure
This determines Ultron’s primary provider only. Other providers (Bedrock, Anthropic, Google, vLLM) can be added alongside either mode. Boost is provider-agnostic — it connects to any OpenAI-compatible endpoint.

Azure OpenAI Configuration

Configure Azure OpenAI as Ultron’s primary provider by setting AI_MODE: azure.

Step 1: Create Azure OpenAI Resource

Create an Azure OpenAI resource and obtain your endpoint URL and API key. For detailed and up-to-date instructions, see the Azure OpenAI documentation.
# Create resource
az cognitiveservices account create \
  --name jitera-openai \
  --resource-group jitera-rg \
  --kind OpenAI \
  --sku S0 \
  --location japaneast \
  --custom-domain jitera-openai

# Get endpoint
az cognitiveservices account show \
  --name jitera-openai \
  --resource-group jitera-rg \
  --query "properties.endpoint"

# Get key
az cognitiveservices account keys list \
  --name jitera-openai \
  --resource-group jitera-rg
The AZURE_OPENAI_INSTANCE_NAMES value must be a custom subdomain name (e.g., my-instance), not a regional endpoint. Ultron constructs URLs as https://{instance}.openai.azure.com. If your Azure OpenAI resource uses a regional endpoint (e.g., eastus2.api.cognitive.microsoft.com), enable a custom subdomain:
az cognitiveservices account update \
  --name <resource-name> \
  --resource-group <rg> \
  --custom-domain <desired-subdomain>
You can verify your endpoint format in the Azure Portal under your OpenAI resource > Keys and Endpoint. The endpoint must be https://<subdomain>.openai.azure.com/.

Step 2: Deploy Models

Jitera’s backend services (Ultron and Boost) reference Azure deployments by name through environment variables and endpoint URLs. Each deployment you create here must match the deployment name configured in Jitera’s Helm values — otherwise the services cannot route requests to the correct model. For the full list of deployment-to-environment-variable mappings, see the Azure Model-to-Deployment Mapping section. Deploy the following models in the Azure OpenAI Studio or via the Azure CLI. For deployment instructions, see the Azure OpenAI deployment guide.
Use the same string for both Model deployment name and Model name (e.g. deploy gpt-4.1 with deployment name gpt-4.1). This simplifies configuration since Jitera uses the deployment name as the routing key.
Recommended minimum deployments:
ModelDeployment NamePurpose
gpt-4.1gpt-4.1Main chat/completion, default fallback
gpt-4ogpt-4oVision, multimodal tasks
gpt-4o-minigpt-4o-miniFast completions
text-embedding-ada-002text-embedding-ada-002Embeddings
o1o1Advanced reasoning (global region)
o3-minio3-miniEfficient reasoning (global region)

Step 3: Configure Ultron (values.yaml)

Ultron reads Azure OpenAI settings from environment variables injected via openai.secretKeys.azure:
openai:
  AI_MODE: azure  # Required: set to "azure" for Azure OpenAI
  secretKeys:
    azure:
      # === Main Region (e.g. Japan East) ===
      AZURE_OPENAI_KEY: "<your-api-key>"
      AZURE_OPENAI_KEYS: '["<key1>", "<key2>"]'              # JSON array for load balancing
      AZURE_OPENAI_INSTANCE_NAME: "<your-instance-name>"
      AZURE_OPENAI_INSTANCE_NAMES: '["<instance1>", "<instance2>"]'
      AZURE_OPENAI_VERSION: "2024-10-21"

      # Deployment names (must match Azure portal)
      AZURE_OPENAI_DEVELOPMENT_NAME: gpt-4.1                  # Default/fallback model
      AZURE_OPENAI_EMBEDDING_DEVELOPMENT_NAME: text-embedding-ada-002
      AZURE_OPENAI_VISION_DEVELOPMENT_NAME: gpt-4o
      AZURE_OPENAI_GPT_4O_DEVELOPMENT_NAME: gpt-4o
      AZURE_OPENAI_GPT_4O_MINI_DEVELOPMENT_NAME: gpt-4o-mini
      AZURE_DEVELOPMENT_NAME_GPT_41: gpt-4.1

      # === Global Region (e.g. Sweden Central or US East) ===
      # Required for O1, O3, and GPT-5 models
      AZURE_OPENAI_GLOBAL_KEYS: '["<global-key1>", "<global-key2>"]'
      AZURE_OPENAI_GLOBAL_INSTANCE_NAMES: '["<instance-swedencentral>"]'
      AZURE_OPENAI_GLOBAL_VERSION: "2024-12-01-preview"

      # O1/O3 models (auto-routed to Global region)
      AZURE_OPENAI_GPT_O1_DEVELOPMENT_NAME: o1
      AZURE_OPENAI_GPT_O1_MINI_DEVELOPMENT_NAME: o1-mini
      AZURE_OPENAI_GPT_O3_MINI_DEVELOPMENT_NAME: o3-mini
      AZURE_DEVELOPMENT_NAME_O3: o3

      # GPT-5 models (requires azure_global: prefix in SuperAdmin modelKey)
      AZURE_OPENAI_GPT_5_DEVELOPMENT_NAME: gpt-5
      AZURE_OPENAI_GPT_5_MINI_DEVELOPMENT_NAME: gpt-5-mini
      AZURE_OPENAI_GPT_5_NANO_DEVELOPMENT_NAME: gpt-5-nano
      AZURE_OPENAI_GPT_5_CHAT_DEVELOPMENT_NAME: gpt-5-chat
      AZURE_OPENAI_GPT_51_DEVELOPMENT_NAME: gpt-5.1
      AZURE_OPENAI_GPT_52_DEVELOPMENT_NAME: gpt-5.2

    openai:
      # Main model name — used by Ultron for background tasks regardless of AI_MODE
      OPENAI_MAIN_MODEL_NAME: gpt-4.1
OPENAI_MAIN_MODEL_NAME is required even when using Azure mode. Despite being under the openai key, this value is injected into Ultron unconditionally and determines the model used for background processing tasks. For Azure deployments, set this to a deployment name that exists in your Azure OpenAI resource (e.g., gpt-4.1).

Step 4: Configure Boost (values.yaml)

Boost reads Azure OpenAI settings from credentials.boost.JITERA_BOOST_API_CONFIG_AZURE_* variables. Each variable encodes one Azure deployment endpoint. Format:
behavior=azure,url=https://<instance>.openai.azure.com/openai/deployments/<deployment>,headers={"api-key": "<key>"},query_params={"api-version": "<version>"}
credentials:
  boost:
    # === Main region models ===
    JITERA_BOOST_API_CONFIG_AZURE_INSTANCE_1_41: 'behavior=azure,url=https://<instance>.openai.azure.com/openai/deployments/gpt-4.1,headers={"api-key": "<key>"},query_params={"api-version": "2024-12-01-preview"}'
    JITERA_BOOST_API_CONFIG_AZURE_INSTANCE_1_41_MINI: 'behavior=azure,url=https://<instance>.openai.azure.com/openai/deployments/gpt-4.1-mini,headers={"api-key": "<key>"},query_params={"api-version": "2024-12-01-preview"}'
    JITERA_BOOST_API_CONFIG_AZURE_INSTANCE_1_41_NANO: 'behavior=azure,url=https://<instance>.openai.azure.com/openai/deployments/gpt-4.1-nano,headers={"api-key": "<key>"},query_params={"api-version": "2024-12-01-preview"}'
    JITERA_BOOST_API_CONFIG_AZURE_INSTANCE_1_ADA: 'behavior=azure,url=https://<instance>.openai.azure.com/openai/deployments/text-embedding-ada-002,headers={"api-key": "<key>"},query_params={"api-version": "2"}'
    JITERA_BOOST_API_CONFIG_AZURE_INSTANCE_1_4O: 'behavior=azure,url=https://<instance>.openai.azure.com/openai/deployments/gpt-4o,headers={"api-key": "<key>"},query_params={"api-version": "2024-12-01-preview"}'
    JITERA_BOOST_API_CONFIG_AZURE_INSTANCE_1_4O_MINI: 'behavior=azure,url=https://<instance>.openai.azure.com/openai/deployments/gpt-4o-mini,headers={"api-key": "<key>"},query_params={"api-version": "2024-12-01-preview"}'

    # === Reasoning models (global region) ===
    JITERA_BOOST_API_CONFIG_AZURE_INSTANCE_1_O1: 'behavior=azure,url=https://<global-instance>.openai.azure.com/openai/deployments/o1,headers={"api-key": "<key>"},query_params={"api-version": "2024-12-01-preview"}'
    JITERA_BOOST_API_CONFIG_AZURE_INSTANCE_1_O3: 'behavior=azure,url=https://<global-instance>.openai.azure.com/openai/deployments/o3,headers={"api-key": "<key>"},query_params={"api-version": "2024-12-01-preview"}'
    JITERA_BOOST_API_CONFIG_AZURE_INSTANCE_1_O3_MINI: 'behavior=azure,url=https://<global-instance>.openai.azure.com/openai/deployments/o3-mini,headers={"api-key": "<key>"},query_params={"api-version": "2024-12-01-preview"}'
    JITERA_BOOST_API_CONFIG_AZURE_INSTANCE_1_O4_MINI: 'behavior=azure,url=https://<global-instance>.openai.azure.com/openai/deployments/o4-mini,headers={"api-key": "<key>"},query_params={"api-version": "2024-12-01-preview"}'

    # === GPT-5 family (global region — Sweden/US) ===
    JITERA_BOOST_API_CONFIG_AZURE_INSTANCE_GPT5_5: 'behavior=azure,url=https://<global-instance>.openai.azure.com/openai/deployments/gpt-5,headers={"api-key": "<key>"},query_params={"api-version": "2024-12-01-preview"}'
    JITERA_BOOST_API_CONFIG_AZURE_INSTANCE_GPT5_5_MINI: 'behavior=azure,url=https://<global-instance>.openai.azure.com/openai/deployments/gpt-5-mini,headers={"api-key": "<key>"},query_params={"api-version": "2024-12-01-preview"}'
    JITERA_BOOST_API_CONFIG_AZURE_INSTANCE_GPT5_5_NANO: 'behavior=azure,url=https://<global-instance>.openai.azure.com/openai/deployments/gpt-5-nano,headers={"api-key": "<key>"},query_params={"api-version": "2024-12-01-preview"}'
    JITERA_BOOST_API_CONFIG_AZURE_INSTANCE_GPT5_5_CHAT: 'behavior=azure,url=https://<global-instance>.openai.azure.com/openai/deployments/gpt-5-chat,headers={"api-key": "<key>"},query_params={"api-version": "2024-12-01-preview"}'
    JITERA_BOOST_API_CONFIG_AZURE_INSTANCE_GPT5_51: 'behavior=azure,url=https://<global-instance>.openai.azure.com/openai/deployments/gpt-5.1,headers={"api-key": "<key>"},query_params={"api-version": "2024-12-01-preview"}'
    JITERA_BOOST_API_CONFIG_AZURE_INSTANCE_GPT5_52: 'behavior=azure,url=https://<global-instance>.openai.azure.com/openai/deployments/gpt-5.2,headers={"api-key": "<key>"},query_params={"api-version": "2024-12-01-preview"}'
All JITERA_BOOST_API_CONFIG_AZURE_INSTANCE_1_* keys defined in the chart’s values.yaml must be explicitly overridden in your values file. Keys left with the default placeholder value (<REPLACE_WITH_YOUR_AZURE_CONFIG>) will crash Boost on startup with a Pydantic validation error. Set models you have not deployed to an empty string ("").
How Boost discovers model names from Azure: Boost extracts the deployment name from the last path segment of the URL:
URL: https://instance.openai.azure.com/openai/deployments/gpt-4.1

Discovered model name: gpt-4.1
This name must match the SuperAdmin LLM name field exactly. Endpoint format parameters:
ParameterDescriptionExample
behaviorProvider typeazure or openai
urlFull API endpoint URLhttps://instance.openai.azure.com/openai/deployments/gpt-4.1
headersJSON object with request headers{"api-key": "xxx"}
query_paramsJSON object with query parameters{"api-version": "2024-12-01-preview"}
weightLoad balancing weight (optional)1.0

SuperAdmin Registration for Azure Models

FieldValue
nameMust match Azure deployment name (e.g. gpt-4.1)
modelKeySame as name (e.g. gpt-4.1)
providerAzure OpenAI
O1, O1-mini, and O3-mini are automatically routed to the Azure Global region by Ultron. GPT-5 models are not auto-routed — use the azure_global: prefix in the SuperAdmin modelKey (e.g. azure_global:gpt-5) to route them to the Global region.

AWS Bedrock Configuration (Claude)

Step 1: Enable Bedrock Models

Enable the required Claude models in the AWS Bedrock console. For detailed instructions, see the AWS Bedrock documentation. Common models used with Jitera:
  • Claude 3.5 Sonnet v2 (anthropic.claude-3-5-sonnet-20241022-v2:0)
  • Claude 3.7 Sonnet (anthropic.claude-3-7-sonnet-20250219-v1:0)
  • Claude Sonnet 4 (anthropic.claude-sonnet-4-20250514-v1:0)
  • Claude Sonnet 4.5 (anthropic.claude-sonnet-4-5-20250929-v1:0)
  • Claude Sonnet 4.6 (anthropic.claude-sonnet-4-6)
  • Claude Opus 4 (anthropic.claude-opus-4-20250514-v1:0)
  • Claude Opus 4.1 (anthropic.claude-opus-4-1-20250805-v1:0)
  • Claude Opus 4.5 (anthropic.claude-opus-4-5-20251101-v1:0)
  • Claude Opus 4.6 (anthropic.claude-opus-4-6-v1)

Step 2: Create IAM Policy

Create an IAM policy that grants Bedrock model invocation permissions. For IAM best practices, see the AWS IAM documentation.
{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Effect": "Allow",
      "Action": [
        "bedrock:InvokeModel",
        "bedrock:InvokeModelWithResponseStream"
      ],
      "Resource": "*"
    }
  ]
}

Step 3: Configure Ultron (values.yaml)

Ultron calls Bedrock directly using credentials from openai.secretKeys.bedrock:
openai:
  secretKeys:
    bedrock:
      # Main region (e.g. ap-northeast-1 for APAC)
      BEDROCK_CONVERSE_REGION: ap-northeast-1
      BEDROCK_CONVERSE_ACCESS_KEY_ID: "<aws-access-key>"
      BEDROCK_CONVERSE_SECRET_ACCESS_KEY: "<aws-secret-key>"

      # Global region — required for Claude 3.7 and Claude 4 Opus
      BEDROCK_CONVERSE_GLOBAL_REGION: us-east-1
      BEDROCK_CONVERSE_GLOBAL_ACCESS_KEY_ID: "<aws-access-key>"
      BEDROCK_CONVERSE_GLOBAL_SECRET_ACCESS_KEY: "<aws-secret-key>"
Environment variables injected into Ultron:
VariablePurpose
BEDROCK_CONVERSE_REGIONAWS region for main Bedrock access
BEDROCK_CONVERSE_ACCESS_KEY_IDAWS access key for main region
BEDROCK_CONVERSE_SECRET_ACCESS_KEYAWS secret key for main region
BEDROCK_CONVERSE_GLOBAL_REGIONSecondary AWS region for newer models
BEDROCK_CONVERSE_GLOBAL_ACCESS_KEY_IDAWS access key for global region
BEDROCK_CONVERSE_GLOBAL_SECRET_ACCESS_KEYAWS secret key for global region

Step 4: Configure Boost for Claude (via LiteLLM)

Boost accesses Claude through the LiteLLM proxy, reusing the same Bedrock credentials. Step 4a — LiteLLM credentials are injected automatically from openai.secretKeys.bedrock:
AWS_ACCESS_KEY_ID     ← BEDROCK_CONVERSE_ACCESS_KEY_ID
AWS_SECRET_ACCESS_KEY ← BEDROCK_CONVERSE_SECRET_ACCESS_KEY
Step 4b — Add Claude models to litellm-proxy-config.yaml: For each model, set model to bedrock/ followed by the Bedrock model ID from the model ID table below. The model_name must match the SuperAdmin name field. AWS credentials are inherited from the environment (Step 4a), so only aws_region_name is required.
# charts/jitera/extra_config/litellm-proxy-config.yaml
model_list:
  - model_name: claude-3.5-sonnet            # Must match SuperAdmin name
    litellm_params:
      model: bedrock/apac.anthropic.claude-3-5-sonnet-20241022-v2:0  # bedrock/ + model ID
      aws_region_name: ap-northeast-1

  - model_name: claude-sonnet-4
    litellm_params:
      model: bedrock/apac.anthropic.claude-sonnet-4-20250514-v1:0
      aws_region_name: ap-northeast-1

  - model_name: claude-sonnet-4.6
    litellm_params:
      model: bedrock/us.anthropic.claude-sonnet-4-6
      aws_region_name: us-east-1

  - model_name: claude-opus-4.6
    litellm_params:
      model: bedrock/global.anthropic.claude-opus-4-6-v1
      aws_region_name: us-east-1

  # Add additional models following the same pattern

general_settings:
  master_key: os.environ/PROXY_MASTER_KEY
Step 4c — Configure the Boost connection to LiteLLM (in credentials.boost):
credentials:
  boost:
    JITERA_BOOST_OPENAI_KEY_LITELLM: "<litellm-master-key>"
    # JITERA_BOOST_OPENAI_URL_LITELLM is set automatically to http://jitera-litellm:80

SuperAdmin Registration for Claude

FieldValue
nameMust match LiteLLM model_name (e.g. claude-3.5-sonnet) — used by Boost for routing
modelKeyFull ARN of the Bedrock inference profile (see format and examples below) — used by Ultron
providerAWS Bedrock
modelKey requirements: The modelKey must be a full AWS Bedrock inference profile ARN in the following format:
arn:aws:bedrock:{region}:{account-id}:inference-profile/{profile-id}
ComponentDescriptionExample
{region}AWS region where the inference profile is availableus-east-1, ap-northeast-1
{account-id}Your AWS account ID123456789012
{profile-id}Cross-region inference profile ID (see table below)us.anthropic.claude-sonnet-4-5-20250929-v1:0
For example:
arn:aws:bedrock:us-east-1:123456789012:inference-profile/us.anthropic.claude-sonnet-4-5-20250929-v1:0
Ultron uses this value for two routing decisions:
  1. Provider selection — the ARN contains anthropic.claude, which triggers the Bedrock Converse provider (see routing rules)
  2. Region selection — the {region} in the ARN determines which Bedrock credentials to use. If it matches BEDROCK_CONVERSE_REGION (e.g. ap-northeast-1), main region credentials are used. If it matches BEDROCK_CONVERSE_GLOBAL_REGION (e.g. us-east-1), global region credentials are used.
Cross-region inference profile IDs: These are AWS Bedrock cross-region inference profile IDs. Use them as the {profile-id} in the ARN for the SuperAdmin modelKey, and in LiteLLM config with a bedrock/ prefix.
Profile IDRegionModel
apac.anthropic.claude-3-5-sonnet-20241022-v2:0Main (ap-northeast-1)Claude 3.5 Sonnet v2
apac.anthropic.claude-3-7-sonnet-20250219-v1:0Main (ap-northeast-1)Claude 3.7 Sonnet
apac.anthropic.claude-sonnet-4-20250514-v1:0Main (ap-northeast-1)Claude Sonnet 4
apac.anthropic.claude-sonnet-4-5-20250929-v1:0Main (ap-northeast-1)Claude Sonnet 4.5
us.anthropic.claude-sonnet-4-5-20250929-v1:0Global (us-east-1)Claude Sonnet 4.5 (US)
us.anthropic.claude-sonnet-4-6Global (us-east-1)Claude Sonnet 4.6
us.anthropic.claude-opus-4-20250514-v1:0Global (us-east-1)Claude Opus 4
us.anthropic.claude-opus-4-1-20250805-v1:0Global (us-east-1)Claude Opus 4.1
global.anthropic.claude-opus-4-5-20251101-v1:0Global (us-east-1)Claude Opus 4.5
global.anthropic.claude-opus-4-6-v1Global (us-east-1)Claude Opus 4.6

Google Gemini Configuration

Step 1: Get API Key

Obtain a Gemini API key from Google AI Studio. For detailed instructions, see the Gemini API documentation.

Step 2: Configure Ultron (values.yaml)

openai:
  secretKeys:
    google:
      GOOGLE_GENERATIVE_API_KEY: "<your-gemini-api-key>"

Step 3: Configure Boost for Gemini (via LiteLLM)

The Gemini API key is injected into the LiteLLM container automatically as GEMINI_API_KEY. Add Gemini models to litellm-proxy-config.yaml:
model_list:
  - model_name: gemini-3.1-pro            # Must match SuperAdmin name
    litellm_params:
      model: gemini/gemini-3.1-pro-preview
      api_key: os.environ/GEMINI_API_KEY

  - model_name: gemini-3-flash
    litellm_params:
      model: gemini/gemini-3-flash-preview
      api_key: os.environ/GEMINI_API_KEY

  - model_name: gemini-2.5-pro
    litellm_params:
      model: gemini/gemini-2.5-pro
      api_key: os.environ/GEMINI_API_KEY

  - model_name: gemini-2.5-flash
    litellm_params:
      model: gemini/gemini-2.5-flash
      api_key: os.environ/GEMINI_API_KEY

  - model_name: gemini-2.0-flash
    litellm_params:
      model: gemini/gemini-2.0-flash
      api_key: os.environ/GEMINI_API_KEY
Gemini preview model IDs (those with -preview suffix) change when Google promotes a model to GA or releases a new preview. Verify current model IDs in the Google AI documentation when configuring.

SuperAdmin Registration for Gemini

FieldValue
nameMust match LiteLLM model_name (e.g. gemini-2.0-flash)
modelKeySame as name
providerGoogle

OpenAI Direct Configuration

Configure OpenAI Direct as Ultron’s primary provider by setting AI_MODE: open_ai (the default).

Step 1: Get API Key

Create an API key with GPT-4 access from the OpenAI platform. For details, see the OpenAI API documentation.

Step 2: Configure Ultron (values.yaml)

openai:
  AI_MODE: open_ai  # Default
  secretKeys:
    openai:
      OPENAI_API_KEY: "<your-openai-api-key>"
      OPENAI_API_KEYS: '["<key1>", "<key2>"]'   # JSON array for load balancing
      OPENAI_EMBEDDING_KEY: "<your-embedding-api-key>"
      OPENAI_VISION_KEY: "<your-vision-api-key>"
      OPENAI_MAIN_MODEL_NAME: "gpt-4.1"

Step 3: Configure Boost (values.yaml)

Configure Boost endpoints pointing to the OpenAI API:
credentials:
  boost:
    JITERA_BOOST_OPENAI_URL_OPENAI: "https://api.openai.com/v1"
    JITERA_BOOST_OPENAI_KEY_OPENAI: "<your-openai-api-key>"
Ultron modelKey format for OpenAI Direct: Use the openai: prefix to explicitly route to OpenAI’s API. The prefix is stripped and the remainder is used as the model ID.
modelKeyActual Model Used
openai:gpt-4.1gpt-4.1
openai:gpt-4ogpt-4o

Anthropic Direct API Configuration

Direct access to Anthropic’s API, bypassing AWS Bedrock.

Step 1: Get API Key

Create an API key from the Anthropic Console. For details, see the Anthropic API documentation.

Step 2: Configure Ultron (values.yaml)

openai:
  secretKeys:
    anthropic:
      ANTHROPIC_API_KEY: "<your-anthropic-api-key>"
Ultron modelKey requirements for Anthropic Direct: The modelKey must contain claude but must not contain anthropic.claude — otherwise Bedrock is used instead.
modelKeyDescription
claude-3-haiku-20240307Claude 3 Haiku
claude-3-5-sonnet-20241022Claude 3.5 Sonnet v2
claude-sonnet-4-20250514Claude Sonnet 4
claude-sonnet-4-5-20250929Claude Sonnet 4.5
claude-sonnet-4-6Claude Sonnet 4.6
claude-opus-4-20250514Claude Opus 4
claude-opus-4-1-20250805Claude Opus 4.1
claude-opus-4-6Claude Opus 4.6

SuperAdmin Registration

FieldValue
nameclaude-sonnet-4-20250514
modelKeyclaude-sonnet-4-20250514
providerAnthropic

vLLM Configuration

For air-gapped deployments or local model hosting.
vLLM requires GPU nodes with NVIDIA CUDA support.

Step 1: Enable vLLM

vllm:
  enabled: true
  replicaCount: 1
  args:
    - "vllm serve Qwen/Qwen2.5-Coder-1.5B-Instruct-AWQ --trust-remote-code --enable-prefix-caching --disable-log-requests --dtype=float16"
  resources:
    requests:
      memory: "4Gi"
      cpu: "2000m"
    limits:
      memory: "32Gi"
      cpu: "8000m"
      nvidia.com/gpu: 1
  nodeSelector:
    accelerator: nvidia-gpu

Step 2: Configure Credentials

credentials:
  vllm:
    HUGGING_FACE_HUB_TOKEN: "<your-hf-token>"

LiteLLM Proxy Configuration

LiteLLM provides a unified API proxy for Claude (Bedrock) and Gemini models used by Boost.
litellm:
  enabled: true
  replicaCount: 1
  resources:
    requests:
      memory: "512Mi"
      cpu: "250m"
The proxy model list is defined in charts/jitera/extra_config/litellm-proxy-config.yaml. See the AWS Bedrock and Google Gemini sections for model configuration examples.

Background Model Configuration

Default models for background tasks are documented in the Default Models for Background Tasks section above.

Web Search Agent Configuration

Boost includes a Web Search Agent that provides web search, URL reading, and deep research capabilities. These features require additional API keys and firewall rules beyond the core LLM configuration.

Architecture

The Web Search Agent has two core capabilities:
  1. Web Search — Finding information on the internet
  2. URL Reading — Extracting content from web pages
Each capability has multiple backend options with a fallback chain:
CapabilityToolDefault BackendFallbackTrigger
Web Searchboost__web_searchTavily (if API key set)SearXNG (if Tavily not configured)Agent explicitly calls the tool
Google Searchboost__google_searchGoogle (Agno scraping)N/ALegacy — registered globally but not used by any current workflow
URL Readingboost__read_webpageJina Reader (r.jina.ai)NoneAgent explicitly calls the tool (e.g., deep-research skill)
URL Readingread-urls middlewareMarkItDown (local conversion)Jina Reader (if MarkItDown returns empty)Automatically processes URLs in user messages
The boost__read_webpage tool depends exclusively on Jina Reader with no fallback. If Jina Reader is unreachable, this tool will fail. Skills that rely on it (e.g., deep-research) will not function.
If neither Tavily nor SearXNG is configured, the boost__web_search tool will not be registered. Skills that depend on it (e.g., deep-research) will fail. boost__google_search exists in the global tool registry but is not used by any current workflow — it is a legacy tool from Document Agent v0.1.5.

Web Search Backend

Choose one of the following search backends:

URL Reading (Jina Reader)

Jina Reader converts web pages to text for the boost__read_webpage tool.
credentials:
  boost:
    # JITERA_BOOST_JINA_READER_API_URL: "https://r.jina.ai"  # Default
    JITERA_BOOST_JINA_READER_API_KEY: "<your-jina-api-key>"   # Optional — free tier allows 20 RPM
VariableRequiredDefaultDescription
JITERA_BOOST_JINA_READER_API_URLYeshttps://r.jina.aiJina Reader API URL
JITERA_BOOST_JINA_READER_API_KEYNo""Jina API key for higher rate limits

Reranking (Optional)

Reranking improves search result quality for Document Agent and Code Agent RAG workflows. It is not required for the Web Search Agent to function.
VariableRequiredDefaultDescription
JITERA_BOOST_JINA_BASE_API_URLNohttps://api.jina.aiJina Rerank API base URL
JITERA_BOOST_CO_API_URLNohttps://api.cohere.aiCohere Rerank API URL (alternative to Jina)
JITERA_BOOST_CO_API_KEYNo""Cohere API key

Deep Research Requirements

The deep-research skill requires both of the following:
RequirementMinimum Configuration
boost__web_searchTavily API key or SearXNG URL must be configured
boost__read_webpageJina Reader (r.jina.ai) must be accessible
Jina’s free tier (20 RPM) may hit rate limits during deep research sessions that read 20+ URLs. Consider setting JITERA_BOOST_JINA_READER_API_KEY for higher limits in production.

Minimum Viable Configuration

credentials:
  boost:
    JITERA_BOOST_TAVILY_API_KEY: "tvly-xxxxxxxxxxxxx"
    # Jina Reader uses defaults (https://r.jina.ai, no API key, 20 RPM free tier)
    # MarkItDown requires no configuration (local library)
Required firewall rules: api.tavily.com:443, r.jina.ai:443
For the full list of required firewall rules, see Network and Firewall.

Ultron Provider Routing Reference

Ultron determines the LLM provider by pattern-matching the modelKey field from the SuperAdmin LLM record. Patterns are evaluated in the following priority order:
PriorityPatternProvidermodelKey Example
1Starts with openai:OpenAI Directopenai:gpt-4o
2Starts with azure:Azure (main region)azure:gpt-4.1
3Starts with azure_global:Azure (global region)azure_global:gpt-5
4Contains anthropic.claudeAWS Bedrock Conversearn:aws:bedrock:ap-northeast-1:123456789012:inference-profile/apac.anthropic.claude-3-5-sonnet-20241022-v2:0
5Contains claudeAnthropic Direct APIclaude-3-opus-20240229
6Contains geminiGoogle Generative AIgemini-2.0-flash
7Contains deepseek-r1-distillGroqdeepseek-r1-distill-llama-70b
8Equals o1, o1-mini, or o3-miniAzure Global (auto-routed)o1
9Contains qwenOpenAI-compatible endpointqwen-2.5-72b
10DefaultAzure main/default
Priority order matters. For example, a Bedrock ARN containing anthropic.claude-3-5-sonnet matches rule 4 (contains anthropic.claude) before rule 5 (contains claude). Construct modelKey values carefully to avoid ambiguous matches.

Azure Model-to-Deployment Mapping (Ultron)

Ultron maps requested model names to Azure deployment names using these environment variables:
Requested ModelEnvironment VariableExample Value
gpt-4oAZURE_OPENAI_GPT_4O_DEVELOPMENT_NAMEgpt-4o
gpt-4o-miniAZURE_OPENAI_GPT_4O_MINI_DEVELOPMENT_NAMEgpt-4o-mini
gpt-4.1AZURE_DEVELOPMENT_NAME_GPT_41gpt-4.1
gpt-3.5-instructAZURE_OPENAI_GPT_35_INSTRUCT_DEVELOPMENT_NAMEgpt-3.5-instruct
text-embedding-ada-002AZURE_OPENAI_EMBEDDING_DEVELOPMENT_NAMEtext-embedding-ada-002
Vision modelAZURE_OPENAI_VISION_DEVELOPMENT_NAMEgpt-4o
o1AZURE_OPENAI_GPT_O1_DEVELOPMENT_NAMEo1
o1-miniAZURE_OPENAI_GPT_O1_MINI_DEVELOPMENT_NAMEo1-mini
o3-miniAZURE_OPENAI_GPT_O3_MINI_DEVELOPMENT_NAMEo3-mini
o3AZURE_DEVELOPMENT_NAME_O3o3
gpt-5AZURE_OPENAI_GPT_5_DEVELOPMENT_NAMEgpt-5
gpt-5-miniAZURE_OPENAI_GPT_5_MINI_DEVELOPMENT_NAMEgpt-5-mini
gpt-5-nanoAZURE_OPENAI_GPT_5_NANO_DEVELOPMENT_NAMEgpt-5-nano
gpt-5-chatAZURE_OPENAI_GPT_5_CHAT_DEVELOPMENT_NAMEgpt-5-chat
gpt-5.1AZURE_OPENAI_GPT_51_DEVELOPMENT_NAMEgpt-5.1
gpt-5.2AZURE_OPENAI_GPT_52_DEVELOPMENT_NAMEgpt-5.2
DefaultAZURE_OPENAI_DEVELOPMENT_NAMEgpt-4.1

Verification

Check AI Service Health

# Check Ultron
kubectl get pods -n jitera -l app=jitera-ultron
kubectl logs -n jitera -l app=jitera-ultron --tail=100

# Check Boost
kubectl get pods -n jitera -l app=jitera-boost

# Check LiteLLM
kubectl get pods -n jitera -l app=jitera-litellm

Check Environment Variables

# Verify Ultron credentials
kubectl exec -it deploy/jitera-ultron -n jitera -- \
  env | grep -E "(AI_MODE|AZURE_|OPENAI_|BEDROCK_)"

# Verify Boost credentials
kubectl exec -it deploy/jitera-boost -n jitera -- \
  env | grep JITERA_BOOST

# Verify LiteLLM credentials
kubectl exec -it deploy/jitera-litellm -n jitera -- \
  env | grep -E "(AWS_|GEMINI_)"

Test Provider Connectivity

# Test Azure OpenAI
kubectl exec -it deploy/jitera-ultron -n jitera -- \
  curl -X POST "https://<INSTANCE>.openai.azure.com/openai/deployments/gpt-4o/chat/completions?api-version=2024-10-21" \
  -H "api-key: <KEY>" \
  -H "Content-Type: application/json" \
  -d '{"messages":[{"role":"user","content":"Hello"}]}'

# Test LiteLLM proxy models endpoint
kubectl exec -it deploy/jitera-boost -n jitera -- \
  curl -H "Authorization: Bearer $JITERA_BOOST_OPENAI_KEY_LITELLM" \
  http://jitera-litellm/v1/models

# Check LiteLLM proxy config
kubectl exec -it deploy/jitera-litellm -n jitera -- cat /app/config.yaml

Troubleshooting

Model Not Appearing in GUI Dropdown

  1. Verify the LLM record is enabled in SuperAdmin:
    SELECT * FROM llms WHERE name = 'your-model-name';
    UPDATE llms SET enabled = true WHERE name = 'your-model-name';
    
  2. Confirm the LLM is assigned to the organization.

”Deployment not found” Error (Azure)

  1. Confirm the Azure deployment name matches the SuperAdmin name field exactly.
  2. Verify JITERA_BOOST_API_CONFIG_AZURE_* URL contains the correct deployment name.
  3. Check Ultron has the correct deployment name env var:
    kubectl exec -it deploy/jitera-ultron -n jitera -- env | grep AZURE
    kubectl exec -it deploy/jitera-boost -n jitera -- env | grep JITERA_BOOST_API_CONFIG
    

LiteLLM Model Not Working (Claude/Gemini)

  1. Confirm the model entry exists in litellm-proxy-config.yaml.
  2. Verify model_name matches the SuperAdmin name field exactly.
  3. Confirm credentials are set correctly:
    kubectl exec -it deploy/jitera-litellm -n jitera -- cat /app/config.yaml
    kubectl exec -it deploy/jitera-litellm -n jitera -- env | grep -E "(AWS_|GEMINI_)"
    

GPT-5 Falls Back to GPT-4.1

GPT-5 is not automatically routed to the Azure Global region. Set the SuperAdmin modelKey to azure_global:gpt-5 to explicitly route it to the Global region.

API Key Errors

# Check secret values
kubectl get secret jitera-openai -n jitera -o yaml

# Verify key is correctly base64 encoded
echo "<KEY>" | base64 -d

Rate Limiting

  1. Add multiple API keys using JSON arrays (AZURE_OPENAI_KEYS, AZURE_OPENAI_INSTANCE_NAMES).
  2. Add additional Boost endpoint variables (JITERA_BOOST_API_CONFIG_AZURE_INSTANCE_2_*).
  3. Increase quotas with your AI provider.
  4. Consider deploying to multiple regions.

vLLM GPU Issues

# Check GPU nodes
kubectl get nodes -l accelerator=nvidia-gpu

# Check GPU availability
kubectl describe node <node-name> | grep nvidia

# Check vLLM pod status
kubectl describe pod -n jitera -l app=jitera-vllm

Appendix: Environment Variable Reference

Ultron

General (applies to both modes):
Variablevalues.yaml SourcePurpose
AI_MODEopenai.AI_MODEopen_ai (default) or azure
OPENAI_MAIN_MODEL_NAMEopenai.secretKeys.openai.OPENAI_MAIN_MODEL_NAMEMain background model (required for both modes)
OpenAI Direct mode (AI_MODE: open_ai):
Variablevalues.yaml SourcePurpose
OPENAI_API_KEYSopenai.secretKeys.openai.OPENAI_API_KEYSJSON array of OpenAI keys
OPENAI_API_KEYopenai.secretKeys.openai.OPENAI_API_KEYSingle OpenAI key
OPENAI_EMBEDDING_KEYopenai.secretKeys.openai.OPENAI_EMBEDDING_KEYEmbeddings API key
OPENAI_VISION_KEYopenai.secretKeys.openai.OPENAI_VISION_KEYVision model API key
Azure mode (AI_MODE: azure):
Variablevalues.yaml SourcePurpose
AZURE_OPENAI_KEYSopenai.secretKeys.azure.AZURE_OPENAI_KEYSJSON array of Azure keys
AZURE_OPENAI_INSTANCE_NAMESopenai.secretKeys.azure.AZURE_OPENAI_INSTANCE_NAMESJSON array of instance names
AZURE_OPENAI_VERSIONopenai.secretKeys.azure.AZURE_OPENAI_VERSIONAPI version
AZURE_OPENAI_DEVELOPMENT_NAMEopenai.secretKeys.azure.AZURE_OPENAI_DEVELOPMENT_NAMEDefault deployment
AZURE_OPENAI_GLOBAL_KEYSopenai.secretKeys.azure.AZURE_OPENAI_GLOBAL_KEYSGlobal region keys
AZURE_OPENAI_GLOBAL_INSTANCE_NAMESopenai.secretKeys.azure.AZURE_OPENAI_GLOBAL_INSTANCE_NAMESGlobal instances
BEDROCK_CONVERSE_REGIONopenai.secretKeys.bedrock.BEDROCK_CONVERSE_REGIONAWS Bedrock main region
BEDROCK_CONVERSE_ACCESS_KEY_IDopenai.secretKeys.bedrock.BEDROCK_CONVERSE_ACCESS_KEY_IDAWS access key (main)
BEDROCK_CONVERSE_SECRET_ACCESS_KEYopenai.secretKeys.bedrock.BEDROCK_CONVERSE_SECRET_ACCESS_KEYAWS secret key (main)
BEDROCK_CONVERSE_GLOBAL_REGIONopenai.secretKeys.bedrock.BEDROCK_CONVERSE_GLOBAL_REGIONAWS Bedrock global region
BEDROCK_CONVERSE_GLOBAL_ACCESS_KEY_IDopenai.secretKeys.bedrock.BEDROCK_CONVERSE_GLOBAL_ACCESS_KEY_IDAWS access key (global)
BEDROCK_CONVERSE_GLOBAL_SECRET_ACCESS_KEYopenai.secretKeys.bedrock.BEDROCK_CONVERSE_GLOBAL_SECRET_ACCESS_KEYAWS secret key (global)
GOOGLE_GENERATIVE_API_KEYopenai.secretKeys.google.GOOGLE_GENERATIVE_API_KEYGemini API key
ANTHROPIC_API_KEYopenai.secretKeys.anthropic.ANTHROPIC_API_KEYAnthropic direct API key

Boost

VariableSourcePurpose
JITERA_BOOST_API_KEY_MAINcredentials.boost.JITERA_BOOST_API_KEY_MAINMain Boost API key
JITERA_BOOST_OPENAI_URL_LITELLMAuto-generatedLiteLLM proxy URL
JITERA_BOOST_OPENAI_KEY_LITELLMcredentials.boost.JITERA_BOOST_OPENAI_KEY_LITELLMLiteLLM master key
JITERA_BOOST_API_CONFIG_AZURE_*credentials.boost.JITERA_BOOST_API_CONFIG_AZURE_*Azure endpoint configs
JITERA_BOOST_DEFAULT_BASE_MODELboost.env.*Background base model
JITERA_BOOST_DEFAULT_EXPERT_MODELboost.env.*Background expert model

LiteLLM

VariableSourcePurpose
PROXY_MASTER_KEYcredentials.boost.JITERA_BOOST_OPENAI_KEY_LITELLMProxy authentication
AWS_ACCESS_KEY_IDopenai.secretKeys.bedrock.BEDROCK_CONVERSE_ACCESS_KEY_IDAWS Bedrock auth
AWS_SECRET_ACCESS_KEYopenai.secretKeys.bedrock.BEDROCK_CONVERSE_SECRET_ACCESS_KEYAWS Bedrock auth
GEMINI_API_KEYopenai.secretKeys.google.GOOGLE_GENERATIVE_API_KEYGoogle Gemini auth

Appendix: Configuration Checklists

Azure OpenAI Only

  • Set openai.AI_MODE: azure
  • Set OPENAI_MAIN_MODEL_NAME under openai.secretKeys.openai (e.g., gpt-4.1)
  • Configure AZURE_OPENAI_KEYS and AZURE_OPENAI_INSTANCE_NAMES
  • Set AZURE_OPENAI_VERSION
  • Configure deployment name env vars for each model
  • Configure Boost Azure endpoints in credentials.boost.JITERA_BOOST_API_CONFIG_AZURE_*
  • Register models in SuperAdmin with name matching Azure deployment names

Azure OpenAI + Claude (Bedrock)

  • Complete Azure setup above
  • Configure openai.secretKeys.bedrock.* (main and global regions)
  • Add Claude models to litellm-proxy-config.yaml
  • Set JITERA_BOOST_OPENAI_KEY_LITELLM in credentials.boost
  • Register Claude models in SuperAdmin with name matching LiteLLM model_name

Azure OpenAI + Claude + Gemini

  • Complete Azure + Claude setup above
  • Configure openai.secretKeys.google.GOOGLE_GENERATIVE_API_KEY
  • Add Gemini models to litellm-proxy-config.yaml
  • Register Gemini models in SuperAdmin with name matching LiteLLM model_name

Web Search Agent (Optional)

  • Configure web search backend: JITERA_BOOST_TAVILY_API_KEY (Tavily) or JITERA_BOOST_SEARXNG_URL (SearXNG)
  • Verify Jina Reader reachability (r.jina.ai:443)
  • (Optional) Set JITERA_BOOST_JINA_READER_API_KEY for higher rate limits
  • Add required domains to firewall allow-list (see Network and Firewall)

Helm Values

Complete configuration reference

Values Reference

All configuration parameters

Architecture

Service architecture overview

Troubleshooting

Common issues and solutions