Skip to main content

Documentation Index

Fetch the complete documentation index at: https://docs.jitera.ai/llms.txt

Use this file to discover all available pages before exploring further.

Jitera Self-Hosted is a Kubernetes-based platform consisting of application microservices, infrastructure services, and data stores. This page covers supported deployment configurations and the high-level architecture.

Supported Configurations

Jitera supports four deployment models, defined by two dimensions: where the platform runs (Public Cloud or On-Premises) and how the LLM is accessed (Public or Private).
ConfigurationNetwork IsolationData Management ControlInfrastructure DifficultyInstallation Guide
Public Cloud + Public LLMLowLowLowAWS EKS, Azure AKS
Public Cloud + Private LLMMedium–HighMedium–HighLow–MediumAWS EKS, Azure AKS
On-Premises + Public LLMMediumMediumHighOn-premises
On-Premises + Private LLMHighHighVery HighOn-premises
Public Cloud refers to standard multi-tenant IaaS/PaaS environments (AWS, Azure). Dedicated hardware (e.g., AWS Dedicated Hosts) and hybrid extensions (e.g., AWS Outposts) are excluded.On-Premises refers to infrastructure owned and physically maintained within the organization’s facilities. Hosted Private Cloud services managed by IaaS providers are excluded.Public LLM refers to models hosted by external vendors (e.g., OpenAI, Google Gemini) accessed via API. Private LLM refers to open-source or licensed models (e.g., Qwen, Llama, Mistral) deployed on user-controlled infrastructure.

Public Cloud + Public LLM

The simplest deployment model. Jitera runs on a managed Kubernetes service (EKS / AKS) and connects to an external LLM provider over the internet.
  • Network Isolation — Low: Infrastructure can be isolated via VPC / Virtual Network, but LLM API calls traverse the public internet.
  • Data Management Control — Low: Both infrastructure and AI model are managed by third parties.
  • Infrastructure Difficulty — Low: Managed Kubernetes eliminates control plane management. No physical hardware required.

Public Cloud + Private LLM

Jitera runs on a managed Kubernetes service, and the LLM is deployed within the user’s own cloud environment (e.g., GPU instances in the same VPC).
  • Network Isolation — Medium to High: Access to the LLM can be secured through a closed network (within the VPC).
  • Data Management Control — Medium to High: The LLM execution environment is built and managed within the user’s own cloud tenant. Data stays within the user’s cloud environment.
  • Infrastructure Difficulty — Low to Medium: Adds GPU node resource management (auto-scaling configuration, GPU quota requests) on top of the public LLM model.
See AI Configuration — vLLM for GPU requirements.

On-Premises + Public LLM

Jitera runs on user-owned hardware, but connects to an external LLM provider over the internet.
  • Network Isolation — Medium: Infrastructure is completely isolated, but LLM API calls traverse the public internet.
  • Data Management Control — Medium: Data usage and retention policies depend on the LLM provider. On-premises data is transmitted externally.
  • Infrastructure Difficulty — High: You must build the Kubernetes cluster from scratch (HA control planes, etcd backups, load balancers, storage classes). Requires hardware procurement, installation, and physical maintenance.

On-Premises + Private LLM

The most isolated deployment model. Both Jitera and the LLM run entirely on user-owned hardware within the organization’s network.
  • Network Isolation — High: Access to the LLM occurs entirely within the organization’s closed network.
  • Data Management Control — High: Both the LLM execution environment and all data are managed within the organization without external exposure.
  • Infrastructure Difficulty — Very High: In addition to on-premises Kubernetes management, Kubernetes upgrades require rigorous testing of NVIDIA drivers and Container Toolkits. GPU servers are expensive, have long lead times, and high power consumption — often necessitating facility upgrades (power, cooling).
See AI Configuration — vLLM for GPU requirements.

Application Services

ComponentDescriptionTechnology
FrontendUser-facing web applicationReact, TypeScript
Frontend CoreAI chat, documentation, knowledge graphReact 19, TypeScript
SWEFCode generation interfaceReact, TypeScript
AutomationBusiness logic, API endpoints, background jobsRuby on Rails, GraphQL
UltronAI chat completions, code generationNestJS, LangChain
BoostLLM routing, workflow orchestrationPython, FastAPI
LiteLLMLLM provider abstractionPython

Infrastructure Services

ComponentDescription
Kong GatewayAPI gateway, SSL termination, request routing
Document ConverterFile format conversion
HTML ConversionHTML document processing
PlaywrightBrowser automation for testing and screenshots
HasuraReal-time GraphQL API for database access

Data Stores

All data stores are bundled in the Helm chart and deployed in-cluster by default. For production high-availability deployments, consider externalizing to managed services.
ServicePurpose
PostgreSQLPrimary relational database (users, projects, permissions)
PGVectorVector similarity search for AI embeddings
MongoDBDocument storage (generated code, design specs)
RedisCaching, sessions, job queues
RabbitMQAsynchronous message processing
See External Services for supported versions and externalization options.

Sizing and Scaling

For resource sizing by deployment scale (small, medium, large), see the Sizing Guide. For scaling strategies and auto-scaling configuration, see the Scaling Guide.

Sizing Guide

Resource sizing by deployment scale

Scaling Guide

Scaling strategies and auto-scaling

Requirements

Mandatory and optional deployment requirements