Jitera Self-Hosted is a Kubernetes-based platform consisting of application microservices, infrastructure services, and data stores. This page covers supported deployment configurations and the high-level architecture.Documentation Index
Fetch the complete documentation index at: https://docs.jitera.ai/llms.txt
Use this file to discover all available pages before exploring further.
Supported Configurations
Jitera supports four deployment models, defined by two dimensions: where the platform runs (Public Cloud or On-Premises) and how the LLM is accessed (Public or Private).| Configuration | Network Isolation | Data Management Control | Infrastructure Difficulty | Installation Guide |
|---|---|---|---|---|
| Public Cloud + Public LLM | Low | Low | Low | AWS EKS, Azure AKS |
| Public Cloud + Private LLM | Medium–High | Medium–High | Low–Medium | AWS EKS, Azure AKS |
| On-Premises + Public LLM | Medium | Medium | High | On-premises |
| On-Premises + Private LLM | High | High | Very High | On-premises |
Public Cloud refers to standard multi-tenant IaaS/PaaS environments (AWS, Azure). Dedicated hardware (e.g., AWS Dedicated Hosts) and hybrid extensions (e.g., AWS Outposts) are excluded.On-Premises refers to infrastructure owned and physically maintained within the organization’s facilities. Hosted Private Cloud services managed by IaaS providers are excluded.Public LLM refers to models hosted by external vendors (e.g., OpenAI, Google Gemini) accessed via API. Private LLM refers to open-source or licensed models (e.g., Qwen, Llama, Mistral) deployed on user-controlled infrastructure.
Public Cloud + Public LLM
The simplest deployment model. Jitera runs on a managed Kubernetes service (EKS / AKS) and connects to an external LLM provider over the internet.- Network Isolation — Low: Infrastructure can be isolated via VPC / Virtual Network, but LLM API calls traverse the public internet.
- Data Management Control — Low: Both infrastructure and AI model are managed by third parties.
- Infrastructure Difficulty — Low: Managed Kubernetes eliminates control plane management. No physical hardware required.
Public Cloud + Private LLM
Jitera runs on a managed Kubernetes service, and the LLM is deployed within the user’s own cloud environment (e.g., GPU instances in the same VPC).- Network Isolation — Medium to High: Access to the LLM can be secured through a closed network (within the VPC).
- Data Management Control — Medium to High: The LLM execution environment is built and managed within the user’s own cloud tenant. Data stays within the user’s cloud environment.
- Infrastructure Difficulty — Low to Medium: Adds GPU node resource management (auto-scaling configuration, GPU quota requests) on top of the public LLM model.
On-Premises + Public LLM
Jitera runs on user-owned hardware, but connects to an external LLM provider over the internet.- Network Isolation — Medium: Infrastructure is completely isolated, but LLM API calls traverse the public internet.
- Data Management Control — Medium: Data usage and retention policies depend on the LLM provider. On-premises data is transmitted externally.
- Infrastructure Difficulty — High: You must build the Kubernetes cluster from scratch (HA control planes, etcd backups, load balancers, storage classes). Requires hardware procurement, installation, and physical maintenance.
On-Premises + Private LLM
The most isolated deployment model. Both Jitera and the LLM run entirely on user-owned hardware within the organization’s network.- Network Isolation — High: Access to the LLM occurs entirely within the organization’s closed network.
- Data Management Control — High: Both the LLM execution environment and all data are managed within the organization without external exposure.
- Infrastructure Difficulty — Very High: In addition to on-premises Kubernetes management, Kubernetes upgrades require rigorous testing of NVIDIA drivers and Container Toolkits. GPU servers are expensive, have long lead times, and high power consumption — often necessitating facility upgrades (power, cooling).
Application Services
| Component | Description | Technology |
|---|---|---|
| Frontend | User-facing web application | React, TypeScript |
| Frontend Core | AI chat, documentation, knowledge graph | React 19, TypeScript |
| SWEF | Code generation interface | React, TypeScript |
| Automation | Business logic, API endpoints, background jobs | Ruby on Rails, GraphQL |
| Ultron | AI chat completions, code generation | NestJS, LangChain |
| Boost | LLM routing, workflow orchestration | Python, FastAPI |
| LiteLLM | LLM provider abstraction | Python |
Infrastructure Services
| Component | Description |
|---|---|
| Kong Gateway | API gateway, SSL termination, request routing |
| Document Converter | File format conversion |
| HTML Conversion | HTML document processing |
| Playwright | Browser automation for testing and screenshots |
| Hasura | Real-time GraphQL API for database access |
Data Stores
All data stores are bundled in the Helm chart and deployed in-cluster by default. For production high-availability deployments, consider externalizing to managed services.| Service | Purpose |
|---|---|
| PostgreSQL | Primary relational database (users, projects, permissions) |
| PGVector | Vector similarity search for AI embeddings |
| MongoDB | Document storage (generated code, design specs) |
| Redis | Caching, sessions, job queues |
| RabbitMQ | Asynchronous message processing |
Sizing and Scaling
For resource sizing by deployment scale (small, medium, large), see the Sizing Guide. For scaling strategies and auto-scaling configuration, see the Scaling Guide.Related Documentation
Sizing Guide
Resource sizing by deployment scale
Scaling Guide
Scaling strategies and auto-scaling
Requirements
Mandatory and optional deployment requirements

