Architecture Overview

Jitera Self-Hosted is a Kubernetes-based platform consisting of application microservices, infrastructure services, and data stores. This page covers supported deployment configurations and the high-level architecture.

Supported Configurations

Jitera supports four deployment models, defined by two dimensions: where the platform runs (Public Cloud or On-Premises) and how the LLM is accessed (Public or Private).

Configuration	Network Isolation	Data Management Control	Infrastructure Difficulty	Installation Guide
Public Cloud + Public LLM	Low	Low	Low	AWS EKS, Azure AKS
Public Cloud + Private LLM	Medium–High	Medium–High	Low–Medium	AWS EKS, Azure AKS
On-Premises + Public LLM	Medium	Medium	High	On-premises
On-Premises + Private LLM	High	High	Very High	On-premises

Public Cloud refers to standard multi-tenant IaaS/PaaS environments (AWS, Azure). Dedicated hardware (e.g., AWS Dedicated Hosts) and hybrid extensions (e.g., AWS Outposts) are excluded.On-Premises refers to infrastructure owned and physically maintained within the organization’s facilities. Hosted Private Cloud services managed by IaaS providers are excluded.Public LLM refers to models hosted by external vendors (e.g., OpenAI, Google Gemini) accessed via API. Private LLM refers to open-source or licensed models (e.g., Qwen, Llama, Mistral) deployed on user-controlled infrastructure.

Public Cloud + Public LLM

The simplest deployment model. Jitera runs on a managed Kubernetes service (EKS / AKS) and connects to an external LLM provider over the internet.

Network Isolation — Low: Infrastructure can be isolated via VPC / Virtual Network, but LLM API calls traverse the public internet.
Data Management Control — Low: Both infrastructure and AI model are managed by third parties.
Infrastructure Difficulty — Low: Managed Kubernetes eliminates control plane management. No physical hardware required.

Public Cloud + Private LLM

Jitera runs on a managed Kubernetes service, and the LLM is deployed within the user’s own cloud environment (e.g., GPU instances in the same VPC).

Network Isolation — Medium to High: Access to the LLM can be secured through a closed network (within the VPC).
Data Management Control — Medium to High: The LLM execution environment is built and managed within the user’s own cloud tenant. Data stays within the user’s cloud environment.
Infrastructure Difficulty — Low to Medium: Adds GPU node resource management (auto-scaling configuration, GPU quota requests) on top of the public LLM model.

See AI Configuration — vLLM for GPU requirements.

On-Premises + Public LLM

Jitera runs on user-owned hardware, but connects to an external LLM provider over the internet.

Network Isolation — Medium: Infrastructure is completely isolated, but LLM API calls traverse the public internet.
Data Management Control — Medium: Data usage and retention policies depend on the LLM provider. On-premises data is transmitted externally.
Infrastructure Difficulty — High: You must build the Kubernetes cluster from scratch (HA control planes, etcd backups, load balancers, storage classes). Requires hardware procurement, installation, and physical maintenance.

On-Premises + Private LLM

The most isolated deployment model. Both Jitera and the LLM run entirely on user-owned hardware within the organization’s network.

Network Isolation — High: Access to the LLM occurs entirely within the organization’s closed network.
Data Management Control — High: Both the LLM execution environment and all data are managed within the organization without external exposure.
Infrastructure Difficulty — Very High: In addition to on-premises Kubernetes management, Kubernetes upgrades require rigorous testing of NVIDIA drivers and Container Toolkits. GPU servers are expensive, have long lead times, and high power consumption — often necessitating facility upgrades (power, cooling).

See AI Configuration — vLLM for GPU requirements.

Application Services

Component	Description	Technology
Frontend	User-facing web application	React, TypeScript
Frontend Core	AI chat, documentation, knowledge graph	React 19, TypeScript
SWEF	Code generation interface	React, TypeScript
Automation	Business logic, API endpoints, background jobs	Ruby on Rails, GraphQL
Ultron	AI chat completions, code generation	NestJS, LangChain
Boost	LLM routing, workflow orchestration	Python, FastAPI
LiteLLM	LLM provider abstraction	Python

Infrastructure Services

Component	Description
Kong Gateway	API gateway, SSL termination, request routing
Document Converter	File format conversion
HTML Conversion	HTML document processing
Playwright	Browser automation for testing and screenshots
Hasura	Real-time GraphQL API for database access

Data Stores

All data stores are bundled in the Helm chart and deployed in-cluster by default. For production high-availability deployments, consider externalizing to managed services.

Service	Purpose
PostgreSQL	Primary relational database (users, projects, permissions)
PGVector	Vector similarity search for AI embeddings
MongoDB	Document storage (generated code, design specs)
Redis	Caching, sessions, job queues
RabbitMQ	Asynchronous message processing

See External Services for supported versions and externalization options.

Sizing and Scaling

For resource sizing by deployment scale (small, medium, large), see the Sizing Guide. For scaling strategies and auto-scaling configuration, see the Scaling Guide.

Sizing Guide

Resource sizing by deployment scale

Scaling Guide

Scaling strategies and auto-scaling

Requirements

Mandatory and optional deployment requirements

Release Notes

Documentation Index

​Supported Configurations

​Public Cloud + Public LLM

​Public Cloud + Private LLM

​On-Premises + Public LLM

​On-Premises + Private LLM

​Application Services

​Infrastructure Services

​Data Stores

​Sizing and Scaling

​Related Documentation