Generative AI that runs inside your network.
Three components — AI Server, AI Client, and AI Admin Console — deployed entirely inside your environment. Your prompts, your documents, your model outputs, and your audit logs never leave the perimeter. Built for organisations in compliance-heavy industries that cannot use cloud LLM providers.
On-premises end to end
AI Server hosts the generative model on hardware you control (on-prem GPU, private cloud, or air-gapped lab). AI Client runs on each workstation and connects to AI Server over your network. AI Admin Console gives IT a single pane of glass for licences, enrollments, policy, and audit. No component ever needs to reach the public internet at run-time.
Cloud AI vs. Local AI Suite
| Concern | Public cloud LLM | Local AI Suite |
|---|---|---|
| Where your prompts go | Out to the vendor's API. | To AI Server on your network. |
| Where your documents go | Uploaded to the vendor (or to a vector DB they manage). | Stay on your storage; retrieval runs locally. |
| Data residency | Wherever the vendor's region runs — re-check on every contract. | Wherever you put the server. Period. |
| Per-token cost | Per call, scales with usage. | Capex on hardware; usage is free at the margin. |
| Air-gapped or offline use | Not possible. | Supported by default. |
| Audit & compliance evidence | You depend on vendor reports. | Logs and policy live in your IT environment. |
Three pieces, one suite
AI Server
Hosts the generative model on your hardware. Supports OpenAI GPT-OSS 20B and 120B and other open-weight models. Exposes an OpenAI-compatible API to clients on your network.
Learn more →AI Client
Desktop app that talks to AI Server from each workstation. Chat, document Q&A, drafting, summarisation — all served by the model running inside your network.
Learn more →AI Admin Console
For IT administrators. Manage members, license assignments, AI Server enrollments, policy, and audit logs. Sign-in via Microsoft Entra ID or Google Workspace.
Learn more →Built for regulated environments
Where it runs
- On-prem GPU servers (Linux or Windows).
- Your private cloud tenant — AWS, Azure, GCP, or any other.
- Air-gapped or offline labs (no run-time internet required).
Why customers pick it
- Data residency: data physically stays where you put the server.
- Audit: logs live in your IT environment, not the vendor's.
- Cost: predictable hardware capex instead of per-token billing.
- Custom models: you can run open-weight models we have not even shipped yet.
Talk through your deployment
Start with a free 1-week pilot sprint. We will walk through your hardware, security, and compliance constraints during the scoping call, then ship a working deliverable by Friday.