Tabby

Tabby launched in 2023 as a self-hosted alternative to GitHub Copilot, addressing the core concern that enterprise teams have about sending proprietary code to external APIs. The entire inference pipeline runs on the organisation’s own infrastructure — a Docker container with a Rust-built server, requiring no external database or cloud dependency.

Key capabilities

Self-hosted by default — Tabby runs as a Docker image or standalone binary on the team’s own hardware or private cloud. All code, completions, and chat messages stay within the organisation’s network perimeter.

Repository-level RAG context — Tabby builds a code understanding layer from the team’s own repositories, giving completions and answers that reflect the team’s actual patterns, frameworks, and naming conventions rather than generic training data.

Team Answer Engine — A shared Q&A layer that lets team members ask questions about the codebase and receives answers grounded in the team’s actual code, documented in a searchable history.

Pochi agentic layer — TabbyML’s Pochi project (github.com/TabbyML/pochi) adds a full agentic coding loop on top of Tabby’s platform, with autonomous task execution in isolated git worktrees and parallel agent support.

Autonomy level

Level 2 — Assisted. Tabby’s core product is code completion and chat with codebase context. The Pochi agentic layer extends this to Level 3-4 autonomous task execution, but is a separate product.

Strengths

Complete data sovereignty — all inference runs in your own environment
Apache 2.0 licence; 33,600 GitHub stars validate strong adoption
No external cloud dependency or monthly per-seat fees
Supports local models (Ollama, llama.cpp) for fully air-gapped deployments
VS Code and JetBrains plugins; enterprise features available

Limitations

Requires infrastructure setup and maintenance (Docker, GPU for best performance)
Self-hosting shifts operational burden to the team
Core product is autocomplete and chat; agentic capabilities require Pochi separately
Completion quality depends on hardware available for inference
Smaller model selection than cloud-based tools without a GPU server

Key capabilities

Autonomy level

Strengths

Limitations

Sources