Dev Tools / mini-SWE-agent

mini-SWE-agent

by Princeton NLP / Stanford

cli active free

A radically minimal autonomous software engineering agent — approximately 100 lines of core Python — that scores above 74% on SWE-bench Verified using only bash commands in a sandboxed environment.

mini-SWE-agent is the lightweight successor to SWE-agent, built by the same Princeton NLP and Stanford team to demonstrate that near-state-of-the-art autonomous software engineering performance can be achieved with roughly 100 lines of core Python and a plain bash interface. It installs in seconds via pip or uvx, works on all major operating systems, and uses LiteLLM to route requests to any model provider. The SWE-agent team explicitly recommends mini-SWE-agent for most practical use cases over the original full-featured agent.

Key capabilities

Minimal design with high benchmark performance — The entire core agent fits in approximately 100 lines of Python, yet achieves above 74% resolution rate on SWE-bench Verified, outperforming many more complex systems by relying on a clean bash loop rather than a custom interaction layer.

Universal LLM support via LiteLLM — By routing all model calls through LiteLLM, mini-SWE-agent works with Claude, GPT-4o, Gemini, DeepSeek, locally hosted Llama models, and any other provider LiteLLM supports, with a single configuration change to switch between them.

Sandboxed execution — The agent runs its bash commands inside a sandboxed environment, reducing the risk of unintended file system changes on the host machine and making it safer to use in automated pipelines than tools that operate directly on the host.

Trivial installation and cross-platform support — A single pip install mini-swe-agent or uvx mini-swe-agent command is all that is needed to get started, and the tool runs natively on Windows, macOS, and Linux without requiring WSL or containerisation.

Autonomy level

mini-SWE-agent operates at autonomy level 5. It receives a task description or GitHub issue, then plans and executes file edits, bash commands, and git operations autonomously inside its sandbox until the task is resolved or the budget is exhausted. No human checkpoints are built into the default loop, and version 2.4.1 (released June 11, 2026) further refined the tool’s reliability in unattended operation.

Strengths

  • Recommended by the SWE-agent authors as the default choice for most users
  • Installs with a single pip or uvx command; no complex configuration required
  • Cross-platform: works natively on Windows, macOS, and Linux
  • Achieves above 74% on SWE-bench Verified despite the minimal codebase
  • LiteLLM integration means any model provider can be used, including fully local models at zero API cost
  • Sandboxed execution reduces risk compared to agents that run directly on the host

Limitations

  • Primarily designed for isolated, self-contained task resolution rather than ongoing interactive development
  • Sandboxed environment can complicate tasks that require integrating with external services or databases on the host
  • Minimal design means fewer configuration knobs for teams that need custom behaviour
  • Best results require a well-specified task description; vague inputs produce inconsistent outcomes
  • Like all autonomous agents, it may attempt plausible but incorrect fixes that pass surface-level checks without deeper understanding of the codebase

Sources

Last verified June 12, 2026