GitHub - chernistry/bernstein: Declarative Agent Orchestration. Ship while you sleep.

Declarative agent orchestration for engineering teams.

One YAML. Multiple coding agents. Ship while you sleep.

Web dashboard — real-time task monitoring, cost tracking, agent status

Homepage | Documentation | Getting Started | Known Limitations

If you're running one agent at a time, you're leaving performance on the table. Bernstein takes a goal, breaks it into tasks, assigns them to AI coding agents running in parallel, verifies the output, and commits the results. You come back to working code, passing tests, and a clean git history.

No framework to learn. No vendor lock-in. Works with Claude Code, Codex, Gemini CLI, Cursor, Aider, Amp, Roo Code, Goose, Qwen, and any CLI tool that accepts a prompt flag.

Think of it as what Kubernetes did for containers, but for AI coding agents. You declare a goal. The control plane decomposes it into tasks. Short-lived agents execute them in isolated git worktrees -- like pods. A janitor verifies the output before anything lands.

pip install bernstein                    # any platform
# or
pipx install bernstein                   # isolated install
# or
uv tool install bernstein                # fastest (Rust-based)
# or
brew tap chernistry/bernstein && brew install bernstein  # macOS / Linux
# or
sudo dnf copr enable alexchernysh/bernstein && sudo dnf install bernstein  # Fedora / RHEL
# or
npx bernstein-orchestrator               # npm wrapper (requires Python 3.12+)

# Run:
bernstein -g "Add JWT auth with refresh tokens, tests, and API docs"

1.78× faster than single-agent execution, verified on internal benchmarks. See benchmarks for methodology and reproduction steps.

What it is

Bernstein is a deterministic orchestrator for CLI coding agents. It schedules tasks in parallel across any installed agent — Claude Code, Codex, Cursor, Gemini, Aider, and more — with git worktree isolation, janitor-verified output, and file-based state you can inspect, back up, and recover from. No vendor lock-in. No framework to learn. Your agents, your models, your backlog.

5-minute setup

# 1. Install (pick one — full list in the install block above)
pipx install bernstein

# 2. Init your project (creates .sdd/ workspace + bernstein.yaml)
cd your-project
bernstein init

# 3. Run — pass a goal inline or let bernstein.yaml guide the run
bernstein -g "Add rate limiting and improve test coverage"

# 4. Stop when done
bernstein stop

That's it. Your agents spawn, work in parallel, verify their output, and exit. Watch progress in the terminal dashboard.

Supported agents

Bernstein ships with adapters for 12 CLI agents. If you have any of these installed, Bernstein uses them — no API key plumbing required:

Agent	Models	Install
Aider	Any OpenAI/Anthropic-compatible model	`pip install aider-chat`
Amp	opus 4.6, gpt-5.4	`brew install amp`
Claude Code	opus 4.6, sonnet 4.6, haiku 4.5	`npm install -g @anthropic-ai/claude-code`
Codex CLI	gpt-5.4, o3, o4-mini	`npm install -g @openai/codex`
Cursor	sonnet 4.6, opus 4.6, gpt-5.4	Cursor app (sign in via app)
Gemini CLI	gemini-3-pro, 3-flash	`npm install -g @google/gemini-cli`
Goose	Any provider	Install Goose CLI
Kilo	Configurable	`npm install -g kilo`
Kiro	Multi-provider	Install Kiro CLI
OpenCode	Multi-provider	Install OpenCode CLI
Qwen	qwen3-coder, qwen-max	`npm install -g qwen-code`
Roo Code	opus 4.6, sonnet 4.6, gpt-4o	VS Code extension (headless CLI)

Prefer a different agent? Bring your own -- the generic adapter accepts any CLI tool with a --prompt-flag interface. Mix models in the same run: cheap free-tier agents for boilerplate, heavy models for architecture.

Tip

Run bernstein --headless for CI pipelines -- no TUI, structured JSON output, non-zero exit on failure.

Shipped features

Only capabilities that ship with v1.4.11. Full matrix at FEATURE_MATRIX.md.

Deterministic scheduling — zero LLM tokens on coordination. The orchestrator is plain Python.
Parallel execution — spawn multiple agents across roles (backend, qa, docs, security) simultaneously.
Git worktree isolation — every agent works in its own branch. Your main branch stays clean.
Janitor verification — concrete signals (tests pass, files exist, no regressions) before anything lands.
Quality gates — lint, type-check, PII scan, and mutation testing run automatically after completion.
Plan files — multi-stage YAML with stages and steps, like Ansible playbooks (bernstein run plan.yaml).
Cost tracking — per-model spend, tokens, and duration (bernstein cost).
Live dashboards — terminal TUI (bernstein live) and browser UI (bernstein dashboard).
Self-evolution — analyze metrics, propose improvements, sandbox-test, and auto-apply what passes (--evolve).
CI autofix — parse failing CI logs, create fix tasks, route to the right agent (bernstein ci fix <url>).
Circuit breaker — halt agents that repeatedly violate purpose or crash.
Token growth monitor — detect runaway token consumption and intervene automatically.
Cross-model verification — route completed task diffs to a different model for review.
Audit trail — HMAC-chained tamper-evident logs with Merkle seal verification.
Pluggy plugin system — hook into any lifecycle event.
Multi-repo workspaces — orchestrate across multiple git repositories as one workspace.
Cluster mode — central server + remote worker nodes for distributed execution.
MCP server mode — run Bernstein as an MCP tool server for other agents.
12 agent adapters — Claude, Codex, Cursor, Gemini, Aider, Amp, Roo Code, Kiro, Kilo, OpenCode, Qwen, Goose, plus a generic catch-all.

Install

All methods install the same bernstein CLI.

Method	Command
pip	`pip install bernstein`
pipx	`pipx install bernstein`
uv	`uv tool install bernstein`
Homebrew	`brew tap chernistry/bernstein && brew install bernstein`
Fedora / RHEL	`sudo dnf copr enable alexchernysh/bernstein && sudo dnf install bernstein`
npm (thin wrapper)	`npx bernstein-orchestrator` or `npm i -g bernstein-orchestrator`

The npm wrapper requires Python 3.12+ on the system -- it delegates to pipx/uvx/python under the hood.

COPR targets: Fedora 41, 42 (x86_64, aarch64), EPEL 9, 10.

Editor extensions

Editor	Install
VS Code	`code --install-extension alex-chernysh.bernstein` or search "Bernstein" in Extensions
Cursor	Search "Bernstein" in Extensions, or install from Open VSX
Cursor (skills)	8 built-in skills in `packages/cursor-plugin/`

Monitoring and diagnostics

bernstein live          # interactive TUI dashboard (3 columns)
bernstein dashboard     # open web dashboard in browser
bernstein status        # task summary and agent health
bernstein ps            # running agent processes
bernstein cost          # spend breakdown by model and task
bernstein doctor        # pre-flight: adapters, API keys, ports
bernstein recap         # post-run: tasks, pass/fail, cost
bernstein retro         # detailed retrospective report
bernstein trace <ID>    # step-by-step agent decision trace
bernstein logs -f       # tail live agent output

Agents appear in Activity Monitor / ps as bernstein: <role> [<session>] — no more hunting for mystery Python processes.

Command aliases

Hidden aliases kept for backward compatibility — not shown in --help:

Alias	Canonical command	Notes
`overture`	`init`	Init workspace (legacy name)
`downbeat`	`start`	Start server and spawn manager (legacy name)
`score`	`status`	Task summary and agent health (legacy name)

Plan files

For multi-stage projects, define stages and steps in a YAML plan file:

bernstein run plan.yaml

The plan skips manager decomposition and goes straight to execution. See templates/plan.yaml for the format and examples/plans/flask-api.yaml for a working example.

Observability

Prometheus metrics at /metrics — wire up Grafana, set alerts, monitor cost. OTLP telemetry initialization supports distributed tracing.

Extensibility

Pluggy-based plugin system. Hook into any lifecycle event:

from bernstein.plugins import hookimpl

class SlackNotifier:
    @hookimpl
    def on_task_completed(self, task_id, role, result_summary):
        slack.post(f"#{role} finished {task_id}: {result_summary}")

GitHub App integration

Install a GitHub App on your repository to automatically convert GitHub events into Bernstein tasks. Issues become backlog items, PR review comments become fix tasks, and pushes trigger QA verification.

bernstein github setup       # print setup instructions
bernstein github test-webhook  # verify configuration

Agent catalogs

Hire specialist agents from Agency (100+ agents) or define your own:

# bernstein.yaml
catalogs:
  - name: agency
    type: agency
    enabled: true

The spawner matches the best agent for each role using keyword-based role inference and affinity scoring.

Watch: terminal demo (GIF)

How it compares

	Bernstein	Paperclip	CrewAI	AutoGen	LangGraph
Orchestrator type	Deterministic code	LLM-driven	LLM-driven	LLM-driven	Graph + LLM
Agent model	Any CLI agent	OpenClaw, Claude, Codex	Python classes	Python agents	Nodes + edges
Parallel execution	Native	Task-based	Sequential	Async	Graph-based
Git isolation	Worktrees	None	None	None	None
Verification	Janitor + quality gates	None built-in	None built-in	None built-in	Conditional edges
Cost tracking	Built-in	Built-in	Manual	Manual	Manual
State persistence	File-based (.sdd/)	Database	In-memory	In-memory	Checkpointer
Self-evolution	Built-in	No	No	No	No
Plan files	YAML stages + steps	No	Python code	Python code	Python code
Org/business modeling	No	Yes (org charts, goals)	No	No	No

Full comparison pages — detailed feature matrices, benchmark data, and "when to use X instead" guides for Paperclip, GitHub Agent HQ, Conductor, Crystal, Stoneforge, and single-agent workflows.

Origin

Built during a 47-hour sprint: 12 AI agents on a single laptop, 737 tickets closed (15.7/hour), 826 commits. Full write-up. Every design decision here is a direct response to those findings.

Roadmap

Bernstein's roadmap is public. Near-term work focuses on adoption and the governance moat; longer-term work on enterprise standards and distribution.

Shipped

Area	What	Status
Governance	Lifecycle governance kernel — guarded state transitions, typed events	Done
Governance	Governed workflow mode — deterministic phases, hashable definitions	Done
Governance	Model routing policy — provider allow/deny lists	Done
Governance	Immutable HMAC-chained audit log — tamper-evident, daily rotation	Done
Governance	Execution WAL — hash-chained write-ahead log, crash recovery, determinism fingerprinting	Done
Adoption	CI autofix pipeline — `bernstein ci fix <url>` and `bernstein ci watch`	Done
Adoption	Comparative benchmark suite — orchestrated vs. single-agent proof	Done
Adoption	Agent run manifest — hashable workflow spec for SOC2 evidence	Done
Adoption	`bernstein demo` — zero-config first-run experience	Done
Adoption	`bernstein doctor` — pre-flight health check	Done

Now (P1)

Area	What	Target
Enterprise	SSO/SAML/OIDC auth for multi-tenant deployments	H2 2026
Governance	Time-based model policy constraints ("deny expensive providers during peak hours")	H2 2026
Adoption	Verified SWE-Bench eval publication	In progress

Next (P2)

Area	What	Target
Enterprise	Dynamic policy hot-reload without restart	2026
Adoption	JetBrains IDE extension	2026
Governance	Task-specific model constraints ("role=security must use opus-only")	2026

Support Bernstein

Bernstein is free and open-source. If it saves you time, consider sponsoring:

All sponsorship proceeds fund development, infrastructure, and open-source sustainability.

Contributing

PRs welcome. See CONTRIBUTING.md for setup, testing, and code style. Open an issue for bugs and feature requests.

License

Apache License 2.0

Don't babysit agents. Set a goal, walk away, come back to working code.

What will your agents build first?

Name		Name	Last commit message	Last commit date
Latest commit History 964 Commits
.bernstein		.bernstein
.github		.github
Formula		Formula
action		action
benchmarks		benchmarks
deploy		deploy
docker/demo		docker/demo
docs		docs
examples		examples
github-action		github-action
hooks		hooks
integrations		integrations
packages		packages
packaging		packaging
plans/templates		plans/templates
scripts		scripts
sdk		sdk
src/bernstein		src/bernstein
templates		templates
tests		tests
.gitignore		.gitignore
AGENTS.md		AGENTS.md
CLAUDE.md		CLAUDE.md
CODE_OF_CONDUCT.md		CODE_OF_CONDUCT.md
CONTRIBUTING.md		CONTRIBUTING.md
CONTRIBUTORS.md		CONTRIBUTORS.md
Dockerfile		Dockerfile
LICENSE		LICENSE
README.md		README.md
SECURITY.md		SECURITY.md
_typos.toml		_typos.toml
action.yml		action.yml
bernstein.yaml		bernstein.yaml
codecov.yml		codecov.yml
coverage-test.xml		coverage-test.xml
docker-compose.yaml		docker-compose.yaml
pyproject.toml		pyproject.toml
sonar-project.properties		sonar-project.properties
typos.toml		typos.toml
uv.lock		uv.lock
vulture_whitelist.py		vulture_whitelist.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Declarative agent orchestration for engineering teams.

One YAML. Multiple coding agents. Ship while you sleep.

What it is

5-minute setup

Supported agents

Shipped features

Install

Editor extensions

Monitoring and diagnostics

Command aliases

Plan files

Observability

Extensibility

GitHub App integration

Agent catalogs

How it compares

Origin

Roadmap

Shipped

Now (P1)

Next (P2)

Support Bernstein

Contributing

License

About

Uh oh!

Releases 38

Packages

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Declarative agent orchestration for engineering teams.

One YAML. Multiple coding agents. Ship while you sleep.

What it is

5-minute setup

Supported agents

Shipped features

Install

Editor extensions

Monitoring and diagnostics

Command aliases

Plan files

Observability

Extensibility

GitHub App integration

Agent catalogs

How it compares

Origin

Roadmap

Shipped

Now (P1)

Next (P2)

Support Bernstein

Contributing

License

About

Topics

Resources

License

Code of conduct

Contributing

Security policy

Uh oh!

Stars

Watchers

Forks

Releases 38

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Packages