Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion src/content/docs/becoming-productive/going-10x.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -14,7 +14,7 @@ Agentic engineering is like multicore CPUs. Agents aren’t always faster than h
- A Git worktree is an additional working directory attached to the same repository, so you can have multiple branches checked out side by side without cloning the repo again.
- For this to be efficient, your app needs to be easily bootable with a blank or seeded state from a fresh Git checkout.
- Raw Git worktrees can feel annoying when you want to quickly jump into an agent's changes, switch editor context, or test uncommitted work on a simulator. There is also a Git limitation: each worktree of a single checkout must operate on a different `HEAD` (branch/commit/ref).
- Some harnesses offer built-in worktree management, such as <ExternalLink href="https://docs.conductor.build/guides/spotlight-testing">Conductor Spotlight Testing</ExternalLink> and <ExternalLink href="https://developers.openai.com/codex/app/worktrees/#working-between-local-and-worktree">Codex Handoff</ExternalLink>, which reduce a lot of the plumbing involved in moving between an isolated agent workspace and the place where you actually run or test the app.
- Some coding agents offer built-in worktree management, such as <ExternalLink href="https://docs.conductor.build/guides/spotlight-testing">Conductor Spotlight Testing</ExternalLink> and <ExternalLink href="https://developers.openai.com/codex/app/worktrees/#working-between-local-and-worktree">Codex Handoff</ExternalLink>, which reduce a lot of the plumbing involved in moving between an isolated agent workspace and the place where you actually run or test the app.
- Models easily handle prompts such as `do this in a /tmp worktree`.
2. You can also try keeping several separate full Git checkouts.
3. Or, you can go YOLO and actually have multiple agents operate on a single codebase.
Expand Down
8 changes: 4 additions & 4 deletions src/content/docs/becoming-productive/harness-engineering.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -9,10 +9,10 @@ description: >-
import ExternalLink from "../../../components/ExternalLink.astro";

Harness engineering is the practice of improving a coding agent's output
quality and reliability by shaping its harness, not just the prompt you type
into the chat.
Instead of treating the agent as a black box, you shape the configuration
around the model, including instructions, tools, context, and integrations, so
quality and reliability by shaping the software around it, not just the prompt
you type into the chat.
Instead of treating the coding agent as a black box, you shape its harness:
the instructions, tools, context, hooks, and integrations around the model, so
that correct behavior becomes easier and more repeatable.

In practice, that means shaping the scaffolding around the model so the right
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -153,7 +153,7 @@ Like augmented manual QA: the agent scripts the scenario, and the human does the
### Just talk to it

If typing is tiring, you can use your voice instead.
Some agent harnesses have built-in voice dictation features.
Some coding agents have built-in voice dictation features.
Similar things are also provided by accessibility features of operating systems,
like <ExternalLink href="https://support.apple.com/guide/mac-help/mh40584/mac">Dictation</ExternalLink> on macOS,
or <ExternalLink href="https://support.microsoft.com/en-us/windows/use-voice-typing-to-talk-instead-of-type-on-your-pc-fec94565-c4bd-329d-e59a-af033fa5689f">Voice Typing</ExternalLink> on Windows,
Expand Down
12 changes: 6 additions & 6 deletions src/content/docs/becoming-productive/the-workflow.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -44,7 +44,7 @@ It also knows nothing about your private codebase until you put that information
### Agents cannot learn (yet)

Models do not get better at your project just because you told them yesterday that a task should be done in a better way.
What looks like memory is usually the harness replaying prior context, which means preferences disappear when the session resets or earlier instructions fade out.
What looks like memory is usually the coding agent replaying prior context, which means preferences disappear when the session resets or earlier instructions fade out.
That means the same mistake can reappear, so durable improvement has to live in prompts, docs, lints, tests, and tooling rather than in the model itself.

### Context rot
Expand Down Expand Up @@ -90,15 +90,15 @@ Depending on your intent, some phases can be merged, shortened, or skipped.
Or try _plan_ mode and work out a solid implementation outline with an agent.
Talk to it and refine the plan until it’s&nbsp;👌.
2. **Execute** - when your agent (not you!) knows how the task should be done, tell it to do it!
3. **Agent review** - many harnesses have built-in auto-review features.
3. **Agent review** - many coding agents have built-in auto-review features.
Try using them in the background so the agent spends time finding all the stupid mistakes,
not you (you should be doing more valuable work in the meantime).
We will discuss this in more detail [later on](../going-10x/#automated-code-reviews).
4. **Human review** - ultimately, you (a human) are responsible for the code.
Invest some time in reviewing it so that (a) you know what's happening and (b) you won't waste reviewers' time.

:::tip
- Most GUI or in-IDE harnesses provide rich diffs computed from model changes.
- Most GUI or in-IDE coding agents provide rich diffs computed from model changes.
- You can use the Git staging area to mark already reviewed changes.
:::

Expand All @@ -109,7 +109,7 @@ Depending on your intent, some phases can be merged, shortened, or skipped.

:::tip
- If you use Claude Code, try the `/insights` command.
- Use your harness's memory feature (e.g., `/memory` in Claude Code, Memories in Cursor) to persist lessons learned across sessions.
- Use your coding agent's memory feature (e.g., `/memory` in Claude Code, Memories in Cursor) to persist lessons learned across sessions.
- Try post-mortem diffs: ask the agent to compare its first attempt vs the final version and explain what it got wrong. Great for spotting recurring antipatterns.
:::

Expand All @@ -123,11 +123,11 @@ Yes.
Try to think of threads as composable entities.
You can, and should, split work into multiple threads.

For example, most harnesses assign unique IDs to threads.
For example, most coding agents assign unique IDs to threads.
You can use these IDs to resume past threads or to fork them or hand them off into new threads.
Such forks can even be run in parallel.
Some tools allow you to refer to past threads directly (for example `@Past Chat` in Cursor or `@@` in Amp).

If your harness of choice does not provide such niceties, you can always fall back to sharing context through plain Markdown files.
If your coding agent of choice does not provide such niceties, you can always fall back to sharing context through plain Markdown files.
For example, you can run several brainstorming sessions, each on a different topic, and produce summary `*.md` files as outcomes.
Then, in the execution thread, you _join_ all of that information when priming the context.
3 changes: 2 additions & 1 deletion src/content/docs/expanding-horizons/model-pricing.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -29,7 +29,8 @@ If you're integrating directly with an LLM API, lowering cost per request/sessio
- Cap output length when possible (`max_tokens` / equivalent).
- Keep threads compact. Good cache hit rates help, but each turn still adds some uncached tail tokens, and cache entries can expire/prune over long sessions.

If you're using an agent harness, many of these optimizations are handled internally (prompt layout, caching, compaction). Your main cost levers are usually model choice and keeping tasks/threads scoped.
If you're using a coding agent, many of these optimizations are handled internally (prompt layout, caching, compaction).
Your main cost levers are usually model choice and keeping tasks/threads scoped.

## Subscriptions

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -9,11 +9,16 @@ This page explains how agent threads are transformed into model input and why th

## From thread to model input

Models are still just glorified autocomplete. What agent harnesses do is structure agent threads as model inputs (i.e., input strings). Almost every existing harness roughly follows the same framework. It’s dead simple:
Models are still just glorified autocomplete.
What coding agents do is structure agent threads as model inputs (i.e., input strings).
Almost every existing coding agent roughly follows the same framework.
It’s dead simple:

![](../../../assets/thread-diagram.svg)

You might ask: what is the context window, then? It is just the maximum length of the input string that the model can ingest. Agent harnesses often reduce this window slightly from actual model limits to reserve space for some tail system messages or a compaction prompt.
You might ask: what is the context window, then?
It is just the maximum length of the input string that the model can ingest.
Coding agents often reduce this window slightly from actual model limits to reserve space for some tail system messages or a compaction prompt.

Now (especially if you’re going to work with Anthropic models), go read this article. The author wrote a great overview of context peculiarities and how models behave as context usage grows. TL;DR: Keep your threads short:

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -176,7 +176,7 @@ First and foremost, always check whether legal or ethical constraints prevent ag
- The project may have regulatory constraints that forbid uploading the codebase to inference providers.
- It can also be the other way around: legal teams may forbid incorporating LLM-generated code (which may reproduce
open-source code from GitHub) into a proprietary product.
- There may be a corporate policy mandating the use of a specific agent harness.
- There may be a corporate policy mandating the use of a specific coding agent.
- Maintainers may simply not be comfortable with agent usage.

If that is clear, you can use agents to support your work. In external collaboration, though, behave as a human owner.
Expand Down
9 changes: 6 additions & 3 deletions src/content/docs/getting-started/glossary.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -16,12 +16,15 @@ Before we start, let us introduce a few terms that we will use a lot:

- **Model**: A particular LLM, like Claude-4.6-Opus, GPT-5.3-Codex, Gemini-3.1-Pro.

- **Agent**: An LLM that runs tools in a loop to achieve a goal.
- **Agent**: Software that calls an LLM, gives it tools, and runs those tools in a loop to achieve a goal.

- **Agent Harness**: A piece of software that implements an actual agent, like Claude Code or Cursor Agent.
- **Coding agent**: An agent specialized for software work, like Claude Code or Cursor Agent.
In practice, coding agents can inspect repositories, use developer tools, propose changes, and often write and run code as part of a task loop.

- **Vibe coding**: When agents write code and you're in the flow, building PoCs and MVPs, or just having fun.

- **Agentic engineering**: When you use agents to develop codebases professionally, and you are responsible for the output.

- **Gate**: A check that work must pass before it can move forward. The term predates AI agents, but agentic workflows made it much more common because they require explicit checkpoints for reviewing and verifying autonomous work. In practice, a gate can be a lint run, typecheck, test suite, code review, preview deployment, or any other requirement that blocks changes until they meet your quality bar.
- **Gate**: A check that work must pass before it can move forward.
The term predates AI agents, but agentic workflows made it much more common because they require explicit checkpoints for reviewing and verifying autonomous work.
In practice, a gate can be a lint run, typecheck, test suite, code review, preview deployment, or any other requirement that blocks changes until they meet your quality bar.
5 changes: 3 additions & 2 deletions src/content/docs/getting-started/how-to-set-up-a-new-repo.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -50,7 +50,8 @@ Here is an <ExternalLink href="https://ampcode.com/threads/T-019c999d-a6a6-73db-
## A first bug fixed in 7 steps

:::note
This section is Cursor-specific. Under the hood, it’s just Cursor’s magic prompting, so you can achieve the same in other harnesses by trying to replicate it.
This section is Cursor-specific.
Under the hood, it’s just Cursor’s magic prompting, so you can achieve the same in other coding agents by trying to replicate it.
:::

Before Debug Mode existed, the agent tried to guess the most probable root cause and immediately fix it—sometimes resulting in quick patches or even ugly workarounds 🙁
Expand Down Expand Up @@ -84,7 +85,7 @@ With the right tools, agents are excellent at turning Figma designs into rough c
- <ExternalLink href="https://help.figma.com/hc/en-us/articles/32132100833559-Guide-to-the-Figma-MCP-server" />
- <ExternalLink href="https://help.figma.com/hc/en-us/articles/35281350665623-Figma-MCP-collection-How-to-set-up-the-Figma-remote-MCP-server" />
3. In Figma, copy a link to a frame or layer.
4. In your agent harness, prompt the agent to `implement this button component [url] in ui/`
4. In your coding agent, prompt the agent to `implement this button component [url] in ui/`

</Steps>

Expand Down
6 changes: 3 additions & 3 deletions src/content/docs/getting-started/towards-self-improvement.mdx
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
---
title: Towards self-improvement
description: Start adapting your repository to agents with a minimal `AGENTS.md` and a few simple harness conventions.
description: Start adapting your repository to coding agents with a minimal `AGENTS.md` and a few simple setup conventions.
---

import ExternalLink from "../../../components/ExternalLink.astro";
Expand All @@ -19,8 +19,8 @@ One way to steer agents toward doing the right things is to have an `AGENTS.md`
file in the repository root.
This file is read by the agent in each thread.

This file is standardized and supported by most agent harnesses, except Claude Code.
There are also harness-specific mechanisms, like `CLAUDE.md`
This file is standardized and supported by most coding agents, except Claude Code.
There are also tool-specific mechanisms, like `CLAUDE.md`
or <ExternalLink href="https://cursor.com/docs/context/rules">Cursor Rules</ExternalLink>.

<Steps>
Expand Down