What makes up a background agent

A background agent takes a bounded task, works on it asynchronously in an environment suited to that task, and returns a reviewable artifact

Jonathan Low
Jonathan Low
Technical workflow diagram showing a bounded task entering a sandboxed background agent and becoming a reviewable artifact

A background agent takes a bounded task, works on it asynchronously in an environment suited to that task, and returns a reviewable artifact.

Specifically:

  1. Bounded task - the agent has a defined scope and sufficient context to understand the intended work.
  2. Execution environment - the agent has access to the code, systems, tools, dependencies, documentation, and other resources needed to perform the task.
  3. Asynchronous execution - the agent runs independently outside a developer’s active editor session or local machine.
  4. Reviewable artifact - the agent returns work in a form that can be inspected, verified, and acted on within the team’s existing workflows and systems.

Background agents are best suited to tasks with clear objectives and stable constraints. They may still require human input during execution, but should generally require minimal supervision.

How the task is set for the agent really matters

A background agent is only as useful as the task boundary it is given. A bounded task reduces variability and uncertainty by giving the agent a specific surface area, constraints, expected output, and review criteria. Many times, a task would include a human-in-the-loop review point before the agent moves from investigation to action.

Early on, the tasks we set will sometimes be misinterpreted by the agent. That’s okay. Adjust the instructions accordingly with each iteration. If the same task is repeatable, it can become a skill.

UnboundedBoundedSkill version
Fix this production error.Investigate the latest checkout timeout error, inspect logs and recent deploys, and return a diagnosis with evidence and proposed next steps for human review. After review, implement the approved mitigation or prepare the follow-up issue.diagnose-issue ↗
File these issues.Turn these audit findings into issue drafts. For each one, include source context, current behavior, expected behavior, scope, acceptance criteria, and open questions for human review before creating anything.file-issues ↗
Improve our agent workflow.Inspect this repository’s agent instructions, validation commands, and repeated failure patterns. Present the highest-leverage harness gaps for human review before changing docs, tests, or workflow defaults.improve-harness-engineering ↗

The task and environment need to line up

A bounded task only works when the agent has an environment suited to that task. If the agent is unable to run the project, inspect the relevant systems, or gather the evidence needed, it will fail to produce good work.

The environment should be designed around the work the agent is expected to perform. A production investigation agent may need access to logs, traces, and runbooks. A dependency update agent may need a repository checkout, package managers, and test runners. A code review agent may need pull requests, repository history, and issue context.

When the task and environment are aligned, the agent can inspect the system, make a bounded change, run the relevant checks, and return a result. When they are not, the agent either gets stuck or is forced to operate outside the intended scope of the work. Conversely, the environment should not grant tools and access unrelated to the expected capabilities of the agent.

Asynchronous execution removes local constraints and reduces context switching

Running asynchronously removes constraints imposed by a developer’s machine and attention. Agents can run in environments tailored to the task, execute for as long as needed, and work independently in parallel.

This allows teams to delegate work across many agents simultaneously. Investigations, code reviews, dependency updates, and operational checks can all progress concurrently while engineers focus on higher-priority work.

Asynchronous execution does not mean the work is entirely hands-off. During execution, an agent may surface questions, request clarification, or present intermediate results for review. Teams can provide feedback and course corrections without needing to participate in every step of the process. The agent does the work independently, while humans remain involved when judgment, context, or direction is required.

Reviewable Artifact

It is essential that the agent returns an artifact that lives in the same workflows, processes, and systems that engineering teams already use:

  • a pull request in GitHub
  • a diagnosis in Slack
  • a set of issues in Linear
  • a question that requires human judgment

This enables agent work to become collaborative. It can be reviewed, discussed, handed off, and built upon by others. Teams can provide feedback, answer questions, or request changes.

Where to start with background agents

Start with smaller pieces of engineering work that already have clear boundaries, produce evidence, and can be reviewed, with very little drama, by a human.

Good starting tasks tend to be:

  • not on a critical path
  • repeatable and can be improved over time
  • clearly bounded, the agent knows where to stop
  • self-verifiable by the agent

Examples include:

  • investigating flaky or slow tests
  • reviewing a pull request for one specific concern
  • checking documentation against the current codebase
  • triaging a bug report and proposing next steps
  • investigating a production error
  • preparing a migration plan for review
  • generating tests for a narrow code path
  • refactoring duplicated logic in a single module

These tasks work well because success is observable. The agent can return a diagnosis, pull request, report, or set of findings that a human can inspect before deciding what should happen next.

Starting with this class of work helps teams validate the operational model around background agents: how tasks are delegated, how results are reviewed, and how feedback is incorporated. Once those processes are established, it becomes easier to expand into more complex workflows composed of multiple agents, tools, and review steps.