Sandbox

A sandbox is an isolated environment where an agent can work without being able to reach anything that matters. Its file access is walled off, its network may be cut or restricted, and whatever it breaks stays inside the box.

Why you want one

An agent's safety has two layers. Permission mode decides what the agent is allowed to attempt. The sandbox decides how much damage a mistake can actually do if it slips past that. Belt and braces.

A good sandbox limits:

The [filesystem](/ai-coding-dictionary/filesystem): the agent sees a copy or a scoped directory, not your whole disk.
The network: no surprise calls out to the internet, or an allowlist only.
Credentials: no reach into production keys or secrets.

The tradeoff

The tighter the box, the safer, but also the less the agent can genuinely accomplish. If it cannot reach the real database or install a package, some tasks become impossible. The craft is scoping the sandbox so it is roomy enough for the work and tight enough that the worst case is "throw away the container," not "restore from backup." That containment is what makes it tolerable to let an agent run unattended.

Watch out

A sandbox limits blast radius; it does not make the agent correct. Inside the box it can still produce confident, broken code. Sandboxing protects your system, not the quality of what comes out, so you still have to review the result.

Why you want one

The tradeoff

Related terms

Environment

Permission mode

Filesystem

Building with AI agents?