Permissions & safety

Sandbox

A sandbox is an isolated environment that limits what an agent can touch, such as the filesystem and network, so a mistake stays contained and cannot damage the real system.

James Phoenix
Understanding Data Updated July 2, 2026

A sandbox is an isolated environment where an agent can work without being able to reach anything that matters. Its file access is walled off, its network may be cut or restricted, and whatever it breaks stays inside the box.

Why you want one

An agent's safety has two layers. Permission mode decides what the agent is allowed to attempt. The sandbox decides how much damage a mistake can actually do if it slips past that. Belt and braces.

A good sandbox limits:

  • The [filesystem](/ai-coding-dictionary/filesystem): the agent sees a copy or a scoped directory, not your whole disk.
  • The network: no surprise calls out to the internet, or an allowlist only.
  • Credentials: no reach into production keys or secrets.

The tradeoff

The tighter the box, the safer, but also the less the agent can genuinely accomplish. If it cannot reach the real database or install a package, some tasks become impossible. The craft is scoping the sandbox so it is roomy enough for the work and tight enough that the worst case is "throw away the container," not "restore from backup." That containment is what makes it tolerable to let an agent run unattended.

Watch out
A sandbox limits blast radius; it does not make the agent correct. Inside the box it can still produce confident, broken code. Sandboxing protects your system, not the quality of what comes out, so you still have to review the result.

Related terms

Building with AI agents?

This dictionary is part of how I think about agentic engineering. If you want the same thinking applied to your codebase, that is what I do.

See how I can help