A sandbox is an isolated environment where an agent can work without being able to reach anything that matters. Its file access is walled off, its network may be cut or restricted, and whatever it breaks stays inside the box.
Why you want one
An agent's safety has two layers. Permission mode decides what the agent is allowed to attempt. The sandbox decides how much damage a mistake can actually do if it slips past that. Belt and braces.
A good sandbox limits:
- The [filesystem](/ai-coding-dictionary/filesystem): the agent sees a copy or a scoped directory, not your whole disk.
- The network: no surprise calls out to the internet, or an allowlist only.
- Credentials: no reach into production keys or secrets.
The tradeoff
The tighter the box, the safer, but also the less the agent can genuinely accomplish. If it cannot reach the real database or install a package, some tasks become impossible. The craft is scoping the sandbox so it is roomy enough for the work and tight enough that the worst case is "throw away the container," not "restore from backup." That containment is what makes it tolerable to let an agent run unattended.
Related terms
Environment
The environment is the surroundings an agent acts in: working directory, files, shell, environment variables, and network. It defines what the agent's tools can actually reach.
Read definition →Permission mode
Permission mode is the policy that decides which actions an agent can take on its own and which ones need your approval, ranging from ask-every-time to full auto. It trades safety for flow.
Read definition →Filesystem
The filesystem is the set of files an agent can read and write. It is its main source of truth and its main way to make durable changes.
Read definition →