Using sandboxes effectively
This guide covers practical patterns for working with sandboxed agents.
Basic workflow
Create a sandbox for your project:
$ cd ~/my-project
$ docker sandbox run AGENT
Replace AGENT with your preferred agent (claude, codex, copilot, etc.).
The workspace defaults to your current directory when omitted. You can also
specify an explicit path:
$ docker sandbox run AGENT ~/my-project
The docker sandbox run command is idempotent. Running the same command
multiple times reuses the existing sandbox instead of creating a new one:
$ docker sandbox run AGENT ~/my-project # Creates sandbox
$ docker sandbox run AGENT ~/my-project # Reuses same sandbox
This works with workspace path (absolute or relative) or omitted workspace. The sandbox persists. Stop and restart it without losing installed packages or configuration:
$ docker sandbox run <sandbox-name> # Reconnect by name
When using the --name flag, the behavior is also idempotent based on the
name:
$ docker sandbox run --name dev AGENT # Creates sandbox named "dev"
$ docker sandbox run --name dev AGENT # Reuses sandbox "dev"
Installing dependencies
Ask the agent to install what's needed:
You: "Install pytest and black"
Agent: [Installs packages via pip]
You: "Install build-essential"
Agent: [Installs via apt]The agent has sudo access. Installed packages persist for the sandbox lifetime. This works for system packages, language packages, and development tools.
For teams or repeated setups, use Custom templates to pre-install tools.
Docker inside sandboxes
Agents can build images, run containers, and use Docker Compose. Everything runs inside the sandbox's private Docker daemon.
Testing containerized apps
You: "Build the Docker image and run the tests"
Agent: *runs*
docker build -t myapp:test .
docker run myapp:test npm testContainers started by the agent run inside the sandbox, not on your host. They
don't appear in your host's docker ps.
Multi-container stacks
You: "Start the application with docker-compose and run integration tests"
Agent: *runs*
docker-compose up -d
docker-compose exec api pytest tests/integration
docker-compose downRemove the sandbox, and all images, containers, and volumes are deleted.
What persists
While a sandbox exists:
- Installed packages (apt, pip, npm, etc.)
- Docker images and containers inside the sandbox
- Configuration changes
- Command history
When you remove a sandbox:
- Everything inside is deleted
- Your workspace files remain on your host (synced back)
To preserve a configured environment, create a Custom template.
Security considerations
Credential security
Set API keys as environment variables on the host rather than authenticating interactively inside a sandbox. When you set credentials on the host, Docker Sandboxes proxies API calls from the sandbox through the host daemon, so the agent never has direct access to the raw key.
When you authenticate interactively, credentials are stored inside the sandbox where the agent can read them directly. This creates a risk of credential exfiltration if the agent is compromised or behaves unexpectedly.
Interactive authentication also requires you to re-authenticate for each workspace separately.
Workspace trust
Agents running in sandboxes automatically trust the workspace directory without prompting. This enables agents to work freely within the isolated environment.
Agents can create and modify any files in your mounted workspace, including scripts, configuration files, and hidden files.
After an agent works in a workspace, review changes before performing actions on your host that might execute code:
- Committing changes (executes Git hooks)
- Opening the workspace in an IDE (may auto-run scripts or extensions)
- Running scripts or executables the agent created or modified
Review what changed:
$ git status # See modified and new files
$ git diff # Review changes to tracked files
Check for untracked files and be aware that some changes, like Git hooks in
.git/hooks/, won't appear in standard diffs.
This is the same trust model used by editors like Visual Studio Code, which warn when opening new workspaces for similar reasons.
Managing multiple projects
Create sandboxes for different projects:
$ docker sandbox create claude ~/project-a
$ docker sandbox create codex ~/project-b
$ docker sandbox create copilot ~/work/client-project
Each sandbox is completely isolated. Switch between them by running the appropriate sandbox name.
Remove unused sandboxes to reclaim disk space:
$ docker sandbox rm <sandbox-name>
Named sandboxes
Docker automatically generates sandbox names based on the agent and workspace
directory (for example, claude-my-project). You can also specify custom names
using the --name flag:
$ docker sandbox run --name myproject AGENT ~/project
Create multiple sandboxes for the same workspace:
$ docker sandbox create --name dev claude ~/project
$ docker sandbox create --name staging codex ~/project
$ docker sandbox run dev
Each maintains separate packages, Docker images, and state, but share the workspace files.
Multiple workspaces
Mount multiple directories into a single sandbox for working with related projects or when the agent needs access to documentation and shared libraries.
$ docker sandbox run AGENT ~/my-project ~/shared-docs
The primary workspace (first argument) is always mounted read-write. Additional workspaces are mounted read-write by default.
Read-only mounts
Mount additional workspaces as read-only by appending :ro or :readonly:
$ docker sandbox run AGENT . /path/to/docs:ro /path/to/lib:readonly
The primary workspace remains fully writable while read-only workspaces are protected from changes.
Path resolution
Workspaces are mounted at their absolute paths inside the sandbox. Relative paths are resolved to absolute paths before mounting.
Example:
$ cd /Users/bob/projects
$ docker sandbox run AGENT ./app ~/docs:ro
Inside the sandbox:
/Users/bob/projects/app- Primary workspace (read-write)/Users/bob/docs- Additional workspace (read-only)
Changes to /Users/bob/projects/app sync back to your host, while
/Users/bob/docs remains read-only.
Sharing workspaces across sandboxes
A single path can be included in multiple sandboxes simultaneously:
$ docker sandbox create --name sb1 claude ./project-a
$ docker sandbox create --name sb2 claude ./project-a ./project-b
$ docker sandbox create --name sb3 cagent ./project-a
$ docker sandbox ls
SANDBOX AGENT STATUS WORKSPACE
sb1 claude running /Users/bob/src/project-a
sb2 claude running /Users/bob/src/project-a, /Users/bob/src/project-b
sb3 cagent running /Users/bob/src/project-a
Each sandbox runs in isolation with separate configurations while sharing the same workspace files.
Resetting state
If you encounter issues with sandbox state, use the reset command to clean up all VMs and registries:
$ docker sandbox reset
This command:
- Stops all running sandbox VMs
- Deletes all VM state and registries
- Continues running the sandbox daemon (does not shut it down)
- Warns about directories it cannot remove
After reset, you can create fresh sandboxes. Use this when troubleshooting persistent issues or reclaiming disk space from all sandboxes at once.
Debugging
Access the sandbox directly with an interactive shell:
$ docker sandbox exec -it <sandbox-name> bash
Inside the shell, you can inspect the environment, manually install packages, or check Docker containers:
agent@sandbox:~$ docker ps
agent@sandbox:~$ docker images
List all sandboxes:
$ docker sandbox ls