Skip to main content
Give your AI agent a safe place to run code. This guide shows how to wrap a Tensorlake Sandbox as a callable tool in both Claude (Anthropic SDK) and OpenAI agents, so the LLM can write and execute code on demand — all inside an isolated container with network restrictions and resource limits.

How it works

The pattern is the same regardless of which LLM you use:
  1. Define a run_code tool — tell the LLM it can call a function that accepts a code string and returns stdout/stderr.
  2. Create a sandbox once and keep it alive across the agent loop — reusing one sandbox preserves state between tool calls (installed packages, files written to disk). Each run_code call is a fresh Python process, so variables and imports must be redefined in each call.
  3. Execute the tool call — when the LLM invokes run_code, pass the code into sandbox.run() and return the result.
  4. Clean up — terminate the sandbox when the agent session ends.

Claude (Anthropic SDK)

Prerequisites

pip install tensorlake anthropic

Full example

import anthropic
from tensorlake.sandbox import SandboxClient

SYSTEM_PROMPT = """You are a data analysis assistant. You have access to a Python sandbox.
Use the run_code tool whenever you need to compute something, analyze data, or verify your
reasoning with code. Each run_code call is a fresh Python process — include all imports and
redefine any variables you need. Installed packages and files written to disk persist across calls."""

# Define the tool schema Claude will use
RUN_CODE_TOOL = {
    "name": "run_code",
    "description": (
        "Execute Python code in a secure sandbox. "
        "Each call is a fresh Python process — include all imports and redefine any variables you need. "
        "Installed packages and files written to disk persist across calls. "
        "Returns stdout and stderr."
    ),
    "input_schema": {
        "type": "object",
        "properties": {
            "code": {
                "type": "string",
                "description": "Python code to execute.",
            }
        },
        "required": ["code"],
    },
}


def run_agent(user_message: str) -> str:
    anthropic_client = anthropic.Anthropic()
    tl_client = SandboxClient()

    # Create one sandbox for the entire agent session
    sandbox = tl_client.create_and_connect(
        cpus=1.0,
        memory_mb=512,
        timeout_secs=600,
        allow_internet_access=False,  # lock down network for untrusted code
    )

    messages = [{"role": "user", "content": user_message}]

    try:
        while True:
            response = anthropic_client.messages.create(
                model="claude-opus-4-5",
                max_tokens=4096,
                system=SYSTEM_PROMPT,
                tools=[RUN_CODE_TOOL],
                messages=messages,
            )

            # Append assistant's response to history
            messages.append({"role": "assistant", "content": response.content})

            # If no tool use, we're done
            if response.stop_reason == "end_turn":
                # Extract the final text response
                for block in response.content:
                    if hasattr(block, "text"):
                        return block.text

            # Process all tool calls in this response
            tool_results = []
            for block in response.content:
                if block.type != "tool_use":
                    continue

                code = block.input["code"]
                print(f"\n[sandbox] executing:\n{code}\n")

                result = sandbox.run("python", ["-c", code])

                output = result.stdout or ""
                if result.stderr:
                    output += f"\n[stderr]\n{result.stderr}"
                if result.exit_code != 0:
                    output += f"\n[exit code: {result.exit_code}]"

                print(f"[sandbox] output:\n{output}")

                tool_results.append({
                    "type": "tool_result",
                    "tool_use_id": block.id,
                    "content": output or "(no output)",
                })

            # Feed all results back to Claude in one message
            messages.append({"role": "user", "content": tool_results})

    finally:
        sandbox.close()  # always clean up


if __name__ == "__main__":
    answer = run_agent(
        "I have a list of numbers: [4, 8, 15, 16, 23, 42]. "
        "What is the mean, median, and standard deviation? "
        "Also plot a histogram and tell me if the distribution looks normal."
    )
    print("\n=== Final answer ===")
    print(answer)

What happens step by step

StepWhat Claude doesWhat your code does
1Reads the user questionSends to Claude with run_code tool available
2Decides it needs to compute something, emits a tool_use blockDetects stop_reason == "tool_use"
3Calls sandbox.run() with the generated code
4Appends result as tool_result and calls Claude again
5Reads the output, continues reasoning or calls tool againLoops until stop_reason == "end_turn"
6Returns final text answerReturns it to the caller, closes sandbox

OpenAI (function calling)

Prerequisites

pip install tensorlake openai

Full example

import json
import openai
from tensorlake.sandbox import SandboxClient

SYSTEM_PROMPT = """You are a data analysis assistant with access to a Python sandbox.
Always use the run_code function to execute code — never compute or guess answers yourself.
Each run_code call is a fresh Python process — include all imports and redefine any variables
you need. Installed packages and files written to disk are available across calls."""

# Define the function schema OpenAI will use
RUN_CODE_FUNCTION = {
    "type": "function",
    "function": {
        "name": "run_code",
        "description": (
            "Execute Python code in a secure isolated sandbox. "
            "Each call runs in a fresh Python process — include all imports and redefine "
            "any variables you need. Installed packages and files written to disk persist across calls. "
            "Returns stdout and stderr as a string."
        ),
        "parameters": {
            "type": "object",
            "properties": {
                "code": {
                    "type": "string",
                    "description": "Python code to execute.",
                }
            },
            "required": ["code"],
        },
    },
}


def run_agent(user_message: str) -> str:
    openai_client = openai.OpenAI()
    tl_client = SandboxClient()

    # Create one sandbox for the entire agent session
    sandbox = tl_client.create_and_connect(
        cpus=1.0,
        memory_mb=512,
        timeout_secs=600,
        allow_internet_access=False,
    )

    messages = [
        {"role": "system", "content": SYSTEM_PROMPT},
        {"role": "user", "content": user_message},
    ]

    try:
        while True:
            response = openai_client.chat.completions.create(
                model="gpt-5.1",
                messages=messages,
                tools=[RUN_CODE_FUNCTION],
                tool_choice="auto",
            )

            msg = response.choices[0].message
            messages.append(msg)

            # No tool calls — agent is done
            if not msg.tool_calls:
                return msg.content

            # Process all tool calls
            for tool_call in msg.tool_calls:
                args = json.loads(tool_call.function.arguments)
                code = args["code"]

                print(f"\n[sandbox] executing:\n{code}\n")

                result = sandbox.run("python", ["-c", code])

                output = result.stdout or ""
                if result.stderr:
                    output += f"\n[stderr]\n{result.stderr}"
                if result.exit_code != 0:
                    output += f"\n[exit code: {result.exit_code}]"

                print(f"[sandbox] output:\n{output}")

                messages.append({
                    "role": "tool",
                    "tool_call_id": tool_call.id,
                    "content": output or "(no output)",
                })

    finally:
        sandbox.close()


if __name__ == "__main__":
    answer = run_agent(
        "Using only Python stdlib (random, datetime), generate sample monthly revenue and cost "
        "data for the last 12 months (seed 42). Print a table showing each month, profit, and "
        "profit margin. Then print which month had the best and worst margin."
    )
    print("\n=== Final answer ===")
    print(answer)

Using OpenAI Agents SDK

If you are using the newer OpenAI Agents SDK, you can wrap the sandbox as a FunctionTool directly:

Prerequisites

pip install tensorlake openai-agents

Full example

from agents import Agent, ModelSettings, Runner, function_tool
from tensorlake.sandbox import SandboxClient

# Keep one sandbox alive for the agent's lifetime
_client = SandboxClient()
_sandbox = _client.create_and_connect(
    cpus=1.0,
    memory_mb=512,
    timeout_secs=600,
    allow_internet_access=False,
)


@function_tool
def run_code(code: str) -> str:
    """Execute Python code in a secure sandbox. Each call is a fresh Python process —
    include all imports and redefine any variables you need. Installed packages and
    files written to disk persist across calls."""
    result = _sandbox.run("python", ["-c", code])
    output = result.stdout or ""
    if result.stderr:
        output += f"\n[stderr]\n{result.stderr}"
    if result.exit_code != 0:
        output += f"\n[exit code: {result.exit_code}]"
    return output or "(no output)"


agent = Agent(
    name="Data Analyst",
    instructions="You are a data analysis assistant. Always use run_code to compute answers — never calculate or guess yourself.",
    tools=[run_code],
    model_settings=ModelSettings(tool_choice="required"),
)

result = Runner.run_sync(
    agent,
    "Write and run Python code to calculate the compound annual growth rate "
    "if revenue grew from $1M to $3.2M over 5 years. Print the result."
)

print(result.final_output)
_sandbox.close()

Production tips

Reuse one sandbox per session, not per call

Creating a new sandbox on every tool call adds cold-start latency and loses any state (installed packages, files written to disk) from prior calls. Create the sandbox before the agent loop and close it afterward.
# ✅ Create once, reuse across all tool calls
sandbox = tl_client.create_and_connect(...)
try:
    run_agent_loop(sandbox)
finally:
    sandbox.close()

# ❌ Don't do this — loses state and adds latency every call
def run_code_tool(code):
    with tl_client.create_and_connect() as sandbox:  # new sandbox every call
        return sandbox.run("python", ["-c", code])

Pre-install dependencies with Snapshots

If your agent always needs the same libraries (pandas, numpy, matplotlib, etc.), install them once, snapshot the sandbox, and boot future sandboxes from that snapshot. This avoids re-running pip install on every session.
# One-time setup: build a snapshot with dependencies pre-installed
setup_sandbox = tl_client.create_and_connect()
setup_sandbox.run("pip", ["install", "pandas", "numpy", "matplotlib", "scipy"])
snapshot = tl_client.snapshot_and_wait(setup_sandbox.sandbox_id)
setup_sandbox.close()
print(f"Snapshot ready: {snapshot.snapshot_id}")

# Every future session starts with packages already installed
sandbox = tl_client.create_and_connect(snapshot_id=snapshot.snapshot_id)
See the Snapshots guide for details.

Let the agent install packages on demand

If your agent may need arbitrary or unknown packages, tell it in the system prompt that it can install them with pip. Because sandbox state persists across tool calls, a package installed in one call is available in all subsequent calls.
SYSTEM_PROMPT = """You are a data analysis assistant. You have access to a Python sandbox.
Use the run_code tool whenever you need to compute something, analyze data, or verify your
reasoning with code. Each run_code call is a fresh Python process — include all imports and
redefine any variables you need. Installed packages and files written to disk persist across calls.
If a required package is missing, install it before using it:
  import subprocess; subprocess.run(["pip", "install", "--break-system-packages", "<package>"], check=True)"""
Use this approach when dependencies are unpredictable. For a known set of dependencies, pre-installing via Snapshots is faster since it avoids repeating pip install on every session.

Use Warm Pools for low-latency responses

For user-facing agents where cold-start time is noticeable, create a Warm Pool so containers are ready before the request arrives.
pool = tl_client.create_pool(warm_containers=3, max_containers=20, cpus=1.0, memory_mb=512)

# Later, claim a pre-warmed container instantly
response = tl_client.claim(pool.pool_id)
sandbox = tl_client.connect(response.sandbox_id)

Lock down the network

By default, sandboxes have internet access. For agents executing untrusted or LLM-generated code, disable it:
sandbox = tl_client.create_and_connect(allow_internet_access=False)
If the agent needs to call specific external APIs, use the networking controls to whitelist only those destinations. See the Networking guide.

What to build next

  • Data Analysis — spin up sandboxes with data science libraries to analyze complex datasets and stream results back in real time.
  • Snapshots — pre-install dependencies so agent sessions start instantly.
  • Pools — keep containers warm for sub-second sandbox creation.