Skip to main content
If your agent is going to run code written by an LLM, you need a sandbox. Podflare ships small adapters for the three frameworks most teams use. Each one is ~30 lines of glue; you don’t write the glue yourself.

Claude — code_execution tool

Anthropic’s Messages API exposes a hosted code_execution tool. You can replace the hosted execution with your own sandbox — same tool spec, your compute. The handle_code_execution_tool_use helper consumes a tool_use block and returns the matching tool_result block.
from anthropic import Anthropic
from podflare import Sandbox
from podflare.integrations.anthropic import handle_code_execution_tool_use

client = Anthropic()

with Sandbox() as sbx:
    messages = [
        {"role": "user",
         "content": "Download this CSV and summarize: https://ex.com/sales.csv"}
    ]
    while True:
        resp = client.messages.create(
            model="claude-opus-4-7",
            max_tokens=2048,
            tools=[{"type": "code_execution_20250825",
                    "name": "code_execution"}],
            messages=messages,
        )
        messages.append({"role": "assistant", "content": resp.content})
        if resp.stop_reason != "tool_use":
            break
        results = [
            handle_code_execution_tool_use(b, sbx)
            for b in resp.content
            if b.type == "tool_use" and b.name == "code_execution"
        ]
        messages.append({"role": "user", "content": results})

    print(resp.content[0].text)

When to reach for this

  • You already use Anthropic and want the native tool contract.
  • You need data-plane isolation — Anthropic’s hosted code_execution runs on their infra; with Podflare it runs on yours.
  • Per-customer quota / billing attribution lives on your API key, not Anthropic’s.

OpenAI Agents SDK

The Agents SDK accepts any function tool. podflare_code_interpreter() (Python) and podflareCodeInterpreter() (TypeScript) return one.
from agents import Agent, Runner
from podflare.integrations.openai_agents import podflare_code_interpreter

agent = Agent(
    name="assistant",
    instructions="Use the run_code tool to answer data questions.",
    tools=[podflare_code_interpreter()],
)

result = await Runner.run(agent, "what is 111 * 111?")
print(result.final_output)

Vercel AI SDK

podflareRunCode() returns a shape compatible with tool() from the ai package.
TypeScript
import { generateText, tool } from "ai";
import { openai } from "@ai-sdk/openai";
import { z } from "zod";
import { podflareRunCode } from "podflare/ai-sdk";

const pf = podflareRunCode({ template: "python-datasci" });

const result = await generateText({
  model: openai("gpt-4o"),
  tools: {
    runCode: tool({
      description: pf.description,
      parameters: z.object({
        code: z.string(),
        language: z.enum(["python", "bash"]).optional(),
      }),
      execute: pf.execute,
    }),
  },
  prompt: "Fetch the top 5 HN stories and chart their scores.",
  maxSteps: 10,
});

await pf.close();
console.log(result.text);

Persistent REPL pattern

State survives between run_code calls — filesystem and Python variables. This is how you build a “chat with a notebook” UX:
from podflare import Sandbox

sbx = Sandbox(template="python-datasci", idle_timeout_seconds=1800)

# Turn 1 — load data
sbx.run_code("""
import pandas as pd
df = pd.read_parquet('https://ex.com/events.parquet')
print(df.shape)
""")

# Turn 2 — df is still there
r = sbx.run_code("print(df.head().to_string())")
print(r.stdout)

# Turn 3 — even a file is still there
sbx.run_code("df.head(100).to_csv('/tmp/sample.csv')")
sbx.run_code("import os; print(os.listdir('/tmp'))")

sbx.close()
Set a generous idle_timeout_seconds so the sandbox survives between user messages. Paid tiers let you go up to 2 h — enough for long interactive sessions.

Pitfalls

  • Don’t share a sandbox across users. One per session is the right granularity. Fork if you need parallel branches within one user’s session.
  • Install packages once, reuse many. A pip install is cheap but not free (~100–500 ms). Cache the sandbox and install on first run; subsequent turns hit the already-installed version.
  • Close sandboxes when done. Idle timeout catches forgotten ones but you’ll burn budget until it fires. SDK’s with / using semantics do this automatically.