Skip to main content

Documentation Index

Fetch the complete documentation index at: https://docs.podflare.ai/llms.txt

Use this file to discover all available pages before exploring further.

TL;DR — One Podflare sandbox is a real Linux box. Root filesystem, 1–16 GB RAM, full internet, pip install anything, persistent Python REPL, sub-100 ms fork(), freeze-to-disk for later resume. Not a container. Not a serverless function. A real VM your agent owns.

The shape of one sandbox

from podflare import Sandbox

with Sandbox() as sb:
    # This is not a container. This is a Linux box.
    print(sb.run_code("uname -a", language="bash").stdout)
    # Linux ... 6.1.x ... x86_64 ... GNU/Linux
What you getPer sandbox
Hardware isolationDedicated Podflare Pod microVM. Own kernel. Own page tables. Can’t see other tenants.
CPU2 vCPUs (configurable)
RAM1 GB default, up to 16 GB on Scale tier
Rootfs4 GB default, up to 64 GB on Scale tier. Writable. Ubuntu 24.04 minimal.
NetworkFull outbound. DHCP-leased IP, NAT to public internet. No config.
Python REPLPersistent across run_code calls. Variables, imports, open files all survive.
BashSame sandbox serves language="bash". Run any shell command.
FilesystemUpload bytes in, download artifacts out. Mount tmpfs, write files, run git clone, build Docker images, you name it.
Lifecyclecreate → unlimited run_code / fork / uploadclose. Or freeze with persistent=True and resume later.
LatencyCreate hits pool in ~187 ms p50 end-to-end (laptop → CF edge → nearest origin → VM → back, SDK 0.0.20). Hot run_code p50 is ~46 ms on the same connection. p95 stays under 210 ms, p99 under 240 ms — see vs E2B/Daytona/Blaxel for the full distribution.

Real capabilities, real code

Every snippet below runs, end-to-end, today.

1. Install any package you want

No allowlist. No proxy. Real pip with real internet.
with Sandbox() as sb:
    sb.run_code("pip install scikit-learn pandas requests", language="bash")
    # Now use them:
    out = sb.run_code("""
        import pandas as pd
        from sklearn.linear_model import LinearRegression
        print("pandas:", pd.__version__, "sklearn:", LinearRegression().__class__.__name__)
    """)
    print(out.stdout)
    # pandas: 2.x  sklearn: LinearRegression
Same deal for npm, cargo, apt, whatever the base image has.

2. Call any external API

with Sandbox() as sb:
    sb.run_code("pip install openai", language="bash")
    out = sb.run_code("""
        import os, openai
        client = openai.OpenAI(api_key='sk-...')
        r = client.chat.completions.create(
            model="gpt-5",
            messages=[{"role": "user", "content": "one liner pun about Linux"}]
        )
        print(r.choices[0].message.content)
    """)
    print(out.stdout)
Outbound IP seen by the API is the sandbox’s region IP (MASQUERADE via the host).

3. Persistent Python REPL

State carries across run_code calls. This is the feature that makes agent loops actually cheap — no re-parsing CSVs, no re-loading models, no re-importing pandas on every tool call.
with Sandbox() as sb:
    # Expensive setup once
    sb.run_code("""
        import pandas as pd
        df = pd.read_csv('https://raw.githubusercontent.com/pandas-dev/pandas/main/pandas/tests/io/data/csv/tips.csv')
        print(f'loaded {len(df)} rows')
    """)

    # Agent turn 2: `df` is still here, no re-loading
    out = sb.run_code("print(df.groupby('sex')['tip'].mean())")
    print(out.stdout)

    # Turn 3: import something new, use it, state sticks
    sb.run_code("import numpy as np")
    out = sb.run_code("print(np.array([1,2,3]).sum())")
    print(out.stdout)  # 6
Under the hood each exec hits the same running Python process over a vsock control channel. globals() persists.

4. Full filesystem — write files, run commands, keep artifacts

with Sandbox() as sb:
    # Write a file from your agent loop
    sb.upload(b"hello from outside\n", "/tmp/greeting.txt")

    # Guest code reads + writes files like any Linux box
    sb.run_code("""
        cat /tmp/greeting.txt
        mkdir -p /app
        cd /app
        git clone --depth 1 https://github.com/psf/requests.git
        ls requests/
    """, language="bash")

    # Pull an artifact back out
    rendered = sb.download("/tmp/greeting.txt")
    print(rendered.decode())
upload / download go over the control channel (fast, no egress billing). git clone goes out to the internet directly.

5. fork() — tree-search without state hell

Snapshot the parent sandbox mid-flight, spawn N children each starting from the parent’s exact state. Copy-on-write memory + reflinked rootfs means a fork costs ~80 ms end-to-end with near-zero per-child memory overhead.
with Sandbox() as parent:
    # Expensive setup: load a big DataFrame
    parent.run_code("""
        import pandas as pd
        df = pd.read_parquet('/data/big.parquet')  # 500 MB in memory
    """)

    # Fork 5 ways — each child inherits df without re-loading
    children = parent.fork(n=5)
    plans = [
        "df.groupby('region').revenue.sum()",
        "df.describe()",
        "df.corr(numeric_only=True)",
        "df['signup_date'].dt.year.value_counts()",
        "df.sample(100).to_json()",
    ]
    results = [c.run_code(p) for c, p in zip(children, plans)]

    # Pick the best branch, commit it as the parent's new state
    winner = children[pick_best(results)]
    parent.merge_into(winner)

    # The parent now IS winner's world. Losers get destroyed:
    for c in children:
        if c is not winner:
            c.close()
This is the primitive that agent tree-search, hypothesis testing, and “try 5 refactors and take the one that compiles” workflows are built on.

6. Persistent Spaces — freeze now, resume later

A sandbox created with persistent=True is frozen into a Space when its idle timeout fires — memory, running processes, Python REPL state, filesystem, all preserved on disk. Resume it later and pick up exactly where you left off.
from podflare import Sandbox

# Create a long-running sandbox
sb = Sandbox(persistent=True)
sb.run_code("""
    import pandas as pd
    model = train_giant_model()   # 10 minutes
    df = load_big_dataset()       # 3 minutes
""")
# ... idle for 30 minutes, you go to lunch ...
# Reaper freezes it. Your dashboard shows a new Space.

# Later — same or different day — resume.
space_id = "<id from dashboard>"
sb = Sandbox.resume(space_id, region="us-west")

# `model` and `df` are still in memory. No re-training, no re-loading.
sb.run_code("print(model.predict(df.sample(10)))")
Single-shot resume: each Space is consumed by one resume, then gone. Re-freeze by creating the resumed sandbox with persistent=True.

7. Multi-region, automatically routed

api.podflare.ai is a Cloudflare Worker that geo-routes your requests to the nearest region, with automatic failover on 5xx. SDK 0.0.20 defaults here. Or pin explicitly:
sb_usw = Sandbox(region="us-west")   # Latitude SJC
sb_eu  = Sandbox(region="eu")        # Hetzner Helsinki

# Go direct-to-origin (skip the edge). Only faster when your caller
# is in the same DC as the region; from residential wifi the edge
# is usually faster because CF's PoP is closer than the origin.
sb_fast = Sandbox(host="https://usw1.podflare.ai")
Measured from California residential wifi, 100 iter, SDK 0.0.20: edge-routed p99 = 221 ms, direct-to-usw1 p99 = 483 ms. The “extra hop” through Cloudflare is shorter wall-clock than the public-internet route to the origin. Direct is reserved for in-cloud callers that are already co-located with a specific region.

What a sandbox is not

  • Not a container. Each sandbox is its own kernel, its own page tables, its own virtualized hardware. Container escapes don’t apply.
  • Not a serverless function. No 15-second timeout, no cold-start penalty per invocation. State persists across run_code calls; the VM stays alive for the idle / max-lifetime window you pick.
  • Not a shared REPL service. Each sandbox belongs to exactly one caller. Cross-tenant visibility at the network, memory, or filesystem layer is not possible by construction.

Tier caps (silent clamps, no errors)

If your API key’s tier doesn’t allow the value you requested, we silently clamp instead of erroring — your code still runs, just with the tier-appropriate ceiling. See pricing.
CapabilityFreeProScaleEnterprise
Max RAM per sandbox1 GB4 GB16 GB64 GB
Max rootfs4 GB16 GB64 GB512 GB
Max idle timeout5 min30 min2 h24 h
Max sandbox lifetime30 min8 h24 h7 d
Concurrent sandboxes105050010 000
Persistent Spaces

The minimum program

If all you want is to prove to yourself that the whole thing works:
pip install podflare
import os
from podflare import Sandbox

# Create a key at https://dashboard.podflare.ai/keys
os.environ["PODFLARE_API_KEY"] = "pf_live_..."

with Sandbox() as sb:
    sb.run_code("pip install cowsay", language="bash")
    out = sb.run_code("import cowsay; cowsay.cow('podflare')")
    print(out.stdout)
That’s the whole onboarding. Real compute, real network, real state, in one with block.

Fork

Tree-search branching in ~80 ms

Persistent REPL

Why your variables survive across calls

Warm pool

How create() returns in 7 ms

Pricing

What tier unlocks what