Skip to main content
fork(n) is Podflare’s core primitive for agent tree-search. Call it on any running sandbox and you instantly get n independent copies, each starting from the parent’s exact state — same Python variables, same imported modules, same files on disk. From that point, every child diverges in complete isolation: memory writes and filesystem writes are copy-on-write, so nothing one child does affects another. The practical result: you load expensive data once into a parent sandbox, fork N ways to explore N plans in parallel, compare results, and promote the winner — without re-loading data or re-running setup code in each branch.

How to use fork

from podflare import Sandbox

with Sandbox() as parent:
    # Load data once
    parent.run_code("""
        import pandas as pd
        df = pd.read_csv('/data/medium.csv')
    """)

    # Explore 5 hypotheses in parallel
    plans = [
        "print(df['col_a'].mean())",
        "print(df['col_b'].mean())",
        "print(df[df.col_a > 0].shape)",
        "print(df.describe().loc['mean'])",
        "print(df.corr().iloc[0])",
    ]

    children = parent.fork(n=len(plans))
    try:
        results = [c.run_code(p) for c, p in zip(children, plans)]
        winner = children[pick_best_index(results)]
        parent.merge_into(winner)
    finally:
        for c in children:
            c.close()

What each child inherits

Every child starts from a snapshot taken at the moment you called fork():
InheritedNot shared after fork
Python REPL state (variables, imports, open file handles, in-memory objects)Subsequent memory writes (copy-on-write)
Filesystem state (all files written to the parent’s rootfs)Subsequent filesystem writes (copy-on-write)
Full process tree (every running process in the VM)Each child’s vsock connection (independent control channel)
Children run their REPL and filesystem fully independently. A crash in one child never affects siblings or the parent.

Fork timing

Fork is fast because children boot in parallel and share memory pages until they diverge. Measured on production hardware:
n childrensnapshotrebasespawn (parallel)total
172 ms7 ms13 ms92 ms
279 ms9 ms14 ms102 ms
575 ms9 ms17 ms101 ms
Spawn cost is nearly flat in n because all children restore concurrently — adding more children doesn’t proportionally increase total time.

Comparing forks with diff()

Use diff(other) to compare filesystem state between two sandboxes forked from the same parent:
a, b = parent.fork(n=2)
a.run_code("with open('/root/a.txt', 'w') as f: f.write('from a')")
b.run_code("with open('/root/b.txt', 'w') as f: f.write('from b')")

d = a.diff(b)
# {"added": ["/root/b.txt"], "removed": ["/root/a.txt"], "modified": []}
By default, diff() compares /root and /tmp. Pass paths=[...] to compare other directories.

Promoting a winner with merge_into()

Once you’ve identified the best fork, promote it as the parent’s new state:
parent.merge_into(winner)
# parent.id is still valid — it now drives winner's VM.
# Calling winner.close() after merge_into is a no-op.
# Close any other children you no longer need.
for c in children:
    if c is not winner:
        c.close()
After merge_into(winner), the parent sandbox continues with the winner’s memory and filesystem state. The winner’s sandbox ID becomes defunct — use the parent ID going forward.

Limits

fork(n) accepts n between 1 and 32. Fanouts larger than 32 are not yet supported.
  • Maximum n: 32 children per fork() call
  • fork() requires a sandbox created from the API (pool-backed or a previous fork child). Sandboxes created in local dev mode without pool support are not forkable.

Python REPL

How REPL state persists and what fork inherits

Sandboxes

Sandbox lifecycle and isolation guarantees