Skip to content

Support Distributed Execution for BranchContext #1

@congwang-mk

Description

@congwang-mk

BranchContext is single-node. To scale parallelism (BestOfN n=100, wide BeamSearch), we need distributed task dispatch. Most AI agents are lightweight processes making LLM API calls, they run as containers in Kubernetes. No need for Ray, Modal, or custom gRPC. Just K8s Jobs.

Deployment Model

COORDINATOR (pod or bare metal) WORKER PODS (K8s Jobs)
┌──────────────────────────┐ ┌──────────────────────────┐
│ BranchFS FUSE daemon │ RWX PVC │ No BranchFS daemon │
│ │ ─────────────→ │ /mnt/branchfs/ (PVC) │
│ Pattern (Speculate, etc) │ │ read/write only │
│ workspace.branch() │ │ │
│ → ioctl create │ │ branching-worker: │
│ K8s: ConfigMap(task) │ │ read task from ConfigMap│
│ K8s: create Job ───────┼── K8s API ───→ │ fn(*args) │
│ K8s: watch completion │ │ print result to stdout │
│ ← pod stdout (JSON) │ │ exit │
│ b.commit() or abort() │ │ │
│ → ioctl │ │ │
└──────────────────────────┘ └──────────────────────────┘
│ │
└──── ReadWriteMany PVC (EFS/Filestore/CephFS/NFS) ────┘

Storage-agnostic via PVC: The shared filesystem is a K8s PersistentVolumeClaim with ReadWriteMany access mode. The admin provisions it with whatever backend their cluster supports: AWS EFS, GCP Filestore, Azure Files, CephFS (Rook), or plain NFS. The executor only needs the PVC name, not storage-specific details.

Coordinator owns all branch decisions: Workers never call create/commit/abort (those are ioctls local to
the BranchFS FUSE daemon). Workers only read/write files in the branch directory via the shared PVC.
The pattern thread on the coordinator receives the worker's return value and decides commit/abort. This
is already how the code works — run_in_process() returns a value, the pattern acts on it.

Task input via ConfigMap: The serialized callable (cloudpickle) is stored in a K8s ConfigMap and mounted
into the worker pod at /etc/branching/task.pkl. The branch directory stays clean — no ._task.pkl
pollution. ConfigMaps are etcd-backed K8s objects with RBAC, lifecycle management, and garbage
collection via ownerReferences (auto-deleted when the Job is deleted). For typical agent callables
(<1KB), well within the 1MB ConfigMap limit.

Result passing via pod stdout: JSON on stdout. Coordinator reads via K8s read_namespaced_pod_log() after
Job completes. No filesystem IPC for results, no polling — K8s watch API is event-driven.

Data flow through K8s API objects (all etcd-backed):

Task callable → ConfigMap (etcd write)
Task dispatch → Job (etcd write)
Completion → Job watch (etcd watch — event-driven, no polling)
Result → Pod logs (k8s API read)
Cleanup → ownerRef (ConfigMap auto-deleted with Job)
Cancellation → Job delete (etcd delete → pod killed)

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions