Brainstorm · operating modes

How containerised coding agents actually run

Eight operating modes for running claude / codex / copilot inside a container (HolyClaude or alt). For each: container topology, what works, what doesn't, when to use, agent compat. Then the cross-cutting capability matrix, Stevie's direct questions answered, and an honest list of what can't be done at all.

Date2026-05-15 Forainb bossmode refactor BackendHolyClaude (pinned :v1.2.2) Pre-design · for discussion
TL;DR · five things to internalise
  • Container ≠ session. One HolyClaude container can host many short-lived claude -p processes (or hold one long-lived claude REPL). The container is the sandbox; the process is the agent.
  • CloudCLI UI is optional. Each container ships one CloudCLI on internal :3001. If you spawn N containers you have N CloudCLI instances. Map host ports as you see fit — or map none and use docker exec only.
  • Session resume works if session files persist. claude -c / --resume read ~/.claude/projects/<hash>/<uuid>.jsonl. Bind-mount the auth tree → resume works across container restarts.
  • Cross-agent context sharing is impossible without a translation layer. claude and codex have different session formats and don't read each other's files.
  • "Bossmode" is one mode among many. Today it's mode 1 (fire-and-forget). The refactor is a chance to enable modes 2–8 without breaking mode 1.
01

Eight operating modes

Each card: a topology sketch (host · containers · workspace mount · who-talks-to-who) · what works · what doesn't · when to reach for it · which agents support it. Read them as a spectrum from ephemeral & cheap (mode 1) to long-lived & complex (modes 5–7).

01 · Fire-and-forget

Fire-and-forget

Ephemeral container per task. claude -p "$PROMPT" runs, JSON streams out, process & container exit. Current ainb bossmode.

AINB HOST tui spawn HolyClaude · ephemeral claude -p "$P" json stream /workspace ← bind mount
Works
  • Single-prompt task with structured JSON output
  • Strong per-task isolation — crash one, others unaffected
  • No state leakage between tasks
  • Easy to test (deterministic lifecycle)
Doesn't work
  • No follow-up — container gone before you can attach
  • ~3GB image pull on cold start hurts first-task latency
  • Can't resume previous session — no persistent claude context
  • Wastes the CloudCLI UI (gone before you'd open it)
use whenscripted one-shot tasks · CI · "explain this file" / "fix this bug" / "write tests for X"
claude
codex
copilot
02 · Sticky single-task

Sticky single-task

Fire-and-forget, but container survives after claude -p exits. User can docker exec -it claude -c to TTY-attach the same session — or just hand-inspect logs.

AINB HOST tui spawn HolyClaude · sticky claude -p (exited 0) claude -c (on attach) docker exec -it · later
Works
  • Read logs / status long after task finished
  • Attach interactive REPL post-hoc (claude -c)
  • Open CloudCLI UI for a forensic browse
  • Resume the exact session that ran (jsonl is still there)
Doesn't work
  • Containers accumulate — need a reaper or explicit kill
  • Disk + memory cost grows with each kept-alive task
  • Auth tree shared if multiple sticky containers live concurrently — one bad token can cascade
use whenyou might want to follow up · forensic debugging · "what did it do exactly?"
claude
codex
copilot
03 · Interactive REPL

Interactive REPL

Long-lived claude (no -p) in TTY. User drives via docker exec -it or via CloudCLI browser UI on :3001. Replaces ainb's current tmux Interactive mode.

USER BROWSER :3001 or tmux stdio HolyClaude · long-lived CloudCLI :3001 claude (REPL · TTY) /workspace · /home/claude/.claude
Works
  • Native multi-turn conversation
  • User can interrupt / steer mid-response
  • Tool approval prompts (no --dangerously-skip-permissions needed)
  • CloudCLI UI in browser if the user wants it
Doesn't work
  • Hard to programmatically inject prompts (TTY semantics)
  • No structured JSON output — ainb can't parse events live
  • Container locks one host port (per-container :3001 mapping)
use whenexploratory work · pair programming · agent assists you in real-time
claude
codex
copilot
04 · Multi-turn programmatic

Multi-turn programmatic

Long-lived container, ainb drives via repeated docker exec claude -p calls — each new prompt passes --resume <session-id> to keep context. Structured output, no TTY.

AINB orchestrator prompt N exec HolyClaude · long-lived claude -p --resume <sid> turn 1: -p "init" turn 2: -p --resume "follow up" turn N: -p --resume "..."
Works
  • Multi-turn conversation driven by ainb (orchestrate the chat)
  • Structured JSON output preserved — live event parsing
  • Cheap turns (no container restart) once container is up
  • Session jsonl persists, restartable across machine reboots
Doesn't work
  • No mid-turn interrupt — each claude -p is atomic
  • Context window still bounded — long sessions need compaction strategy
  • Hardcoded prompt prefixes (current bossmode) bloat every turn
use whenautonomous task with planned multi-step flow · ainb-driven dialog · agent-as-API
claude
codex
copilot
05 · Babysitter / autonomous loop

Babysitter loop

Agent receives a goal, self-loops with periodic status reports to ainb. Maps to /loop + cloud-coding-agent patterns. Human can interrupt; agent owns the next-action decision.

AINB babysitter poll · interrupt goal HolyClaude · self-driving claude w/ goal loop → plan → act (tool calls) → report status ↑ loop until done heartbeat → ainb
Works
  • Long-running goals without human babysitting
  • Status visibility via periodic JSON heartbeats
  • Interrupt-able from ainb (kill, pause, send new instructions)
  • Cost-cap-able (ainb stops when budget exceeded)
Doesn't work
  • No native loop primitive in claude CLI — ainb has to define the protocol
  • Stall detection is hard (agent thinking vs agent stuck)
  • Cost runaway without a hard cap; /loop guardrails apply
  • Reproducibility low — same goal can produce different paths
use whenfix-this-issue-end-to-end · long refactor · "build me a goal" / set-and-forget
claude
codex
copilot
06 · Parallel tasks · one container

Parallel tasks · one container

N concurrent docker exec claude -p calls inside the same HolyClaude container. Matches HolyClaude's own design. Shared auth + workspace tree.

AINB fan out N prompts parallel HolyClaude · shared claude -p (task A) claude -p (task B) claude -p (task C) shared /workspace · shared auth
Works
  • Spawn extra tasks for free (no container cold start)
  • Lower memory footprint than N containers
  • Matches HolyClaude's own one-container-per-user model
Doesn't work
  • Workspace race conditions if two tasks edit the same file
  • Shared CLAUDE.md memory — tasks pollute each other
  • Auth rate-limits collapse onto one Max plan account
  • Kill one task ≠ kill the others — careful PID management
use whenmultiple read-only tasks · "review these 5 files in parallel" · independent sandboxes within one project
claude
codex
copilot
07 · Multi-agent comparison

Multi-agent comparison

Same prompt → claude + codex + copilot in parallel (separate processes or separate containers). Diff the outputs. Useful for "which agent solves this better" + cross-validation.

AINB fan / diff same prompt 3 agents 3 sandboxes container A claude -p container B codex -e container C gh copilot suggest
Works
  • Side-by-side agent quality comparison
  • Pick best output without committing upfront to one agent
  • True isolation — separate containers, no cross-contamination
Doesn't work
  • 3× cost (3 LLM calls + 3 containers)
  • 3× workspace state — diff merge is a UX problem
  • No native cross-agent context sharing (each starts fresh)
  • Voting / consensus needs an orchestration layer ainb doesn't have
use whenhigh-stakes change · benchmark / eval runs · "let's see what each thinks"
claude
codex
copilot
08 · Headless CI

Headless CI

No CloudCLI UI exposed, no TTY, no human in loop. claude -p runs · exit code is the signal · stdout is the artifact. Built for build agents.

CI RUNNER no UI GHA / etc spawn HolyClaude · headless claude -p (no --verbose) no :3001 mapped exit code → CI verdict
Works
  • Reproducible, exit-code-driven automation
  • API-key auth via env (no browser OAuth needed)
  • No port mapping = smaller attack surface
  • Pin to a Claude model version for deterministic-ish CI
Doesn't work
  • Most CI runners can't grant SYS_ADMIN / seccomp=unconfined — HolyClaude's Chromium will fail
  • No recovery path if auth lapses mid-run
  • Stream parsing for live dashboards harder (no --verbose for log volume)
use whenPR-triggered tasks · scheduled rollouts · code-burn audits · machine workflows
claude
codex
copilot
02

Capability matrix

What each mode supports across the dimensions Stevie's questions touch. Yes · Partial · No · N/A.

Mode Resume
previous
conversation
Switch
agent
mid-flight
Live
JSON
stream
File-diff
preview
during exec
Port-forward
dev server
CloudCLI
UI
accessible
Multi-prompt
continuity
Survive
container
restart
Cost
trackable
01 · Fire-and-forgetclaude -p, ephemeral No No Yes Partial No No No If jsonl persisted Yes
02 · Sticky single-tasksurvive after exit Yes No Yes Yes Yes Yes Yes Yes Yes
03 · Interactive REPLtty, long-lived Native No No (TTY) Yes Yes Yes Native Yes Partial
04 · Multi-turn programmaticrepeated -p --resume Yes No Yes Yes Yes Yes Yes Yes Yes
05 · Babysitter loopself-driving Yes No Partial Yes Yes Yes Yes Yes Required (cap)
06 · Parallel · one containershared workspace Per-task No Yes Race risk Single port One UI Per-task Yes Yes
07 · Multi-agent comparisonclaude+codex+copilot Per-agent Per-container Yes 3-way diff 3 ports 3 UIs Per-agent Yes 3× cost
08 · Headless CIno UI, exit-code If jsonl in artifact No Yes No No N/A No Cache-dependent Yes
03

Stevie's questions answered

Direct answers, with which modes apply. No ducking.

If we have multiple containers, do we get multiple CloudCLI UIs?

Yes — one CloudCLI per HolyClaude container. Each container ships a CloudCLI server bound to its internal :3001. To reach it from the host, you must map a unique host port per container — :3001, :3002, :3003, … — or rotate which container holds :3001 at any moment, or skip CloudCLI entirely and use docker exec.

For ainb: the cleanest default is no CloudCLI exposure (use docker exec + JSON streaming), with an opt-in flag (ainb container ui) that picks an unused host port and tells the user where to point their browser. CloudCLI becomes a forensic / interactive escape hatch, not the primary UX.

Modes affected: 02, 03, 06, 07 (UI-relevant). Modes 01, 04, 05, 08 typically skip the UI.

Can I issue something else in that worktree while a task runs?

Yes — depends on which mode. In mode 6 (parallel-in-one-container), just docker exec a second claude -p. In mode 7 (multi-container), spawn a second container with the same workspace bind mount. In mode 1 (fire-and-forget ephemeral), you'd have to wait or spawn a peer container in parallel. Mode 3 (REPL) blocks because the user is mid-interaction with one process.

Warning: concurrent tasks against the same worktree files will race. If both tasks edit src/foo.rs, last-writer-wins. Mitigation: per-task worktrees (git worktree branch-per-task) or read-only mounts for non-mutating tasks.

Modes that support it cleanly: 06 (parallel-in-one), 07 (multi-container). Mode 03 blocks (one REPL = one user attention).

How do I look through status / result / whole execution after the fact?

Three layers, depending on retention policy.

Live: ainb TUI tails the container's stdout (current bossmode pattern, parses stream-json). Post-hoc, container alive: docker logs <id>, or open CloudCLI UI on the container's :3001 mapping. Post-hoc, container gone: persisted logs at ~/.agents-in-a-box/holyclaude/logs/<session-id>.jsonl if ainb saves them on container teardown — and the session jsonl files at ~/.agents-in-a-box/holyclaude/claude/projects/<hash>/<uuid>.jsonl are always recoverable via the bind mount.

Interactive review: docker exec -it <id> claude -c opens the exact same session in a TTY for follow-up questions — works as long as the container is alive. Mode 2 (sticky) keeps this option open by design.

Best supported by: 02 (sticky) for live forensic access · 08 (CI) for artifact-driven review.

Can I continue conversation from a previous session?

Yes, with the right plumbing. claude CLI supports claude -c (continue most recent in cwd) and claude --resume <session-id> (specific session). Both read jsonl from ~/.claude/projects/<project-hash>/<uuid>.jsonl. Because HolyClaude bind-mounts ~/.claude from ./data/claude/ (in ainb's case: ~/.agents-in-a-box/holyclaude/claude/), these files persist across container restarts.

For ainb: surface a "resume this session" affordance in the TUI's session list. The trick is mapping a session UUID back to a human-readable label (current bossmode uses worktree branch + first prompt as the label — keep that).

Best supported by: 02 (sticky), 03 (REPL), 04 (programmatic), 05 (babysitter). Mode 01 can do it too if jsonl persisted.

Can I resume the same context after fire-and-forget exits?

Yes, as long as the jsonl survives. In mode 1, the container is gone but the bind-mounted jsonl is on host disk. Spawn a new container with the same bind mount, claude --resume <uuid>, and you pick up exactly where the previous run left off.

This is the bridge from mode 1 → mode 4 (multi-turn programmatic). The very same session can be "fire-and-forget today, follow up tomorrow" without changing mode operationally — just spawn a fresh container against the persisted state.

Architectural impact: ainb must record the session-id from each run (it's in the stream-json output's init event) so it can pass it back on resume.

After fire-and-forget, can I start an interactive session on top to give more instructions?

Yes — two clean paths.

Path A (recommended for ainb): Spawn a fresh container with the persisted bind mount, run claude -c in TTY via docker exec -it (mode 3 lifted onto the bones of mode 1). Context resumes, user types follow-up, agent responds. Closes mode 3 when user exits TTY.

Path B: Use mode 2 (sticky) from the start — never let the original container exit. Then docker exec -it <id> claude -c on the still-running container. Faster (no cold start) but commits to the "container keeps running" cost.

UX hook: "Continue this task" button in the TUI's session list — defaults to Path A, falls back to Path B if the container is still alive.

04

What we can't do at all

Honest constraints. These aren't trade-offs to optimise — they're walls. Don't promise these in the UI.

  • True conversation forking — branching a mid-conversation agent state into two divergent paths needs CRIU-style process snapshotting of the claude CLI's in-memory state. Docker checkpoints exist but are flaky and don't preserve LLM connection state. Closest available: clone the jsonl, restart in two parallel containers, accept that the two paths see the world from slightly different starting points.
  • Cross-agent session sharingclaude can't read codex's session file (and vice versa). Different schema, different storage location, different mental model. To use claude's context in codex, ainb would need a translation layer that exports claude's jsonl to plain markdown / chat-format and feeds it as a fresh prompt to codex. Lossy by definition.
  • Hot-swap container mid-task without losing state — Docker has no live-migrate for arbitrary processes. If you want to upgrade the HolyClaude image while a task runs, the task dies. Wait for idle, then swap.
  • Multi-agent voting / consensus — mode 7 fans out, but the "pick the best" or "merge into one" step needs an orchestrator ainb doesn't have. Could be a future plugin (mode 7 + a judge LLM); not free.
  • True streaming pause / resume mid-promptclaude -p has no pause primitive. The closest: kill the process, lose the partial output, resume via --resume from the last committed turn. Interactive mode 3 lets the user interrupt with Ctrl-C, but that's a kill, not a pause.
  • Native cost cap inside claude CLI — claude doesn't expose a "stop after N tokens" hard limit. Ainb has to enforce caps externally by watching usage events from stream-json and killing the container if budget exceeded. Crude but workable.
  • Headless OAuth refresh inside a CI runner — if the bind-mounted token expires mid-run, the user (a human with a browser) is required to re-auth. CI uses ANTHROPIC_API_KEY for exactly this reason. Don't try to OAuth in CI.