ainb × Multica

From local CLI to managed-agents control plane

Implementation plan · May 2026 · 6 research reports · 3,220 lines of findings
31.2k
Multica GitHub stars
~45%
ainb primitives reusable
8 phases
sequenced delivery
~16 wks
estimated build
01

What is Multica

Multica is an open-source managed-agents platform: you assign GitHub issues to AI agents through a web Kanban, a daemon on your machine claims tasks, spawns agent CLI subprocesses, and the PR ships while you watch live transcript updates.

Multica — end-to-end architecture Clients to Go server to optional cloud-runtime fleet · per-user daemon spawns agent CLIs CLIENTS Next.js Web apps/web — Kanban Electron Desktop apps/desktop Expo Mobile (iOS) apps/mobile — read-only multica CLI Go binary — same as daemon GitHub App PR mirror + webhook TRANSPORT HTTPS REST · WSS /ws (user hub) · WSS /api/daemon/ws (daemon hub) · JWT · PAT mul_… · daemon mdt_… GO SERVER (single binary) Chi router + middleware Auth · WS upgrade · RBAC service.* layer TaskSvc · AutopilotSvc · Email realtime.Hub user fan-out WS daemonws.Hub daemon wakeup WS events.Bus in-proc pub/sub Background sweepers queued/dispatched/running TTLs Analytics + Metrics PostHog · Prometheus sqlc queries agent_task_queue · issue · skill Cloud-runtime client (proxy stub) to closed-source fleet — paid upsell STORAGE PostgreSQL 17 + pgvector 98 migrations · sqlc · pg_cron Redis (optional) PAT cache · empty-claim · relay Cloud-runtime fleet (external, closed) MULTICA_CLOUD_FLEET_URL — paid PER-USER DAEMON (lives on user machine, same Go binary) Heartbeat loop WS push · HTTP fallback Poll loop (3s) claim queued tasks Workspace sync membership refresh GC loop .gc_meta.json driven Auto-update drain-wait Per-task isolated env directory ~/multica_workspaces/[ws]/[task]/{workdir,output,logs,.gc_meta.json} Agent CLI subprocess claude · codex · openclaw · hermes · kimi · cursor · copilot · gemini · pi · opencode · kiro REST claim/start/complete WSS task_available wakeup Server never sees the LLM call. Sees: result, session_id, work_dir, usage. Daemon owns the data plane.
02

Task lifecycle state machine

Every agent assignment becomes a task row. State transitions are driven by the daemon (claim/start/complete/fail) and by server-side TTL sweepers. Retries create new rows with parent_task_id — old row stays for audit.

Multica task lifecycle — state machine queued to dispatched to running to terminal · with TTL sweepers and bounded retry enqueue queued dispatched running completed failed cancelled ClaimTask StartTask Complete Fail Cancel ReclaimStaleDispatchedTask (90s) queued 2h TTL → failed(timeout) dispatched 5m TTL → failed(timeout) running 2.5h TTL → failed(timeout) auto-rerun if attempt < max + retryable failure_reason INVARIANTS & SOURCE • Partial unique index idx_one_pending_task_per_issue WHERE status IN (queued,dispatched) — at most one pending per issue (migration 022) • Idempotent finalize: if UPDATE matches 0 rows but row already in terminal state, return success (task.go:1010) — defeats WS-vs-HTTP race • Retry creates new row with parent_task_id pointing at old; only failure_reason in (runtime_offline, runtime_recovery) is retryable (migration 055)
03

User journey — assign issue, agent ships PR

The complete path from human action to merged PR. The WS-driven UI watching task lifecycle and comments is what makes the experience feel like a teammate rather than a CLI tool.

Multica user journey — assign issue to agent, agent ships PR Sequence diagram · the path Stevie wants to replicate verbatim Human (Web) Multica Server Daemon Agent CLI Git/GitHub POST /api/issues (assignee=agent_alice) INSERT issue + EnqueueTaskForIssue events.Bus.Publish WSS issue:created broadcast WSS daemon:task_available POST /api/daemon/runtimes/{id}/tasks/claim task + agent_skills + brief prepare per-task env write skills · clone repo .gc_meta.json POST /tasks/{id}/start exec() agent CLI subprocess LLM loop · tool calls edit files · run tests progress comments progress JSON (CLI hook) POST /tasks/{id}/progress (comment) WSS comment:created git push branch remote ref created gh pr create PR url GitHub App webhook pull_request.opened WSS issue:pr_linked exit 0 + result JSON + session_id + usage POST /tasks/{id}/complete WSS task:completed (UI updates) Net effect: human did one action (assign issue) — agent went off and shipped a PR — UI never blanked, never refreshed manually. The WS-driven UI watching task lifecycle + comments is the magic that makes it feel like a teammate, not a CLI tool.
04

Patterns worth copying wholesale

Multica made eight architectural bets that consistently reduced blast radius. These should be replicated verbatim in ainb v2 — the invariants are locked in DB constraints and middleware, not conventions.

05

ainb today — what exists

ainb is a local-first Rust workspace with a TUI, plugin host, toolkit, and knowledge layer. It has no server, no web UI, and no persistent daemon — all state lives on the filesystem. The gap to Multica is real but largely additive.

ainb today — current architecture (May 2026) Filesystem-centric · local-first · zero server · TUI-only UI · plugin host · swarm orchestration USER SURFACE Ratatui TUI burndown · agents · usage · width-aware ainb CLI subcommands via clap Claude Code (host harness) skills · slash commands · MCP Subagent fleet 100+ agents in toolkit/ RUST CORE (workspace) ainb-core burndown · usage analyzer Plugin host v2 JSON-RPC subprocess · cap-gated Multi-provider abstraction Claude · Codex · Copilot · Gemini Plugin manifest + SDK TOML · capability tokens · CTS reflect-kb (Python + GraphRAG) QMD store · entity sidecars · vector + graph beads CLI git-backed issue tracker swarm-lib.sh topological dispatch claude-peers MCP inter-instance messaging EXECUTION (tmux + filesystem) tmux sessions watchdog daemon · auto-recovery Agent worktrees (volatile) git worktree per task · short-lived Claude Code subprocesses or Codex/Copilot/Gemini Skills materialized toolkit/packages/skills STATE (filesystem only — no server) Beads SQLite DB (per worktree, BEADS_DIR) JSON files + session.yaml ad-hoc state per script Git commits + branches PR history is the audit log QMD knowledge base reflect-kb global archive WHAT AINB ALREADY HAS THAT MULTICA DOESN'T • reflect-kb GraphRAG knowledge compounding · 170+ learnings indexed · cross-session retrieval (Multica has nothing) • Plugin host v2 with capability-gated JSON-RPC + CTS (Multica has no plugin layer at all) • Topological swarm dispatch with watchdog (Multica's Squad routing is simpler) · 9-tool portability via single skill format
06

Capability gap — ainb vs Multica

Coverage is current ainb coverage of that Multica feature. Effort is engineering cost to close the gap. ~45% of Multica primitives map directly onto existing ainb components.

Feature
Coverage
Effort
ainb mapping / notes
Persistent daemon (heartbeat + poll + GC)
25%
L
tmux watchdog exists; needs WS heartbeat + HTTP fallback + per-task GC loop
CLI auth (JWT + PAT)
40%
M
no auth today; PAT pattern (ainb_ prefix) is ~200 LoC
AgentRuntime registry (per-daemon provider config)
5%
L
multi-provider abstraction covers CLI selection; no runtime registration API
Issue tracker / task queue (Beads bridge)
55%
M
beads SQLite is the leverage point — two-way sync adapter is the key P2 deliverable
Agent-as-assignee (polymorphic actor)
10%
M
assignee_type/assignee_id schema pattern; ainb has agent profiles but no DB row
Agent profiles / templates
20%
M
toolkit/agents/ has 37 agents; need DB rows + embedded JSON templates
Squads (agent groups)
35%
M
swarm teams are the analogue; need squad DB entity + routing
Multi-workspace / multi-tenant
10%
L
BEADS_DIR per-worktree is single-tenant; need workspace table + membership
Web dashboard (Kanban + transcript)
0%
XL
zero web today; Next.js app is P3 — biggest build item by surface area
Desktop app
0%
XL
Tauri preferred over Electron (Rust-native); P8 optional
WebSocket realtime (dual hub)
0%
L
tokio-tungstenite + axum-ws; split user hub / daemon hub from day 1
Skill server-side storage + dispatch
15%
M
toolkit/packages/skills exists; need DB rows + materialization at claim time
Skills compounding (reflect-kb)
30%
S
reflect-kb GraphRAG already works; wire learnings into skill templates
Autopilots / cron triggers
0%
M
pg_cron + cron expression on issue row; P7
Webhook triggers (GitHub App)
0%
M
GitHub App + webhook handler; P7
@mention routing
15%
M
claude-peers MCP is point-to-point; need @mention parser + polymorphic routing
Task lifecycle FSM (all invariants)
25%
M
P1 core deliverable; partial unique index + sweepers + idempotent finalize
Usage analytics (token + cost)
70%
S
ainb usage analyzer already tracks per-session cost; expose via API
07

ainb v2 target architecture

The target collapses CLI + daemon + server into one Rust binary (same trick Multica uses). New modules are marked ★. Existing ainb primitives are reused as-is. The Beads sync adapter is the leverage move — Stevie keeps using Beads UX while the server backfills the task table.

ainb v2 — proposed target architecture (multica-shaped, ainb-flavored) ★ = new build · plain = reuse existing ainb primitive · ringed clay = adapted/extended USER SURFACE ainb TUI (Ratatui) extended w/ task board ainb Web (Next.js) Kanban · transcript · skills ainb Desktop (Tauri) Rust-native vs Electron ainb CLI extended w/ task verbs Subagents · Claude Code · slash /skills harness-side workflows TRANSPORT (★ all new) ★ HTTPS REST · ★ WSS /ws (user hub) · ★ WSS /api/daemon/ws (daemon hub) · cookie JWT · PAT ainb_ · daemon adt_ tokens ★ AINB SERVER (Rust · Axum · single binary, same as CLI/daemon) Axum router + tower middleware auth · rate-limit · RBAC service layer (Rust) TaskSvc · BeadsSync · SkillSvc realtime hub (WS) user fan-out daemon hub (WS) wakeup events bus (tokio mpsc) in-proc pub/sub Background sweepers queued/dispatched/running TTLs Observability tracing + OTEL from day one sqlx queries (split by aggregate) issue · task · skill · agent Beads sync adapter two-way: beads ⇄ task table STORAGE PostgreSQL (or SQLite for solo) split schemas: issue/task/skill/agent Redis (optional) PAT cache · WS relay Beads SQLite (existing) synced two-way via adapter reflect-kb (existing) GraphRAG learnings store DAEMON (★ new — but wraps existing tmux/swarm/worktree primitives) Heartbeat (WS+HTTP) bimodal Poll + WS wakeup claim queued Per-task env worktree + skills tmux session adapter reuse swarm-lib.sh Plugin host v2 calls reuse existing Per-task isolated env (existing worktree pattern) git worktree add · ainb skills written under provider paths Agent CLI subprocess (existing multi-provider) claude · codex · copilot · gemini · cursor · openclaw — via plugin host REST claim/start/complete WSS task_available wakeup LEGEND existing ainb (reuse as-is) ★ new build Total ★ count: ~24 new modules / ~12 reused — see roadmap diagram for sequencing Daemon, CLI, and server collapse into ONE Rust binary with subcommands — same trick Multica uses, hugely simplifies release
08

Execution roadmap — 8 phases, ~16 weeks

Sequenced so a usable artifact ships at the end of every phase. No big-bang. Beads sync adapter (P2) is the leverage move — Stevie keeps the existing Beads UX while the server backfills the task table underneath.

ainb v2 execution roadmap — 8 phases over ~16 weeks Sequencing chosen so a usable surface ships at the end of every phase · no big-bang W1 W2 W3 W4 W5 W6 W7 W8 W9 W10 W11 W12 W13 W14 W15 W16 P0 Schema + server skeleton Postgres migrations · Axum router · sqlx setup · health endpoint P1 Task lifecycle + daemon v1 queued→dispatched→running state machine · per-task env dir · single-provider P2 Beads sync adapter Two-way: beads issue ↔ task row · ID mapping table P3 Web UI minimal Next.js: issue list · transcript view · agent picker · auth screen P4 WebSocket realtime Two-hub split · scope subscribe · idempotent finalize P5 Multi-tenant + auth Workspace · membership · PAT · daemon token · polymorphic actor P6 Skills server-side Skill aggregate · dispatch-time materialization · embedded templates P7 Polish: GitHub PR mirror, autopilots, observability GitHub App · cron + webhook triggers · OTEL traces · Prometheus P8 Cloud runtime + billing (later) Optional · proxy stub mirroring Multica's open-core boundary TODAY (2026-05-22) M1: daemon claims a task M2: web Kanban live M3: multi-tenant + skills M4: v1.0 ainb-as-control-plane Each phase ends with a usable artifact. Beads sync adapter (P2) is the leverage move — it lets Stevie keep using beads UX while the server backfills the task table.
P0 W1-2
Schema + server skeleton
Axum binary compiles, health endpoint returns 200, Postgres migrations run, sqlx types generated
axumsqlxpostgres
P1 W2-5
Task lifecycle + daemon v1
queued→dispatched→running→terminal FSM; per-task env dir; single-provider claude subprocess; sweepers running
daemontokiofsm
P2 W4-5
Beads sync adapter
Two-way bridge: beads issue ↔ task row; ID mapping table; Stevie keeps beads UX
beadsbridge
P3 W4-7
Web UI minimal
Next.js: issue list, transcript view, agent picker, auth screen, basic Kanban
nextjskanban
P4 W7-8
WebSocket realtime
Dual-hub split (user fan-out + daemon wakeup), scope subscribe, idempotent finalize
websockettokio-tungstenite
P5 W8-10
Multi-tenant + auth
Workspace table, membership, PAT (ainb_), daemon token (adt_), polymorphic actor assignee
authrbacmulti-tenant
P6 W10-12
Skills server-side
Skill aggregate in DB, dispatch-time materialization, embedded templates in binary
skillsembed
P7 W12-14
Polish — PR mirror, autopilots, observability
GitHub App + webhook, cron + trigger rules, OTEL traces, Prometheus metrics
githubcronotel
P8 W14-16
Cloud runtime + billing (optional)
Proxy stub mirroring Multica's open-core boundary; monetize fleet later
optionalcloud
09

Security posture + distinguished engineer warnings

Multica has three security issues that must not be replicated. The DE audit also flagged five scaling cliffs. All eight should be addressed explicitly in the ainb v2 design — not deferred.

Issue
Severity
ainb v2 mitigation
isBlockedEnvKey blocklist misses LD_PRELOAD, DYLD_INSERT_LIBRARIES, NODE_OPTIONS
high
Use an allowlist (permitted keys only), not a blocklist. Deny unknown env keys by default.
macOS Codex danger-full-access sandbox mode — agent can exfiltrate anything
high
Explicit runtime warning + consent gate at daemon startup. Document in README.
LLM API keys stored as plaintext JSONB in custom_env column
high
Column-level encryption (pgcrypto / KMS) from day 1 in schema. No plaintext secrets in DB.
agent_task_queue overloaded with 5 different task kinds
med
Split per task kind from the start: task, autopilot_run, webhook_event, comment_job — separate tables or discriminated union.
Postgres-as-queue scaling cliff at ~1k concurrent tasks
med
Partial unique index + sweepers from P0. Add SKIP LOCKED on claim query. Redis relay if needed.
Cloud-runtime open-core boundary unclear — when to monetize?
med
Ship proxy stub in P8 but keep it optional. Monetize fleet compute, not core features.
License SaaS restriction (Multica's Modified Apache 2.0)
med
ainb is MIT today. If cloud offering planned, decide license before any public launch. Apache 2.0 + Commons Clause is the Multica pattern.
No OTEL tracing from day one — impossible to diagnose task stuck issues
low
Add tracing crate + otlp exporter at P0 skeleton. Zero cost to add early; expensive to retrofit.
10

Open decisions before build starts

Three decisions have non-trivial schema cost if changed mid-build. Lock them before P0 migrations are written.

Single-tenant vs multi-tenant from day 1? Recommendation: multi-tenant. The schema cost is one-time (workspace_id FK on every table). Retrofitting later requires a migration across all existing rows and every query. ainb already has the BEADS_DIR per-worktree pattern — mapping that to workspace_id is straightforward. Single-tenant saves ~2 days in P0/P5 but costs 2 weeks if you ever want to share a server instance.
BYO-LLM-keys vs platform-managed keys? Recommendation: BYO for the OSS build. Store per-workspace, column-encrypted. Platform-managed keys require a secrets manager (Vault, AWS SM) and billing integration — that is P8+ territory. Multica ships BYO-keys for self-hosted; platform keys are cloud-only.
Trust-the-user vs container-sandboxed execution? Recommendation: trust-the-user with documented warnings, same as Multica self-hosted. Container sandboxing (gVisor, Firecracker) is a correctness win but adds ~3 weeks and breaks the git worktree isolation pattern that already works. Ship the warning gate (see §09 Codex danger-full-access item), document what the agent can access, and treat sandboxing as a future P9 hardening option. Do not block v1.0 on it.
ainb × Multica — implementation plan · May 2026 Research: 6 reports · 3,220 lines · code-archaeologist + community + DE critique + UX analysis