ainb × Multica

From local CLI to managed-agents control plane

Implementation plan · May 2026 · 6 research reports · 3,220 lines of findings

31.2k

Multica GitHub stars

~45%

ainb primitives reusable

8 phases

sequenced delivery

~16 wks

estimated build

What is Multica

Multica is an open-source managed-agents platform: you assign GitHub issues to AI agents through a web Kanban, a daemon on your machine claims tasks, spawns agent CLI subprocesses, and the PR ships while you watch live transcript updates.

Repo topology: Go monolith (internal/) + Next.js (apps/web) + Electron (apps/desktop) + Expo mobile + single CLI/daemon binary
Scale: 31.2k stars, 107 DB migrations, 11 agent CLIs supported (claude, codex, openclaw, hermes, kimi, cursor, copilot, gemini, pi, opencode, kiro)
License: Modified Apache 2.0 with SaaS hosting restriction — open-core boundary around the cloud-runtime fleet
Self-host: docker-compose up — Postgres + Redis + single Go binary; daemon runs on the user's laptop
Key insight: the server never sees LLM calls — it sees result, session_id, work_dir, usage. Daemon owns the data plane.

Task lifecycle state machine

Every agent assignment becomes a task row. State transitions are driven by the daemon (claim/start/complete/fail) and by server-side TTL sweepers. Retries create new rows with parent_task_id — old row stays for audit.

Unique constraint: idx_one_pending_task_per_issue WHERE status IN ('queued','dispatched') — migration 022 — prevents double-dispatch
Idempotent finalize: if UPDATE matches 0 rows but row is already terminal, return success (task.go:1010) — defeats WS-vs-HTTP race
Retry: only runtime_offline and runtime_recovery failure reasons are retryable; new row created, attempt counter incremented — migration 055
TTLs: queued → 2h, dispatched → 5min, running → 2.5h — all enforced by background sweeper goroutines
Reclaim: stale dispatched tasks (daemon went offline) re-queued via ReclaimStaleDispatchedTask after 90s

User journey — assign issue, agent ships PR

The complete path from human action to merged PR. The WS-driven UI watching task lifecycle and comments is what makes the experience feel like a teammate rather than a CLI tool.

Patterns worth copying wholesale

Multica made eight architectural bets that consistently reduced blast radius. These should be replicated verbatim in ainb v2 — the invariants are locked in DB constraints and middleware, not conventions.

Dual-WS hub split: realtime.Hub (user fan-out, survives daemon reconnect) + daemonws.Hub (daemon wakeup, per-runtime). Never merge these — different retry semantics.
Polymorphic actor: assignee_type + assignee_id columns on issue row — member, agent, or squad without schema change. Picker is a single component that renders all three.
Per-task env isolation: ~/ainb_workspaces/{ws}/{task}/{workdir,output,logs} — each task gets fresh git worktree + materialized skills; no shared state between tasks.
Skills materialized at dispatch time: DB stores skill template rows; daemon writes provider-native paths into the per-task env at claim time. Server never touches the filesystem.
Curated agent templates embedded in binary: //go:embed templates/*.json for agent profiles; these are the main onboarding surface — easy to extend, hard to corrupt.
Single binary for CLI + daemon + server: subcommand routing means one release artifact, trivial daemonize (ainb server start), zero coordination between binaries on auto-update.
Idempotent finalize pattern: any task completion/failure endpoint checks existing terminal state before returning error — defeats the WS-vs-HTTP race without distributed locks.
Partial unique index for pending tasks: one DB constraint eliminates an entire class of double-dispatch bugs without application-level locking.

ainb today — what exists

ainb is a local-first Rust workspace with a TUI, plugin host, toolkit, and knowledge layer. It has no server, no web UI, and no persistent daemon — all state lives on the filesystem. The gap to Multica is real but largely additive.

Capability gap — ainb vs Multica

Coverage is current ainb coverage of that Multica feature. Effort is engineering cost to close the gap. ~45% of Multica primitives map directly onto existing ainb components.

Feature

Coverage

Effort

ainb mapping / notes

Persistent daemon (heartbeat + poll + GC)

25%

tmux watchdog exists; needs WS heartbeat + HTTP fallback + per-task GC loop

CLI auth (JWT + PAT)

40%

no auth today; PAT pattern (ainb_ prefix) is ~200 LoC

AgentRuntime registry (per-daemon provider config)

multi-provider abstraction covers CLI selection; no runtime registration API

Issue tracker / task queue (Beads bridge)

55%

beads SQLite is the leverage point — two-way sync adapter is the key P2 deliverable

Agent-as-assignee (polymorphic actor)

10%

assignee_type/assignee_id schema pattern; ainb has agent profiles but no DB row

Agent profiles / templates

20%

toolkit/agents/ has 37 agents; need DB rows + embedded JSON templates

Squads (agent groups)

35%

swarm teams are the analogue; need squad DB entity + routing

Multi-workspace / multi-tenant

10%

BEADS_DIR per-worktree is single-tenant; need workspace table + membership

Web dashboard (Kanban + transcript)

zero web today; Next.js app is P3 — biggest build item by surface area

Desktop app

Tauri preferred over Electron (Rust-native); P8 optional

WebSocket realtime (dual hub)

tokio-tungstenite + axum-ws; split user hub / daemon hub from day 1

Skill server-side storage + dispatch

15%

toolkit/packages/skills exists; need DB rows + materialization at claim time

Skills compounding (reflect-kb)

30%

reflect-kb GraphRAG already works; wire learnings into skill templates

Autopilots / cron triggers

pg_cron + cron expression on issue row; P7

Webhook triggers (GitHub App)

GitHub App + webhook handler; P7

@mention routing

15%

claude-peers MCP is point-to-point; need @mention parser + polymorphic routing

Task lifecycle FSM (all invariants)

25%

P1 core deliverable; partial unique index + sweepers + idempotent finalize

Usage analytics (token + cost)

70%

ainb usage analyzer already tracks per-session cost; expose via API

ainb v2 target architecture

The target collapses CLI + daemon + server into one Rust binary (same trick Multica uses). New modules are marked ★. Existing ainb primitives are reused as-is. The Beads sync adapter is the leverage move — Stevie keeps using Beads UX while the server backfills the task table.

Execution roadmap — 8 phases, ~16 weeks

Sequenced so a usable artifact ships at the end of every phase. No big-bang. Beads sync adapter (P2) is the leverage move — Stevie keeps the existing Beads UX while the server backfills the task table underneath.

P0 W1-2

Schema + server skeleton

Axum binary compiles, health endpoint returns 200, Postgres migrations run, sqlx types generated

axumsqlxpostgres

P1 W2-5

Task lifecycle + daemon v1

queued→dispatched→running→terminal FSM; per-task env dir; single-provider claude subprocess; sweepers running

daemontokiofsm

P2 W4-5

Beads sync adapter

Two-way bridge: beads issue ↔ task row; ID mapping table; Stevie keeps beads UX

beadsbridge

P3 W4-7

Web UI minimal

Next.js: issue list, transcript view, agent picker, auth screen, basic Kanban

nextjskanban

P4 W7-8

WebSocket realtime

Dual-hub split (user fan-out + daemon wakeup), scope subscribe, idempotent finalize

websockettokio-tungstenite

P5 W8-10

Multi-tenant + auth

Workspace table, membership, PAT (ainb_), daemon token (adt_), polymorphic actor assignee

authrbacmulti-tenant

P6 W10-12

Skills server-side

Skill aggregate in DB, dispatch-time materialization, embedded templates in binary

skillsembed

P7 W12-14

Polish — PR mirror, autopilots, observability

GitHub App + webhook, cron + trigger rules, OTEL traces, Prometheus metrics

githubcronotel

P8 W14-16

Cloud runtime + billing (optional)

Proxy stub mirroring Multica's open-core boundary; monetize fleet later

optionalcloud

Security posture + distinguished engineer warnings

Multica has three security issues that must not be replicated. The DE audit also flagged five scaling cliffs. All eight should be addressed explicitly in the ainb v2 design — not deferred.

Issue

Severity

ainb v2 mitigation

isBlockedEnvKey blocklist misses LD_PRELOAD, DYLD_INSERT_LIBRARIES, NODE_OPTIONS

high

Use an allowlist (permitted keys only), not a blocklist. Deny unknown env keys by default.

macOS Codex danger-full-access sandbox mode — agent can exfiltrate anything

high

Explicit runtime warning + consent gate at daemon startup. Document in README.

LLM API keys stored as plaintext JSONB in custom_env column

high

Column-level encryption (pgcrypto / KMS) from day 1 in schema. No plaintext secrets in DB.

agent_task_queue overloaded with 5 different task kinds

med

Split per task kind from the start: task, autopilot_run, webhook_event, comment_job — separate tables or discriminated union.

Postgres-as-queue scaling cliff at ~1k concurrent tasks

med

Partial unique index + sweepers from P0. Add SKIP LOCKED on claim query. Redis relay if needed.

Cloud-runtime open-core boundary unclear — when to monetize?

med

Ship proxy stub in P8 but keep it optional. Monetize fleet compute, not core features.

License SaaS restriction (Multica's Modified Apache 2.0)

med

ainb is MIT today. If cloud offering planned, decide license before any public launch. Apache 2.0 + Commons Clause is the Multica pattern.

No OTEL tracing from day one — impossible to diagnose task stuck issues

low

Add tracing crate + otlp exporter at P0 skeleton. Zero cost to add early; expensive to retrofit.

Open decisions before build starts

Three decisions have non-trivial schema cost if changed mid-build. Lock them before P0 migrations are written.

Single-tenant vs multi-tenant from day 1? Recommendation: multi-tenant. The schema cost is one-time (workspace_id FK on every table). Retrofitting later requires a migration across all existing rows and every query. ainb already has the BEADS_DIR per-worktree pattern — mapping that to workspace_id is straightforward. Single-tenant saves ~2 days in P0/P5 but costs 2 weeks if you ever want to share a server instance.

BYO-LLM-keys vs platform-managed keys? Recommendation: BYO for the OSS build. Store per-workspace, column-encrypted. Platform-managed keys require a secrets manager (Vault, AWS SM) and billing integration — that is P8+ territory. Multica ships BYO-keys for self-hosted; platform keys are cloud-only.

Trust-the-user vs container-sandboxed execution? Recommendation: trust-the-user with documented warnings, same as Multica self-hosted. Container sandboxing (gVisor, Firecracker) is a correctness win but adds ~3 weeks and breaks the git worktree isolation pattern that already works. Ship the warning gate (see §09 Codex danger-full-access item), document what the agent can access, and treat sandboxing as a future P9 hardening option. Do not block v1.0 on it.

ainb × Multica — implementation plan · May 2026 Research: 6 reports · 3,220 lines · code-archaeologist + community + DE critique + UX analysis