agents-in-a-box · feat/multica

Hangar

A TUI-first managed-agents control plane — a feature replica of Multica built natively inside ainb. No web UI: the terminal is the control plane. Issues, autopilots, skills, a kanban board, and a managed-agent fleet, driven by a loosely-coupled daemon + plugin over a unix-socket JSON-RPC contract.

4 Rust crates + 1 plugin 9 TUI screens 10 CLI noun-groups 17 JSON-RPC methods 10 migrations 35 features · 22 e2e-tripwired Phases P0–P9

01What Hangar is

Hangar gives ainb a managed-agent control plane: a place to file work as issues, assign them to agents (Claude / Codex / Gemini), watch tasks march through a lifecycle on a kanban board, schedule recurring work with autopilots, curate reusable skills & agent templates, and observe the whole fleet's health — all from the terminal.

It is deliberately loosely coupled. A standalone ainb-hangar-daemon owns the data plane (SQLite, the task FSM, the cron scheduler, the agent runner). The TUI is a plugin (hangar-tui) that the host ainb binary loads and that talks to the daemon over a unix-socket JSON-RPC contract. The plugin holds zero domain logic — it subscribes, pulls snapshots, renders, and forwards key intents. This means the control plane runs (autopilots fire, tasks dispatch) whether or not a TUI is attached.

Issues → Tasks → Agents

File an issue, assign an agent, and the daemon dispatches a task that runs the provider in an isolated git worktree, streaming its transcript back.

Autopilots

Cron-scheduled autopilots fire tasks on a schedule, with a concurrency guard that skips a tick if the prior run is still in flight.

Skills & Templates

Import skills from the toolkit, bundle them into 10 curated agent templates, and materialise them into each task's provider-native skill directory at dispatch.

Observability

Structured JSONL tracing (+ optional OTLP), instrumented service spans, a live daemon-health sparkline, and a logs tail on both CLI and TUI.

02System architecture

Five Rust components in three planes: the host + plugin (presentation), the daemon (control), and the store + SQLite (data). Two cross-cutting crates — ainb-hangar-core (IO-free domain types) and ainb-hangar-proto (wire types) — are shared by everything.

Hangar layered system architecture
Fig 1 — Layered architecture: host ainb + plugin-runtime v2 → hangar-tui plugin → unix-socket JSON-RPC → ainb-hangar-daemonainb-hangar-store (sqlx) → SQLite.

The components

ainb (host)

The ratatui TUI binary (folder crates/ainb-core, package ainb). Embeds plugin-runtime v2, which discovers, spawns, and supervises plugin subprocesses and brokers their host-capability calls. Reaches Hangar from the home screen via g.

hangar-tui (plugin)

Package ainb-plugin-hangar. A native subprocess speaking JSON-RPC 2.0 over stdio (Content-Length framing). Renders 9 screens, dials the daemon socket, subscribes to a workspace, pulls snapshots, folds events. No DB, no domain logic.

ainb-hangar-daemon

Standalone binary. Hosts the unix-socket JSON-RPC server, the task claim loop, the autopilot scheduler, the provider runner, the beads sync, and the observability subscriber. The control plane proper.

ainb-hangar-store

sqlx repositories + the task-FSM services (claim / start / complete / fail / cancel / retry / sweep). Owns the schema (migrations 0001–0010) over a single SQLite file at ~/.ainb/hangar/hangar.db.

ainb-hangar-core

IO-free domain layer: typed ids, the HangarClock/IdGen traits, the task-status FSM table, the cron parser, env-allowlist policy, skill + autopilot service traits, PR-URL parser, token mint/verify, TaskResult.

ainb-hangar-proto

The JSON-RPC wire types + method-name constants shared by daemon and plugin. Plus the plugin SDK (ainb-plugin-protocol / ainb-plugin-sdk-rust) that defines the host-call contract and the stdio server loop.

The transport

The plugin dials ~/.ainb/hangar/hangar.sock via the host unix_socket_dial capability and speaks the same JSON-RPC framing the host uses over stdio. The daemon resolves a workspace identifier (slug or id) to the real row before scoping any query — the guard that closed the cross-tenant IDOR. Plugin subprocesses are spawned with kill_on_drop(true) plus an OS leak-guard (PR_SET_PDEATHSIG on Linux, setpgid + kill(-pgid) on macOS).

03Dependency graph

Internal crate dependencies

ainb-hangar-core (foundation — IO-free; no internal deps) ▲ ▲ ▲ │ │ │ ainb-hangar-store │ ainb-hangar-proto (sqlx + FSM) │ (wire types) ▲ │ ▲ └───────────┴──────────────┘ │ ainb-hangar-daemon (deps: core + store + proto) │ unix-socket JSON-RPC ▼ hangar-tui plugin (deps: proto + plugin-sdk) ◀── loaded by ── ainb (host + plugin-runtime v2)

core is the root and depends on nothing internal (so it stays IO-free + trivially testable). store and proto both build on core; the daemon ties all three together. The plugin depends only on proto + the SDK — never on the daemon or store crates — so it cannot smuggle in domain logic.

Key external dependencies

CratePurpose
tokioAsync runtime — daemon server, claim loop, scheduler, runner, plugin stdio.
sqlx (SQLite)Async, runtime-checked queries; migrations 0001–0010. Postgres-compatible schema for a future backend.
ratatui + crosstermTUI rendering (host + plugin screens).
cron v0.12Cron expression parsing (6-field; 5-field POSIX normalised by prepending 0 ).
chronoTime math for next-tick calc (bridged to epoch-millis storage via millis_to_utc/utc_to_millis).
security-framework (macOS)OS keychain backend for the secret store (ainb-hangar-secrets).
sha2 + subtlePAT/daemon-token hashing (sha256, stored hash-only) + constant-time verify.
tracing-subscriber + tracing-appenderStructured JSONL sink with daily rotation (daemon.<date>).
opentelemetry / opentelemetry-otlp (optional otlp feature)OTLP span export; zero crates linked in the default build.
zeroizeSecretBytes wiped on drop.

04Data & control flow

Two loops run continuously and independently: the dispatch loop (control plane — turns issues into running agent tasks) and the render loop (data plane — turns daemon state into TUI pixels).

Hangar control and data flow
Fig 2 — Dispatch (orange/control) vs render (blue/data + dashed events). The daemon is the single source of truth; the plugin self-heals on the next snapshot.

Dispatch loop (control)

  1. ainb hangar issue create --assign <agent> (or an autopilot tick) enqueues an agent_task_queue row.
  2. The daemon claim loop atomically claims the oldest queued task for an idle runtime (queued → dispatched), respecting per-agent max_concurrent_tasks.
  3. It materialises skills into the task's per-task directory at the provider-native path (.claude/skills/, .codex/skills/, .agent_context/skills/ …) — copied, scripts chmod 0755, kept outside the worktree git root so git status stays clean.
  4. It spawns the provider in an isolated git worktree (dispatched → running), streaming the transcript.
  5. On a terminal transition the FSM finalize path runs idempotently: it stamps done/failed/cancelled, cascades autopilot_run.completed_at when the task belongs to an autopilot run, and captures any gh pr create URL into result.pr_url.

Render loop (data)

  1. The plugin sends workspace/subscribe for the active workspace.
  2. On the ack it fires snapshot RPCs — hangar/issues_list, tasks_list, agents_list, skills_list, autopilots_list, daemon_health — which the daemon answers from the store (resolving slug→id, scoping by workspace).
  3. The plugin folds the wire rows into screen state and renders.
  4. Async events (TaskStarted/TaskFinished, autopilot.tick_skipped, skill updates) stream back over the subscription for instant feedback; the next snapshot reconciles authoritatively, so a dropped event self-heals.

05Task lifecycle (FSM)

Every unit of agent work is an agent_task_queue row walking a strict finite-state machine. The transition table is exhaustively defined in ainb-hangar-core and enforced by the store's finalize services.

Task FSM state machine
Fig 3 — queued → dispatched → running → done | failed | cancelled, with idempotent finalize, retry via parent_task_id, and TTL sweepers.

06Data model

A single SQLite database, workspace-tenant from migration 0001. Every row is scoped to a workspace; every by-id query carries the workspace guard (the IDOR fix). The schema is kept Postgres-compatible for a future server backend.

Hangar SQLite schema ER diagram
Fig 4 — Entities & foreign keys across migrations 0001–0010 (16 tables): tenancy, actors, work queue, skills, auth, autopilots.

Tenancy

workspace (slug unique), user (email unique), member (role).

Actors

agent_runtime (status), agent (runtime, visibility, owner).

Work

issue + comment; agent_task_queue (status, attempt, parent_task_id, result JSON, autopilot_run_id) with a partial unique index = one pending task per issue.

Skills

skill (unique per workspace/name), skill_file, agent_skill M:N junction.

Auth

pat + daemon_token (sha256 only), beads_mapping (hangar↔bd).

Autopilots

autopilot (cron_expr, max_concurrent_runs, next_tick_at, enabled), autopilot_run (status, completed_at).

07Plugin host capabilities & security

The plugin runs as a separate process and can only reach the host through declared, gated capabilities in its manifest.toml. Each capability is a Bool or an allow-List; the runtime enforces the grant before any privileged action.

Plugin host capability and security model
Fig 5 — Capability gating: ungranted → -32001; ambiguous form → -32003; daemon-side workspace scoping closes IDOR.
CapabilityHost callEnforcement
event_stream_subscribesubscribe to event topicstopic-prefix allow-list
spawn_managed_subprocessspawn a tracked child (e.g. the daemon)list-form mandatory; bool-true rejected -32003; reaped on teardown
unix_socket_dialdial the daemon socketpath allow-list, canonicalised; bool-true rejected -32003
secrets:readhost/secret_store_get → OS keychainkey allow-list; {scope, key}; read-only (no write path)
workspace:writeset active / default workspacebool-only; list-form rejected -32003
An ungranted capability returns -32001 CAPABILITY_DENIED; an ambiguous/unsupported grant form returns -32003 MANIFEST_VALIDATION. Separately, every daemon-side by-id query is workspace-scoped (the resolve_workspace_id guard) so a leaked id from one workspace cannot read or mutate another's data.

08Autopilots & the cron scheduler

An autopilot is a cron expression + an agent + instructions. A single daemon task drives all of them.

Autopilot cron scheduler loop
Fig 6 — The scheduler loop: pick earliest tick → sleep/​wake/​shutdown select → concurrency check → fire (single-tx) → reschedule.

09Feature catalogue — what a user does

Every Hangar feature is reachable two ways: a TUI screen (open Hangar with g, then a hotkey) and/or the ainb hangar <noun> CLI.

TUI screens & hotkeys

HotkeyScreenWhat you do
1IssuesBrowse/filter issues (All/Members/Agents/Mine chips), c create, a assign agent, Enter open task detail.
2Task detailLive transcript (5-colour stream), PR badge + o open-in-browser, r retry / x cancel.
KKanban4 columns (queued/running/done/failed); Shift+←/→ moves a card → fires a task transition.
4Skillss sync from toolkit, i/d attach/detach to selected agent, Enter view body.
,SettingsProvider keys (keychain write), workspace switching (s active/d default/n new/r rename).
5AutopilotsList + recent runs; a/e create/edit, r run-now, d enable/disable.
DDaemon healthRuntimes, claim-cache, concurrent tasks, dual-dim throughput sparkline (green success / red failure).
LLogsTail the daemon's structured JSONL, level-filter chips, colour-by-level.
(modal)Agent pickerPick a human or agent to assign (presence dots, / filter, recents pinned).

CLI

ainb hangar issue create | list | show ainb hangar task list | cancel | retry ainb hangar autopilot create | list | disable | enable | run ainb hangar skills sync | list ainb hangar templates list | show | use ainb hangar logs tail [-f] [--lines N] [--level L] ainb hangar auth token create | list | revoke (daemon-token create, hidden) ainb hangar config env.allow list|add|remove · warnings reset ainb hangar beads reconcile ainb hangar daemon status

10Feature × test coverage

Built test-first: every feature carries an acceptance test (unit/integration), and most carry an e2e tripwire — a real test that drives ainb tui in a tmux pane (or the daemon over its real socket) and asserts the rendered/persisted result, per the tmux-ui-tripwire discipline.

has acceptance + e2e tripwire (acc.) acceptance only partial
Feature (user action)LayerAcceptancee2e tripwire
Issues
Create / list / show issueCLIhangar_cli_integration.rs (4) + cli::hangar parse (3)tripwire_hangar_issue_roundtrip.rs
Persist issue + assigneestorerepo_issue.rs (4)— (via roundtrip)
Issue list screen (nav/filter/create)TUIissue_list_reducer_test.rs (7)tripwire_p4_issue_list_renders.rs
Kanban board (4 cols, card move)TUIkanban_reducer (10) + rpc_over_socket + snapshot (5)tripwire_kanban_columns_render.rs
Tasks
Task FSM (claim/start/complete/fail/cancel)store+corefinalize_idempotency (22) + claim_task_integration + task_state_transitionstripwire_task_happy_path_claude_provider.rs
Retry chain (parent/child, max-attempts)storeretry_chain.rs (8)(acc.)
TTL sweep (stale → fail)daemonsweeper_ttls.rs (10)tripwire_ttl_sweeper_fails_stale_dispatched.rs
Task detail + transcript screenTUItranscript_reducer (10) + render_snapshot (2)tripwire_p4_task_detail_streams.rs
Task CLI (list/cancel/retry)CLIhangar_cli_integration + cli::hangar parse(acc.)
Task-started bannerTUIbanner_reducer_test.rs (6)(acc.)
Agents
Agent picker (assign agent)TUIagent_picker_reducer_test.rs (8)tripwire_p4_agent_picker_opens.rs
agents_list snapshotdaemon/storerepo_agent.rs + rpc_server.rs— (in picker tripwire)
Skills
Skill repo CRUD (scoping, cascade)store+coreskill_repo_tests (9) + skill_service inline(acc.)
Skills sync importer (idempotent)daemon/CLItripwire_skills_sync_idempotent.rs (5) + cli parsescreens_render_from_daemon (sync RPC)
Skill manager screen (attach/detach/sync)TUIskill_manager_reducer (9) + snapshot (2)tripwire_p4_skill_manager_lists.rs
Dispatch-time materialisationdaemonmaterialise_skills_tests.rs (8)tripwire_skill_import_and_dispatch.rs
Templates
10 curated templates (embedded, resolve)coretemplate_registry_tests.rs (5)(acc.)
templates list / show / useCLI+daemontemplate_use_tests (6) + cli parse(acc.)
Autopilots
Cron CRUD (reject invalid cron)store+corerepo_autopilot (14) + cron.rs inline (12)(acc.)
Scheduler fires on scheduledaemonscheduler_loop + repo_autopilot_enqueuetripwire_autopilot_fires_on_schedule.rs
Scheduler skips when in-flightdaemonscheduler_loop::skip_when_prior_run_in_flighttripwire_autopilot_skips_when_running.rs
Autopilots manager screenTUIautopilots_reducer (6) + snapshot (4) + rpc_over_socket— (real-socket, no tmux)
autopilot CLI (create/list/disable/run)CLIhangar_autopilot_cli.rs (2) + cli parse
Auth / Secrets
OS keychain store/get/deletesecretsbackend.rs (7)tripwire_keychain_roundtrip.rs (#[ignore], dev-mac)
secret_store_get cap gatingruntimesecret_store_cap.rs (5)(acc.)
PAT / daemon tokens (hash-only)store+corerepo_token (11) + token.rs inline (3) + cli(acc.)
Env allowlist (block LD_PRELOAD)core+daemonenv_policy (5) + env_allow_config (3) + runnertripwire_env_allowlist_blocks_ld_preload / _passes_home
danger-full-access first-run warningcore+daemonwarnings.rs inline (4)tripwire_warning_shown_on_first_provider_use.rs
Workspace switching in SettingsTUI+runtimesettings_reducer + workspace_cap.rs (7)tripwire_workspace_switch_e2e.rs
Settings screen (sections, key entry)TUIsettings_reducer_test.rstripwire_p4_settings_renders.rs
Observability
Tracing JSONL sinkdaemonit_subscriber_writes_jsonl.rs(acc.)
OTLP exporter (otlp feature)daemonit_otlp_export_when_endpoint_set.rs (--features otlp)tripwire_otel_export_when_endpoint_set.rs
Instrumented service spans (8 methods)store+daemonservice_spans_emit + beads_sync_spans_emit(acc.)
Daemon health pane + sparklineTUI+daemonsnapshot_daemon_health.rs (5)tripwire_daemon_health_sparkline.rs
logs tail CLI + logs screenCLI+TUIlogs.rs inline (8) + cli + snapshot_logs_screen (3)— (no tmux for logs screen)(acc.)
gh integration
PR-URL capture into task resultcore+daemonpr_url_parse (10) + result inline + issues_list_pr_url (3)tripwire_pr_capture.rs
PR badge + o open-in-browserTUIpr_badge_snapshot (5) + pr_open_keybinding (3)tripwire_pr_badge.rs
Daemon / Transport
Daemon boot + migrations applydaemon+storetripwire_migrations_apply.rs (16 tables)tripwire_daemon_boots.rs
Unix-socket JSON-RPC + snapshotsdaemon+protowire_types (6) + rpc inline + rpc_server.rstripwire_hangar_plugin_connects.rs
workspace/subscribe + event streamproto+pluginevent_roundtrip (6) + stream_decode (8) + daemon_dialtripwire_detects_daemon_drop
Cross-screen navigationTUIscreen_router_test.rs (5)tripwire_p4_cross_screen_navigation.rs
Beads bidirectional syncdaemonbeads_adapter/reconcile/inbound/outbound/cli (50+)tripwire_beads_roundtrip.rs
Claude runner exec (env/exit/stream/timeout)daemonrunner_claude.rs (6)— (in happy-path)
Full-suite e2e guard (no shrink)daemontripwire_full_e2e.rs

Coverage summary

22 — full e2e tripwire

Real tmux drive of ainb tui (11 screens/flows) or daemon-over-real-socket (11 control-plane flows).

12 — acceptance only

Strong unit/integration coverage; no tmux tripwire — task retry, task/templates/token CLIs, skill CRUD, autopilot CRUD, JSONL sink, service spans, logs screen.

0 — untested

Every feature has at least an acceptance test.

Honest gaps (not regressions — coverage shape):
  • The logs screen and autopilots manager screen have reducer + snapshot + real-socket coverage but no tmux capture-pane proof of the rendered screen.
  • The task / templates / token CLIs are acceptance-only (parse + handler), versus the issue path which has a full tmux roundtrip.
  • The keychain roundtrip tripwire is #[ignore] by default (needs a real dev-mac keychain prompt); the in-memory + cfg-gated backend tests are the authoritative proof.

11How it was built — phases P0–P9

Per-bead TDD (RED → GREEN → review → scoped gate → close), each phase capped by e2e tripwires.

P0 ✅Schema + store
P1 ✅Task FSM + dispatch
P2 ✅Daemon
P3 ✅Plugin host caps
P4 ✅5 core screens
P5 ✅Auth + workspace
P6 ✅Skills + templates
P7 ✅Autopilots + cron
P8 ✅Kanban + health + obs
P9 ◐gh + release

P0–P8 complete and verified; P9 (gh integration ✅, PR badge ✅, CI matrix ✅) in the release-prep stretch — draft PR feat/multica → main (#179) open, target tag v2.0.0.