agents-in-a-box · feat/multica · parity review

Hangar ⇆ Multica — feature-parity review

Hangar is a faithful — often cleaner — re-architecture of Multica’s task FSM, but it is not yet at feature parity: of 113 mapped Multica features, 19 are at parity, 36 partial, 45 gaps, and 13 out of scope by design — and three design-level holes (an unauthenticated unix socket, a producer-absent event push channel, a per-issue concurrency regression) sit underneath the feature deltas.

19
36
45
13
parity 19 (16.8%) partial 36 (31.9%) gap 45 (39.8%) out-of-scope by design 13 (11.5%)
113 Multica features mapped 162 Multica features extracted 44 Hangar features catalogued 87 claims adversarially re-checked 35 beads proposed

01Methodology — how this review was produced

This review was produced by a 96-agent workflow. Five parallel extractors catalogued 102 code-level features plus 60 doc/site-level features on the Multica side (162 total) and 44 features on the Hangar side. An independent mapping pass produced the 113-row parity matrix below; every gap and partial claim was then adversarially re-checked against the Hangar codebase by separate verifier agents.

54 confirmed

The verifier reproduced the claim against Hangar source with file:line evidence.

7 adjusted

The claim was directionally right but the status or rationale changed on re-check (e.g. gap → partial).

2 refuted

The original claim was wrong — the feature exists (agent concurrency limit; appearance/theme). Statuses corrected.

26 unchallenged

Parity / out-of-scope rows that were not sent through the adversarial pass.

Honest disclosure.
  • 24 of 113 rows are flagged unverified in the matrix: their adversarial verdicts could not be matched back to a row name (verdict→row matching failed), so they kept their pre-verification status. Treat those statuses as the mapper’s opinion, not a checked fact.
  • 26 parity / out-of-scope rows were never challenged — the adversarial pass targeted only gap/partial claims.
  • The two most severe architecture claims were independently re-verified by hand: rg over ainb-hangar-daemon/src finds zero HangarEvent/EVENT_METHOD emission sites and zero auth-token use on the RPC path; bind() at rpc/mod.rs:104 creates the socket with default permissions.

02Full parity matrix — all 113 rows

Every Multica feature mapped to its Hangar equivalent, grouped by area. Status: parity partial gap oos (out of scope by design). Verification badge per row; priority chip on gap/partial rows. Expand evidence & judgment for the verifier’s file:line evidence and the web-only-feature judgment.

showing 113 of 113
Multica featureHangar equivalentStatusVerifiedRationale
Issues & projects · 26 rows
Create issue (title, desc, status, priority, assignee, dates, labels)issue create CLI/screen (F01/F03)partial P1confirmedCreate supports title/description/state/assignee only; no priority, due dates, or labels in the issue model.
evidence & judgment

Web-only judgment — Issue creation is fully TUI/CLI-native and present, but the data model lacks priority/dates/labels fields, so partial with real gaps in those attributes.

Verification evidence — store NewIssue/Issue ainb-hangar-store/src/repo/issue.rs:26-66 + wire IssueRow ainb-hangar-proto/src/events.rs:203-227 + CLI IssueCreateArgs ainb-core/src/cli/hangar/mod.rs:425-443 all = title/desc/state/assignee only; no priority/due/labels. Labels/priority live only in beads adapter (beads_adapter/mod.rs:99 discards priority).

Quick-create issueissue create (F01) with c on issue list (F03)partial P3unverified'c' on the issue-list screen creates an issue inline; a distinct origin-tagged quick-create lane is not modeled.
evidence & judgment

Web-only judgment — Inline quick-create is TUI-native and present; the origin-tracking nuance is the only missing piece.

Board & list viewsIssue list (F03) + Kanban board (F04)parityunchallengedIssue-list screen plus 4-column kanban with card-move RPC mirror Multica's list+board.
Grouped issue viewsKanban columns by task status (F04)partial P3confirmedKanban groups by task status (queued/running/done/failed); grouping by arbitrary fields (status/priority/assignee) is not offered.
evidence & judgment

Web-only judgment — Status-grouped columns exist; richer group-by is TUI-expressible, hence partial.

Verification evidence — Status grouping: issue_list.rs:37-76 (IssueColumn Todo/InProgress/Done), kanban.rs:56-112; tested in issue_list_reducer_test.rs:109-151. Assignee is filter not group: issue_list.rs:84-126. No priority on IssueRow (events.rs:203-215) or store; no group-by axis toggle in plugin.

Issue detail page (timeline, comments, subscribers, runs)Task detail + live transcript (F08); task runspartial P1confirmedHangar's detail shows the live agent transcript and PR badge but has no comments, subscribers, or full activity timeline.
evidence & judgment

Web-only judgment — Detail/transcript is TUI-native and present; missing comments/timeline/subscribers are TUI-expressible gaps, so partial.

Verification evidence — task_detail.rs:508/519 (PR badge+transcript), :424-433 interleave CommentAdded, sidebar.rs:32-46 (status/assignee/project, no subscribers). CommentRow/table exist (events.rs:357, migration 0003) but NO daemon producer, no comment repo, no compose key (keymap :338-348 only j/k/R/X). Arch doc:155 scopes detail to transcript+badge+retry/cancel. No subscribers/separate timeline anywhere.

Edit / update issue (status, priority, assignee, project, dates)task transition (F04 card move); assign (F11)partial P1adjustedState changes via kanban move and assignee via picker work; priority/project/dates are not in the model.
evidence & judgment

Web-only judgment — Field-edit is TUI-native; the missing fields (priority/project/dates) are real model gaps, so partial.

Verification evidence — Rationale refuted: assignee picker intent dropped (app_screens.rs:646-655, route ignores out.intent); kanban moves tasks not issues (kanban.rs:1-10, methods.rs:119); no issue-write RPC (methods.rs:155-174); issue state only via beads (inbound.rs:184). Create ignored (app_screens.rs:557). Model lacks priority/project/dates (events.rs:203-227).

Batch update / delete issuesnonegap P2unverifiedNo multi-select batch update/delete in issue-list screen or CLI.
evidence & judgment

Web-only judgment — Multi-select batch ops are common in TUIs; absent, so gap not oos.

Parent / child sub-issuesnone (issues); task retry parent/child (F06)gap P2unverifiedTasks have parent/child for retries, but issues have no parent_id/sub-issue hierarchy.
evidence & judgment

Web-only judgment — Sub-issue nesting is a TUI-expressible data relationship; not present on issues, so gap.

Label issues (create/attach/detach labels)nonegap P2confirmedNo label table or issue-label join; 'label' hits are bd-sync flags, not issue labels.
evidence & judgment

Web-only judgment — Issue labels are simple CRUD+chips, very TUI-native; entirely missing, so gap.

Verification evidence — migrations/0003_issue_comment.sql:19-43 (no label table/join); store/src/repo/issue.rs:84-99 (insert has no label col); proto/events.rs:203-227 (IssueRow no labels); proto/methods.rs ALL_METHODS + daemon rpc/mod.rs:260-261 (only skill_attach/detach, no issue-label verb); beads_sync/inbound.rs:43 SYNC_LABEL="hangar-v1" is bd-filter flag only.

Custom issue metadata (KV get/set/delete)nonegap P2confirmedNo per-issue key/value metadata store or CLI; pipeline-state tracking unavailable.
evidence & judgment

Web-only judgment — KV metadata is a CLI-native feature (Multica's own is CLI form-factor); absent in Hangar, so gap.

Verification evidence — migrations/0003_issue_comment.sql:23-33 issue table has fixed cols, no metadata/KV col; proto/src/methods.rs:155-174 ALL_METHODS has no meta/kv method; store/src/repo/issue.rs:81,112,135,156 only insert/get/update_state/list — no set/get/delete_metadata; no issue_meta/pipeline_state anywhere.

Subscribe / unsubscribe to issuesnonegap P3confirmedworkspace/subscribe is an event-stream subscribe, not per-issue notification subscription; no issue-subscriber model.
evidence & judgment

Web-only judgment — Per-issue subscription driving notifications is TUI-expressible; missing, so gap.

Verification evidence — ainb-hangar-proto/src/methods.rs:155 ALL_METHODS has only workspace/subscribe (event stream, line 19) + hangar/issues_list; no issue/subscribe. issue.rs:49 Issue struct lacks subscribers/watchers field; no subscription migration (0001-0010); no daemon handler; notification hits = notifyd/tracing only.

React to issues (emoji reactions)nonegap P3confirmedNo reaction table or surface for issues.
evidence & judgment

Web-only judgment — Emoji reactions render fine in a TUI; absent, so gap (marginal).

Verification evidence — No reaction table: migrations/0003_issue_comment.sql:19-43 has only issue+comment tables, no reaction col. IssueRow has no reaction field (proto/events.rs:203). issue_list.rs reducer keys are j/k///c(create)/a(assign)/enter only (issue_list.rs:338-352); no reaction RPC/event anywhere.

Search issues (full-text)nonepartial P2adjustedNo full-text search index or search RPC/CLI over issues; issue list has filter chips only.
evidence & judgment

Web-only judgment — Issue full-text search is CLI/TUI-expressible (Multica ships a CLI search); absent in Hangar, so gap.

Verification evidence — issue_list.rs:152,171-173,231-234 has `/` free-text query: `r.title.to_lowercase().contains(&q)` (title-only, client-side). Claim's "filter chips only" is wrong. But no server search: rpc/mod.rs:203 + snapshots.rs:46 only *_LIST; no FTS/LIKE in migrations/0003 (only idx_issue_workspace_state) or repo/issue.rs:164. Narrower than Multica's ranked title+desc+comment LIKE search → partial.

My Issues viewissue list 'Mine' filter chip (F03)partial P3unverifiedIssue-list has a 'Mine' chip approximating My Issues, but no dedicated assigned/created/agent-scoped view.
evidence & judgment

Web-only judgment — A Mine filter exists; richer scoped views are TUI-expressible, hence partial.

Comment on issuescomment table (schema only); CommentRow eventgap P1confirmedcomment table and CommentRow event exist but there is no write path (no RPC/CLI/screen to post a comment).
evidence & judgment

Web-only judgment — Commenting is fully TUI-native (text input + list); schema present but no create surface, so gap and core-workflow important.

Verification evidence — No write path: comment table only created (migrations/0003_issue_comment.sql:36), never inserted; no comment method in proto methods.rs ALL_METHODS (drift-guarded); CommentRow is display-only via fold at task_detail.rs:424; intents are RetryTask/CancelTask only (task_detail.rs:300-305).

Threaded comment repliesnonegap P3confirmedComment table has no parent_id; no threading and no comment write path at all.
evidence & judgment

Web-only judgment — Threaded replies are TUI-expressible but depend on the absent comment feature; gap.

Verification evidence — migrations/0003_issue_comment.sql comment table has no parent_id (only id,issue_id,author_*,body,created_at). Zero "comment" mentions in ainb-hangar-daemon/src; no INSERT INTO comment / add_comment / comment RPC. CommentRow/CommentAdded (proto events.rs:356) only built in proto event_roundtrip_test.rs sample_comment — dead wire type, no write path.

Edit / delete commentsnonegap P3unverifiedNo comment mutation surface; comments cannot even be created.
evidence & judgment

Web-only judgment — Edit/delete are TUI-native list actions; gap because comments are absent end-to-end.

Resolve / unresolve commentsnonegap P3unverifiedNo resolved_at field or resolve surface on comments.
evidence & judgment

Web-only judgment — Resolve toggle is TUI-expressible; gap, contingent on comments existing.

React to commentsnonegap P3unverifiedNo comment reaction surface.
evidence & judgment

Web-only judgment — Reactions render in a TUI; gap, marginal.

Projects (status, priority, lead, grouped issues)nonegap P2confirmedNo project model, project_id on issues, or project screen/CLI.
evidence & judgment

Web-only judgment — Projects are data+list views, fully TUI/CLI-expressible (Multica has a project CLI); absent in Hangar, so gap.

Verification evidence — No project table in migrations 0001-0010; issue.rs:49-66 Issue has only workspace_id (no project_id/priority/lead); events.rs:203 IssueRow same; methods.rs lists 15 RPCs, none project; cli/hangar/mod.rs zero project tokens; sidebar.rs:45 "Project" label aliases issue.workspace_id.

Project resourcesnonegap P3unverifiedNo projects, so no resource links/docs attachment.
evidence & judgment

Web-only judgment — Resource pointers are CLI-expressible; gap, contingent on projects.

Search projectsnonegap P3unverifiedNo projects and no project search.
evidence & judgment

Web-only judgment — Project search is CLI/TUI-expressible; gap, depends on projects.

Upload / view attachmentsnonegap P3unverifiedNo file upload, issue-attachment listing, or preview surface.
evidence & judgment

Web-only judgment — Attaching files by path and listing them is CLI/TUI-expressible; image preview is oos but listing/attach is a gap, so overall gap.

Link PRs to issues with CI / conflict statusPR-URL capture (F36) + PR badge / open-in-browser (F37)partial P2confirmedHangar parses a gh PR URL from the transcript and shows a badge with open-in-browser, but does not surface CI check results or merge-conflict status, nor auto-move-to-Done.
evidence & judgment

Web-only judgment — PR linking is TUI-expressible and partly present (URL+badge); CI/conflict status and auto-done are the gaps, so partial.

Verification evidence — pr_url.rs:47 scrapes one PR URL via regex; result.rs:42 + events.rs:220 carry only pr_url:Option<String>; task_detail.rs:541 renders "▶ PR <url> [o] open". No ci_status/check_run/mergeable/conflict anywhere (grep all crates). FSM done stamps on task exit not PR merge (architecture.md:83). P9.md:40 scopes out webhooks.

Global search / Command palette (Cmd+K)per-screen filter chipsgap P2confirmedNo global cross-entity search or command palette; only per-screen filter chips exist.
evidence & judgment

Web-only judgment — A fuzzy command palette is a classic TUI pattern (very TUI-native); absent, so gap.

Verification evidence — router.rs:34-62 global keys are tab/help/quit only, no Cmd+K/palette; proto methods.rs:19-155 ALL_METHODS closed catalogue has no search/find RPC; issue_list.rs:152-234 `/` filters only current-screen issue titles; agent_picker.rs:7,92-103 `/` narrows assign-modal actors only. No cross-entity search anywhere.

Assignee frequency suggestionsagent picker recents pinned (F11)partial P3confirmedAgent picker pins recents, approximating frequency suggestions, but not a true frequency-ranked assignee model.
evidence & judgment

Web-only judgment — Recency/frequency ordering is TUI-expressible and partly present (recents), so partial.

Verification evidence — No frequency model exists. Picker pins by recency only (agent_picker.rs:117-131 sort_recent_then_alpha; events.rs:248-250 recent_rank). Daemon hardcodes recent_rank: None for every actor (snapshots.rs:113-116,138,149), so even recency pinning is inert; only alpha+filter is live.

Agents, tasks & runtimes · 19 rows
Assign issue to agent or memberAgent picker (F11) + --assign enqueue (F01)parityunchallengedAgent picker assigns human or agent; agent assignment enqueues a task — the core assign-and-queue flow.
Rerun issue tasktask retry (F06/F09); task detail rparityunchallenged'r' in task detail and CLI task retry spawn a fresh execution, matching rerun.
Cancel running tasktask cancel (F08 x / F09 CLI)parityunchallengedTask detail 'x' (confirm modal) and CLI task cancel match cancel-in-progress.
View task runs & run messagesTask detail live transcript (F08); execution historypartial P2unverifiedSingle live transcript per task is shown; a list of historical runs per issue with per-run message logs is not fully surfaced.
evidence & judgment

Web-only judgment — Transcript view is present and TUI-native; the per-issue run-history list is the gap, so partial.

Mention agent in comment to trigger itnonegap P1confirmedNo comment mention parser or comment-triggered task path; comments themselves are unreachable.
evidence & judgment

Web-only judgment — @-mention-to-trigger is a core agent-collaboration loop and TUI-expressible; absent, so an important gap.

Verification evidence — No mention parser/regex over comment bodies anywhere; no comment-write RPC (methods.rs ALL_METHODS lacks any comment method); zero comment refs in ainb-hangar-store. CommentAdded (events.rs:97) only rendered read-only at task_detail.rs:424, produced solely by test fixtures. No comment-triggered task-spawn path. (Rationale's "comments unreachable" is wrong-they render-but feature absent.)

Agent posts progress / blocker commentslive task transcript stream (F08/F40)partial P2confirmedRunning agents stream progress to the task transcript, but they do not write durable system-authored comments on the issue.
evidence & judgment

Web-only judgment — Progress reporting exists via transcript (TUI-native); persisting it as issue comments is the gap, so partial.

Verification evidence — Progress streams: runner.rs:294 stream_stdout; docs/hangar/architecture.md:82. But comment table (migrations/0003_issue_comment.sql:36) is unwired: no SQL touches it, no repo/comment.rs, no hangar/comment RPC in proto/src/methods.rs; CommentRow/CommentAdded (events.rs:96,355) constructed only in tests, never by daemon/runner.

Create / edit / archive agentstemplates use materialises agent (F18); agent repopartial P1confirmedAgents can be created by materialising a template (CLI), but there is no general create/edit/archive/restore agent CRUD surface or archived flag.
evidence & judgment

Web-only judgment — Agent CRUD is fully CLI/TUI-expressible; only template-instantiation exists, no edit/archive, so partial.

Verification evidence — agent.rs:55/79/94 AgentRepo=insert/get/list only (no update/delete); migration 0002 agent table has no archived col; methods.rs ALL_METHODS has only hangar/agents_list; zero `archiv` matches in all hangar crates+CLI; create only via `templates use` (cli/hangar/mod.rs) + seed.rs.

Create agent from templatetemplates list/show/use (F17/F18)parityunchallenged10 curated embedded templates with list/show/use that materialises a live agent + skills.
Configure agent runtime, model, args, env, MCP, thinkingagent table (instructions/runtime/visibility only); env allowlist (F27)partial P1confirmedRuntime binding and env allowlist exist, but model, thinking level, custom CLI args, MCP config, and per-agent env vars are not in the agent model.
evidence & judgment

Web-only judgment — Agent config is CLI/TUI-expressible; most knobs (model/args/MCP/thinking/env) are absent from the schema, so partial with substantial gaps.

Verification evidence — agent model has only id/workspace_id/name/runtime_id/instructions/visibility/owner_id (migrations/0002_agent_runtime_skill.sql:31-39, repo/agent.rs:26-42). Runtime binding present (runtime_id FK + provider). Env is global EnvPolicy allowlist, not per-agent (env_policy.rs:84). Zero matches for mcp/model_id/thinking_level across all hangar crates. Thinking refs are transcript display only (task_detail.rs:51).

Agent concurrency limitnone (per-agent); daemon claim-slot cache (F34)parityrefutedDaemon has global claim-slot bounds but no configurable per-agent max-concurrency field.
evidence & judgment

Web-only judgment — Per-agent concurrency is a config field, TUI-expressible; absent, so gap.

Verification evidence — migrations/0006_claim_dispatch_concurrency.sql:24 adds `agent.max_concurrent_tasks INTEGER NOT NULL DEFAULT 1`; store/src/service/claim.rs:108-111 enforces it atomically in CLAIM_SQL (cites Multica task.go:761 CountRunningTasks); tests/claim_task_integration.rs:245,275 cover caps of 1 and 2.

Cancel all agent tasksper-task cancel (F08/F09)partial P3confirmedIndividual tasks can be cancelled, but there is no single 'cancel all tasks for this agent' operation.
evidence & judgment

Web-only judgment — Bulk-cancel-by-agent is a CLI verb, TUI-expressible; only per-task exists, so partial.

Verification evidence — Per-task cancel only: cancel.rs:41 cancel(task_id); methods.rs:121 task_transition params {task_id}; cli/hangar/mod.rs:466,482 Cancel(TaskIdArgs{id}); kanban.rs:301 single-card transition; task_detail.rs:304 CancelTask(TaskId). No by_agent/bulk cancel query in store or daemon.

Set agent avatarnoneoosunchallengedAvatar is an uploaded image with no meaningful terminal rendering.
evidence & judgment

Web-only judgment — Image avatars have no TUI rendering surface; oos. (Presence dots in the picker are the TUI-native identity cue.)

Live task progress streamingtask detail live transcript (F08); task message stream (F40)parityunchallenged5-colour live transcript streams task messages as events to the UI, matching task:message/updated streaming.
Manage agent runtimes / Unified runtimes dashboardDaemon health pane (F34) lists registered runtimes; daemon status CLIpartial P2confirmedDaemon health lists registered runtimes with status, but there is no full manage surface (update, set visibility, delete, archive-and-delete).
evidence & judgment

Web-only judgment — Runtime listing+status is present and TUI-native; the management verbs (delete/visibility/update) are the gap, so partial.

Verification evidence — methods.rs:155-173 ALL_METHODS has only hangar/agents_list + hangar/daemon_health (read); no update/delete/visibility/archive verbs. agent_runtime.rs:37-94 + agent.rs:55-94 repos are insert/get/list only despite agent carrying visibility/owner (architecture.md:112). daemon_health.rs screen has zero key handlers.

Inspect runtime models / skills / update CLInonegap P3confirmedNo query of a runtime's available models or local skills, and no runtime CLI-update trigger.
evidence & judgment

Web-only judgment — Querying runtime models/skills is CLI-expressible; absent, so gap.

Verification evidence — agent_runtime.rs:17-32 + migration 0002_agent_runtime_skill.sql:16-24 store only provider/mode/status (no models/version); settings.rs:113-120 RuntimeHealthRow exposes only provider/connected/pid; methods.rs lists 15 RPCs, none query runtime models or trigger CLI update; runner.rs:1 spawns claude only to execute tasks.

Cloud runtime node controlnone (runtime_mode cloud enum only)oosadjustedSchema models a 'cloud' runtime_mode but there is no create/start/stop/reboot/exec on cloud nodes.
evidence & judgment

Web-only judgment — Node lifecycle commands are CLI-expressible; the enum exists but no control surface, so gap (and Multica itself marks cloud as waitlist).

Verification evidence — No cloud-node verbs exist (methods.rs:155-174 ALL_METHODS test-guarded; agent_runtime.rs:45-94 only insert/get/list; seed.rs:171 local-only). But it's a deliberate scope-out: build-plan.md:56 "Cloud runtime: skip entirely"; Multica's cloud control is a proxy to a closed SaaS fleet (research/01:136, research/04:43).

Multiple coding-agent backends (12 providers)claude runner only (F43); Provider trait stubpartial P1confirmedOnly the claude provider ships; a Provider trait exists for future codex/copilot/gemini but no other backend is implemented.
evidence & judgment

Web-only judgment — Multi-provider execution is daemon-native and core to the product; only 1 of 12 implemented, so partial with a major gap.

Verification evidence — runner.rs:137-152 only Provider impl=Runner("claude"); run_loop.rs:284 unconditionally run_claude (no codex/copilot/gemini exec path); cfg only claude_path/HANGAR_CLAUDE_PATH. Scaffolding real+tested: materialise.rs:57-122 ProviderSkillLayout (claude/codex/cursor/copilot/gemini) + materialise_skills_tests.rs. Core registry (4 providers) unused by hangar crates.

Task lifecycle & state machine (queued/running/completed/failed/cancelled, timeouts, retries, orphan reclaim)Task FSM (F05) + retry chain (F06) + TTL sweepers (F07) + runner timeout (F43)parityunchallengedStrict task FSM, parent/child retry with max-attempts, TTL orphan/stale sweeps, and runner timeout cover the full lifecycle.
Session resumption (reuse session_id + work_dir)runner pins first session_id (F43); ExecEnv workdirpartial P2confirmedRunner pins the claude session_id and uses an isolated ExecEnv workdir per task, but reuse across the same (agent,issue) pair to preserve context is not confirmed.
evidence & judgment

Web-only judgment — Session reuse is daemon-internal and TUI-invisible; session_id pinning present, cross-task reuse is the unverified gap, so partial.

Verification evidence — runner.rs:192-205 spawns claude with ZERO args (no .arg/.args); grep for --resume/-r/--session/--continue across daemon+core = 0 hits. session_id only captured outbound (runner.rs:302-319), persisted (complete.rs:63-68, run_loop.rs:329) but never fed back. claim.rs:37-40 exposes prior_session_id but execute_claimed (run_loop.rs:228-284) never reads it. Workdir keyed per-task-id (execenv.rs:144; worktree.rs:46 branch hangar/task/{shortID}); resume = same-task recovery only (worktree.rs:81-86). retry.rs:64,108-116 drops session_id (NULL), only work_dir inherited. No (agent,issue) context reuse exists.

Autopilots · 8 rows
Create / edit / delete autopilotsAutopilot cron CRUD (F19); Autopilots screen (F22)parityunchallengedCLI create/list/enable/disable and Autopilots screen a/e/d cover autopilot CRUD with prompt+assignee.
Schedule autopilot via cronAutopilot cron + scheduler (F19/F20)parityunchallengedCron validation, next_tick scheduling, and a firing scheduler match cron scheduling.
Manually trigger autopilotautopilot run / fire_now (F20/F23); screen rparityunchallengedCLI 'autopilot run' and screen 'r' run-now both fire immediately via fire_now RPC.
Webhook-triggered autopilotsnonegap P2confirmedOnly cron + manual triggers; no webhook URL, signing secret, or event filters.
evidence & judgment

Web-only judgment — Webhook ingestion needs an HTTP listener; a local daemon can expose one and it is configurable via CLI, so this is a gap not oos (capability is server-side, not browser-bound).

Verification evidence — autopilot/mod.rs:1 "cron-scheduled"; service.rs:38-99 Autopilot/CreateAutopilot carry only cron_expr/next_tick_at, no webhook/secret/filter fields; proto methods.rs:83-106 only autopilots_list/runs/fire_now/set_enabled (cron + manual); pr_url.rs:3 "no webhook"; daemon lib.rs:74 only UnixListener, no HTTP ingress.

Rotate webhook token / set signing secretnonegap P3unverifiedNo webhook trigger exists, so no token rotation or HMAC secret management.
evidence & judgment

Web-only judgment — Token/secret management is CLI-expressible; gap, contingent on webhook triggers.

View autopilot runs & deliveriesAutopilots screen run list (F22); autopilot_runs RPCpartial P3confirmedAutopilot run history is listed, but there are no webhook deliveries to inspect/replay (no webhooks).
evidence & judgment

Web-only judgment — Run history is present and TUI-native; the webhook-delivery inspection half is gap, so partial.

Verification evidence — Runs at parity: methods.rs:91 HANGAR_AUTOPILOT_RUNS; autopilots.rs:44,82,350 run-history pane; snapshots.rs:380 autopilot_runs; tested autopilots_reducer_test.rs:37. Deliveries absent: zero webhook_deliveries/replay in code; P7.md:6 "Out of scope: webhook autopilots"; build-plan.md:55 "no webhook ingress at v1".

Autopilot modes & concurrency (create_issue/run_only; skip/queue/replace)scheduler skip when in-flight (F21)partial P2confirmedConcurrency skip-when-running is implemented, but explicit create_issue vs run_only modes and queue/replace policies are not exposed.
evidence & judgment

Web-only judgment — Modes/policies are config enums, TUI-expressible; only skip-concurrency present, so partial.

Verification evidence — Skip-only: scheduler.rs:38-42 + architecture.md:143; queue/replace enum collapsed to int (P7.md:17). No create_issue/run_only mode — fire path hardcodes issue_id=NULL (autopilot_run.rs:133); grep finds no ExecutionMode/create_issue/run_only/queue/replace in hangar src. Multica had all three (01-codebase-archaeology.md:269,493).

Autopilot templates (one-click presets)nonegap P3confirmedNo built-in autopilot preset templates (news digest, PR review, bug triage, etc.).
evidence & judgment

Web-only judgment — Preset autopilots are embedded data + a picker, TUI-expressible; absent, so gap.

Verification evidence — No autopilot preset surface: cli/hangar/mod.rs:142-176 AutopilotCommand=create/list/disable/enable/run (freeform name+cron+agent), autopilots.rs:127-131 Add/Edit open blank create flow, rpc/mod.rs:262-265 no create RPC. template/mod.rs:59-94 embeds 10 AGENT templates (no cron); all templates/*.json lack autopilot/schedule keys.

Squads · 5 rows
Assign issue to squadnonegap P2unverifiedNo squad model, leader-routing, or squad assignment anywhere in Hangar.
evidence & judgment

Web-only judgment — Squad assignment is data+orchestration, fully TUI/CLI-expressible; entirely absent, so gap.

Create / edit / archive squadsnonegap P2confirmedNo squad model, CLI, or screen.
evidence & judgment

Web-only judgment — Squad CRUD is data+TUI-expressible; entirely absent, so gap.

Verification evidence — ainb-hangar-core/src/actor.rs:24-40 ActorKind only Member|Agent (no squad). Zero `squad` hits in any hangar crate/plugin/sql. methods.rs lists workspace/agents/autopilots/issues/skills/tasks only. 16 migration tables, none a squad/group. hangar-tui has no squad screen.

Manage squad members & rolesnonegap P3unverifiedNo squad membership surface at all.
evidence & judgment

Web-only judgment — Squad-member management is TUI-expressible; gap, depends on absent squads.

Squad leader routingnonegap P3confirmedNo leader-agent briefing/routing logic in the daemon.
evidence & judgment

Web-only judgment — Leader routing is server orchestration with a TUI status view; absent, so gap.

Verification evidence — actor.rs:24-29 ActorKind={Member,Agent} only; migrations/0003_issue_comment.sql:25 assignee_type CHECK IN('member','agent'); no squad/leader_id table in any migration; daemon "leader" refs (runner.rs:207,352) are POSIX process-group, not agent routing. Plan admits "need squad DB entity + routing".

Squad member statuspresence dots in agent picker (F11)gap P3confirmedPresence dots exist for agents generally, but there is no squad construct to show per-member squad status.
evidence & judgment

Web-only judgment — Live status rendering exists (presence dots), but squad scoping is absent; gap.

Verification evidence — No squad construct in hangar: migrations/0003_issue_comment.sql:25 enforces assignee_type IN ('member','agent') (no 'squad'); no squad/squad_member table in any migration; rg 'squad' over crates/ exits 1 (zero hits). actor.rs:20-26 only Member/Agent; PresenceState (events.rs:187) is per-agent, not per-squad-member.

Skills · 5 rows
Assign skills to agentsSkill manager attach/detach (F15); agent_skill M:N (F13)parityunchallengedSkill manager i/d attach/detach to selected agent over agent_skill join matches skill assignment.
Create / edit / delete skillsSkill repo CRUD (F13); skills sync (F14); skill manager (F15)partial P2unverifiedSkills are imported/synced and attached/detached and viewed, but authoring/editing/deleting individual skills from the TUI is not the primary surface (sync-driven).
evidence & judgment

Web-only judgment — Skill repo CRUD exists at the store layer and is TUI-expressible; in-TUI authoring/edit/delete is limited, so partial.

Import skillsskills sync importer (F14)partial P2confirmedImports the local toolkit skills tree idempotently, but importing from arbitrary URLs (clawhub/skills.sh) is not supported.
evidence & judgment

Web-only judgment — Local-tree import is present and TUI/CLI-native; URL import is a gap, so partial.

Verification evidence — Local toolkit-dir import only: skills_sync.rs:148-219 ToolkitDirImporter+skills_sync_from (idempotent upsert_by_name); CLI source is Option<PathBuf> (cli/hangar/mod.rs:256), no URL arg. UrlImporter (skills.sh/GitHub) is explicitly "future" (skills_sync.rs:24-29); no reqwest/http-fetch in import path. Plugin exposes only `s sync` (skill_manager.rs:486). Tested in tripwire_skill_import_and_dispatch.rs.

Search skillsskill manager filter chips (F15)parityadjustedSkill manager has All/Used/Unused/Mine filter chips but no free-text skill search query.
evidence & judgment

Web-only judgment — Free-text skill search is TUI-expressible; only chip filters present, so partial.

Verification evidence — skill_manager.rs:36-45 All/Used/Unused/Mine chips; reduce_key:260-271 j/k/s/i/d/Enter/r (no '/' text mode); snapshots.rs:165 skills_list(workspace) no query param. Upstream multica skill.go:217 only ListSkills/GetSkill (no search); UX research 05:106 skills UI = chips only. Hangar fully replicates multica's chip-based skill filtering.

Manage skill files (list/upsert/delete)skill_file model + skill manager file tree (F13/F15)partial P3confirmedskill_file CRUD exists in the store and a file tree is rendered/viewable, but in-TUI upsert/delete of individual files is not surfaced.
evidence & judgment

Web-only judgment — File-list view present (TUI-native); per-file edit/delete is the gap, so partial.

Verification evidence — skill.rs:151,212,292,470 have full skill_file CRUD; skill_manager.rs:260-271 binds only s/i/d/Enter/r (no file create/edit/delete); editor_pane.rs:1 read-only; SkillManagerIntent (209-225) lacks any file-mutation variant; methods.rs:155-174 exposes no upsert/delete RPC.

Chat · 4 rows
Chat with an agentnonegap P2confirmedNo 1:1 agent chat sessions, messages, or chat screen; transcript is task-scoped only.
evidence & judgment

Web-only judgment — Streaming agent chat is fully TUI-native (a message pane); entirely absent, so gap.

Verification evidence — No chat_session/chat_message tables (migrations 0001-0010, grep NONE). No Screen::Chat (screen/mod.rs:44 enum). Messages task-scoped: HangarEvent::TaskMessage{task_id,...} (events.rs:79-86). "Chat task" is only worktree-skip (worktree.rs:49-73; 0004:6), no enqueue/chat path. Multica uses chat_session+chat_message (research/01:265).

Attach files in chatnonegap P3unverifiedNo chat and no file-attachment binding to a chat session.
evidence & judgment

Web-only judgment — File attach by path is CLI/TUI-expressible; gap, depends on absent chat.

Chat unread trackingnonegap P3confirmedNo chat sessions, so no unread tracking.
evidence & judgment

Web-only judgment — Unread badges are TUI-expressible; gap, contingent on chat.

Verification evidence — No chat conversation surface in hangar: proto RPC catalogue has zero chat/messages/inbox/unread methods (ainb-hangar-proto/src/events.rs:23, lib.rs); no chat/inbox screen in plugins/hangar-tui/src/screen/app_screens.rs. "chat" is only a no-issue_id task origin (worktree.rs:12, task.rs:16), not sessions+messages. notifyd's unread_by_session (ainb-plugin-notifyd/src/store.rs:248) tracks agent-hook notifications, unrelated to hangar, no chat ref.

Pending chat task indicatortask-started banner (F10) for task flowgap P3unverifiedTask banners exist for the issue/task flow but there is no chat surface to show pending chat tasks.
evidence & judgment

Web-only judgment — Pending-task indicators are TUI-native; absent in a chat context that does not exist, so gap.

Notifications · 3 rows
Notification inboxnonegap P2confirmedNo activity inbox aggregating issue/comment/task notifications with unread counts.
evidence & judgment

Web-only judgment — A notification inbox is fully TUI-native (a list screen with unread counts); absent, so gap.

Verification evidence — No Inbox screen in Screen enum (plugins/hangar-tui/src/screen/mod.rs:44-69). HangarEvent (ainb-hangar-proto/src/events.rs:39) is live stream, not persisted/aggregated. "notification" = JSON-RPC only (events.rs:4-35). No unread/read_at/mark_read in core/daemon/proto/store. TUI-viable (Logs/Kanban prove it), so not oos.

Bulk inbox actionsnonepartial P3adjustedNo inbox, so no mark-all-read / archive-all actions.
evidence & judgment

Web-only judgment — Bulk list actions are TUI-native; gap, depends on absent inbox.

Verification evidence — Rationale "no inbox" is false: host TUI has a notification Inbox with bulk archive. ainb-plugin-notifyd/src/store.rs:431 dismiss_visible() archives every visible row; inbox.rs:244 + events.rs:207 bind it to Shift+C; store.rs:411 mark_read. Hangar's own RPC (proto/methods.rs:155) lacks it, but the Inbox is a host builtin screen (events.rs:894) in the same binary. No mark-all-read sweep -> partial, not gap.

Notification preferencesnonegap P3confirmedNo notification preference configuration in Settings or CLI.
evidence & judgment

Web-only judgment — Preference toggles are TUI-native settings; absent, so gap.

Verification evidence — os_notifications hardcoded true (listener.rs:49); only test sets false (listener.rs:252). No config/env/CLI toggle (bin/ainb-notifyd.rs:30 = Run/Stop/Install/Uninstall/Status). Settings categories lack Notifications (state.rs:1189-1197). Daemon Settings = Daemon/Providers only (tripwire_p4_settings_renders.rs:30-37).

Settings, workspaces & onboarding · 21 rows
Onboarding questionnaire / 5-step onboarding flowdanger-full-access first-run warning (F28); firstrun.rs wizardpartial P2unverifiedHangar has a first-run flow (provider danger warning) but no source/role/use-case questionnaire or runtime-provisioning onboarding wizard.
evidence & judgment

Web-only judgment — A guided first-run is fully TUI-expressible (Hangar already has firstrun.rs), so the missing questionnaire/runtime-bootstrap steps are a real gap, not oos.

Bootstrap runtime during onboardingdaemon boot + runtime auto-register (F12/F38)partial P3confirmedDaemon registers runtimes on boot but there is no guided onboarding step to provision/opt-out/join-waitlist a starter runtime.
evidence & judgment

Web-only judgment — Runtime provisioning is CLI/daemon-shaped and partly present; the onboarding-wizard wrapper is the gap, TUI-expressible.

Verification evidence — No runtime onboarding step: setup_menu.rs:24-30 (wizard = deps/git/auth/editor/reset only); onboarding.rs has no runtime field. Multica's CLI-detect Runtime step is TUI-able (05-ux-design-analysis.md:37-48). Rationale's "registers on boot" is wrong — boot() (daemon lib.rs:173-232) never inserts; all agent_runtime inserts are test-only (seed.rs + tests/).

Create workspaceSettings new workspace (F29 n key); workspace repoparityunchallengedSettings screen 'n' creates a workspace via workspace:write capability; reserved-slug blocking not verified but core create exists.
Switch / list workspacesSettings workspace switch (F29); workspace/list RPCparityunchallengedSettings 's' sets active, 'd' default; workspace/list RPC enumerates; mirrors switch/list.
Edit workspace name & avatarSettings rename workspace (F29 r key)partial P3unverifiedRename exists; avatar is an image upload with no TUI rendering surface so only name-edit reaches parity.
evidence & judgment

Web-only judgment — Name-rename is TUI-native (present). Avatar image upload/display has no terminal rendering equivalent, so the avatar half is oos and the feature overall is partial.

Manage workspace members & rolesmember table with roles (schema only)gap P2confirmedmember table carries owner/admin/member roles but there is no RPC/CLI/screen to list members, change roles, or remove members.
evidence & judgment

Web-only judgment — Member/role management is plain CRUD fully expressible as CLI verbs or a TUI screen; schema exists but no surface, so gap.

Verification evidence — member(role owner/admin/member) is data-only: migrations/0001_init_workspace_user_member.sql:23-28; seeded once as owner seed.rs:75-78, cli/hangar/mod.rs:1511. Test-enforced method catalogue methods.rs:155-174 has zero member-mutation RPCs. members_of folds members read-only into agents_list snapshots.rs:142-217. No member screen (settings = keys+workspace switch, architecture.md:159); no set_role/remove_member fn or test anywhere. TUI equivalent is feasible, not web-only.

Invite members / revoke invitationsnonegap P2unverifiedNo invitation table, RPC, or CLI; single-user local daemon, but multi-member workspaces are modeled so invite is a missing CRUD surface.
evidence & judgment

Web-only judgment — Invitation issuance is data-CRUD expressible in CLI/TUI; absence is a gap, though low urgency for a local single-operator tool.

Accept / decline invitationsnonegap P3unverifiedNo invitation surface at all; depends on invite feature which is absent.
evidence & judgment

Web-only judgment — Accept/decline is a TUI-expressible list action; gap because no invite pipeline exists.

Leave workspacenonegap P3unverifiedNo leave-membership operation in RPC/CLI/Settings; member rows cannot be self-removed.
evidence & judgment

Web-only judgment — Self-removal from a workspace is a simple CLI/TUI action; gap not oos.

Pin items to sidebarnonegap P3unverifiedNo pin model or pin/reorder surface for issues/projects/agents.
evidence & judgment

Web-only judgment — Pinning favourites is TUI-expressible (a pinned list); absent, so gap (marginal).

Usage dashboardDaemon health pane + sparkline (F34); daemon statuspartial P2confirmedDaemon-health shows runtimes, concurrent tasks, and a throughput/failure sparkline, but there is no token-usage or per-agent usage rollup dashboard.
evidence & judgment

Web-only judgment — Usage charts are TUI-expressible (sparkline already present); token/per-agent usage rollup is the gap, so partial.

Verification evidence — daemon_health.rs:1-9,37-48 renders only runtimes+claim_cache+concurrent_tasks+throughput sparkline; proto settings.rs:131-141 DaemonHealthSnapshot has no token/cost/per-agent fields ("never persists it"); methods.rs has no usage method. Real token-usage burndown plugin (provides analytics, /usage) is ainb-tui host, zero refs in hangar-tui/daemon/proto/core. Multica usage/daily+by-agent rollup absent.

Connect GitHub (App install/manage)noneoosunchallengedGitHub App installation is an OAuth/web App-install handshake; no terminal install flow.
evidence & judgment

Web-only judgment — App installation requires the GitHub web consent flow; oos. (PR-URL capture below is the TUI-native slice Hangar does provide.)

Lark/Feishu integrationnoneoosunchallengedLark bot install/bind and chat-to-issue is a hosted third-party chat webhook integration with no local-TUI equivalent.
evidence & judgment

Web-only judgment — Lark binding is a SaaS OAuth/webhook flow tied to Lark's cloud; no terminal-equivalent capability, so oos.

Personal access tokens (create/list/renew/revoke)PAT / daemon tokens (F26)partial P3confirmedCreate/list/revoke PATs (hash-only, shown once) plus hidden daemon-token; renew/rotate is not listed.
evidence & judgment

Web-only judgment — PAT management is CLI-native and largely present; the renew verb is the only apparent gap, so partial.

Verification evidence — cli/hangar/mod.rs:342-349 TokenCommand=Create/List/Revoke only (no Renew); :1143 mint_pat, :1170 list_by_user, :1179 revoke; :336 daemon-token hide=true. store/repo/token.rs:94 stores sha256 only. Zero renew/rotate hits anywhere.

Edit user profile & preferences (name/lang/timezone)nonegap P3confirmedNo user profile/description/language/timezone editing in Settings or CLI.
evidence & judgment

Web-only judgment — Profile/preference editing is TUI-native settings; absent, so gap.

Verification evidence — Settings has only Daemon/Providers/Keys/Workspaces sections (plugins/hangar-tui/src/screen/settings.rs:61-70); RPC catalogue lacks any user/profile/prefs method (ainb-hangar-proto/src/methods.rs); user table = id/email/created_at only, no name/lang/tz (ainb-hangar-store/migrations/0001_init_workspace_user_member.sql:17-21). No timezone/language/locale hits anywhere.

Submit feedbacknonegap P3unverifiedNo in-product feedback submission path.
evidence & judgment

Web-only judgment — A feedback submit (text + POST) is CLI/TUI-expressible; absent, so gap (marginal).

Contact sales / enterprise inquirynoneoosunchallengedLanding-site sales inquiry form is a marketing web surface, irrelevant to a local TUI tool.
evidence & judgment

Web-only judgment — Contact-sales is a public marketing-site lead form; no TUI-equivalent purpose, so oos.

Multi-language UI / localized docs & CLInonegap P3confirmedNo i18n/localization of the TUI or CLI messages (English only).
evidence & judgment

Web-only judgment — Localization (string tables, locale switch) is TUI-expressible; absent, so gap (marginal for a dev tool).

Verification evidence — No i18n crate in any ainb-tui/hangar Cargo.toml. settings.rs:95-100 sections = Daemon/Providers/Keys/Workspaces only, no locale. runner.rs:47-48 LANG/LC_ALL are subprocess env pass-through, not UI localization. No translation/message-catalog anywhere.

Workspace context prompt (per-workspace system prompt)nonegap P2confirmedNo workspace-level context/system prompt injected into every agent.
evidence & judgment

Web-only judgment — A per-workspace prompt is a config field injected at dispatch, fully TUI/CLI-expressible; absent, so gap.

Verification evidence — workspace table has no prompt col (migrations/0001_init_workspace_user_member.sql:11-16); WorkspaceRow lacks one (ainb-hangar-proto/src/settings.rs:60-72); execenv writes no CLAUDE.md (execenv.rs:8-13); run_claude passes zero prompt args (runner.rs:192-205); synonym sweep + daemon tests empty.

Multi-workspace isolation (context prompt, repo whitelist, issue prefix)workspace-scoped store + Settings switch (F29)partial P2confirmedAll store queries are workspace-scoped and workspaces can be created/switched, but per-workspace context prompt, repo whitelist, and issue prefix are not modeled.
evidence & judgment

Web-only judgment — Workspace isolation is present at the data layer; the per-workspace config extras (prompt/whitelist/prefix) are TUI-expressible gaps, so partial.

Verification evidence — workspace table only id/slug/name/created_at (0001_init_workspace_user_member.sql:10-15); WorkspaceRow wire = id/slug/name/current/default (proto/settings.rs:55-68); scoping+switch present (repo/issue.rs:164, tripwire_workspace_switch_e2e.rs:56-150); zero context_prompt/repo-whitelist/issue_prefix hits anywhere; no workspace.rs repo.

Appearance / theme (light/dark/system)none (TUI style guide / fixed palette)parityrefutedNo light/dark/system theme selection in Settings.
evidence & judgment

Web-only judgment — Terminal colour themes are TUI-expressible (palette switch); no toggle present, so gap (marginal).

Verification evidence — Host Settings has live Appearance category with Dark/Light/System Choice: state.rs:1462-1475 (seed), 1741-1755 (load), 1875-1882 (apply); persisted UiPreferences.theme config/mod.rs:505 (default "dark"); rendered config_screen.rs:152; roundtrip tests config/mod.rs:1057,1107. Note: lives in ainb host shell, not hangar-tui plugin's own settings pane.

Auth · 7 rows
Email verification-code login (auth)PAT / daemon tokens (F26); CLI auth tokenoosunchallengedEmail magic-code is a browser/email round-trip; Hangar is a local single-user daemon authed via tokens/keychain, no email server or session cookies.
evidence & judgment

Web-only judgment — Email-code login presupposes a multi-tenant web server emailing codes. A local TUI has no email infra; the TUI-equivalent capability (proving identity to the daemon) is served by PAT/daemon tokens, so no email-login gap.

Google OAuth loginnoneoosunchallengedOAuth requires a browser redirect/callback and hosted client. Local daemon has no OAuth flow and no need for federated identity.
evidence & judgment

Web-only judgment — OAuth callback is intrinsically browser-bound; token-based local auth (F26) is the TUI-equivalent, so OAuth itself is oos not gap.

Logouttoken revoke (F26 auth token revoke)parityadjustedNo session concept to log out of, but tokens can be revoked via CLI, which is the equivalent credential-invalidation capability.
evidence & judgment

Web-only judgment — Logout = invalidate credential. TUI-equivalent (revoke token) exists, so this is partial rather than oos; a one-shot 'auth logout' convenience verb is the only gap.

Verification evidence — PatRepo::revoke hard-DELETEs token (token.rs:255-261), CLI `hangar auth token revoke` (mod.rs:347,1177-1188), tested revoke_pat_removes_row_and_verify_returns_none (repo_token.rs:209) + daemon-token (:328). No session/login table exists (migrations 0001-0010); auth is stateless PAT bearer (architecture.md:115,243), so revoke IS the complete credential-invalidation equivalent.

Auth route protection / redirect guardcapability gating + token auth on RPCpartial P3confirmedNo web routes to guard, but daemon RPC enforces workspace/secret capabilities and token auth, the equivalent access-control surface.
evidence & judgment

Web-only judgment — Route-redirect is a web-router concept; the underlying capability (deny unauthenticated access) is met by RPC capability gating, so partial not oos.

Verification evidence — Real enforced access control: capability gate ainb-plugin-runtime/src/secret_store.rs:174-180 (-32001) and workspace-scope IDOR guard docs/hangar/architecture.md:35,133. But token auth is unwired: PatRepo::verify (repo/token.rs:215) called only in tests/repo_token.rs; daemon listener serve() rpc/mod.rs:119-131 + handle() rpc/mod.rs:187-293 authenticate no requests. No web routes to guard.

Mobile email login + verifynoneoosunchallengedMobile app login screens have no TUI form factor.
evidence & judgment

Web-only judgment — Mobile-native screen, no terminal equivalent.

PAT & daemon tokens (browser JWT, mul_ PAT, mdt_ daemon)PAT + daemon tokens (F26)oosadjustedUser PATs and hidden daemon tokens exist (hash-only), but there is no browser JWT tier (no web sessions).
evidence & judgment

Web-only judgment — The CLI-relevant token tiers (PAT + daemon) are present; the browser-JWT tier is oos (no web), so the feature is partial.

Verification evidence — token.rs:46-63 TokenKind has only Pat(ainb_)/Daemon(mdt_); repo/token.rs pat+daemon_token tables sha256-only; cli/hangar/mod.rs:329-407 exposes auth token + daemon-token CLI. Zero jwt/cookie/web-session matches. Missing tier = browser cookie session, pure web form-factor with no TUI equivalent.

OS keychain secret storekeychain secret store (F24) + capability gating (F25)parityunchallengedmacOS/Linux keychain-backed secret store with zeroized bytes and secrets:read capability gating — a TUI-native superset of secret handling.
CLI & daemon · 9 rows
CLI issue managementainb hangar issue / task CLI (F01/F09)partial P1confirmedCLI covers create/list/show/assign and task list/cancel/retry, but not comment, label, subscribe, or search verbs (those features are absent).
evidence & judgment

Web-only judgment — Core issue CLI is present and native; missing verbs map to missing features (comment/label/search), so partial.

Verification evidence — cli/hangar/mod.rs:413-421 IssueCommand={Create,List,Show} + assign flag :442; no comment/label/subscribe/search verb. But comment IS a tested layer: migration 0003_issue_comment.sql, tripwire_migrations_apply.rs:247, CommentRow/CommentAdded event_roundtrip_test.rs:46 (no CLI verb/RPC). label/search genuinely absent.

CLI agent / squad / skill / project / label / autopilot / runtime managementtemplates, skills, autopilot CLI (F18/F14/F23)partial P1confirmedTemplates, skills, and autopilot CLIs exist; squad, project, label, and full agent/runtime management CLIs are absent.
evidence & judgment

Web-only judgment — CLI management is the native surface and partly present; the missing entity CLIs track missing features, so partial.

Verification evidence — cli/hangar/mod.rs:67-99 HangarCommand enum has skills/templates/autopilot (dispatch 686/826/897) but no squad/project/label/agent CRUD; agent only via templates_use (869-894); AgentRuntimeRepo is read-helper for task list (1331-1351), not mgmt; squad marked unbuilt in hangar-plan.html:606-609

CLI login (browser/token)auth token CLI (F26)partial P2confirmedToken-based auth exists via CLI; a browser-login flow and status/logout convenience verbs are not present.
evidence & judgment

Web-only judgment — Token CLI auth is native and present; browser-login is oos but a 'login --token/status/logout' UX is a TUI-expressible gap, so partial.

Verification evidence — cli/hangar/mod.rs:331-349 AuthCommand=Token{Create,List,Revoke}+DaemonToken; dispatch_auth:1118-1126 routes only those. No login/logout/status verb; rg for Login/Logout/browser-open/OAuth across hangar CLI+daemon+proto+core = none. Multica login=browser-OAuth+status (research 05:226, 06:192).

One-command setupnone (daemon start exists)gap P2confirmedNo single 'setup' command that configures, authenticates, and starts the daemon; daemon must be started separately.
evidence & judgment

Web-only judgment — A setup wrapper command is pure CLI orchestration; absent, so gap.

Verification evidence — cli/hangar/mod.rs:24-29 lists init+daemon-start as NOT-wired; mod.rs:516-519 DaemonCommand has only Status. plugins/hangar-tui/src/plugin.rs:20-21 says spawn_managed_subprocess (auto-start daemon) lands later; on_init:843-851 only dials pre-existing socket. tripwire_p4_common.rs:186-195 spawns daemon separately. No init/setup/onboard verb exists.

Self-update CLInone in hangar (ainb has self-update elsewhere)gap P3confirmedNo 'hangar update' self-update verb in the hangar CLI tree.
evidence & judgment

Web-only judgment — Binary self-update is CLI-native; not in the hangar surface, so gap.

Verification evidence — hangar/mod.rs:68-99 verb tree (issue/task/beads/daemon/auth/config/skills/templates/autopilot/logs) has no update/upgrade; DaemonCommand only Status (mod.rs:516-519). registry.rs:928 update is `ainb plugin update`. 03-architecture-review.md:50 deliberately defers self-update to OS package manager.

Agent daemon (start/stop/status/restart/logs/disk)daemon boot (F38) + daemon CLI + logs (F35)partial P2confirmedDaemon boots with migrations, has status and logs tail; start/stop/restart/disk-usage subcommands are not all confirmed present.
evidence & judgment

Web-only judgment — Daemon lifecycle is CLI-native; status+logs present, full start/stop/restart/disk set is partial.

Verification evidence — hangar/mod.rs:514-519 DaemonCommand has only Status; mod.rs:26 defers daemon start|stop; logs is top-level `hangar logs tail` mod.rs:108-111; P9.md:281 `daemon start` unchecked [ ]; disk/du = 0 code hits (docs-only 05-ux-design-analysis.md:214). status proven live in proofs/cli-hangar-174.11.cast. 2/6 verbs present.

Workspace garbage collectionTTL sweepers for tasks (F07)partial P3confirmedTTL sweepers fail stale tasks, but there is no disk-reclaiming GC of done/orphaned workspace dirs and build artifacts.
evidence & judgment

Web-only judgment — Disk GC is a daemon loop (TUI/CLI-invisible but configurable); task-sweep present, dir/artifact GC is the gap, so partial.

Verification evidence — execenv.rs:213 cleanup() = Full/ArtifactOnly(output build-artifacts)/OrphanScan(no .gc_meta.json, 72h grace execenv.rs:39); worktree.rs:114 cleanup_worktree rm -rf; tested execenv_layout.rs:134,147,165,188. But NO runtime wiring: run_loop.rs:137,181 spawn_sweepers runs only TTL row-sweepers; no scheduled OrphanScan/Full call, no RPC GC. So claim's status right, rationale ("no disk-reclaiming GC exists") wrong: it exists+tested, just unscheduled.

CLI repo checkout / Repository whitelistnonegap P2confirmedNo repo checkout into managed workdir and no workspace repo-whitelist management.
evidence & judgment

Web-only judgment — Repo checkout + whitelist are CLI-native (Multica's are CLI); absent in Hangar, so gap.

Verification evidence — No repos table (migrations 0001-0010); no repo/* or checkout RPC (proto/src/methods.rs full catalogue); hangar's own inventory 06-ainb-capability-inventory.md:307 "No workspace-level repo policy". worktree.rs prepare_worktree exists but has zero call sites in daemon run_loop/dispatch (unwired, test-only).

Daemon profiles (multiple isolated daemons)none confirmedgap P3confirmedNo profile mechanism for running multiple isolated daemons (separate config/state/port/workspace root) on one machine.
evidence & judgment

Web-only judgment — Daemon profiles are pure CLI/config, TUI-expressible; not present, so gap.

Verification evidence — No --profile/registry/launcher anywhere. Daemon is single-instance-per-home: rpc/mod.rs:107 "only one daemon owns a given hangar home"; lib.rs:20 O_EXCL pidfile; snapshots.rs:520 "single-daemon". $AINB_HANGAR_HOME (store.rs:104, rpc/mod.rs:90) is a test seam, not a profile. Unix socket, no port; daemon start|stop deferred (mod.rs:26).

Infra & platform surfaces · 5 rows
Live realtime updates (WebSocket feed)workspace/subscribe event stream (F40); live streamingparityunchallengedPlugin subscribes to a workspace and async events (TaskStarted/Finished, skill/autopilot updates) stream over the socket — equivalent realtime fabric.
Desktop app with embedded daemonnoneoosunchallengedElectron desktop shell with embedded web UI and auto-update has no TUI form factor.
evidence & judgment

Web-only judgment — A native Electron app is intrinsically a GUI shell; Hangar's whole premise is the TUI itself, so oos.

Mobile appnoneoosunchallengedExpo iOS app is a mobile-native client with no terminal equivalent.
evidence & judgment

Web-only judgment — Mobile-native client; no TUI form factor, oos.

Self-host via Docker Compose / Kuberneteslocal daemon binary (F38)oosunchallengedHangar is a single local daemon binary by design; there is no multi-service web stack to containerize/orchestrate.
evidence & judgment

Web-only judgment — Compose/Helm exist to deploy a hosted web+Postgres stack; Hangar's local-daemon model needs no such deployment, so oos rather than a missing capability.

Product analytics (PostHog) / Prometheus metricsTracing JSONL sink (F31) + OTLP exporter (F32) + service spans (F33)partial P3confirmedStructured JSONL tracing, optional OTLP span export, and instrumented spans provide operational telemetry, but there is no PostHog product-analytics funnel or Prometheus counter set.
evidence & judgment

Web-only judgment — Operational telemetry is present via tracing/OTLP; product-analytics and Prometheus counters are TUI-adjacent gaps but low value for a local tool, so partial.

Verification evidence — Tracing present: observability.rs:54-123 (OTLP seam), claim.rs:65 #[tracing::instrument name="task.claim"], architecture.md:67-68,253-255 (JSONL+OTLP+8 spans tested by it_otlp_export_when_endpoint_set.rs). ABSENT: zero posthog/prometheus/metrics-exporter deps in any hangar Cargo.toml/Cargo.lock; no .capture()/track_event/counter/gauge code — all grep hits are test/seed noise. OTLP is traces-only, no meter API.

Billing · 1 rows
Cloud billing & top-upsnoneoosunchallengedStripe checkout and billing portal are hosted web payment flows with no terminal equivalent.
evidence & judgment

Web-only judgment — Payment checkout/portal is intrinsically a browser PCI flow; no TUI-equivalent capability, so oos.

03Architecture critique

Hangar is a faithful, often cleaner re-architecture of Multica's task FSM: idempotent single-UPDATE finalize, atomic per-agent-capped claim, parent_task_id retry chains gated to infra-only failures, cadence-preserving scheduler with catch-up-not-replay, and a TTL sweeper with a 90s reclaim band — all verified in code and matching Multica's load-bearing SQL invariants. The IO-free core/proto split and capability-gated host calls are genuinely better-factored than Multica's app+SQL blend. But three design-level gaps are real, not feature gaps: (1) the unix-socket RPC server has NO authentication and NO socket-permission hardening — the token mint/verify in core/token.rs is unwired, so any local process reaching the socket gets full multi-workspace control-plane access; (2) the "dual-channel event stream" is half-built — the plugin fully decodes hangar/event frames the daemon never emits (zero emission sites), so live UI is snapshot-pull-only despite the architecture claim; (3) hangar's idx_one_pending_task_per_issue serializes ONE pending task per issue globally, silently dropping Multica's deliberate per-(issue,agent) model that lets different agents work one issue in parallel. Hangar also has no OS-level agent sandbox (env-allowlist only) vs Multica's Codex Seatbelt config.

Where Hangar is stronger

  • Cleaner layering. IO-free ainb-hangar-core (FSM table, cron, env policy, token) and ainb-hangar-proto (wire types) have zero internal deps, so the FSM and transition legality are unit-tested without a DB or runtime. Multica's equivalents live inside a 2370-line task.go intertwined with service wiring.
  • Idempotent finalize is a single reusable primitive (service/finalize.rs finalize_idempotent) shared verbatim by start/complete/fail/cancel, with explicit TerminalMismatch-vs-AlreadyTerminal-vs-IllegalState classification on the 0-row path. Multica re-expresses the same idea per-mutation in SQL; hangar factors it once and tests the classifier directly.
  • Capability security model is a real architectural seam Multica lacks entirely. manifest-declared caps enforced at runtime with -32001/-32003, grant-form discipline (list-form vs bool), read-only secret key allow-lists. Multica's daemon simply runs with whatever the user's shell grants.
  • Secret posture at the storage layer is strong and explicit. SecretBytes zeroize-on-drop, sha256 hash-only token storage, subtle::ConstantTimeEq verify, OS-keychain bridge, and keychain keys injected AFTER the env-policy pass so a code-injection var (LD_PRELOAD) in the deny family can never reach a child even if allowlisted.
  • Scheduler is a tighter design than a bare ticker. recompute-from-fired-tick preserves cadence, a far-behind daemon fires one catch-up tick (not a replay storm), skip-when-at-concurrency-limit still reschedules, and the WakeHandle is a clean test-only injected-clock seam that is provably inert in production.
  • Loose coupling is genuinely cleaner. the TUI is a plugin holding zero domain logic, so autopilots fire and tasks dispatch with no UI attached. Multica's web app is the primary surface and the daemon/server split carries more shared assumptions.
  • Single SQLite file + sqlx with a Postgres-compatible schema is the right call for a local-first single-node control plane: no Postgres+Redis+pgvector operational surface to run, while keeping a clean migration path to a server backend.

Where Multica is stronger

  • Concurrency model is richer and intentional. ClaimAgentTask's NOT EXISTS guard serializes per-(issue,agent) AND per-(chat_session,agent) AND per-(quick-create-shape,agent), explicitly allowing different agents to work one issue in parallel. Hangar collapses this to one-pending-task-per-issue-globally, a real semantic regression the parity matrix scores as 'has dedupe index = yes'.
  • Real distributed claim correctness. FOR UPDATE SKIP LOCKED + priority DESC ordering lets multiple daemons/runtimes claim concurrently with priority semantics. Hangar's claim has no priority column and relies on SQLite's single-writer serialization (correct for one daemon, but not a multi-node story).
  • Lost-response recovery is a first-class, claimable transition (ReclaimStaleDispatchedTaskForRuntime is a targeted RETURNING re-delivery per runtime). Hangar's reclaim is a batch sweeper UPDATE — functionally similar but coarser and not runtime-targeted at claim time.
  • Agent sandboxing is real OS-level isolation. the Codex execenv writes a per-task config.toml with an OS/version decision matrix (workspace-write + network_access on Linux/Landlock; danger-full-access fallback on macOS Seatbelt around openai/codex#10390, gated by a version constant that auto-tightens). Hangar has only an env-var allowlist — no Seatbelt/seccomp/Landlock, no filesystem confinement of the agent process.
  • Real-time is actually push. scope-keyed WS rooms with an optional Redis stream relay for horizontal fanout, plus daemonws wakeup hints layered on HTTP-claim-for-correctness. Daemon WS clients authenticate via Authorization headers before upgrade. Hangar's push channel is defined-but-unemitted, and the socket is unauthenticated.
  • Richer poisoned-state handling on resume. GetLastTaskSession explicitly excludes iteration_limit / api_invalid_request / codex_semantic_inactivity / agent_fallback terminal states so auto-retry never inherits a stuck conversation, and distinguishes manual rerun (force_fresh_session) from auto-retry. Hangar's retry inherits work_dir/session via parent linkage but has no poisoned-terminal exclusion taxonomy.
  • Real auth/tenancy posture. JWT + PATs (mul_ prefix) + daemon tokens + workspace owner/admin/member RBAC + Redis membership cache, enforced in middleware. Hangar mints/hashes tokens but does not enforce them at the RPC boundary.
  • Provider breadth is shipped, not planned. 13 agent CLIs behind one Backend interface today. Hangar ships only claude; codex/copilot are a P5 trait-impl plan.

Design-level gaps

These are not missing features — they are holes in the design as built. The three headline items first, with code evidence; the remaining five below.

design gap 01

Unauthenticated unix-socket RPC server

Unix-socket RPC server is unauthenticated and unhardened: bind() removes the stale file and binds with default perms; serve_conn dispatches every request with no token/peer-cred check. The token mint/verify in ainb-hangar-core/src/token.rs (sha256 + ConstantTimeEq) is never invoked on the request path. Any local process that can open the socket gets full control-plane access to every workspace. A feature-parity matrix scores 'tokens: implemented' and misses that they gate nothing.

Code evidence
rg over ainb-hangar-daemon/src: zero auth-token use on the RPC path; bind() at rpc/mod.rs:104 removes the stale file and binds with default socket perms; serve()/handle() (rpc/mod.rs:119–293) dispatch every framed request with no token/peer-cred check. core/token.rs mint/verify (sha256 + ConstantTimeEq) is invoked only in tests. Independently re-verified by hand.
design gap 02

Producer-absent “dual-channel” event push

The 'dual-channel event stream' is producer-absent. ainb-hangar-proto defines HangarEvent + the hangar/event notification method, and plugins/hangar-tui/src/stream.rs fully decodes those frames, but rg for EVENT_METHOD / HangarEvent:: / any notification write in ainb-hangar-daemon/src returns nothing. workspace/subscribe acks with an empty snapshot and comments that push is 'the stream client's concern'. Live updates therefore depend on snapshot re-pulls; the instant-feedback half of the design does not run.

Code evidence
rg over ainb-hangar-daemon/src: zero HangarEvent / EVENT_METHOD emission sites. ainb-hangar-proto defines HangarEvent + the hangar/event notification and plugins/hangar-tui/src/stream.rs fully decodes those frames — but no daemon code ever sends one. Live UI is snapshot-pull-only. Independently re-verified by hand.
design gap 03

Per-issue concurrency regression

Per-issue concurrency is a silent semantic regression. idx_one_pending_task_per_issue enforces one pending task per issue across ALL agents; Multica's ClaimAgentTask deliberately serializes per-(issue,agent) to let different agents work one issue in parallel. The parity matrix sees a dedupe index in both and marks parity, hiding that hangar forbids a concurrency pattern Multica designed for.

Code evidence
idx_one_pending_task_per_issue (partial unique index) enforces one pending task per issue across ALL agents. Multica’s ClaimAgentTask (pkg/db/queries/agent.sql) deliberately serializes per-(issue, agent) via a NOT EXISTS active-set guard, allowing different agents to work one issue in parallel.

The remaining design gaps

Recommendations

  1. Wire the event push channel or stop claiming it: the daemon emits zero hangar/event frames while the plugin's stream.rs fully decodes them. Either add emission sites at every finalize/claim/scheduler transition (the tx.send seam already exists in the scheduler's SchedulerEvent sink — extend it to a per-connection subscriber fanout in serve_conn), or rename the architecture doc to 'snapshot-poll with a reserved event channel' so reviewers aren't misled into thinking live push works.
  2. Authenticate the unix socket. Right now serve_conn dispatches every framed request with no token check. At minimum enforce SO_PEERCRED (same-uid only) and chmod the socket to 0600 in bind(); ideally require the minted daemon_token on the first frame and verify it with the existing core::token::verify path. The crypto is built — it is just not on the request path.
  3. Decide the per-issue concurrency model deliberately, don't inherit it by accident. Multica allows N agents per issue in parallel on purpose; hangar's global one-pending-per-issue index forbids it. If single-agent-per-issue is the intended v1 product decision, document it as a divergence; if not, change the partial index to (issue_id, agent_id) and port the NOT EXISTS active-set guard into the claim SELECT.
  4. Add an OS-level agent sandbox before shipping codex/copilot. The env-allowlist stops secret leakage into the child but does nothing to confine filesystem or network access of the agent process. On macOS port Multica's Seatbelt/danger-full-access matrix; on Linux reach for Landlock/seccomp. Today a misbehaving agent has the daemon user's full ambient filesystem access inside workdir's parent tree.
  5. Give claim a priority column and ordering parity with Multica (ORDER BY priority DESC, created_at). The current ORDER BY created_at, id is strict FIFO with no way to jump urgent work ahead — a feature Multica relies on and a cheap schema add now versus a migration later.
  6. Plan the multi-node story explicitly or commit to single-node. Hangar's claim correctness rests on SQLite single-writer serialization; that is fine for one daemon but the architecture doc gestures at a future Postgres backend. If multi-daemon is a goal, the claim/sweeper need FOR-UPDATE-SKIP-LOCKED-equivalent semantics that SQLite cannot provide — that is a storage-engine decision, not a migration.
  7. Add a poisoned-terminal taxonomy to retry/resume. parent_task_id chaining preserves session/work_dir, but RetryService has no equivalent of Multica's exclusion of iteration_limit/api_invalid_request states. Without it, an auto-retry that resumes a session can inherit a conversation the model already wedged on.

04Verification & testing

Coverage is strong but not yet "trust without manual testing." The tripwire harness is genuinely high-grade: real daemon binary + real SQLite WAL + real tmux + real ainb CLI, positive AND negative assertions (no substring-OR), skip-not-fail gates, exact-name kills, byte-level materialise asserts, poll-with-deadline (no bare sleep), run in CI on mac+ubuntu via run_all_tripwires.sh, and a meta-guard against silent deletion. Store-layer concurrency is real (tokio::join! races for both claim and finalize). What's missing is resilience and isolation: no daemon restart/recovery, no plugin/host crash recovery, no migration upgrade-from-populated-DB, no proof of multi-workspace DATA isolation (only the active-marker move is asserted), no daemon-level concurrent dispatch, no long-session soak, and every "happy path" uses fake-claude so no real provider is ever exercised. 44 features each have a path, but the unhappy/recovery quadrant is thin.

Current coverage strengths

Real-everything tripwires

Real daemon binary + real SQLite WAL + real tmux + real ainb CLI; run in CI on mac+ubuntu via run_all_tripwires.sh, with a meta-guard against silently deleted legs.

Disciplined assertions

Positive AND negative assertions (no substring-OR), skip-not-fail gates, exact-name kills, byte-level materialise asserts, poll-with-deadline (no bare sleep).

Real concurrency at the store layer

tokio::join! races for both claim and finalize prove the atomic-claim and idempotent-finalize invariants in-process.

Every feature has a path

All 44 catalogued Hangar features carry at least an acceptance test; most carry an e2e tripwire (see the full feature × coverage list below).

Hangar feature catalogue with test coverage (44 features)
IDFeatureAreaAcceptancee2e tripwire
F01Create / list / show issue (CLI)
Create, list, and show Hangar issues; --assign enqueues a task for an agent runtime.
Issueshangar_cli_integration.rs + cli::hangar parsetripwire_hangar_issue_roundtrip (referenced in doc; issue roundtrip)
F02Persist issue + assignee
Store layer persists issue rows with assignee, workspace-scoped, partial-unique one-pending-task-per-issue index.
Issuesrepo_issue.rsnone (covered via issue roundtrip)
F03Issue list screen (nav / filter / create)
Browse and filter issues (All/Members/Agents/Mine chips), c create, a assign agent, Enter open task detail.
Issuesissue_list_reducer_test.rs (7)tripwire_p4_issue_list_renders.rs (tmux)
F04Kanban board (4 columns, card move)
Four columns queued/running/done/failed; Shift+Left/Right drags focused card, firing a task transition RPC.
Issueskanban_reducer_test.rs (9) + kanban_rpc_over_socket + snapshot_kanban_layouttripwire_kanban_columns_render.rs (tmux)
F05Task FSM (claim / start / complete / fail / cancel)
Strict task finite-state machine in core, enforced by store finalize services; idempotent terminal transitions.
Tasksfinalize_idempotency (22) + claim_task_integration + task_state_transitions + finalize_concurrent_complete_vs_canceltripwire_task_happy_path_claude_provider.rs (tmux)
F06Retry chain (parent/child, max-attempts)
Retryable failure spawns child task linked by parent_task_id, capped by max_attempts; agent_error does not retry.
Tasksretry_chain.rs (8)none
F07TTL sweepers (stale queued/dispatched/running to failed)
Stale queued (2h)/dispatched (5min)/running (2.5h) rows swept to failed in idempotent batches capped at 500.
Taskssweeper_ttls.rs (10)tripwire_ttl_sweeper_fails_stale_dispatched.rs (real daemon socket)
F08Task detail + live transcript screen
Live 5-colour transcript stream, PR badge, o open-in-browser, r retry / x cancel (confirm modal).
Taskstranscript_reducer_test.rs (10) + transcript_render_snapshot (2)tripwire_p4_task_detail_streams.rs (tmux)
F09Task CLI (list / cancel / retry)
List pending tasks, cancel a task, spawn retry child for a retryable failed task.
Taskshangar_cli_integration + cli::hangar parsenone
F10Task-started banner
Transient banner notifies when a task starts running, folded from TaskStarted event.
Tasksbanner_reducer_test.rs (6)none
F11Agent picker (assign agent)
Modal to pick a human or agent to assign; presence dots, / filter, recents pinned, Enter assign.
Agentsagent_picker_reducer_test.rs (8)tripwire_p4_agent_picker_opens.rs (tmux)
F12agents_list snapshot
Daemon answers hangar/agents_list from store, workspace-scoped, feeding the agent picker.
Agentsrepo_agent.rs + rpc_server.rsnone (exercised in picker tripwire)
F13Skill repo CRUD (scoping, cascade)
Workspace-scoped skill + skill_file + agent_skill M:N CRUD with cascade delete and cross-tenant guard.
Skillsskill_repo_tests.rs (9) + skill_service inlinenone
F14Skills sync importer (idempotent)
Imports toolkit/packages/skills tree into a workspace, idempotent on (workspace, name); dry-run supported.
Skillstripwire_skills_sync_idempotent.rs (5, acceptance-style) + cli parsescreens_render_from_daemon (sync RPC); tripwire_skills_sync_idempotent (daemon-level, not tmux)
F15Skill manager screen (attach / detach / sync / view)
Three-pane skill list/file-tree/detail; s sync, i/d attach/detach to selected agent, Enter view SKILL.md body, filter chips.
Skillsskill_manager_reducer_test.rs (9) + skill_screen_snapshot (2)tripwire_p4_skill_manager_lists.rs (tmux)
F16Dispatch-time skill materialisation
On claim, copies agent's attached skills to provider-native path (.claude/skills etc), chmod 0755, outside git root.
Skillsmaterialise_skills_tests.rs (8)tripwire_skill_import_and_dispatch.rs (real daemon socket)
F17Curated agent templates (embedded, resolve)
10 curated agent templates baked into the binary; resolve instructions + bundled skill list from no database.
Templatestemplate_registry_tests.rs (5)none
F18Templates list / show / use CLI
List embedded templates, show one in full, use materialises a template into a live agent + skill attachments.
Templatestemplate_use_tests.rs (6) + cli parsenone
F19Autopilot cron CRUD (reject invalid cron)
Create/list/enable/disable cron autopilots; invalid cron rejected before any row written (6-field, 5-field normalised).
Autopilotsrepo_autopilot.rs (14) + cron.rs inline (12)none
F20Autopilot scheduler fires on schedule
Single daemon task sleeps until earliest next_tick_at; at fire time inserts run + enqueues task in one transaction.
Autopilotsscheduler_loop (5) + repo_autopilot_enqueuetripwire_autopilot_fires_on_schedule.rs (real daemon socket)
F21Autopilot scheduler skips when in-flight
At fire time, if concurrent runs >= max_concurrent_runs, tick is skipped and autopilot.tick_skipped emitted.
Autopilotsscheduler_loop::skip_when_prior_run_in_flighttripwire_autopilot_skips_when_running.rs (real daemon socket)
F22Autopilots manager screen
Lists autopilots + recent runs; a/e create/edit, r run-now, d enable/disable, selection loads runs.
Autopilotsautopilots_reducer_test.rs (6) + autopilots_screen_snapshot (4) + autopilot_rpc_over_socket (2)tripwire_autopilots_screen_renders.rs (tmux; NEW, postdates architecture.md which lists this as a gap)
F23Autopilot CLI (create/list/disable/enable/run)
Full CLI surface for cron autopilots including immediate manual tick via run.
Autopilotshangar_autopilot_cli.rs (2) + parsenone
F24OS keychain secret store (get/set/delete)
Secret store backed by macOS Security framework / Linux backend; zeroized SecretBytes; in-memory backend for tests.
Auth / Secretsbackend.rs (7)tripwire_keychain_roundtrip.rs (#[ignore], dev-mac only)
F25secret_store_get capability gating
Plugin host capability secrets:read gates host/secret_store_get; key allow-list; read-only, ungranted returns -32001.
Auth / Secretssecret_store_cap.rs (5)none
F26PAT / daemon tokens (hash-only)
Create/list/revoke personal access tokens (sha256 stored, plaintext shown once); hidden daemon-token create.
Auth / Secretsrepo_token (11) + token.rs inline (3) + clinone
F27Env allowlist (block LD_PRELOAD family)
Allowlist governs ambient env vars passed to provider subprocess; hardcoded code-injection deny family always overrides.
Auth / Secretsenv_policy (5) + env_allow_config (3) + runnertripwire_env_allowlist_blocks_ld_preload.rs + tripwire_env_allowlist_passes_home.rs (daemon-level)
F28danger-full-access first-run warning
Warns about danger-full-access first time each provider invoked per session; y dismisses + records ack in state.toml.
Auth / Secretswarnings.rs inline (4)tripwire_warning_shown_on_first_provider_use.rs (tmux)
F29Workspace switching in Settings
Set active / toggle default / create / rename workspaces via Settings; workspace:write capability gates host calls.
Auth / Secretssettings_reducer_test + workspace_cap.rs (7)tripwire_workspace_switch_e2e.rs (tmux)
F30Settings screen (sections, provider key entry)
Four-section settings: provider keys (keychain write via n key entry) + workspace switching; section/row nav.
Auth / Secretssettings_reducer_test.rs (9) + settings_workspace_render_snapshottripwire_p4_settings_renders.rs (tmux)
F31Tracing JSONL sink
Daemon tracing wired to structured JSONL sink with daily rotation (daemon.<date>).
Observabilityit_subscriber_writes_jsonl.rs (1)none
F32OTLP exporter (otlp feature)
Optional OTLP span export when endpoint set; zero crates linked in default build.
Observabilityit_otlp_export_when_endpoint_set.rs (--features otlp)tripwire_otel_export_when_endpoint_set.rs (daemon-level, feature-gated)
F33Instrumented service spans
Eight task/service methods plus beads sync emit tracing spans for observability.
Observabilityservice_spans_emit + beads_sync_spans_emitnone
F34Daemon health pane + dual-dim sparkline
Runtimes, claim-cache, concurrent tasks, throughput sparkline encoding height=total and red proportion=failure rate.
Observabilitysnapshot_daemon_health.rs (5)tripwire_daemon_health_sparkline.rs (tmux)
F35Logs tail CLI + logs screen
Tail daemon structured JSONL: CLI tail -f/--lines/--level; TUI screen with level-filter chips, colour-by-level.
Observabilitylogs.rs inline (8) + cli + snapshot_logs_screen (3)tripwire_logs_screen_renders.rs (tmux; NEW, postdates architecture.md which lists this as a gap)
F36PR-URL capture into task result
FSM finalize parses any gh pr create URL from the transcript into result.pr_url, surfaced in issues_list.
gh integrationpr_url_parse (10) + result inline + issues_list_pr_url (3)tripwire_pr_capture.rs (real daemon socket)
F37PR badge + o open-in-browser
Task detail shows a gold PR badge when result.pr_url is set; o raises OpenPrUrl only when a PR exists.
gh integrationpr_badge_snapshot (5) + pr_open_keybinding (3)tripwire_pr_badge.rs (tmux)
F38Daemon boot + migrations apply
Daemon binary boots, applies migrations 0001-0010 (16 tables) against the SQLite file.
Daemon / Transporttripwire_migrations_apply.rs (16 tables)tripwire_daemon_boots.rs (real daemon binary)
F39Unix-socket JSON-RPC + snapshots
Unix-socket JSON-RPC 2.0 server answers 17 methods; snapshot RPCs scoped by resolved workspace id.
Daemon / Transportwire_types (6) + rpc inline + rpc_server.rstripwire_hangar_plugin_connects.rs (plugin connects to real socket)
F40workspace/subscribe + event stream
Plugin subscribes to a workspace; async events (TaskStarted/Finished, tick_skipped, skill updates) stream back.
Daemon / Transportevent_roundtrip (6) + stream_decode (8) + daemon_dialtripwire_detects_daemon_drop (referenced; detects socket drop)
F41Cross-screen navigation
Tab routing across screens (1 issues, 4 skills, , settings, etc) handled before any screen reducer.
Daemon / Transportscreen_router_test.rs (5)tripwire_p4_cross_screen_navigation.rs (tmux)
F42Beads bidirectional sync
Reconcile hangar issues with bd tracker: inbound + outbound sync, mapping table, drift repair, dry-run/json.
Daemon / Transportbeads_adapter/reconcile/inbound/outbound/cli (50+)tripwire_beads_roundtrip.rs (real daemon socket)
F43Claude runner exec (env / exit / stream / timeout)
Spawns claude binary in isolated ExecEnv, deny-by-default env, streams JSONL stdout, pins session_id, enforces timeout.
Daemon / Transportrunner_claude.rs (6) + execenv_layoutnone (exercised in tripwire_task_happy_path_claude_provider)
F44Full-suite e2e meta-guard
Meta-tripwire guards the Hangar tripwire suite against silent deletion / rename (count baseline).
Daemon / Transportnonetripwire_full_e2e.rs (meta-count baseline)

Blind spots — the unhappy/recovery quadrant

AreaWhat’s missingSuggested check
Daemon restart / crash recoveryNo tripwire kills the running daemon mid-task and restarts it. A task in 'dispatched'/'running' when the daemon dies should be reclaimed or swept on restart; the WAL DB should survive an unclean kill. Today only TTL sweeps (F07) cover stale rows, and the daemon process is only ever torn down at end-of-test by exact-pid kill, never restarted against the same DB.Spawn daemon, enqueue+claim a task, kill -9 the daemon pid mid-run, respawn a fresh daemon against the same $HOME/hangar.db, assert (a) DB opens cleanly post-unclean-WAL-kill and (b) the orphaned running/dispatched row reaches a terminal state (reclaimed or swept-to-failed) within budget.
Plugin crash / host reconnectF40 references tripwire_detects_daemon_drop (socket drop seen by plugin) but there is no test for the inverse: plugin subprocess crashing and the host re-spawning/re-dialing, nor the macOS parent-death watcher added in commit 8494ad0f (orphaned-plugin prevention). A crashed plugin that leaves the host hung or a zombie plugin after host exit is untested end-to-end.In a real tmux ainb-tui session on the Hangar screen, kill the hangar-tui plugin child pid, assert the host either re-spawns it and re-renders seeded rows OR shows a clean disconnected state (not a frozen pane). Separately: kill the ainb host and assert no orphaned hangar-tui process survives (validates the parent-death watcher).
Multi-workspace DATA isolationtripwire_workspace_switch_e2e only proves the active ▶ marker moves to the second workspace slug; it seeds the second workspace EMPTY and never asserts that switching changes the visible issues/tasks/agents. Cross-tenant leakage (workspace A's issues showing under B) would pass today. reference_hangar_workspace_slug_vs_id explicitly warns that fixtures where id==slug mask the resolution bug.Seed workspace A with issue 'Refactor API' and workspace B (acme) with a DISTINCT issue 'Acme Login Bug'. After switching to B, assert the issue list shows 'Acme Login Bug' and does NOT contain 'Refactor API' (and vice-versa). Add a store-layer test asserting agents_list/issues_list for ws A never returns ws B rows.
Schema migration upgrade-from-populatedtripwire_migrations_apply only proves migrations 0001-0010 apply to a FRESH SQLite DB (16 tables). There is no test that applies an older migration set, seeds real rows, then applies a newer migration on top — the actual upgrade path every existing user hits. A destructive or non-idempotent later migration against populated data is uncaught.Build a DB at migration N, INSERT representative rows (workspace/issue/task/agent/skill/autopilot/token), then run apply_migrations to head and assert (a) no error, (b) seeded rows survive/transform correctly, (c) re-running apply_migrations a second time is a no-op (idempotent).
Daemon-level concurrent dispatchConcurrency is proven only at the store layer (tokio::join! on claim/finalize in-process). No test runs a real daemon with N queued tasks and a high agent concurrency cap and asserts the claim loop dispatches them without double-claiming, exceeding max_concurrent_tasks, or deadlocking the WAL. The daemon-health tripwire drains a queue but doesn't assert the concurrency CAP is respected under contention.Spawn one claim-enabled daemon, set agent max_concurrent_tasks=3, enqueue 10 tasks with a fake-claude that sleeps; poll and assert running count never exceeds 3 at any sample and all 10 reach terminal. Optionally spawn two daemons bound to the same runtime to prove only one claims each row.
Real provider + failure/timeout/retry chain end-to-endEvery dispatch tripwire uses fake-claude.sh; the real claude binary is never exercised (acceptable for CI determinism, but disclose it). More importantly, the retry chain (F06) and timeout enforcement (F43 timeout) have store-layer tests but NO end-to-end tripwire: a real daemon failing a task, spawning the parent_task_id child, capping at max_attempts, and an agent_error NOT retrying.Spawn claim-enabled daemon with fake-claude that exits non-zero (retryable) then succeeds; assert a child task with parent_task_id is created, runs, and succeeds; assert agent_error variant does NOT spawn a child; assert a fake-claude that sleeps past HANGAR timeout is killed and failed with reason=timeout.
Create-flow through keystrokes (vs SQL-seeded state)Almost all TUI tripwires seed state via direct SQL then assert the render. The actual user CREATE paths (issue create via 'c', autopilot create via 'a/e', key entry via 'n', card move via Shift+arrows firing the transition RPC) are covered by reducer unit tests but the keystroke->RPC->DB->re-render round trip is largely unproven in tmux. Multica drives every create through real UI clicks/forms and asserts the resulting toast + persisted row.In tmux, press 'c' on the issue list, type a title, submit, and assert (a) a success indicator renders and (b) the row appears in the daemon DB. Same for Kanban Shift+Right: assert the focused card's task status actually transitioned in the DB, not just that the card moved on screen.
Long-session soak / leak / stream backpressureNo test runs the daemon+plugin for a sustained period under a steady event stream (TaskStarted/Finished/transcript chunks). Transcript streaming (F08), the event stream (F40), and the throughput ring (F34) could leak memory, drop events under backpressure, or unbounded-grow the transcript buffer over a long session — none is sampled beyond a few seconds.Drive 500+ transcript/event messages over ~2 min through a live plugin session; assert no dropped sequence ids, bounded RSS growth, and the screen still responds to a nav key at the end.

Lessons from Multica’s test suite

Autonomous verify-hangar harness

A runnable goal-file ships at docs/hangar/verify-hangar-goal.md. Dropped into a fresh agent session at the repo root, it builds and stages the binaries + plugin, seeds two non-empty workspaces, walks all 44 features (F01–F44) through real CLI / tmux / daemon legs, runs the eight resilience legs above (R01–R08), and emits a per-feature PASS/SKIP/FAIL table with exit code = failed legs. Provider execution uses fake-claude (mocked), not the real binary — disclosed per the mocked-vs-live rule.
Harness requirements (what the goal-file encodes)
  • Environment setup: build ainb + ainb-hangar-daemon + ainb-hangar-secrets binaries and stage the hangar-tui plugin (scripts/build-plugins.sh into dist/plugins/hangar-tui/<bin>), signed/executable; verify tmux present (tmux -V). Each feature run uses an isolated $HOME tempdir with seeded onboarding.toml + first_run ack so the wizard/danger modal never intercept keystrokes. Resolve the workspace VERSION from [workspace.package] (not the daemon crate's 0.x) or the onboarding wizard re-fires.
  • Seeding: per-feature deterministic fixtures via ainb_hangar_store directly (seed_p4_fixture) PLUS, for isolation/create-flow features, drive state through the real ainb CLI / keystrokes rather than only SQL so the create path itself is verified. Seed at least two non-empty workspaces with DISTINCT issues for the isolation leg. Use greppable, collision-proof markers (e.g. LOGS_TRIPWIRE_MARKER_42, #<short_id> suffixes) never chrome strings.
  • Walk every one of the 44 features: for CLI features run the real ainb hangar <verb> and assert stdout + resulting DB row; for TUI screens spawn ainb tui in tmux, press the documented open key, poll_capture for a POSITIVE seeded marker AND a NEGATIVE placeholder (not Loading/empty, not the prior screen bleeding through), and add a return-navigation leg so a one-way key swallow can't pass; for control-plane features (dispatch/materialise/sweep/scheduler/retry) run a real daemon and assert on-disk bytes / terminal DB state.
  • Cover the resilience quadrant the current suite omits: daemon kill-9-and-restart recovery (orphaned running/dispatched row reaches terminal), plugin crash + host re-dial / clean disconnect, host-exit-leaves-no-orphan-plugin (parent-death watcher), migration upgrade-from-populated + double-apply idempotency, daemon-level concurrent dispatch respecting max_concurrent_tasks, and end-to-end retry chain + timeout with a real claim loop.
  • Pass/fail criteria: a feature is GREEN only when both a positive assertion (seeded data rendered / persisted) and a negative assertion (no leak, no stuck state, no cross-workspace bleed) hold within a deadline-bounded poll. Byte-level equality for materialised files (trim_end_matches('\n') per the insta trailing-newline trap). Aggregate exit code = number of failed feature legs; print the failing feature id + last 20 lines on failure.
  • Flake handling: no bare sleep before any capture — always poll_capture(deadline, predicate) with ~200ms gaps; re-send single-char nav keys every ~1.5s until the target marker appears (first frames race session discovery and drop keystrokes); --test-threads=1 across the whole suite (cross-process socket binds + ENV_LOCK-guarded env mutation); SKIP-not-fail (with a greppable SKIP: line) when tmux/binaries/plugin are absent so a thin CI image never reds.
  • Teardown: kill the daemon by its exact child pid and the tmux session by its exact unique name (hangar-verify-<pid>-<nanos>) — NEVER tmux kill-server / pkill / wildcard. Drop the per-feature $HOME tempdir. Assert no orphaned ainb-hangar-daemon or hangar-tui process survives the run (doubles as the parent-death-watcher check).
  • Reporting + meta-guard: emit a per-feature PASS/SKIP/FAIL table mapped to the F01-F44 ids plus the new resilience legs, and keep the existing tripwire_full_e2e meta-count baseline so a silently deleted/renamed leg reds the run. Disclose explicitly that provider execution uses fake-claude (mocked), not the real claude binary, per the mocked-vs-live disclosure rule.

05Proposed beads — 35 items

Proposals — not yet created, pending approval. None of these beads exist in the tracker yet. Distribution: 1 P0 · 12 P1 · 20 P2 · 2 P3feature-gap 21 · architecture 6 · verification 8 · sizes S 1 / M 25 / L 9.

P0 — ship-blocker (1)

P0Marchitecture

Authenticate the unix-socket RPC server (token verify + peer-cred)

serve()/handle() (rpc/mod.rs:119-293) dispatch every framed request with zero auth; core::token::verify (sha256+ConstantTimeEq) is invoked only in tests. Enforce SO_PEERCRED same-uid, chmod socket 0600 in bind(), require minted daemon_token on first frame via existing verify path. Any local process currently gets full control-plane access.

source: design_gap: unauthenticated unix socket + confirmed token-unwired (rpc/mod.rs:119-293, core/token.rs)

P1 — high (12)

P1Larchitecture

Wire daemon event-push emission or rename 'dual-channel' design

Plugin stream.rs fully decodes hangar/event frames but daemon emits zero — no HangarEvent::/EVENT_METHOD write anywhere in daemon/src; workspace/subscribe acks an empty snapshot. Live updates silently depend on snapshot re-polls. Add per-connection subscriber fanout in serve_conn at every finalize/claim/scheduler transition (SchedulerEvent tx.send seam exists), or document as snapshot-poll.

source: design_gap+recommendation: producer-absent event stream (stream.rs vs daemon/src)
P1Marchitecture

Decide per-issue concurrency model: global vs (issue,agent)

idx_one_pending_task_per_issue (migration 0006) forbids one pending task per issue across ALL agents; Multica serializes per-(issue,agent) to allow parallel agents on one issue. Either document single-agent-per-issue as a deliberate v1 divergence, or change the partial index to (issue_id,agent_id) and port the NOT EXISTS active-set guard into the claim SELECT.

source: design_gap: silent per-issue concurrency regression (claim/dispatch migration 0006)
P1Mfeature-gap

Add priority column + claim ordering parity with Multica

CLAIM_SQL orders by created_at,id only — strict FIFO with no expedite path. Issue/task model has no priority field (events.rs:203-227, issue.rs:26-66). Add priority to schema + ORDER BY priority DESC, created_at. Cheap schema add now vs a migration on the hottest table later. Unblocks issue-create priority and group-by-priority work.

source: design_gap+confirmed: FIFO claim, no priority on issue model
P1Mfeature-gap

Wire issue comment write path (RPC + store + compose key)

comment table (migration 0003:36) is never inserted; no repo/comment.rs; no comment method in ALL_METHODS (drift-guarded, verified); CommentRow/CommentAdded built only by test fixtures. task_detail.rs:424 renders them read-only with no compose key. Add a hangar/comment_add RPC, store insert, and a compose keybind so the core comment workflow is reachable end-to-end.

source: confirmed: comment-on-issues gap (migration 0003, methods.rs, task_detail.rs:424)
P1Mfeature-gap

@-mention an agent in a comment to trigger a task

No mention parser over comment bodies and no comment-triggered task-spawn path; the core agent-collaboration loop is absent. After the comment write path lands, parse @agent mentions in new comments and enqueue a task for the mentioned agent. Depends on comment write-path bead.

source: confirmed: mention-to-trigger gap (no parser, no comment-spawn path)
P1Lfeature-gap

Issue write RPCs: edit state/assignee/priority/project/dates

No issue-mutation RPC in ALL_METHODS (verified); assignee picker intent is dropped (app_screens.rs:646-655 ignores out.intent), create is ignored (app_screens.rs:557), issue state only changes via beads sync. Add hangar/issue_update RPC + store update, route the picker intent, and persist create. Kanban moves tasks not issues, so issue field-edit has no surface today.

source: confirmed(adjusted): edit/update issue + quick-create intents dropped
P1Lfeature-gap

Extend issue model + create flow: priority, due dates, labels

NewIssue/Issue (issue.rs:26-66), IssueRow (events.rs:203-227), and IssueCreateArgs (cli/hangar/mod.rs:425-443) carry title/desc/state/assignee only. Add priority/due-date/label fields to the store schema, wire type, CLI create, and 'c' inline create so created issues persist these attributes. Pairs with the priority/claim bead.

source: confirmed: create-issue partial — no priority/dates/labels
P1Lfeature-gap

Agent CRUD: edit/archive + config knobs (model/args/MCP/thinking/env)

AgentRepo is insert/get/list only (agent.rs:55-94), no archived col, agents created only via `templates use`. agent model lacks model/thinking/CLI-args/MCP/per-agent-env (migration 0002:31-39). Add update/archive RPCs, an archived flag, and schema+config for model/args/MCP/thinking/per-agent env. Currently no general agent edit surface exists.

source: confirmed: create/edit/archive agents + configure runtime/model/args partial
P1Lfeature-gap

Multi-provider runner: codex/copilot/gemini exec paths

runner.rs:137-152 has one Provider impl (claude); run_loop.rs:284 unconditionally run_claude; cfg only claude_path. ProviderSkillLayout scaffolding exists+tested (materialise.rs:57-122) for codex/cursor/copilot/gemini. Wire real exec paths for at least one additional provider behind the Provider trait so the multi-backend product promise is met (1 of 12 today).

source: confirmed: 12-provider backends partial (runner.rs:137, run_loop.rs:284)
P1Larchitecture

Add OS-level agent sandbox before shipping non-claude providers

dispatch/runner give a 12-var env allowlist + deny family + process-group kill (secret-leak boundary) but no Seatbelt/seccomp/Landlock and no FS confinement — the agent subprocess has the daemon user's ambient FS/network access. Port a Seatbelt profile on macOS and Landlock/seccomp on Linux before codex/copilot land. Gates the multi-provider bead.

source: design_gap+recommendation: no OS-level agent sandbox
P1Mverification

Tripwire: daemon kill-9 mid-task crash recovery against same DB

No test kills the running daemon mid-task and restarts against the same hangar.db. Spawn daemon, enqueue+claim a task, kill -9 mid-run, respawn against the same $HOME db, assert (a) WAL DB opens cleanly post-unclean-kill and (b) the orphaned running/dispatched row reaches a terminal state (reclaimed or swept-to-failed) within budget. Today only TTL sweeps cover stale rows; restart is never exercised.

source: blind_spot: daemon restart/crash recovery untested
P1Mverification

Tripwire: multi-workspace DATA isolation (not just active marker)

workspace_switch_e2e only proves the ▶ marker moves and seeds workspace B empty. Seed A with 'Refactor API' and B with 'Acme Login Bug'; after switching to B assert the list shows 'Acme Login Bug' and NOT 'Refactor API' (and vice-versa). Add a store-layer test that issues_list/agents_list for ws A never returns ws B rows. Cross-tenant leakage passes today; id==slug fixtures mask the resolution bug.

source: blind_spot: multi-workspace data isolation untested

P2 — medium (20)

P2Mfeature-gap

Agent posts durable progress/blocker comments to issue

Running agents stream to the live transcript (runner.rs:294) but write no durable issue comments — comment table untouched by daemon/runner. Once the comment write path exists, emit system-authored progress/blocker comments at runner checkpoints so the activity survives beyond the transcript buffer. Depends on the comment write-path bead.

source: confirmed: agent progress comments partial (runner.rs:294, migration 0003 unwired)
P2Mfeature-gap

Issue labels: table, attach/detach RPC, chips

No label table or issue-label join (migration 0003:19-43); IssueRow has no labels; only skill_attach/detach in RPC catalogue; 'label' hits are bd-sync flags. Add a label table + join, hangar/issue_label_attach|detach RPCs, and render label chips on the issue list/detail. Simple CRUD, very TUI-native.

source: confirmed: label issues gap (migration 0003, methods.rs, beads_sync inbound.rs:43)
P2Lfeature-gap

Member & role management: list/set-role/remove RPCs + screen

member table carries owner/admin/member (migration 0001:23-28) but is data-only — seeded once as owner, zero member-mutation RPCs in catalogue, no member screen (settings = keys+workspace switch). Add hangar/members_list, set_role, remove_member RPCs and a Settings members pane. Covers manage-members, invite/accept/leave as the same CRUD surface where feasible.

source: confirmed: manage members/roles gap + invite/leave family (migration 0001, methods.rs)
P2Mfeature-gap

Issue full-text search RPC (title+desc+comment ranked)

issue_list.rs:152-234 has client-side `/` title-only contains filter; no server search RPC, no FTS/LIKE in migrations (only idx_issue_workspace_state). Add a hangar/issues_search RPC doing ranked title+desc+comment LIKE matching so search works beyond the loaded page and across description/comment bodies. Narrower than Multica today.

source: confirmed(adjusted): search issues partial (issue_list.rs:152, repo/issue.rs:164)
P2Mfeature-gap

Global command palette / cross-entity search (Cmd+K)

router.rs:34-62 global keys are tab/help/quit only; per-screen `/` filters one screen's loaded rows; no search RPC in the closed catalogue. Add a fuzzy palette overlay that searches issues/agents/skills/autopilots cross-entity via a search RPC and jumps to the selected entity. Classic TUI pattern, currently absent.

source: confirmed: global search/command palette gap (router.rs:34-62, methods.rs)
P2Mfeature-gap

Notification inbox screen with unread aggregation

No Inbox in Screen enum (screen/mod.rs:44-69); HangarEvent is a live stream, not persisted/aggregated; no unread/read_at/mark_read in store. Add an aggregated inbox of issue/comment/task events with unread counts and a mark-read sweep (the host notifyd Inbox proves the pattern). TUI-viable like Logs/Kanban.

source: confirmed: notification inbox gap + bulk-actions (screen/mod.rs:44, events.rs:39)
P2Lfeature-gap

Squad entity, membership, and leader routing

Zero squad construct: ActorKind={Member,Agent} (actor.rs:24-29), assignee_type CHECK IN('member','agent'), no squad table in any migration, no squad screen, daemon 'leader' refs are POSIX process-groups. Add a squad DB entity + members + leader-routing in the dispatcher and a squad status view. Plan already admits this is unbuilt. Covers assign-to-squad/CRUD/leader-routing/member-status family.

source: confirmed: squads family gap (actor.rs, migration 0003:25, plan)
P2Lfeature-gap

Webhook-triggered autopilots: HTTP ingress + signing secret

Autopilots are cron+manual only; daemon binds UnixListener only (lib.rs:74), no HTTP ingress; CreateAutopilot has no webhook/secret/filter fields (service.rs:38-99). Add a local HTTP ingress, per-autopilot webhook URL + HMAC signing secret + event filters, and a deliveries pane. Capability is server-side, not browser-bound. Covers token-rotation + delivery-inspection sub-gaps.

source: confirmed: webhook autopilots family gap (autopilot/service.rs, lib.rs:74)
P2Mfeature-gap

Autopilot execution modes + concurrency policies

Only skip-when-running is implemented (scheduler.rs:38-42); queue/replace collapsed to an int; fire path hardcodes issue_id=NULL so create_issue vs run_only mode is absent (autopilot_run.rs:133). Add ExecutionMode {create_issue,run_only} and concurrency policy {skip,queue,replace} as config enums surfaced in the autopilot create/edit flow. Multica shipped all three.

source: confirmed: autopilot modes/concurrency partial (scheduler.rs, autopilot_run.rs:133)
P2Mfeature-gap

Daemon lifecycle CLI: start/stop/restart + runtime auto-register

DaemonCommand has only Status (mod.rs:514-519); start/stop deferred; no one-command setup; boot() never inserts runtimes (all agent_runtime inserts are test-only). Add daemon start/stop/restart verbs, a setup wrapper that configures+auths+starts, and runtime auto-register on boot. Covers one-command-setup + agent-daemon + bootstrap-runtime + onboarding-runtime-step family.

source: confirmed: daemon CLI partial + one-command setup gap + runtime auto-register
P2Mfeature-gap

Per-workspace context prompt + repo whitelist + issue prefix

workspace table is id/slug/name/created_at only (migration 0001:10-15); no context_prompt/repo-whitelist/issue_prefix; execenv writes no CLAUDE.md and run_claude passes zero prompt args (runner.rs:192-205). Add per-workspace config fields injected at dispatch. Repo whitelist also gates the absent repo-checkout flow. TUI/CLI-expressible config.

source: confirmed: workspace context prompt + multi-workspace isolation extras gap
P2Marchitecture

Retry/resume poisoned-terminal taxonomy

RetryService gates on infra-failure reasons but has no exclusion set for conversation-poisoning terminals (iteration_limit, api_invalid_request, semantic_inactivity); an auto-retry that resumes the parent session can inherit a wedged conversation. Add an exclusion taxonomy mirroring Multica's GetLastTaskSession so poisoned sessions start fresh instead of resuming.

source: design_gap+recommendation: retry/resume lacks poisoned-terminal taxonomy
P2Mverification

Tripwire: daemon-level concurrent dispatch respects cap, no double-claim

Concurrency is proven only at the store layer (in-process tokio::join!). Spawn one claim-enabled daemon, set agent max_concurrent_tasks=3, enqueue 10 sleep-tasks via fake-claude; poll and assert running count never exceeds 3 at any sample and all 10 reach terminal. Optionally spawn two daemons on the same runtime to prove only one claims each row. The daemon-health tripwire drains but never asserts the cap under contention.

source: blind_spot: daemon-level concurrent dispatch untested
P2Mverification

Tripwire: end-to-end retry chain + timeout enforcement

Retry chain (F06) and timeout (F43) have store-layer tests but no e2e tripwire. With a claim-enabled daemon: fake-claude exits non-zero (retryable) then succeeds — assert a child task with parent_task_id is created and succeeds; assert agent_error does NOT spawn a child; assert a fake-claude sleeping past HANGAR timeout is killed and failed with reason=timeout.

source: blind_spot: real failure/timeout/retry chain e2e untested
P2Mverification

Tripwire: create-flow keystroke->RPC->DB->re-render round trip

Almost all TUI tripwires SQL-seed state then assert render; real create paths are only reducer-tested. In tmux: press 'c' on the issue list, type a title, submit, assert (a) a success indicator renders and (b) the row lands in the daemon DB. Same for Kanban Shift+Right: assert the focused card's task status actually transitioned in the DB, not just on screen.

source: blind_spot: create-flow keystroke round trip unproven in tmux
P2Mverification

Tripwire: migration upgrade-from-populated DB + idempotency

migrations_apply only proves 0001-0010 against a FRESH db. Build a db at migration N, INSERT representative rows (workspace/issue/task/agent/skill/autopilot/token), apply_migrations to head and assert (a) no error, (b) seeded rows survive/transform, (c) re-running apply_migrations is a no-op. This is the real upgrade path every existing user hits; a destructive later migration is uncaught today.

source: blind_spot: schema migration upgrade-from-populated untested
P2Mverification

Tripwire: plugin crash / host reconnect + parent-death watcher

F40 covers daemon-drop seen by plugin but not the inverse. In a real tmux ainb-tui Hangar session, kill the hangar-tui plugin child pid; assert the host re-spawns+re-renders seeded rows OR shows a clean disconnected state (not a frozen pane). Separately kill the ainb host and assert no orphaned hangar-tui process survives (validates the macOS parent-death watcher from commit 8494ad0f).

source: blind_spot: plugin crash/host reconnect + parent-death watcher untested
P2Mfeature-gap

First-run onboarding wizard: source/role/use-case questionnaire

firstrun.rs has a danger-full-access warning and setup_menu.rs:24-30 covers deps/git/auth/editor/reset, but no source/role/use-case questionnaire or runtime-provisioning step. Add the guided questionnaire steps (TUI-expressible, wizard already exists) so first-run matches Multica's 5-step flow. The runtime auto-register half is covered by the daemon-lifecycle bead.

source: confirmed/partial: onboarding questionnaire gap (firstrun.rs, setup_menu.rs:24)
P2Mfeature-gap

PR badge: surface CI check + merge-conflict status, auto-move-to-Done

pr_url.rs:47 scrapes one PR URL; result carries only pr_url; task_detail renders a badge with open-in-browser. No ci_status/check_run/mergeable/conflict anywhere; done stamps on task exit not PR merge. Query gh for CI checks + mergeable state, render status on the badge, and optionally auto-transition to Done on merge. Webhook path is out of scope here.

source: confirmed: PR link CI/conflict status partial (pr_url.rs:47, task_detail.rs:541)
P2Mfeature-gap

Usage dashboard: token/cost + per-agent rollup

daemon_health renders runtimes+claim_cache+concurrent_tasks+throughput sparkline only; DaemonHealthSnapshot has no token/cost/per-agent fields and never persists them; no usage method in the catalogue. Persist per-task token/cost, add a usage RPC, and render a daily + per-agent rollup pane (sparkline infra already exists). The host burndown plugin is unrelated to hangar.

source: confirmed: usage dashboard partial (daemon_health.rs, settings.rs:131)

P3 — low (2)

P3Sarchitecture

Schedule OrphanScan/Full workspace GC loop

execenv.rs:213 cleanup() (Full/ArtifactOnly/OrphanScan, 72h grace) and worktree cleanup are implemented+tested but never scheduled — spawn_sweepers (run_loop.rs:137,181) runs only TTL row-sweepers, no OrphanScan/Full call and no GC RPC. Wire a periodic GC tick so done/orphaned workspace dirs and build artifacts actually get reclaimed. Code exists; just unscheduled.

source: confirmed(rationale-corrected): workspace GC unscheduled (execenv.rs:213, run_loop.rs:137)
P3Mverification

Soak test: long-session leak + stream backpressure

No test runs daemon+plugin under a sustained event stream. Drive 500+ transcript/event messages over ~2 min through a live plugin session; assert no dropped sequence ids, bounded RSS growth, and the screen still responds to a nav key at the end. Transcript buffer, event stream, and throughput ring could leak or drop under backpressure — none is sampled beyond seconds.

source: blind_spot: long-session soak/leak/backpressure untested