TDD workflow · gaps then v1.3 · every test validated e2e
ainb skill-manager — TDD status
10/10
All green. 7 coverage gaps closed + 3 v1.3 features built test-first. Every test validated genuine end-to-end by a skeptical reviewer agent. Zero SQLite · clippy clean · zero v1.3 test failures.
Verified by re-running every test in isolation (not trusting the workflow's claims): catalog 8 · library 5 · provenance 8 · browse 4 deterministic tests + 8 live tmux tripwires + the [m] refresh tripwire (3/3 — it had flaked under 41-agent load). Branch feat/skill-manager-v1.3, pushed. The 34 suite-wide failures are all pre-existing (18 docker, 16 session/analytics flakes untouched since base) — none in skill-manager.
Phase 1 — coverage gaps closed
Each = a live tmux tripwire (or bash test) asserting the user-visible effect of a feature that had no coverage.
| Gap | Status | Genuine e2e? | Test |
| [c] check keybind | PASS | ✓ asserts "drift check running" toast | check_live.rs (+ fixed the keys_wired over-claim) |
| [u] update keybind | PASS | ✓ asserts update result notification | update_live.rs |
| [r] remove keybind | PASS | ✓ unit leaves the Units table | remove_live.rs |
| [m] refresh discovery | PASS | ✓ skip-marker planted → banner re-appears | refresh_discovery_live.rs |
| arrow / j-k / g-G nav | PASS | ✓ highlight moves + Detail updates | nav_keys_live.rs |
| banner [d] details + [s] skip | PASS | ✓ [s] writes the .skip-banner marker | discovery_details_skip.rs |
| sandbox bash safety guards | PASS | ✓ refuses / and $HOME; marker-gated teardown | sandbox_script_safety_guards.rs (+ CI wiring) |
Phase 2 — v1.3 features (TDD, no SQLite)
The three "steal from skills-manager" items, each with a CLI/pure test ladder + a live tmux tripwire.
| Feature (bead) | Status | Coverage | Test |
Provenance matcher
ai-ya9 | PASS | 5 attribute variants + synth_uri + scan CLI + live tmux (external clone shows gh: not local:). Pure, no SQLite. | provenance_tests.rs · provenance_live.rs |
Own-skill library
ai-lgk | PASS | YAML library.yaml roundtrip + add/new/list CLI + reject-outside-home + live tmux [l] view. No SQLite. | library_tests.rs · skill_library_tests.rs · library_live.rs |
Catalog browse
ai-a20 | PASS | CatalogBackend trait + MockCatalogBackend; rank/empty/api-error/url-builder + CLI browse + live tmux [b]→install. Zero network (mock at trait boundary). | catalog_tests.rs · skill_browse_tests.rs · browse_live.rs |
Validator verdict
A skeptical code-reviewer agent checked every test: drives the real binary/CLI, asserts user-visible output, would fail if the feature were removed, specific to the feature.
10/10 genuine. No tautologies, no over-mocking. The catalog mock sits at the HTTP trait boundary — the production code path runs, only the network call is canned. Every CLI subcommand and keybind has at least one assertion on its real effect.