TDD workflow · gaps then v1.3 · every test validated e2e

ainb skill-manager — TDD status

10/10
All green. 7 coverage gaps closed + 3 v1.3 features built test-first. Every test validated genuine end-to-end by a skeptical reviewer agent. Zero SQLite · clippy clean · zero v1.3 test failures.
Verified by re-running every test in isolation (not trusting the workflow's claims): catalog 8 · library 5 · provenance 8 · browse 4 deterministic tests + 8 live tmux tripwires + the [m] refresh tripwire (3/3 — it had flaked under 41-agent load). Branch feat/skill-manager-v1.3, pushed. The 34 suite-wide failures are all pre-existing (18 docker, 16 session/analytics flakes untouched since base) — none in skill-manager.

Phase 1 — coverage gaps closed

Each = a live tmux tripwire (or bash test) asserting the user-visible effect of a feature that had no coverage.
GapStatusGenuine e2e?Test
[c] check keybindPASS✓ asserts "drift check running" toastcheck_live.rs (+ fixed the keys_wired over-claim)
[u] update keybindPASS✓ asserts update result notificationupdate_live.rs
[r] remove keybindPASS✓ unit leaves the Units tableremove_live.rs
[m] refresh discoveryPASS✓ skip-marker planted → banner re-appearsrefresh_discovery_live.rs
arrow / j-k / g-G navPASS✓ highlight moves + Detail updatesnav_keys_live.rs
banner [d] details + [s] skipPASS✓ [s] writes the .skip-banner markerdiscovery_details_skip.rs
sandbox bash safety guardsPASS✓ refuses / and $HOME; marker-gated teardownsandbox_script_safety_guards.rs (+ CI wiring)

Phase 2 — v1.3 features (TDD, no SQLite)

The three "steal from skills-manager" items, each with a CLI/pure test ladder + a live tmux tripwire.
Feature (bead)StatusCoverageTest
Provenance matcher
ai-ya9
PASS5 attribute variants + synth_uri + scan CLI + live tmux (external clone shows gh: not local:). Pure, no SQLite.provenance_tests.rs · provenance_live.rs
Own-skill library
ai-lgk
PASSYAML library.yaml roundtrip + add/new/list CLI + reject-outside-home + live tmux [l] view. No SQLite.library_tests.rs · skill_library_tests.rs · library_live.rs
Catalog browse
ai-a20
PASSCatalogBackend trait + MockCatalogBackend; rank/empty/api-error/url-builder + CLI browse + live tmux [b]→install. Zero network (mock at trait boundary).catalog_tests.rs · skill_browse_tests.rs · browse_live.rs

Validator verdict

A skeptical code-reviewer agent checked every test: drives the real binary/CLI, asserts user-visible output, would fail if the feature were removed, specific to the feature.

10/10 genuine. No tautologies, no over-mocking. The catalog mock sits at the HTTP trait boundary — the production code path runs, only the network call is canned. Every CLI subcommand and keybind has at least one assertion on its real effect.