TDD workflow · gaps then v1.3 · every test validated e2e

ainb skill-manager — TDD status

10/10

All green. 7 coverage gaps closed + 3 v1.3 features built test-first. Every test validated genuine end-to-end by a skeptical reviewer agent. Zero SQLite · clippy clean · zero v1.3 test failures.

Verified by re-running every test in isolation (not trusting the workflow's claims): catalog 8 · library 5 · provenance 8 · browse 4 deterministic tests + 8 live tmux tripwires + the [m] refresh tripwire (3/3 — it had flaked under 41-agent load). Branch feat/skill-manager-v1.3, pushed. The 34 suite-wide failures are all pre-existing (18 docker, 16 session/analytics flakes untouched since base) — none in skill-manager.

Phase 1 — coverage gaps closed

Each = a live tmux tripwire (or bash test) asserting the user-visible effect of a feature that had no coverage.

Gap	Status	Genuine e2e?	Test
[c] check keybind	PASS	✓ asserts "drift check running" toast	check_live.rs (+ fixed the keys_wired over-claim)
[u] update keybind	PASS	✓ asserts update result notification	update_live.rs
[r] remove keybind	PASS	✓ unit leaves the Units table	remove_live.rs
[m] refresh discovery	PASS	✓ skip-marker planted → banner re-appears	refresh_discovery_live.rs
arrow / j-k / g-G nav	PASS	✓ highlight moves + Detail updates	nav_keys_live.rs
banner [d] details + [s] skip	PASS	✓ [s] writes the .skip-banner marker	discovery_details_skip.rs
sandbox bash safety guards	PASS	✓ refuses / and $HOME; marker-gated teardown	sandbox_script_safety_guards.rs (+ CI wiring)

Phase 2 — v1.3 features (TDD, no SQLite)

The three "steal from skills-manager" items, each with a CLI/pure test ladder + a live tmux tripwire.

Feature (bead)	Status	Coverage	Test
Provenance matcher `ai-ya9`	PASS	5 attribute variants + synth_uri + scan CLI + live tmux (external clone shows `gh:` not `local:`). Pure, no SQLite.	provenance_tests.rs · provenance_live.rs
Own-skill library `ai-lgk`	PASS	YAML `library.yaml` roundtrip + add/new/list CLI + reject-outside-home + live tmux [l] view. No SQLite.	library_tests.rs · skill_library_tests.rs · library_live.rs
Catalog browse `ai-a20`	PASS	CatalogBackend trait + MockCatalogBackend; rank/empty/api-error/url-builder + CLI browse + live tmux [b]→install. Zero network (mock at trait boundary).	catalog_tests.rs · skill_browse_tests.rs · browse_live.rs

Validator verdict

A skeptical code-reviewer agent checked every test: drives the real binary/CLI, asserts user-visible output, would fail if the feature were removed, specific to the feature.

10/10 genuine. No tautologies, no over-mocking. The catalog mock sits at the HTTP trait boundary — the production code path runs, only the network call is canned. Every CLI subcommand and keybind has at least one assertion on its real effect.