How reflect captures + recalls across Claude Code and Codex CLI
~/.claude/settings.json vs ~/.codex/hooks.json) and share one on-disk knowledge base
(~/.reflect/ queue + ~/.learnings/ documents + GraphRAG index). SessionStart
fires recall (inject top-3 prior learnings) and the bg-drainer (process any queued
transcripts). PreCompact fires precompact_reflect (enqueue the current transcript before
the harness throws it away). Because the queue is harness-agnostic, a codex session can enqueue a reflection
that a later Claude session drains — and vice versa.
Architecture at a glance
Two harnesses, three hook scripts, one shared knowledge base. Numbered circles on the diagram match the steps in Recall loop and Capture loop below.
The recall loop · SessionStart → context
Fires on every session start in both harnesses. Always exits 0 — never blocks startup even when the knowledge base is empty or the reflect CLI is missing.
1 · Hook fires SessionStart
Both Claude and Codex serialize the same JSON envelope onto the hook's stdin —
{session_id, transcript_path, cwd, hook_event_name, source, ...}. The recall script
is the same in both cases; the only thing that differs is which config file pointed the harness at it.
# the hook script reads from stdin input = json.load(sys.stdin) cwd = input.get('cwd') # project being worked on source = input.get('source') # startup | resume | clear
2 · Query context from cwd · branch · git log session_start_recall.py:174
The script doesn't ask the model "what do you need?" — it builds a context query itself from cheap signals: current working directory, current git branch, recent commit messages on the branch. That string becomes the GraphRAG query.
3 · Hybrid vector + graph search; rerank to top-3 GraphRAG (nano-graphrag) + hnswlib
Two signals are blended: dense vector similarity over learning content, and graph proximity via
entity sidecars (the .entities.yaml files written alongside each learning). Reranking
weights recency, confidence tag, and tag overlap with the query. Output is capped at the top three.
4 · Inject as additionalContext session_start_recall.py:158
The script emits a single JSON object on stdout containing hookSpecificOutput.additionalContext.
The harness reads that envelope and silently prepends the learnings to the session's developer-message
context — the user never sees the JSON, the model sees the learnings as if they were instructions.
{
"hookSpecificOutput": {
"hookEventName": "SessionStart",
"additionalContext": "## Prior learnings relevant here\n- [lrn-...] Skill X must use ..."
}
}
The capture loop · PreCompact → queue → drain
Reflection itself is not done synchronously during PreCompact — the compaction can't wait for a 30-second LLM run. Instead, PreCompact just enqueues the transcript path. The actual /reflect runs asynchronously on the next session start (in any harness) and never blocks the user.
1 · PreCompact fires before compaction PreCompact
When the harness is about to compress history (because context is filling up), it serializes
{session_id, transcript_path, trigger, ...} to the precompact hook. trigger
is either auto (harness decided) or manual (user ran /compact).
2 · Enqueue transcript to ~/.reflect/pending_reflections.jsonl precompact_reflect.py:137
The script appends a single line to ~/.reflect/pending_reflections.jsonl and returns
immediately. No LLM call here. The queue file is shared across all harnesses — there's nothing
claude-specific about it.
{"transcript_path": "...","session_id":"...","trigger":"auto","queued_at":"..."}
3 · Next SessionStart (any harness) fires the drainer SessionStart
The next session that starts on this machine — Claude or Codex — fires
reflect-drain-bg.sh as a detached background process
((nohup ... &) >/dev/null 2>&1) with a 5-second start budget. It's PID-locked so two
concurrent drainers can't trample each other, and daily-capped via cost events so a runaway loop
can't spend unlimited tokens.
4 · Drain shells out to a headless claude -p run reflect-drain-bg.sh:210
For each queue entry, the drainer spawns claude -p "/reflect <transcript>"
with --output-format json, --max-turns 25, and
--permission-mode bypassPermissions. The /reflect skill scans the
transcript, classifies corrections vs noteworthy patterns, and writes the resulting learning
documents to ~/.learnings/documents/.
This subprocess is always claude, even when the queue entry was written by a
codex session. Codex is the trigger, Claude is the worker. (Configurable via
REFLECT_DRAIN_CLAUDE_BIN in environments where claude isn't on PATH.)
5 · Each successful drain triggers reflect reindex reflect-drain-bg.sh:end-of-main
If at least one entry processed cleanly, the drainer runs reflect reindex (with a
5-minute timeout) so the GraphRAG index picks up the new .md + .entities.yaml
files. Without this, learnings are still on disk — they just won't appear in future
/recall results until a manual reindex.
Successful entries are removed from the queue; transient failures stay (with a retry counter);
permanently broken entries (missing transcript, >3 retries) are moved to
~/.reflect/poison-reflections.jsonl.
How each harness gets wired
The hook scripts are shared. The config plumbing is per-harness.
// .claude-plugin/plugin.json — wired by /plugin install reflect@agents-in-a-box { "hooks": { "SessionStart": [{ "hooks": [ { "type":"command", "command":"uv run ${CLAUDE_PLUGIN_ROOT}/skills/recall/hooks/session_start_recall.py" }, { "type":"command", "command":"(nohup ${CLAUDE_PLUGIN_ROOT}/hooks/reflect-drain-bg.sh &) ...", "timeout": 5 } ] }], "PreCompact": [{ "hooks": [{ "type":"command", "command":"uv run ${CLAUDE_PLUGIN_ROOT}/hooks/precompact_reflect.py --auto --verbose" }] }] } }
// ~/.claude/settings.json — what the plugin runtime produces
{
"hooks": {
"SessionStart": [{
"matcher": "",
"hooks": [
{ "type":"command",
"command":"uv run /Users/<you>/.claude/skills/recall/hooks/session_start_recall.py" },
{ "type":"command",
"command":"(nohup /Users/<you>/.claude/plugins/.../hooks/reflect-drain-bg.sh ...",
"timeout": 5 }
]
}],
"PreCompact": [{ "matcher":"", "hooks":[{
"command":"uv run .../hooks/precompact_reflect.py --auto --verbose" }] }]
}
}
# codex has no plugin runtime — the adapter does the wireup itself python plugins/reflect/adapters/codex/codex_adapter.py install # or skip the bg drain on codex-only machines without claude on PATH: python plugins/reflect/adapters/codex/codex_adapter.py install --no-bg-drain # adapter physically copies plugin content into ~/.codex/skills/ # and merges hook entries into ~/.codex/hooks.json
// ~/.codex/hooks.json — what codex_adapter.py produces { "hooks": { "SessionStart": [{ "matcher": "", "hooks": [ { "type":"command", "command":"uv run /Users/<you>/.codex/skills/recall/hooks/session_start_recall.py" }, { "type":"command", "command":"(nohup /Users/<you>/.codex/skills/reflect/hooks/reflect-drain-bg.sh &)...", "timeout": 5 } ] }], "PreCompact": [{ "matcher":"", "hooks":[{ "command":"uv run /Users/<you>/.codex/skills/reflect/hooks/precompact_reflect.py --auto --verbose" }] }] } }
The cross-tool case · codex queues, claude drains
Imagine you spend the morning in Codex on a tricky migration, hit context compaction, and quit. In the afternoon you open Claude on the same repo. Here's the timeline:
Morning · Codex compaction at 11:42 codex session
Codex fires PreCompact → precompact_reflect.py appends one line to
~/.reflect/pending_reflections.jsonl with the codex transcript path. No reflection
runs. Codex compacts and continues.
Codex session ends · queue still has the entry 11:55
If another SessionStart in the same codex session had fired (e.g. on resume), it would have drained. But the user quit. The entry sits in the queue.
Afternoon · Claude session starts at 14:08 claude session
Claude fires SessionStart. Two hooks run: session_start_recall.py
(injects whatever's already in the GraphRAG index — the codex morning's learnings are not
there yet because they haven't been processed) and reflect-drain-bg.sh as a detached
background process.
Drain picks up the codex transcript · spawns claude -p 14:08 + ~1s
The drain script reads the queue, finds the morning's codex transcript path, and spawns
claude -p "/reflect <morning-codex-transcript>" — note this is Claude processing
a transcript that was produced by Codex. The transcript format is the same JSONL the harnesses use
natively, so /reflect doesn't care which harness wrote it.
Learnings land · reindex updates GraphRAG 14:09
The headless run writes .md + .entities.yaml sidecars under
~/.learnings/documents/, then reflect reindex updates the GraphRAG index.
The queue entry is removed.
Next session (Claude OR Codex) recalls them whenever
From this point forward, the next SessionStart in any harness — including the very same Claude session that triggered the drain, on its next start — will see the morning's codex learnings in the top-3 if they match the cwd context.
Gotchas worth knowing
- The drainer ALWAYS shells out to
claude. Codex is the trigger, not the worker. On a codex-only machine withoutclaudeon PATH, the drain logs a warning and exits 0 (no hang), but learnings never get processed. Pass--no-bg-drainto the codex adapter on those machines to skip the hook entirely. - SessionStart never blocks startup. Both hook scripts always exit 0, even on errors — a broken GraphRAG, missing reflect CLI, or unparseable queue all just result in an empty
additionalContext. - PreCompact doesn't reflect synchronously. The script only enqueues. Reflection happens on the next session start so it doesn't make the user wait through compaction.
- Claude and Codex use the same event-name casing (
SessionStart/PreCompact, PascalCase). Copilot CLI uses lowercasesessionStart/preCompact— if you port the adapter, watch the casing. - The queue isn't transactional. If a drain crashes mid-entry, the retry counter (
~/.reflect/retry-count.jsonl) survives. Three failed retries on the same transcript and the entry is moved topoison-reflections.jsonl— out of the way but kept for forensics.
FAQ
- Why doesn't recall just embed the user's last prompt?
- By the time the user submits their prompt, the SessionStart hook has already run. Recall has to use cheap pre-prompt signals (cwd, branch, recent commits) and inject before the conversation begins.
- What stops the drain from running every session?
- A PID lockfile (
~/.reflect/drain.lock) — if another drain is already running, the new one logs and exits. Plus daily caps viaREFLECT_DRAIN_DAILY_MAX(default 20). And the queue may simply be empty. - Can I run recall manually?
- Yes — invoke
/recallas a skill. The same script runs synchronously with a query you provide. Useful when starting a new feature where the cwd-based query misses relevant prior work. - What happens if I install reflect on Claude AND Codex?
- That's the supported case. The hook scripts are the same; both harnesses just point at their own copies. The queue and learnings store are shared, so the cross-tool case above just works.
- Where does
~/.reflect/live, and is it portable? - Under
$HOME/.reflect/by default; overridable viaREFLECT_STATE_DIR. Contents are JSONL/Markdown/YAML — fully grep-able, version-control friendly if you want, and portable across machines via sync if you sync the dir.