--- name: post-mortem description: Capture a recently-resolved incident or painful debug session as durable project memory. Writes a structured feedback file into the project's memory dir so future Claude sessions remember the lesson and don't repeat the mistake. trigger: /post-mortem --- # /post-mortem Painful bugs and operational disasters carry the most learning per minute. Without a skill, those lessons evaporate within a week — you remember "something about the publish cron" but not what specifically broke or how the fix worked. Post-mortem captures the lesson while it's still vivid and writes it as project memory, so the next Claude session starts already knowing. The output is not a doc you read once. It's a memory file (`feedback_.md`) that I, the next Claude, will load when you start a new session. The "When to remember this" section at the bottom is the trigger — that's how the lesson gets retrieved later. ## Usage `/post-mortem ` — short slug for the file (e.g. `publish-cron-race`, `stuck-pending-claims`) Or just `/post-mortem` and I'll suggest a slug from recent context. ## What it's for You hit incidents that took real time to debug. Recent examples on Elite AI Advantage alone: - Publish cron racing manual drain attempts - Bootstrap seed re-introducing deleted IG accounts on every deploy - Stuck PENDING blogs invisible to the claim filter (attempts >= MAX bug) - Force-drain UI not updating until Railway deploy caught up Each of those is now a one-time fix in code. None of them have written memory — so if a similar bug pattern shows up next month, future-Claude has to rediscover the failure mode from scratch. Post-mortem fixes that. ## What You Must Do When Invoked ### Step 1 — Identify the project + incident If a topic slug was provided, use it. If not, infer from recent context — the last ~10 turns of conversation usually point at one specific problem. Confirm with Jake in one short question if it's truly ambiguous (otherwise commit and write it). Find the project memory dir: - `pwd` to get project path - Slugify path (`/` and ` ` → `-`) - Target: `~/.claude/projects//memory/feedback_.md` - If memory dir doesn't exist, create it with `mkdir -p` ### Step 2 — Gather the timeline Pull from: - Recent conversation context (the most reliable source) - Recent git log: `git log --since="3 days ago" --pretty=format:"%h %s (%cr)"` - Recent commits' diffs for the actual fix: `git show ` on the fix commit(s) Identify: - **Symptom** — what was visible to Jake - **Root cause** — what was actually broken under the hood - **Fix** — the specific code/config change, with file:line references - **Why it happened** — the design pattern or assumption that allowed it ### Step 3 — Write the feedback file Output structure (write this exact shape to the memory dir): ```markdown # Feedback: **Date:** **Project:** **Severity:** ## What happened <3–6 bullet timeline. Plain language. What Jake saw, in order.> ## Root cause <1–3 sentences. The actual underlying issue. Not the symptom.> ## What we changed = MAX` → FAILED, `attempts < MAX` → PENDING - `prisma/seed-instagram-accounts.ts` — added bootstrap-only guard checking `existingCount > 0`> ## Why it happened (deeper) <1–2 paragraphs. The design pattern, the assumption, the edge case. This is the part future-Claude needs.> ## Prevention ## When to remember this <2–4 trigger conditions for future sessions. Format as "if you see X, recall this." Example: - If working on the InstagramPipeline runner or claim logic, remember stale-claim recovery must split on attempts. - If touching `prisma/seed-*.ts`, check bootstrap guards exist or admin deletions get reversed every deploy. - If a queue-style table has `attempts < MAX_ATTEMPTS` filters, audit whether stale rows can land at MAX without failing.> ``` ### Step 4 — Update MEMORY.md index If `~/.claude/projects//memory/MEMORY.md` exists, append a one-line link: ```markdown - [](feedback_.md) — ``` Match the existing line style in MEMORY.md (don't introduce a new format). ### Step 5 — Confirm and stop Print a 3-line confirmation: - File written: `` - MEMORY.md updated: yes/no - Next likely trigger: `` Then stop. Don't propose more work. ## Calibration for Jake's projects - **Tone:** factual, not apologetic. No "we should have caught this earlier" hand-wringing. State what happened and what changed. - **Granularity:** if the fix was a one-line typo, the post-mortem is one paragraph. If the fix was a system redesign, the post-mortem can be a page. Match the scope. - **Severity scale:** - `minor` — under 30 minutes, single file, no user impact - `moderate` — under 2 hours, multi-file or multi-stage thinking - `painful` — half a day plus, required design rethink - `disaster` — user-visible damage, lost work, late-night debugging ## What to avoid - **No blame.** Code didn't "fail us," a design didn't "betray us." The pattern allowed the bug. State it neutrally. - **No retroactive coaching.** Don't write "Jake should have…" — that's not what feedback files are for. - **No copying full diffs.** Reference file:line, don't paste 200 lines of code into memory. - **No vague triggers.** "If you're working on the pipeline…" is too broad. Be specific: "If editing `claimNextQueued` or any stale-recovery logic…" - **No skipping the "When to remember" section.** That's the entire point. Without it, the file is a journal entry, not memory. ## Example output (abbreviated) ```markdown # Feedback: stuck-pending-claims **Date:** 2026-04-23 **Project:** Elite AI Advantage **Severity:** painful ## What happened - Admin reported one PENDING blog couldn't be drained. - Manual force-drain didn't pick it up. - Inspection showed status=PENDING but attempts=3 (MAX). ## Root cause `claimNextQueued` increments attempts on claim. If the worker dies mid-write (Railway redeploy), stale recovery flips the row back to PENDING but leaves attempts=3. Subsequent claims filter on `attempts < MAX`, so the row is invisible forever. ## What we changed - `src/lib/instagram-pipeline/runner.ts` — split stale recovery: `attempts >= MAX` → FAILED, `attempts < MAX` → PENDING. Plus a sweep for already-stuck PENDINGs at MAX. - `forceDrain` now also runs the sweep for retroactive cleanup. ## Why it happened (deeper) The claim filter assumed PENDING + attempts=MAX is impossible (since claim moves it to PROCESSING first). Mid-write death broke that assumption. Generic claim logic, generic stale recovery, but the two together produced an impossible state. ## Prevention Any queue with `attempts < MAX` claim filters needs explicit handling for "stale row at MAX" — either fail it, or treat it as eligible. Adding this to the IG pipeline doesn't generalize; the rule belongs in the design notes for any future queue. ## When to remember this - If editing `claimNextQueued` or any stale-recovery logic in instagram-pipeline, remember the two-path split. - If building a new queue with attempt limits, design the stale-at-MAX state explicitly. - If a PENDING blog won't drain, suspect attempts=MAX before suspecting the claim logic. ``` That's the shape. Vivid, specific, future-actionable.