Operationsv1.0.0

Triage

/triage

Production incident response. 90 seconds of structured assessment, then commit to rollback, hotfix, patch, monitor, or ignore.

Install in one command

Install

mkdir -p ~/.claude/skills/triage && curl -fsSL https://eliteaiadvantage.com/skills/triage/SKILL.md -o ~/.claude/skills/triage/SKILL.md

Download SKILL.md

Then run /triage in Claude Code.

What it is

When something breaks in production, the instinct is to dive into code. Half the time that's right; half the time you spend an hour fixing something a 30-second rollback would have handled. Triage forces a decision step before the code dive: severity, blast radius, reversibility, time pressure, then a single response category.

Built for live-user incidents, public site down, queue stuck, admin tool broken, deploy gone wrong. Threads the middle between panic-fix and over-thought.

Why it's useful

→Forces explicit severity (🔴 down / 🟡 degraded / 🟢 cosmetic) so the response matches the actual cost.
→Names blast radius (public / admin / pipeline / one user) before deciding how urgently to act.
→Checks reversibility, rollback safe? data damaged? side effects irreversible?, before diving in.
→Picks one response category from five (rollback / hotfix / patch / monitor / ignore), no menus.
→Pairs cleanly with /post-mortem after the fix lands.

When to use it

•Production site or feature just broke.
•Deploy went out and metrics are off.
•A user reported something that might be a bigger issue.
•Anything where the first instinct is "let me jump in and fix it right now."

How it helps with Claude

Without this skill, incident response defaults to "let me investigate", which is a response category, but rarely the right one for production downtime. Triage forces the explicit assessment-then-decision pattern: name the severity, pick one of five responses, get the next concrete action. Investigation becomes a chosen response, not a default avoidance of commitment.

Published 4/25/2026 · Last updated 5/6/2026

Suggest an improvement →