How I Actually Set Up My .claude Folder for Production Work

The post that got me thinking about this

Manthan Patel published a .claude folder setup guide that's pulling Google traffic right now. It's a good starter resource. He shows you the directory tree, lists the six agents he runs, and tells you to copy the structure. If you're brand new to Claude Code, that's a perfectly fine on-ramp.

The reason I'm writing a different one: aspirational tutorials skip the maintenance reality. After 18 months of running .claude folders across a half-dozen production projects, the parts that actually compounded look very different from the parts that demo well in a screenshot. The agent catalog you set up on day one is the same catalog you delete on day 90, when you realize you ran two of them once and forgot the other four existed.

This post is the version I wish I'd had at month one. It's not a copy-paste tree. It's an honest accounting of what to add, what to wait on, and what to delete in your quarterly cleanup pass. If you're past the "does Claude Code work?" stage and into the "how do I keep this useful at month six?" stage, this is for you.

The structure (and which parts you actually need on day 1)

Here's the canonical layout. The annotations are the honest version, not the vendor pitch:

.claude/
├── settings.json    ← REAL day-1 priority. Permissions + hooks.
├── skills/          ← Ship 1 skill before you ship 6. Resist the catalog.
├── agents/          ← Most projects don't need these. Resist.
├── hooks/           ← Add ONE pre-commit hook before adding 5.
├── commands/        ← Useful at month 2, not week 1.
└── rules/           ← Premature for most projects under 50 files.
CLAUDE.md            ← Day 1, always. Project context lives here.

The rule I've landed on: every directory you create is something you'll maintain. Each one is a future cleanup task with your name on it.

Day 1 is CLAUDE.md + settings.json. That's it. Skills come when you have a real workflow that'll fire twice in a week. Agents come when you have a parallel-task pattern that a single Claude pass can't handle. Hooks come when you've actually made the mistake the hook would prevent. Commands come when you find yourself typing the same prompt prefix three times in a day. Rules come when your project is big enough that drift matters.

Pre-creating the catalog is the most common mistake I see. People stand up a 7-folder .claude with two skills, three agents, and an empty commands/ folder, and three months later the skills are dead, the agents have bit-rotted, and the commands/ folder still has nothing in it. Resist. Add directories when you have a real reason to add them.

CLAUDE.md: the only file that matters in week 1

If you only do one thing, write a real CLAUDE.md. It's the brain of every Claude session that runs in your project. Get this right and the model stops suggesting Express when you're on Next.js, stops re-asking which package manager you use, and stops proposing the same architecture decisions you made (and rejected) six months ago.

What goes in:

Stack

So Claude doesn't suggest the wrong framework. Specific versions, not "we use React." If you're on Next.js 16 with Prisma 7 and deploy on Railway, say that. Two lines, every time.

Conventions you actually enforce

Linter rules that fail builds, naming patterns the team agreed on, banned APIs. Not aspirations ("we like clean code"). Things that the CI will reject. If you wouldn't ship a PR that violates it, it goes in.

Decisions made

The build-vs-buy choices, the "why X is in this repo not that one" calls, the architecture decisions you've already had the meeting about. The point is to keep Claude from re-pitching options you already killed.

Active backlog

What's WIP, what's stale, what's parked. Two sentences each. This is the section that goes out of date fastest, which is fine. The act of updating it is how you remember what you're actually working on.

Recurring gotchas

The things that bit you twice. Write them down so they don't bite a third time. This is the highest-compound section in the file. Every entry is a paid-for lesson.

What stays out: marketing fluff ("we believe in clean code"), aspirational content ("we plan to migrate to..."), and every architectural decision in the project's history. CLAUDE.md is not your wiki. It's the loadout your future Claude session needs to be useful in the first 30 seconds.

A real CLAUDE.md for a typical Next.js project looks like this:

# Stack
Next.js 16 (App Router), Prisma 7, PostgreSQL on Railway, R2 for object storage.
Package manager: pnpm. Node 22.

# Conventions
- Server Components by default. Use "use client" only when you need state/effects.
- DB access goes through src/lib/db.ts. No raw Prisma in components.
- All async server actions live in src/actions/. One file per domain.
- Banned: any, console.log in committed code, useEffect for data fetching.

# Decisions
- We use Prisma over Drizzle. Don't re-pitch.
- Auth is NextAuth v5 (not Clerk). Don't re-pitch.
- We deploy on Railway, not Vercel. Don't suggest Vercel-specific APIs.

# Backlog (active)
- Stripe webhook retry logic, WIP on branch billing-retries
- Admin RBAC, parked until next sprint

# Gotchas
- Prisma client must use the PrismaPg adapter on Railway. Default driver doesn't work with their Postgres pooler.
- Don't run db migrations from local against prod. Use Railway CLI.

That's under 250 words and worth more than 90% of the CLAUDE.md files I see in the wild. If yours is longer than two screens, it's probably full of stuff that should be deleted.

Skills: the 80/20 of what actually compounds

Skills are reusable workflows you can invoke with a slash command. Done well, they're the highest-ROI piece of a .claude folder. Done poorly, they're the directory that grows by one every two weeks and never shrinks.

The honest taxonomy I've landed on:

High compound (build these first)

Project-specific skills that solve a recurring 30+ minute task in under five minutes. Example: a /ship-check skill that runs typecheck, build, security scan, and migration safety review before every push. I run it weekly. It has caught real issues. The build cost was an afternoon; the time saved is in the dozens of hours.

Test for this category: did I use it twice in the last 14 days? If yes, it's earning its keep.

Medium compound

Decision frameworks. A five-voice council skill for high-stakes calls. A pre-mortem skill that walks the failure modes of a plan before you commit to it. These get used 2-4 times a month. Not weekly, but worth keeping because the alternative (running the same exercise from scratch each time) is significantly more friction.

Low compound (the trap)

Skills you build because they look cool. The "code reviewer" skill you invoke once for the demo and forget. The "documentation writer" you used to prove the concept to your team and never opened again. The "blog post idea generator" you ran on day one when you were excited about Claude.

The honest test: did I use this skill twice in the last 30 days? If no, archive it. The skills directory should shrink as often as it grows, and if you've never deleted one, you're probably keeping dead weight.

Most production .claude folders I trust have 3-7 skills, not 15. The ones with 15 either belong to people who built skills as content (and rarely use them) or to teams where multiple people contribute and nobody owns cleanup. Both are fixable; neither is rare.

Agents: when they're worth it (and when they're a maintenance tax)

Agents are sub-Claude instances you can spawn to do parallel work. The pitch is appealing: a "team" of specialists, each with its own role. The reality is more nuanced.

Agents are worth it when three conditions all hold:

The task is genuinely parallelizable (10 niche pages to draft, 5 search queries to run, 4 files to refactor that don't import each other).
The work is independent. No shared state, no "agent A's output feeds agent B's input."
The per-task context is genuinely different. If all four agents are reading the same files and producing similar outputs, one Claude with a list is faster.

For 95% of solo and small-team projects, one well-written CLAUDE.md beats six agents. The "agent team" pattern is a YouTube hook. It's not a workflow advantage in projects under 50 files.

For agency-scale work (multiple repos, multiple projects, real fan-out): agents help, but only if you have a way to merge results. If your "code-reviewer agent" runs and produces a 2,000-word review you don't actually read, that's negative ROI. The agent generated work; the reviewer (you) skipped it; the bug shipped anyway.

What real production looks like: 1-3 specialist agents max, each doing something a generalist Claude genuinely can't. An Explore agent for fast read-only codebase recon. A Plan agent for architecture decisions before you commit. A build agent for fan-out tasks. That's it. If you have six agents and you can't name what each one uniquely does, you have four too many.

Hooks: the part that catches all the bugs the rest doesn't

Hooks are scripts the Claude Code runtime fires at specific lifecycle points: before a tool call, after a session stops, before a commit. They're invisible until they trigger, and when they trigger, they save you from yourself.

The single highest-ROI hook in any production .claude is a Stop hook that verifies live deployment claims. The pattern: when Claude says "shipped" or "deployed" in its final response, the hook curls the prod URL, checks the response, and refuses to let the session end clean if the claim isn't grounded. It catches the "ship-and-claim-without-checking" failure mode that bites everyone exactly once.

Here's a stripped-down version of one I run:

#!/bin/bash
# Stop hook: refuses to end the session if Claude claimed
# something is live but a quick prod curl says otherwise.

LAST_MSG=$(cat "$CLAUDE_LAST_RESPONSE_FILE")

if echo "$LAST_MSG" | grep -qiE "(shipped|deployed|live now|pushed to prod)"; then
  STATUS=$(curl -s -o /dev/null -w "%{http_code}" \
    https://eliteaiadvantage.com/health)
  if [ "$STATUS" != "200" ]; then
    echo "BLOCKED: claim says shipped but /health returned $STATUS" >&2
    exit 2
  fi
fi
exit 0

The other hooks I'd consider, ranked by honest ROI:

Pre-commit type-check. Worth it only if your project has type errors that break things at runtime. If you're on a strict TypeScript project where the build catches everything, this is just noise. If you're on a looser config and runtime type bugs are a recurring theme, it pays back fast.

Migration-safety check. If you're on Postgres with Prisma or Drizzle and the model can suggest schema changes, a hook that flags destructive migrations (drop column, change type, NOT NULL on a big table) is genuinely worth its weight. I've had this catch real near-misses.

Secrets-scan on file write. Useful if you've ever leaked an API key. Annoying if you haven't. Probably worth it once you've grown past solo work.

The general rule: don't add hooks until you've made the mistake the hook would catch. Premature hooks rot. You'll add three on a Saturday afternoon, two of them will fire false positives the next week, and you'll disable the whole config out of frustration. Wait until the bug bites, then add the hook that would have caught it.

The settings.json that makes this all hang together

The kitchen-sink settings.json you see in tutorial posts has 40 permission rules, every hook turned on, and a model selection block that pins to a specific minor version. For most production projects, that's overkill.

The minimal version that holds up:

{
  "permissions": {
    "allow": [
      "Bash(npm *)",
      "Bash(pnpm *)",
      "Bash(git status)",
      "Bash(git diff *)",
      "Bash(git log *)",
      "Read(*)",
      "Edit(*)"
    ],
    "deny": [
      "Read(.env*)",
      "Bash(rm -rf *)",
      "Bash(git push --force*)"
    ]
  },
  "hooks": {
    "Stop": [
      {
        "type": "command",
        "command": ".claude/hooks/ship-claim-check.sh"
      }
    ]
  }
}

That's the whole file. Five to ten allow rules, three to five deny rules, zero or one hook. Most projects don't need more.

The deny list is more important than people think. Reading .env files is the single most common way credentials end up in a session transcript. Force-pushing to main is the single most common way to lose work. rm -rf with a wildcard is the single most common way to nuke a directory. These three rules cost nothing and have saved real engineers real money.

The maintenance cadence (the part nobody writes about)

Every .claude folder in production rots if you don't prune it. Quarterly is the cadence I've landed on. Block 30 minutes once a quarter and walk through:

Every skill: did I use this in the last 90 days? If no, archive. Move it to a skills/_archive/ directory if you're sentimental, or just delete it. The point is that it shouldn't be in the live catalog cluttering your slash-command list.

Every agent: same test. Most agents bit-rot fast as Claude's defaults improve. The "code reviewer" agent you wrote in early 2025 is probably worse than what current Claude does inline. Kill it.

Read CLAUDE.md. Are these still the right conventions? Is the active backlog still active or is half of it shipped? Is the gotchas section growing (good) or static (probably stale)?

Read every hook. Is each one still earning its keep? Has it fired in the last quarter? If a hook hasn't fired in 90 days, either the bug it catches is genuinely solved (delete it) or it's never going to fire because the trigger is wrong (fix or delete it).

The .claude folder grows over a year. The maintenance pass shrinks it back. Skip the pass for two cycles and it becomes a swamp: half-broken hooks, dead skills, a CLAUDE.md that lies about what's in the project. The folder turns from an asset into a liability, and the next person to onboard (including future you) will struggle to trust any of it.

What I'd tell someone setting one up tomorrow

Minimum viable production .claude on day one:

CLAUDE.md with the five sections above. Stack, conventions, decisions, backlog, gotchas. Two pages max.
settings.json with five allow rules, three deny rules, zero or one hook.
One or two skills that you'll genuinely fire weekly. If you can't name them off the top of your head, you don't have them yet.
Nothing else.

Add complexity only when the friction of not having it costs you 30+ minutes per week. That's the bar. If you're adding a skill because it sounds cool, pause. If you're adding an agent because everyone on Twitter has a team of agents, pause. The folder you build to impress someone is the folder that rots first. The folder you build to genuinely save yourself time is the one that compounds.

Closing

If you want to see what real production skills look like, the EAA Skills library has the actual ones I run, with the trigger conditions and the install commands. They're not aspirational; they're the ones that survived the quarterly cleanup pass.

If you're running Claude Code across multiple repos or a team and the maintenance pattern is starting to get away from you, that's part of what we help with at Elite AI Advantage. You can book a 30-minute review and we'll walk your setup, name the parts that are dead weight, and leave you with a one-page action list. No pitch deck, no upsell.