Back to white papers
White Paper

5 AI Projects for Your Resume: Full Technical Breakdown

Jake McCluskey
5 AI Projects for Your Resume: Full Technical Breakdown

Source post: https://www.instagram.com/p/DV-HQHHk9tD/ (datasciencebrain, 12-slide carousel)
Hook: "Hiring managers don't care about your certificates. They care about what you've actually built."

Scraping note: Instagram blocked per-slide image OCR. Text fragments leaked through for Projects 1, 2, and 4. For Projects 3 and 5, only titles leaked. Each section below is labeled:
[SCRAPED] is confirmed from the post.
[STANDARD] is the conventional industry stack for this project type, filled in because slide content couldn't be extracted. Treat as a safe default, not a literal copy.

Project 1: RAG Document Q&A System

Stack [SCRAPED]

  • FastAPI, REST API framework
  • pdfplumber, PDF text extraction
  • OpenAI text-embedding-3-small, embeddings
  • ChromaDB, vector store
  • GPT-4o-mini, answer generation

What it does

User uploads a PDF. System chunks, embeds, and stores it. User asks questions. System retrieves relevant chunks and generates grounded answers.

Minimal implementation [STANDARD pattern for this stack]

# pip install fastapi uvicorn pdfplumber chromadb openai python-multipart
from fastapi import FastAPI, UploadFile, File
import pdfplumber, chromadb, uuid
from openai import OpenAI

app = FastAPI()
oa = OpenAI()
chroma = chromadb.PersistentClient(path="./chroma_db")
collection = chroma.get_or_create_collection("docs")

def embed(texts):
    return [e.embedding for e in oa.embeddings.create(
        model="text-embedding-3-small", input=texts).data]

def chunk(text, size=800, overlap=100):
    return [text[i:i+size] for i in range(0, len(text), size-overlap)]

@app.post("/upload")
async def upload(file: UploadFile = File(...)):
    with pdfplumber.open(file.file) as pdf:
        text = "\n".join(p.extract_text() or "" for p in pdf.pages)
    chunks = chunk(text)
    collection.add(
        ids=[str(uuid.uuid4()) for _ in chunks],
        documents=chunks,
        embeddings=embed(chunks),
    )
    return {"chunks_indexed": len(chunks)}

@app.post("/ask")
async def ask(question: str):
    q_emb = embed([question])[0]
    hits = collection.query(query_embeddings=[q_emb], n_results=4)
    context = "\n\n".join(hits["documents"][0])
    resp = oa.chat.completions.create(
        model="gpt-4o-mini",
        messages=[
            {"role": "system", "content": "Answer using ONLY the context. If not in context, say you don't know."},
            {"role": "user", "content": f"Context:\n{context}\n\nQuestion: {question}"},
        ],
    )
    return {"answer": resp.choices[0].message.content}

Resume angle

"Built a production-ready document Q&A service with semantic retrieval and grounded generation. Chunked, embedded, and indexed arbitrary PDFs against a persistent vector store."

Project 2: Multi-Agent Research Bot

Stack [SCRAPED partial]

  • Python + LangGraph (StateGraph was explicitly mentioned in the slide text)
  • Multiple LLM-powered agents in an orchestrated workflow

Stack [STANDARD defaults]

  • LangGraph, orchestration
  • Tavily API, web search tool (free tier, designed for agents)
  • GPT-4o or Claude Sonnet, agent brains
  • Agents: Planner, Searcher, Summarizer, Writer

Minimal implementation [STANDARD pattern]

# pip install langgraph langchain langchain-openai tavily-python
from typing_extensions import TypedDict
from typing import List
from langgraph.graph import StateGraph, START, END
from langchain_openai import ChatOpenAI
from tavily import TavilyClient
import os

llm = ChatOpenAI(model="gpt-4o", temperature=0)
tavily = TavilyClient(api_key=os.environ["TAVILY_API_KEY"])

class ResearchState(TypedDict):
    topic: str
    plan: List[str]           # sub-questions
    findings: List[str]       # raw search results
    summary: str              # synthesized notes
    report: str               # final writeup

def planner(state):
    prompt = f"Break this research topic into 3-5 searchable sub-questions:\n{state['topic']}"
    plan = llm.invoke(prompt).content.split("\n")
    return {"plan": [p for p in plan if p.strip()]}

def searcher(state):
    findings = []
    for q in state["plan"]:
        results = tavily.search(q, max_results=3)
        findings.append(f"Q: {q}\n" + "\n".join(r["content"] for r in results["results"]))
    return {"findings": findings}

def summarizer(state):
    joined = "\n\n".join(state["findings"])
    prompt = f"Synthesize the key facts from these findings:\n\n{joined}"
    return {"summary": llm.invoke(prompt).content}

def writer(state):
    prompt = (f"Write a polished research report on '{state['topic']}' "
              f"using these synthesized notes:\n\n{state['summary']}")
    return {"report": llm.invoke(prompt).content}

graph = StateGraph(ResearchState)
graph.add_node("plan", planner)
graph.add_node("search", searcher)
graph.add_node("summarize", summarizer)
graph.add_node("write", writer)
graph.add_edge(START, "plan")
graph.add_edge("plan", "search")
graph.add_edge("search", "summarize")
graph.add_edge("summarize", "write")
graph.add_edge("write", END)
app = graph.compile()

print(app.invoke({"topic": "Impact of Mamba architecture on long-context LLMs"})["report"])

Resume angle

"Designed a multi-agent research pipeline using LangGraph where specialized agents (planner, web searcher, summarizer, writer) pass state through a deterministic graph, demonstrating orchestration beyond single-prompt LLM usage."

Project 3: AI Voice Scheduling Bot

Stack [TITLE ONLY, slide not extractable]

Stack [STANDARD, the two canonical paths]

Path A, compose from primitives (more resume-impressive):

  • Twilio Voice, phone number, call handling (webhook on incoming call)
  • Twilio <Gather> + speech recognition OR OpenAI Whisper, speech-to-text
  • GPT-4o (with function calling), intent parsing and slot filling (name, time, duration)
  • Google Calendar API, create event
  • ElevenLabs or Twilio <Say>, text-to-speech response
  • FastAPI, webhook server

Path B, use a voice-agent platform:

  • VAPI.ai or Retell AI, handles STT/TTS/streaming. You write the prompt and tools.
  • Google Calendar API, booking tool exposed to the agent
  • Faster to build, less impressive on a resume

Minimal implementation (Path A) [STANDARD pattern]

# pip install fastapi twilio openai google-api-python-client
from fastapi import FastAPI, Request
from fastapi.responses import Response
from twilio.twiml.voice_response import VoiceResponse, Gather
from openai import OpenAI
from googleapiclient.discovery import build
from google.oauth2 import service_account
import json, datetime

app = FastAPI()
oa = OpenAI()
creds = service_account.Credentials.from_service_account_file(
    "gcal.json", scopes=["https://www.googleapis.com/auth/calendar"])
cal = build("calendar", "v3", credentials=creds)

TOOLS = [{
    "type": "function",
    "function": {
        "name": "book_meeting",
        "description": "Book a meeting on the calendar",
        "parameters": {
            "type": "object",
            "properties": {
                "title": {"type": "string"},
                "start_iso": {"type": "string", "description": "ISO 8601 datetime"},
                "duration_minutes": {"type": "integer"},
                "attendee_email": {"type": "string"},
            },
            "required": ["title", "start_iso", "duration_minutes"],
        },
    },
}]

@app.post("/voice")
async def incoming_call():
    # Entry point — Twilio webhook on incoming call
    vr = VoiceResponse()
    gather = Gather(input="speech", action="/handle", speechTimeout="auto")
    gather.say("Hi! I can book a meeting for you. What would you like to schedule?")
    vr.append(gather)
    return Response(content=str(vr), media_type="application/xml")

@app.post("/handle")
async def handle_speech(request: Request):
    form = await request.form()
    user_said = form.get("SpeechResult", "")
    now = datetime.datetime.now().isoformat()

    resp = oa.chat.completions.create(
        model="gpt-4o",
        messages=[
            {"role": "system", "content": f"You are a scheduling assistant. Current time: {now}. Parse the user's request and call book_meeting."},
            {"role": "user", "content": user_said},
        ],
        tools=TOOLS,
    )
    msg = resp.choices[0].message
    vr = VoiceResponse()

    if msg.tool_calls:
        args = json.loads(msg.tool_calls[0].function.arguments)
        start = datetime.datetime.fromisoformat(args["start_iso"])
        end = start + datetime.timedelta(minutes=args["duration_minutes"])
        cal.events().insert(calendarId="primary", body={
            "summary": args["title"],
            "start": {"dateTime": start.isoformat()},
            "end": {"dateTime": end.isoformat()},
        }).execute()
        vr.say(f"Booked {args['title']} for {start.strftime('%A at %-I %p')}. Goodbye!")
    else:
        vr.say("Sorry, I couldn't understand the request. Please try again.")
    vr.hangup()
    return Response(content=str(vr), media_type="application/xml")

Resume angle

"Built a telephony-driven voice agent that handles inbound calls, transcribes speech, extracts scheduling intent via LLM function-calling, and books events on Google Calendar. Full round-trip under 3 seconds."

Project 4: AI Code Review Agent

Stack [SCRAPED]

  • Python + GPT-4
  • Functions named in the slide: fetch_pr_diff(), review_code(), post_review()
  • GitHub PR webhooks, triggers on every PR submission
  • Suggested extensions: 1-10 quality score per file, secondary agent for concrete fix suggestions

Minimal implementation [STANDARD pattern matching the scraped function names]

# pip install fastapi uvicorn pygithub openai
from fastapi import FastAPI, Request, Header, HTTPException
from github import Github
from openai import OpenAI
import hmac, hashlib, os

app = FastAPI()
oa = OpenAI()
gh = Github(os.environ["GITHUB_TOKEN"])
WEBHOOK_SECRET = os.environ["WEBHOOK_SECRET"].encode()

def verify_signature(body: bytes, signature: str):
    expected = "sha256=" + hmac.new(WEBHOOK_SECRET, body, hashlib.sha256).hexdigest()
    if not hmac.compare_digest(expected, signature):
        raise HTTPException(401, "Invalid signature")

def fetch_pr_diff(repo_name: str, pr_number: int) -> list:
    """Fetch all file diffs from a PR."""
    repo = gh.get_repo(repo_name)
    pr = repo.get_pull(pr_number)
    return [{"filename": f.filename, "patch": f.patch or ""} for f in pr.get_files()]

def review_code(files: list) -> dict:
    """GPT-4 review — returns per-file comments + overall score."""
    diffs = "\n\n".join(f"### {f['filename']}\n```\n{f['patch']}\n```" for f in files)
    resp = oa.chat.completions.create(
        model="gpt-4",
        messages=[
            {"role": "system", "content": (
                "You are a senior code reviewer. For each file, identify bugs, "
                "security issues, performance problems, and style violations. "
                "Give each file a quality score 1-10. Return JSON: "
                '{"per_file": [{"filename": str, "score": int, "comments": [str]}], "overall": str}'
            )},
            {"role": "user", "content": diffs},
        ],
        response_format={"type": "json_object"},
    )
    import json
    return json.loads(resp.choices[0].message.content)

def post_review(repo_name: str, pr_number: int, review: dict):
    """Post review as a PR comment."""
    repo = gh.get_repo(repo_name)
    pr = repo.get_pull(pr_number)
    body = f"## Automated AI Review\n\n**Overall:** {review['overall']}\n\n"
    for f in review["per_file"]:
        body += f"### `{f['filename']}` — Score: {f['score']}/10\n"
        for c in f["comments"]:
            body += f"- {c}\n"
        body += "\n"
    pr.create_issue_comment(body)

@app.post("/webhook")
async def webhook(request: Request, x_hub_signature_256: str = Header(None)):
    body = await request.body()
    verify_signature(body, x_hub_signature_256)
    event = await request.json()

    # Only act on opened/synchronize PR events
    if event.get("action") not in ("opened", "synchronize"):
        return {"skipped": True}

    repo_name = event["repository"]["full_name"]
    pr_number = event["pull_request"]["number"]

    files = fetch_pr_diff(repo_name, pr_number)
    review = review_code(files)
    post_review(repo_name, pr_number, review)
    return {"reviewed": True, "files": len(files)}

Setup: ngrok your FastAPI server, then GitHub repo Settings, Webhooks, add https://<ngrok>/webhook, content-type application/json, secret = your WEBHOOK_SECRET, events = Pull requests.

Scraped extension ideas

  1. Scoring (1-10 per file), already implemented above in review_code()
  2. Secondary fix-suggestion agent, take each comment, run a follow-up prompt that produces a code patch, post as inline review comment via pr.create_review(comments=[{path, position, body}])

Resume angle

"Shipped a GitHub-integrated code review bot. FastAPI webhook handler verifies HMAC signatures, pulls PR diffs via PyGithub, generates structured JSON reviews with GPT-4, and posts scored feedback per file on every PR."

Project 5: Full-Stack AI SaaS with Payments

Stack [TITLE ONLY, slide not extractable]

Stack [STANDARD, the canonical 2026 SaaS stack]

  • Next.js 14+ (App Router) + TypeScript, frontend and API routes
  • Clerk or Auth.js (NextAuth), authentication
  • Postgres (Neon or Supabase) + Prisma or Drizzle, database
  • Stripe, subscriptions and metered billing / credits
  • OpenAI API (or similar), AI feature
  • Vercel, hosting
  • shadcn/ui + Tailwind, UI
  • Credits model: Stripe webhook on subscription/purchase writes credit balance to DB. Every AI call decrements credits.

Architecture [STANDARD pattern]

┌─────────────────┐     ┌──────────────────┐
│  Next.js UI     │────▶│  /api/ai         │──▶ OpenAI
│  (App Router)   │     │  (check credits, │
│  Clerk auth     │     │   deduct, call)  │──▶ DB (decrement)
└─────────────────┘     └──────────────────┘
         │
         │ Subscribe / Buy credits
         ▼
┌─────────────────┐     ┌──────────────────┐
│ Stripe Checkout │────▶│  /api/webhooks/  │──▶ DB (add credits,
│                 │     │   stripe         │    update subscription)
└─────────────────┘     └──────────────────┘

Key code [STANDARD pattern]

app/api/ai/route.ts, gated AI endpoint:

import { auth } from "@clerk/nextjs/server";
import { db } from "@/lib/db";
import { OpenAI } from "openai";
import { NextResponse } from "next/server";

const openai = new OpenAI();

export async function POST(req: Request) {
  const { userId } = auth();
  if (!userId) return NextResponse.json({ error: "unauthorized" }, { status: 401 });

  const user = await db.user.findUnique({ where: { clerkId: userId } });
  if (!user || user.credits < 1) {
    return NextResponse.json({ error: "Out of credits" }, { status: 402 });
  }

  const { prompt } = await req.json();
  const resp = await openai.chat.completions.create({
    model: "gpt-4o-mini",
    messages: [{ role: "user", content: prompt }],
  });

  await db.user.update({
    where: { clerkId: userId },
    data: { credits: { decrement: 1 } },
  });

  return NextResponse.json({ output: resp.choices[0].message.content });
}

app/api/webhooks/stripe/route.ts, credit top-up on purchase:

import Stripe from "stripe";
import { db } from "@/lib/db";
import { NextResponse } from "next/server";

const stripe = new Stripe(process.env.STRIPE_SECRET_KEY!);

export async function POST(req: Request) {
  const body = await req.text();
  const sig = req.headers.get("stripe-signature")!;
  const event = stripe.webhooks.constructEvent(
    body, sig, process.env.STRIPE_WEBHOOK_SECRET!
  );

  if (event.type === "checkout.session.completed") {
    const session = event.data.object as Stripe.Checkout.Session;
    const userId = session.metadata!.clerkId;
    const credits = Number(session.metadata!.credits);
    await db.user.update({
      where: { clerkId: userId },
      data: { credits: { increment: credits } },
    });
  }
  return NextResponse.json({ received: true });
}

Prisma schema:

model User {
  id        String  @id @default(cuid())
  clerkId   String  @unique
  email     String  @unique
  credits   Int     @default(10)  // free starter credits
  stripeId  String?
  createdAt DateTime @default(now())
}

Resume angle

"Shipped a full-stack AI SaaS with authenticated users, metered credit billing via Stripe webhooks, and AI feature gating. Every inference deducts from a server-authoritative credit balance, preventing abuse and tying revenue to model cost."

How to actually use this

The Instagram post's value was the project ideas and high-level stacks, not novel engineering. Each of these projects is a solid 1-3 day build. Priority order for resume impact:

  1. Project 5 (SaaS), hardest to fake, shows end-to-end product thinking, has real users and payments
  2. Project 4 (Code Review Agent), ships to a real system (GitHub), demonstrable on your own repos
  3. Project 2 (Multi-Agent Research), "agentic" is the 2026 buzzword, LangGraph is the right framework
  4. Project 3 (Voice Bot), visually and audibly impressive in demos
  5. Project 1 (RAG), table-stakes by 2026, but a solid foundation project

Confidence summary

ProjectWhat's scraped vs. standard
1. RAG Q&AStack scraped. Code is standard FastAPI+Chroma pattern
2. Multi-AgentLangGraph scraped. Agent roles are standard planner, search, summarize, write
3. Voice BotOnly title scraped. Full stack is standard Twilio+OpenAI+GCal
4. Code ReviewStack and function names scraped. Code matches those signatures
5. SaaSOnly title scraped. Full stack is standard Next.js+Stripe+Clerk