5 AI Projects for Your Resume: Full Technical Breakdown
White Paper

5 AI Projects for Your Resume: Full Technical Breakdown

Jake McCluskeyUpdated
Back to white papers

Source post: https://www.instagram.com/p/DV-HQHHk9tD/ (datasciencebrain, 12-slide carousel)
Hook: "Hiring managers don't care about your certificates. They care about what you've actually built."

Scraping note: Instagram blocked per-slide image OCR. Text fragments leaked through for Projects 1, 2, and 4. For Projects 3 and 5, only titles leaked. Each section below is labeled:
[SCRAPED] is confirmed from the post.
[STANDARD] is the conventional industry stack for this project type, filled in because slide content couldn't be extracted. Treat as a safe default, not a literal copy.

Project 1: RAG Document Q&A System

Stack [SCRAPED]

  • FastAPI, REST API framework
  • pdfplumber, PDF text extraction
  • OpenAI text-embedding-3-small, embeddings
  • ChromaDB, vector store
  • GPT-4o-mini, answer generation

What it does

User uploads a PDF. System chunks, embeds, and stores it. User asks questions. System retrieves relevant chunks and generates grounded answers.

Minimal implementation [STANDARD pattern for this stack]

# pip install fastapi uvicorn pdfplumber chromadb openai python-multipart
from fastapi import FastAPI, UploadFile, File
import pdfplumber, chromadb, uuid
from openai import OpenAI

app = FastAPI()
oa = OpenAI()
chroma = chromadb.PersistentClient(path="./chroma_db")
collection = chroma.get_or_create_collection("docs")

def embed(texts):
    return [e.embedding for e in oa.embeddings.create(
        model="text-embedding-3-small", input=texts).data]

def chunk(text, size=800, overlap=100):
    return [text[i:i+size] for i in range(0, len(text), size-overlap)]

@app.post("/upload")
async def upload(file: UploadFile = File(...)):
    with pdfplumber.open(file.file) as pdf:
        text = "\n".join(p.extract_text() or "" for p in pdf.pages)
    chunks = chunk(text)
    collection.add(
        ids=[str(uuid.uuid4()) for _ in chunks],
        documents=chunks,
        embeddings=embed(chunks),
    )
    return {"chunks_indexed": len(chunks)}

@app.post("/ask")
async def ask(question: str):
    q_emb = embed([question])[0]
    hits = collection.query(query_embeddings=[q_emb], n_results=4)
    context = "\n\n".join(hits["documents"][0])
    resp = oa.chat.completions.create(
        model="gpt-4o-mini",
        messages=[
            {"role": "system", "content": "Answer using ONLY the context. If not in context, say you don't know."},
            {"role": "user", "content": f"Context:\n{context}\n\nQuestion: {question}"},
        ],
    )
    return {"answer": resp.choices[0].message.content}

Resume angle

"Built a production-ready document Q&A service with semantic retrieval and grounded generation. Chunked, embedded, and indexed arbitrary PDFs against a persistent vector store."

Project 2: Multi-Agent Research Bot

Stack [SCRAPED partial]

  • Python + LangGraph (StateGraph was explicitly mentioned in the slide text)
  • Multiple LLM-powered agents in an orchestrated workflow

Stack [STANDARD defaults]

  • LangGraph, orchestration
  • Tavily API, web search tool (free tier, designed for agents)
  • GPT-4o or Claude Sonnet, agent brains
  • Agents: Planner, Searcher, Summarizer, Writer

Minimal implementation [STANDARD pattern]

# pip install langgraph langchain langchain-openai tavily-python
from typing_extensions import TypedDict
from typing import List
from langgraph.graph import StateGraph, START, END
from langchain_openai import ChatOpenAI
from tavily import TavilyClient
import os

llm = ChatOpenAI(model="gpt-4o", temperature=0)
tavily = TavilyClient(api_key=os.environ["TAVILY_API_KEY"])

class ResearchState(TypedDict):
    topic: str
    plan: List[str]           # sub-questions
    findings: List[str]       # raw search results
    summary: str              # synthesized notes
    report: str               # final writeup

def planner(state):
    prompt = f"Break this research topic into 3-5 searchable sub-questions:\n{state['topic']}"
    plan = llm.invoke(prompt).content.split("\n")
    return {"plan": [p for p in plan if p.strip()]}

def searcher(state):
    findings = []
    for q in state["plan"]:
        results = tavily.search(q, max_results=3)
        findings.append(f"Q: {q}\n" + "\n".join(r["content"] for r in results["results"]))
    return {"findings": findings}

def summarizer(state):
    joined = "\n\n".join(state["findings"])
    prompt = f"Synthesize the key facts from these findings:\n\n{joined}"
    return {"summary": llm.invoke(prompt).content}

def writer(state):
    prompt = (f"Write a polished research report on '{state['topic']}' "
              f"using these synthesized notes:\n\n{state['summary']}")
    return {"report": llm.invoke(prompt).content}

graph = StateGraph(ResearchState)
graph.add_node("plan", planner)
graph.add_node("search", searcher)
graph.add_node("summarize", summarizer)
graph.add_node("write", writer)
graph.add_edge(START, "plan")
graph.add_edge("plan", "search")
graph.add_edge("search", "summarize")
graph.add_edge("summarize", "write")
graph.add_edge("write", END)
app = graph.compile()

print(app.invoke({"topic": "Impact of Mamba architecture on long-context LLMs"})["report"])

Resume angle

"Designed a multi-agent research pipeline using LangGraph where specialized agents (planner, web searcher, summarizer, writer) pass state through a deterministic graph, demonstrating orchestration beyond single-prompt LLM usage."

Project 3: AI Voice Scheduling Bot

Stack [TITLE ONLY, slide not extractable]

Stack [STANDARD, the two canonical paths]

Path A, compose from primitives (more resume-impressive):

  • Twilio Voice, phone number, call handling (webhook on incoming call)
  • Twilio <Gather> + speech recognition OR OpenAI Whisper, speech-to-text
  • GPT-4o (with function calling), intent parsing and slot filling (name, time, duration)
  • Google Calendar API, create event
  • ElevenLabs or Twilio <Say>, text-to-speech response
  • FastAPI, webhook server

Path B, use a voice-agent platform:

  • VAPI.ai or Retell AI, handles STT/TTS/streaming. You write the prompt and tools.
  • Google Calendar API, booking tool exposed to the agent
  • Faster to build, less impressive on a resume

Minimal implementation (Path A) [STANDARD pattern]

# pip install fastapi twilio openai google-api-python-client
from fastapi import FastAPI, Request
from fastapi.responses import Response
from twilio.twiml.voice_response import VoiceResponse, Gather
from openai import OpenAI
from googleapiclient.discovery import build
from google.oauth2 import service_account
import json, datetime

app = FastAPI()
oa = OpenAI()
creds = service_account.Credentials.from_service_account_file(
    "gcal.json", scopes=["https://www.googleapis.com/auth/calendar"])
cal = build("calendar", "v3", credentials=creds)

TOOLS = [{
    "type": "function",
    "function": {
        "name": "book_meeting",
        "description": "Book a meeting on the calendar",
        "parameters": {
            "type": "object",
            "properties": {
                "title": {"type": "string"},
                "start_iso": {"type": "string", "description": "ISO 8601 datetime"},
                "duration_minutes": {"type": "integer"},
                "attendee_email": {"type": "string"},
            },
            "required": ["title", "start_iso", "duration_minutes"],
        },
    },
}]

@app.post("/voice")
async def incoming_call():
    # Entry point, Twilio webhook on incoming call
    vr = VoiceResponse()
    gather = Gather(input="speech", action="/handle", speechTimeout="auto")
    gather.say("Hi! I can book a meeting for you. What would you like to schedule?")
    vr.append(gather)
    return Response(content=str(vr), media_type="application/xml")

@app.post("/handle")
async def handle_speech(request: Request):
    form = await request.form()
    user_said = form.get("SpeechResult", "")
    now = datetime.datetime.now().isoformat()

    resp = oa.chat.completions.create(
        model="gpt-4o",
        messages=[
            {"role": "system", "content": f"You are a scheduling assistant. Current time: {now}. Parse the user's request and call book_meeting."},
            {"role": "user", "content": user_said},
        ],
        tools=TOOLS,
    )
    msg = resp.choices[0].message
    vr = VoiceResponse()

    if msg.tool_calls:
        args = json.loads(msg.tool_calls[0].function.arguments)
        start = datetime.datetime.fromisoformat(args["start_iso"])
        end = start + datetime.timedelta(minutes=args["duration_minutes"])
        cal.events().insert(calendarId="primary", body={
            "summary": args["title"],
            "start": {"dateTime": start.isoformat()},
            "end": {"dateTime": end.isoformat()},
        }).execute()
        vr.say(f"Booked {args['title']} for {start.strftime('%A at %-I %p')}. Goodbye!")
    else:
        vr.say("Sorry, I couldn't understand the request. Please try again.")
    vr.hangup()
    return Response(content=str(vr), media_type="application/xml")

Resume angle

"Built a telephony-driven voice agent that handles inbound calls, transcribes speech, extracts scheduling intent via LLM function-calling, and books events on Google Calendar. Full round-trip under 3 seconds."

Project 4: AI Code Review Agent

Stack [SCRAPED]

  • Python + GPT-4
  • Functions named in the slide: fetch_pr_diff(), review_code(), post_review()
  • GitHub PR webhooks, triggers on every PR submission
  • Suggested extensions: 1-10 quality score per file, secondary agent for concrete fix suggestions

Minimal implementation [STANDARD pattern matching the scraped function names]

# pip install fastapi uvicorn pygithub openai
from fastapi import FastAPI, Request, Header, HTTPException
from github import Github
from openai import OpenAI
import hmac, hashlib, os

app = FastAPI()
oa = OpenAI()
gh = Github(os.environ["GITHUB_TOKEN"])
WEBHOOK_SECRET = os.environ["WEBHOOK_SECRET"].encode()

def verify_signature(body: bytes, signature: str):
    expected = "sha256=" + hmac.new(WEBHOOK_SECRET, body, hashlib.sha256).hexdigest()
    if not hmac.compare_digest(expected, signature):
        raise HTTPException(401, "Invalid signature")

def fetch_pr_diff(repo_name: str, pr_number: int) -> list:
    """Fetch all file diffs from a PR."""
    repo = gh.get_repo(repo_name)
    pr = repo.get_pull(pr_number)
    return [{"filename": f.filename, "patch": f.patch or ""} for f in pr.get_files()]

def review_code(files: list) -> dict:
    """GPT-4 review, returns per-file comments + overall score."""
    diffs = "\n\n".join(f"### {f['filename']}\n```\n{f['patch']}\n```" for f in files)
    resp = oa.chat.completions.create(
        model="gpt-4",
        messages=[
            {"role": "system", "content": (
                "You are a senior code reviewer. For each file, identify bugs, "
                "security issues, performance problems, and style violations. "
                "Give each file a quality score 1-10. Return JSON: "
                '{"per_file": [{"filename": str, "score": int, "comments": [str]}], "overall": str}'
            )},
            {"role": "user", "content": diffs},
        ],
        response_format={"type": "json_object"},
    )
    import json
    return json.loads(resp.choices[0].message.content)

def post_review(repo_name: str, pr_number: int, review: dict):
    """Post review as a PR comment."""
    repo = gh.get_repo(repo_name)
    pr = repo.get_pull(pr_number)
    body = f"## Automated AI Review\n\n**Overall:** {review['overall']}\n\n"
    for f in review["per_file"]:
        body += f"### `{f['filename']}`, Score: {f['score']}/10\n"
        for c in f["comments"]:
            body += f"- {c}\n"
        body += "\n"
    pr.create_issue_comment(body)

@app.post("/webhook")
async def webhook(request: Request, x_hub_signature_256: str = Header(None)):
    body = await request.body()
    verify_signature(body, x_hub_signature_256)
    event = await request.json()

    # Only act on opened/synchronize PR events
    if event.get("action") not in ("opened", "synchronize"):
        return {"skipped": True}

    repo_name = event["repository"]["full_name"]
    pr_number = event["pull_request"]["number"]

    files = fetch_pr_diff(repo_name, pr_number)
    review = review_code(files)
    post_review(repo_name, pr_number, review)
    return {"reviewed": True, "files": len(files)}

Setup: ngrok your FastAPI server, then GitHub repo Settings, Webhooks, add https://<ngrok>/webhook, content-type application/json, secret = your WEBHOOK_SECRET, events = Pull requests.

Scraped extension ideas

  1. Scoring (1-10 per file), already implemented above in review_code()
  2. Secondary fix-suggestion agent, take each comment, run a follow-up prompt that produces a code patch, post as inline review comment via pr.create_review(comments=[{path, position, body}])

Resume angle

"Shipped a GitHub-integrated code review bot. FastAPI webhook handler verifies HMAC signatures, pulls PR diffs via PyGithub, generates structured JSON reviews with GPT-4, and posts scored feedback per file on every PR."

Project 5: Full-Stack AI SaaS with Payments

Stack [TITLE ONLY, slide not extractable]

Stack [STANDARD, the canonical 2026 SaaS stack]

  • Next.js 14+ (App Router) + TypeScript, frontend and API routes
  • Clerk or Auth.js (NextAuth), authentication
  • Postgres (Neon or Supabase) + Prisma or Drizzle, database
  • Stripe, subscriptions and metered billing / credits
  • OpenAI API (or similar), AI feature
  • Vercel, hosting
  • shadcn/ui + Tailwind, UI
  • Credits model: Stripe webhook on subscription/purchase writes credit balance to DB. Every AI call decrements credits.

Architecture [STANDARD pattern]

┌─────────────────┐     ┌──────────────────┐
│  Next.js UI     │────▶│  /api/ai         │──▶ OpenAI
│  (App Router)   │     │  (check credits, │
│  Clerk auth     │     │   deduct, call)  │──▶ DB (decrement)
└─────────────────┘     └──────────────────┘
         │
         │ Subscribe / Buy credits
         ▼
┌─────────────────┐     ┌──────────────────┐
│ Stripe Checkout │────▶│  /api/webhooks/  │──▶ DB (add credits,
│                 │     │   stripe         │    update subscription)
└─────────────────┘     └──────────────────┘

Key code [STANDARD pattern]

app/api/ai/route.ts, gated AI endpoint:

import { auth } from "@clerk/nextjs/server";
import { db } from "@/lib/db";
import { OpenAI } from "openai";
import { NextResponse } from "next/server";

const openai = new OpenAI();

export async function POST(req: Request) {
  const { userId } = auth();
  if (!userId) return NextResponse.json({ error: "unauthorized" }, { status: 401 });

  const user = await db.user.findUnique({ where: { clerkId: userId } });
  if (!user || user.credits < 1) {
    return NextResponse.json({ error: "Out of credits" }, { status: 402 });
  }

  const { prompt } = await req.json();
  const resp = await openai.chat.completions.create({
    model: "gpt-4o-mini",
    messages: [{ role: "user", content: prompt }],
  });

  await db.user.update({
    where: { clerkId: userId },
    data: { credits: { decrement: 1 } },
  });

  return NextResponse.json({ output: resp.choices[0].message.content });
}

app/api/webhooks/stripe/route.ts, credit top-up on purchase:

import Stripe from "stripe";
import { db } from "@/lib/db";
import { NextResponse } from "next/server";

const stripe = new Stripe(process.env.STRIPE_SECRET_KEY!);

export async function POST(req: Request) {
  const body = await req.text();
  const sig = req.headers.get("stripe-signature")!;
  const event = stripe.webhooks.constructEvent(
    body, sig, process.env.STRIPE_WEBHOOK_SECRET!
  );

  if (event.type === "checkout.session.completed") {
    const session = event.data.object as Stripe.Checkout.Session;
    const userId = session.metadata!.clerkId;
    const credits = Number(session.metadata!.credits);
    await db.user.update({
      where: { clerkId: userId },
      data: { credits: { increment: credits } },
    });
  }
  return NextResponse.json({ received: true });
}

Prisma schema:

model User {
  id        String  @id @default(cuid())
  clerkId   String  @unique
  email     String  @unique
  credits   Int     @default(10)  // free starter credits
  stripeId  String?
  createdAt DateTime @default(now())
}

Resume angle

"Shipped a full-stack AI SaaS with authenticated users, metered credit billing via Stripe webhooks, and AI feature gating. Every inference deducts from a server-authoritative credit balance, preventing abuse and tying revenue to model cost."

How to actually use this

The Instagram post's value was the project ideas and high-level stacks, not novel engineering. Each of these projects is a solid 1-3 day build. Priority order for resume impact:

  1. Project 5 (SaaS), hardest to fake, shows end-to-end product thinking, has real users and payments
  2. Project 4 (Code Review Agent), ships to a real system (GitHub), demonstrable on your own repos
  3. Project 2 (Multi-Agent Research), "agentic" is the 2026 buzzword, LangGraph is the right framework
  4. Project 3 (Voice Bot), visually and audibly impressive in demos
  5. Project 1 (RAG), table-stakes by 2026, but a solid foundation project

Confidence summary

ProjectWhat's scraped vs. standard
1. RAG Q&AStack scraped. Code is standard FastAPI+Chroma pattern
2. Multi-AgentLangGraph scraped. Agent roles are standard planner, search, summarize, write
3. Voice BotOnly title scraped. Full stack is standard Twilio+OpenAI+GCal
4. Code ReviewStack and function names scraped. Code matches those signatures
5. SaaSOnly title scraped. Full stack is standard Next.js+Stripe+Clerk
Common questions

Frequently asked

What tech stack do I need to build a RAG document Q&A system for my portfolio?

You need FastAPI for the REST API, pdfplumber for PDF text extraction, OpenAI text-embedding-3-small for embeddings, ChromaDB as the vector store, and GPT-4o-mini for answer generation. The system chunks uploaded PDFs, embeds them, and retrieves relevant chunks to generate grounded answers to user questions.

How do I implement a multi-agent research bot with LangGraph?

Use LangGraph StateGraph to orchestrate specialized agents (planner, searcher, summarizer, writer) that pass state through a deterministic graph. The planner breaks the topic into sub-questions, the searcher queries Tavily API for each, the summarizer synthesizes findings using an LLM, and the writer produces a final report. This demonstrates orchestration beyond single-prompt LLM usage.

What are the two main approaches to building an AI voice scheduling bot?

Path A composes primitives: Twilio Voice for call handling, Whisper or Twilio speech recognition, GPT-4o with function calling for intent parsing, Google Calendar API for booking, and ElevenLabs or Twilio Say for text-to-speech. Path B uses a voice-agent platform like VAPI.ai or Retell AI that handles STT/TTS/streaming while you write the prompt and tools. Path A is more resume-impressive because it shows deeper technical control.

How does an AI code review agent integrate with GitHub pull requests?

The agent uses GitHub PR webhooks that trigger on every pull request submission. It fetches the PR diff via the GitHub API, sends the code to GPT-4 for review (identifying bugs, security issues, performance problems, and style violations), scores each file 1 to 10, and posts the results as a PR comment. You configure the webhook in GitHub repo settings to point to your FastAPI server endpoint.

What should I write on my resume for a RAG document Q&A project?

Write that you built a production-ready document Q&A service with semantic retrieval and grounded generation, and that you chunked, embedded, and indexed arbitrary PDFs against a persistent vector store. This framing emphasizes the production-ready architecture and retrieval engineering rather than just API integration.

READY TO IMPLEMENT

Want to talk through this in your business?

The paper above is the thinking. Let's spend 30 minutes on what it would actually look like to ship in your shop, no pitch, just a real scoping conversation.

5 AI Projects for Your Resume: Full Technical Breakdown