5 AI Projects for Your Resume: Full Technical Breakdown

Source post: https://www.instagram.com/p/DV-HQHHk9tD/ (datasciencebrain, 12-slide carousel)
Hook: "Hiring managers don't care about your certificates. They care about what you've actually built."
Scraping note: Instagram blocked per-slide image OCR. Text fragments leaked through for Projects 1, 2, and 4. For Projects 3 and 5, only titles leaked. Each section below is labeled:
[SCRAPED] is confirmed from the post.
[STANDARD] is the conventional industry stack for this project type, filled in because slide content couldn't be extracted. Treat as a safe default, not a literal copy.
Project 1: RAG Document Q&A System
Stack [SCRAPED]
- FastAPI, REST API framework
- pdfplumber, PDF text extraction
- OpenAI
text-embedding-3-small, embeddings - ChromaDB, vector store
- GPT-4o-mini, answer generation
What it does
User uploads a PDF. System chunks, embeds, and stores it. User asks questions. System retrieves relevant chunks and generates grounded answers.
Minimal implementation [STANDARD pattern for this stack]
# pip install fastapi uvicorn pdfplumber chromadb openai python-multipart
from fastapi import FastAPI, UploadFile, File
import pdfplumber, chromadb, uuid
from openai import OpenAI
app = FastAPI()
oa = OpenAI()
chroma = chromadb.PersistentClient(path="./chroma_db")
collection = chroma.get_or_create_collection("docs")
def embed(texts):
return [e.embedding for e in oa.embeddings.create(
model="text-embedding-3-small", input=texts).data]
def chunk(text, size=800, overlap=100):
return [text[i:i+size] for i in range(0, len(text), size-overlap)]
@app.post("/upload")
async def upload(file: UploadFile = File(...)):
with pdfplumber.open(file.file) as pdf:
text = "\n".join(p.extract_text() or "" for p in pdf.pages)
chunks = chunk(text)
collection.add(
ids=[str(uuid.uuid4()) for _ in chunks],
documents=chunks,
embeddings=embed(chunks),
)
return {"chunks_indexed": len(chunks)}
@app.post("/ask")
async def ask(question: str):
q_emb = embed([question])[0]
hits = collection.query(query_embeddings=[q_emb], n_results=4)
context = "\n\n".join(hits["documents"][0])
resp = oa.chat.completions.create(
model="gpt-4o-mini",
messages=[
{"role": "system", "content": "Answer using ONLY the context. If not in context, say you don't know."},
{"role": "user", "content": f"Context:\n{context}\n\nQuestion: {question}"},
],
)
return {"answer": resp.choices[0].message.content}
Resume angle
"Built a production-ready document Q&A service with semantic retrieval and grounded generation. Chunked, embedded, and indexed arbitrary PDFs against a persistent vector store."
Project 2: Multi-Agent Research Bot
Stack [SCRAPED partial]
- Python + LangGraph (
StateGraphwas explicitly mentioned in the slide text) - Multiple LLM-powered agents in an orchestrated workflow
Stack [STANDARD defaults]
- LangGraph, orchestration
- Tavily API, web search tool (free tier, designed for agents)
- GPT-4o or Claude Sonnet, agent brains
- Agents: Planner, Searcher, Summarizer, Writer
Minimal implementation [STANDARD pattern]
# pip install langgraph langchain langchain-openai tavily-python
from typing_extensions import TypedDict
from typing import List
from langgraph.graph import StateGraph, START, END
from langchain_openai import ChatOpenAI
from tavily import TavilyClient
import os
llm = ChatOpenAI(model="gpt-4o", temperature=0)
tavily = TavilyClient(api_key=os.environ["TAVILY_API_KEY"])
class ResearchState(TypedDict):
topic: str
plan: List[str] # sub-questions
findings: List[str] # raw search results
summary: str # synthesized notes
report: str # final writeup
def planner(state):
prompt = f"Break this research topic into 3-5 searchable sub-questions:\n{state['topic']}"
plan = llm.invoke(prompt).content.split("\n")
return {"plan": [p for p in plan if p.strip()]}
def searcher(state):
findings = []
for q in state["plan"]:
results = tavily.search(q, max_results=3)
findings.append(f"Q: {q}\n" + "\n".join(r["content"] for r in results["results"]))
return {"findings": findings}
def summarizer(state):
joined = "\n\n".join(state["findings"])
prompt = f"Synthesize the key facts from these findings:\n\n{joined}"
return {"summary": llm.invoke(prompt).content}
def writer(state):
prompt = (f"Write a polished research report on '{state['topic']}' "
f"using these synthesized notes:\n\n{state['summary']}")
return {"report": llm.invoke(prompt).content}
graph = StateGraph(ResearchState)
graph.add_node("plan", planner)
graph.add_node("search", searcher)
graph.add_node("summarize", summarizer)
graph.add_node("write", writer)
graph.add_edge(START, "plan")
graph.add_edge("plan", "search")
graph.add_edge("search", "summarize")
graph.add_edge("summarize", "write")
graph.add_edge("write", END)
app = graph.compile()
print(app.invoke({"topic": "Impact of Mamba architecture on long-context LLMs"})["report"])
Resume angle
"Designed a multi-agent research pipeline using LangGraph where specialized agents (planner, web searcher, summarizer, writer) pass state through a deterministic graph, demonstrating orchestration beyond single-prompt LLM usage."
Project 3: AI Voice Scheduling Bot
Stack [TITLE ONLY, slide not extractable]
Stack [STANDARD, the two canonical paths]
Path A, compose from primitives (more resume-impressive):
- Twilio Voice, phone number, call handling (webhook on incoming call)
- Twilio
<Gather>+ speech recognition OR OpenAI Whisper, speech-to-text - GPT-4o (with function calling), intent parsing and slot filling (name, time, duration)
- Google Calendar API, create event
- ElevenLabs or Twilio
<Say>, text-to-speech response - FastAPI, webhook server
Path B, use a voice-agent platform:
- VAPI.ai or Retell AI, handles STT/TTS/streaming. You write the prompt and tools.
- Google Calendar API, booking tool exposed to the agent
- Faster to build, less impressive on a resume
Minimal implementation (Path A) [STANDARD pattern]
# pip install fastapi twilio openai google-api-python-client
from fastapi import FastAPI, Request
from fastapi.responses import Response
from twilio.twiml.voice_response import VoiceResponse, Gather
from openai import OpenAI
from googleapiclient.discovery import build
from google.oauth2 import service_account
import json, datetime
app = FastAPI()
oa = OpenAI()
creds = service_account.Credentials.from_service_account_file(
"gcal.json", scopes=["https://www.googleapis.com/auth/calendar"])
cal = build("calendar", "v3", credentials=creds)
TOOLS = [{
"type": "function",
"function": {
"name": "book_meeting",
"description": "Book a meeting on the calendar",
"parameters": {
"type": "object",
"properties": {
"title": {"type": "string"},
"start_iso": {"type": "string", "description": "ISO 8601 datetime"},
"duration_minutes": {"type": "integer"},
"attendee_email": {"type": "string"},
},
"required": ["title", "start_iso", "duration_minutes"],
},
},
}]
@app.post("/voice")
async def incoming_call():
# Entry point — Twilio webhook on incoming call
vr = VoiceResponse()
gather = Gather(input="speech", action="/handle", speechTimeout="auto")
gather.say("Hi! I can book a meeting for you. What would you like to schedule?")
vr.append(gather)
return Response(content=str(vr), media_type="application/xml")
@app.post("/handle")
async def handle_speech(request: Request):
form = await request.form()
user_said = form.get("SpeechResult", "")
now = datetime.datetime.now().isoformat()
resp = oa.chat.completions.create(
model="gpt-4o",
messages=[
{"role": "system", "content": f"You are a scheduling assistant. Current time: {now}. Parse the user's request and call book_meeting."},
{"role": "user", "content": user_said},
],
tools=TOOLS,
)
msg = resp.choices[0].message
vr = VoiceResponse()
if msg.tool_calls:
args = json.loads(msg.tool_calls[0].function.arguments)
start = datetime.datetime.fromisoformat(args["start_iso"])
end = start + datetime.timedelta(minutes=args["duration_minutes"])
cal.events().insert(calendarId="primary", body={
"summary": args["title"],
"start": {"dateTime": start.isoformat()},
"end": {"dateTime": end.isoformat()},
}).execute()
vr.say(f"Booked {args['title']} for {start.strftime('%A at %-I %p')}. Goodbye!")
else:
vr.say("Sorry, I couldn't understand the request. Please try again.")
vr.hangup()
return Response(content=str(vr), media_type="application/xml")
Resume angle
"Built a telephony-driven voice agent that handles inbound calls, transcribes speech, extracts scheduling intent via LLM function-calling, and books events on Google Calendar. Full round-trip under 3 seconds."
Project 4: AI Code Review Agent
Stack [SCRAPED]
- Python + GPT-4
- Functions named in the slide:
fetch_pr_diff(),review_code(),post_review() - GitHub PR webhooks, triggers on every PR submission
- Suggested extensions: 1-10 quality score per file, secondary agent for concrete fix suggestions
Minimal implementation [STANDARD pattern matching the scraped function names]
# pip install fastapi uvicorn pygithub openai
from fastapi import FastAPI, Request, Header, HTTPException
from github import Github
from openai import OpenAI
import hmac, hashlib, os
app = FastAPI()
oa = OpenAI()
gh = Github(os.environ["GITHUB_TOKEN"])
WEBHOOK_SECRET = os.environ["WEBHOOK_SECRET"].encode()
def verify_signature(body: bytes, signature: str):
expected = "sha256=" + hmac.new(WEBHOOK_SECRET, body, hashlib.sha256).hexdigest()
if not hmac.compare_digest(expected, signature):
raise HTTPException(401, "Invalid signature")
def fetch_pr_diff(repo_name: str, pr_number: int) -> list:
"""Fetch all file diffs from a PR."""
repo = gh.get_repo(repo_name)
pr = repo.get_pull(pr_number)
return [{"filename": f.filename, "patch": f.patch or ""} for f in pr.get_files()]
def review_code(files: list) -> dict:
"""GPT-4 review — returns per-file comments + overall score."""
diffs = "\n\n".join(f"### {f['filename']}\n```\n{f['patch']}\n```" for f in files)
resp = oa.chat.completions.create(
model="gpt-4",
messages=[
{"role": "system", "content": (
"You are a senior code reviewer. For each file, identify bugs, "
"security issues, performance problems, and style violations. "
"Give each file a quality score 1-10. Return JSON: "
'{"per_file": [{"filename": str, "score": int, "comments": [str]}], "overall": str}'
)},
{"role": "user", "content": diffs},
],
response_format={"type": "json_object"},
)
import json
return json.loads(resp.choices[0].message.content)
def post_review(repo_name: str, pr_number: int, review: dict):
"""Post review as a PR comment."""
repo = gh.get_repo(repo_name)
pr = repo.get_pull(pr_number)
body = f"## Automated AI Review\n\n**Overall:** {review['overall']}\n\n"
for f in review["per_file"]:
body += f"### `{f['filename']}` — Score: {f['score']}/10\n"
for c in f["comments"]:
body += f"- {c}\n"
body += "\n"
pr.create_issue_comment(body)
@app.post("/webhook")
async def webhook(request: Request, x_hub_signature_256: str = Header(None)):
body = await request.body()
verify_signature(body, x_hub_signature_256)
event = await request.json()
# Only act on opened/synchronize PR events
if event.get("action") not in ("opened", "synchronize"):
return {"skipped": True}
repo_name = event["repository"]["full_name"]
pr_number = event["pull_request"]["number"]
files = fetch_pr_diff(repo_name, pr_number)
review = review_code(files)
post_review(repo_name, pr_number, review)
return {"reviewed": True, "files": len(files)}
Setup: ngrok your FastAPI server, then GitHub repo Settings, Webhooks, add https://<ngrok>/webhook, content-type application/json, secret = your WEBHOOK_SECRET, events = Pull requests.
Scraped extension ideas
- Scoring (1-10 per file), already implemented above in
review_code() - Secondary fix-suggestion agent, take each comment, run a follow-up prompt that produces a code patch, post as inline review comment via
pr.create_review(comments=[{path, position, body}])
Resume angle
"Shipped a GitHub-integrated code review bot. FastAPI webhook handler verifies HMAC signatures, pulls PR diffs via PyGithub, generates structured JSON reviews with GPT-4, and posts scored feedback per file on every PR."
Project 5: Full-Stack AI SaaS with Payments
Stack [TITLE ONLY, slide not extractable]
Stack [STANDARD, the canonical 2026 SaaS stack]
- Next.js 14+ (App Router) + TypeScript, frontend and API routes
- Clerk or Auth.js (NextAuth), authentication
- Postgres (Neon or Supabase) + Prisma or Drizzle, database
- Stripe, subscriptions and metered billing / credits
- OpenAI API (or similar), AI feature
- Vercel, hosting
- shadcn/ui + Tailwind, UI
- Credits model: Stripe webhook on subscription/purchase writes credit balance to DB. Every AI call decrements credits.
Architecture [STANDARD pattern]
┌─────────────────┐ ┌──────────────────┐
│ Next.js UI │────▶│ /api/ai │──▶ OpenAI
│ (App Router) │ │ (check credits, │
│ Clerk auth │ │ deduct, call) │──▶ DB (decrement)
└─────────────────┘ └──────────────────┘
│
│ Subscribe / Buy credits
▼
┌─────────────────┐ ┌──────────────────┐
│ Stripe Checkout │────▶│ /api/webhooks/ │──▶ DB (add credits,
│ │ │ stripe │ update subscription)
└─────────────────┘ └──────────────────┘
Key code [STANDARD pattern]
app/api/ai/route.ts, gated AI endpoint:
import { auth } from "@clerk/nextjs/server";
import { db } from "@/lib/db";
import { OpenAI } from "openai";
import { NextResponse } from "next/server";
const openai = new OpenAI();
export async function POST(req: Request) {
const { userId } = auth();
if (!userId) return NextResponse.json({ error: "unauthorized" }, { status: 401 });
const user = await db.user.findUnique({ where: { clerkId: userId } });
if (!user || user.credits < 1) {
return NextResponse.json({ error: "Out of credits" }, { status: 402 });
}
const { prompt } = await req.json();
const resp = await openai.chat.completions.create({
model: "gpt-4o-mini",
messages: [{ role: "user", content: prompt }],
});
await db.user.update({
where: { clerkId: userId },
data: { credits: { decrement: 1 } },
});
return NextResponse.json({ output: resp.choices[0].message.content });
}
app/api/webhooks/stripe/route.ts, credit top-up on purchase:
import Stripe from "stripe";
import { db } from "@/lib/db";
import { NextResponse } from "next/server";
const stripe = new Stripe(process.env.STRIPE_SECRET_KEY!);
export async function POST(req: Request) {
const body = await req.text();
const sig = req.headers.get("stripe-signature")!;
const event = stripe.webhooks.constructEvent(
body, sig, process.env.STRIPE_WEBHOOK_SECRET!
);
if (event.type === "checkout.session.completed") {
const session = event.data.object as Stripe.Checkout.Session;
const userId = session.metadata!.clerkId;
const credits = Number(session.metadata!.credits);
await db.user.update({
where: { clerkId: userId },
data: { credits: { increment: credits } },
});
}
return NextResponse.json({ received: true });
}
Prisma schema:
model User {
id String @id @default(cuid())
clerkId String @unique
email String @unique
credits Int @default(10) // free starter credits
stripeId String?
createdAt DateTime @default(now())
}
Resume angle
"Shipped a full-stack AI SaaS with authenticated users, metered credit billing via Stripe webhooks, and AI feature gating. Every inference deducts from a server-authoritative credit balance, preventing abuse and tying revenue to model cost."
How to actually use this
The Instagram post's value was the project ideas and high-level stacks, not novel engineering. Each of these projects is a solid 1-3 day build. Priority order for resume impact:
- Project 5 (SaaS), hardest to fake, shows end-to-end product thinking, has real users and payments
- Project 4 (Code Review Agent), ships to a real system (GitHub), demonstrable on your own repos
- Project 2 (Multi-Agent Research), "agentic" is the 2026 buzzword, LangGraph is the right framework
- Project 3 (Voice Bot), visually and audibly impressive in demos
- Project 1 (RAG), table-stakes by 2026, but a solid foundation project
Confidence summary
| Project | What's scraped vs. standard |
|---|---|
| 1. RAG Q&A | Stack scraped. Code is standard FastAPI+Chroma pattern |
| 2. Multi-Agent | LangGraph scraped. Agent roles are standard planner, search, summarize, write |
| 3. Voice Bot | Only title scraped. Full stack is standard Twilio+OpenAI+GCal |
| 4. Code Review | Stack and function names scraped. Code matches those signatures |
| 5. SaaS | Only title scraped. Full stack is standard Next.js+Stripe+Clerk |