The Prompt Engineering Playbook for Mid-Market Marketing & Operations Teams

If your team of fifteen marketers and ops coordinators has been on ChatGPT or Claude for a year and the output still reads like a tired intern wrote it, the problem is not the model. The problem is that nobody on the team writes a prompt the same way twice, nobody saves the good ones, and nobody measures whether the AI is faster than the old workflow. The fix is not training everyone to be a prompt engineer. The fix is giving the team a four-part skeleton, a shared prompt library with about five files in it, and two settings that almost nobody touches. That is the entire playbook. Below is how to install it in two weeks, what to track to know it worked, and the three mistakes I see mid-market teams repeat across every engagement.

This is a team productivity problem, not an AI literacy problem

Every mid-market team I scope into has the same shape. Five to fifty people, a mix of marketing, ops, sales, and admin. Most of them have a personal ChatGPT Plus account paid out of pocket or expensed. A few use Claude. One or two have tried Gemini because it is in their Google Workspace. They are getting value, but it is uneven. The marketing manager saves four hours a week. The ops coordinator saves zero because she keeps getting outputs that are technically correct and totally unusable.

The reflex from leadership is to schedule an AI literacy training. That is the wrong move. AI literacy training teaches people what a token is and how a transformer works. None of that helps your social media coordinator write a better caption. What helps is a shared structure for how prompts get written, a place to store the ones that work, and a manager who looks at the output and says yes or no instead of shrugging.

In my engagements, the teams that get five times more output from AI in ninety days are not the teams with the most technical staff. They are the teams that treat prompts like SOPs. Written down, version controlled, owned by someone, reviewed quarterly. The teams that stay stuck are the ones still treating prompts like a private hobby that everyone does in their own browser tab.

The four-part prompt skeleton: Role, Context, Constraint, Format

If your team learns one thing this quarter, it should be this skeleton. Every prompt that gets reused on your team should have these four parts in this order. The skeleton works for GPT-4o, Claude Sonnet 4.5, Gemini 2.5 Pro, and any model that ships in the next two years. It is model-agnostic because it maps to how humans give a junior employee an assignment.

Role. Tell the model what kind of person it is and what expertise to draw on. Two sentences. "You are a B2B content strategist with eight years writing for industrial manufacturers. You understand the buyer is a plant manager, not a CMO."

Context. Tell the model what it is working with. The product, the audience, the situation, the prior work. This is where most prompts fail. People write a one-line request and expect a model with no memory of the company to produce on-brand work. Paste the brand voice guide. Paste three previous posts. Paste the buyer persona. Three to five paragraphs is normal here.

Constraint. Tell the model what it cannot do. "Do not use any of the corporate filler words on our Reject List (paste it in). Do not write more than 180 words. Do not include statistics you cannot cite." Constraints are where you encode taste. Most teams skip this entirely and then complain the output sounds like a sales brochure.

Format. Tell the model the exact shape of the output. "Return three options. Each option should be a single paragraph between 60 and 90 words. Number them 1, 2, 3." Without a format instruction, models default to padded prose with bullet headings. Specify the shape and you get the shape.

A worked example for a content brief

Here is the same task written two ways. The bad version: "Write a LinkedIn post about our new HVAC service plan." The good version uses the skeleton.

Role: You are a B2B content writer with five years writing for facilities management buyers. You write the way a plant manager talks, not the way a CMO talks.

Context: Our company is Ridgeline Mechanical, a 40-person commercial HVAC contractor in Cincinnati. We just launched a quarterly preventive maintenance plan priced at $2,400 a year for buildings under 50,000 square feet. The buyer is the facilities director at a mid-size manufacturer. They have been burned by reactive break-fix vendors and care more about predictable budgets than slick service. Our brand voice is direct, blue-collar, no jargon. Our last three LinkedIn posts averaged 11 reactions and got two real comments each.

Constraint: Do not use the words partner, solution, journey, or unlock. Do not promise specific savings numbers. Keep it under 200 words. No emojis. No hashtags inside the body.

Format: Return three options. Each option should open with a one-line hook, have two body paragraphs, and end with a single question that invites a reply. Append a separate line at the bottom with three hashtags.

The bad version produces a generic post that sounds like every other contractor on LinkedIn. The good version produces three options the marketing director can pick from in under a minute. Same model, same time spent, completely different ROI.

Job-function templates: the five your team actually needs

You do not need fifty prompt templates. You need five, one per major function, that get refined over time. Below is what each one covers and what to put in it.

Marketing: content brief and social caption. The skeleton above, plus a section that lists your three to five competitor URLs, your last 10 posts as voice samples, and a target keyword if SEO matters. A team of three marketers using a shared content brief prompt should be producing five drafts a day instead of two.

Ops: SOP draft and process documentation. Role is "operations analyst documenting a process for a new hire." Context is a transcript or bullet list of how the process works today. Constraint is "do not invent steps. If a step is unclear, list it as a question to verify." Format is numbered steps with sub-bullets for substeps, plus a separate "Open Questions" section at the end. I have watched ops directors turn a 90-minute Loom recording into a 12-page SOP in 20 minutes using this pattern.

Sales: lead research and outreach draft. Role is "B2B sales researcher who only reports facts you can verify." Context is the prospect's company URL, LinkedIn, and a recent press release. Constraint is "do not speculate about pain points. Only report what you can quote." Format is a five-section brief: company snapshot, recent news, likely buyer titles, three conversation hooks with sources, and a draft 90-word outreach email. The sales rep edits, does not generate from scratch.

Customer support: response draft. Role is "support specialist following our voice guidelines." Context is the brand voice doc, the customer's original message, and the relevant help article URL. Constraint is "do not promise refunds, escalations, or product changes. Flag anything that needs a human decision." Format is the email body plus a separate "Internal Notes for Reviewer" section. A team of two support staff can clear a 40-ticket backlog in a Friday afternoon with this pattern.

Exec admin: meeting summary and email triage. Role is "executive assistant who has worked with this leader for three years." Context is the meeting transcript or the email thread, plus a one-paragraph note on the leader's communication style. Constraint is "do not invent action items. If an action item lacks an owner or deadline, flag it." Format is a TL;DR, a Decisions list, an Action Items table with owner and date, and a Risks section. This one alone saves an EA five hours a week.

The two settings nobody touches but should

Almost every model has two knobs that change the output dramatically and almost nobody on a mid-market team has touched either one. They are temperature and context window usage. Knowing when to push each is the difference between a prompt that works on Tuesday and fails on Thursday.

Temperature. This is the randomness setting. Most consumer interfaces hide it, but the API and most paid team plans expose it. Temperature 0 means the model gives you the most likely answer every time. Temperature 1 means it samples more widely and gets more creative. The default in most chat interfaces is around 0.7, which is a compromise that is wrong for most business tasks.

Push temperature to 0.2 or lower for any task where you want consistency. SOP drafts, customer support replies, data extraction, summarization, contract review. You want the same input to produce the same output. Push temperature to 0.9 or higher for ideation, naming, headline brainstorms, and creative campaign concepts. You want variety. The trap most teams fall into is using the default 0.7 for everything and getting medium-quality output across the board because the model is hedging on every task.

Context window. Modern models can hold 200,000 tokens in working memory, which is roughly 500 pages of text. Most users paste in 200 words and wonder why the output is generic. The model can hold a lot more than you are giving it. For any high-stakes task, paste the full brand voice guide, the last 20 posts, the buyer persona, and the competitor analysis. The model will not get confused. It will get sharper.

The flip side: do not stuff the context window with stuff that contradicts itself. If you paste three brand voice guides from three different agencies because nobody knows which one is current, the model will average them and produce mush. Curate before you paste.

Building a team prompt library: the five files that earn their keep

Stop letting people keep prompts in their personal Notion or in a Google Doc only they can find. A shared prompt library is the single highest-ROI artifact a mid-market team can build this quarter. It does not need to be fancy. A shared Notion database, a Google Drive folder, or a Git repo all work. What matters is that there are five files everyone can find.

File 1: The Voice Guide. Three to five pages on how your brand sounds. Include do-write and do-not-write examples side by side. This is the file that gets pasted into every Context section in every prompt. If you do not have one, write it this week. AI cannot match a voice you have not articulated.

File 2: The Approved Prompts Library. One entry per task type. Each entry has the four-part skeleton filled in, a sample input, and three sample outputs that were rated good. Mark each prompt with an owner and a last-reviewed date. Quarterly review is non-negotiable because models change.

File 3: The Reject List. A running list of words, phrases, and stylistic moves the team has decided are off-brand. Most teams build this by collecting the corporate filler verbs and adjectives that turn business writing into mush, plus a few category-specific words their buyers tune out. Paste the list into every Constraint section. Update it monthly when you spot new offenders in the wild.

File 4: The Buyer Personas. Two-page snapshots of your three to five core buyer types. Job title, day-to-day pain, what they read, what they distrust about your category, the words they use. The model performs about three times better when you give it a real person to write to instead of a generic "target customer."

File 5: The Output Examples. Five to ten pieces of past work that hit the bar, organized by task type. Three approved blog posts. Three approved emails. Three approved SOPs. The model uses these as exemplars. "Match the structure and tone of the three samples below" is one of the most powerful instructions you can give.

That is it. Five files. A team that maintains these five files and uses the four-part skeleton will outperform a team that buys a $40,000 enterprise AI platform and never writes anything down.

Measurement: what to track, what to ignore

Most mid-market teams measure AI adoption by counting seats. That number is meaningless. Forty seats of ChatGPT Team at $30 each is $1,440 a month, and if the team is producing the same volume of content as before, you are setting fire to $17,000 a year and calling it innovation.

Track three things instead. First, time-to-first-draft for the five core task types. Pick one marketer, one ops person, one sales rep, one support rep, and the EA. Time them on a representative task before AI and after AI. Re-time them quarterly. If time-to-first-draft is not dropping by at least 40 percent within 60 days of installing the prompt library, something is broken in the workflow, not the tool.

Second, output volume per week per role. Count the units that matter. Blog drafts published, SOPs documented, leads researched, tickets cleared, summaries delivered. If volume is flat, either the team is using AI for tasks that were not the bottleneck, or the prompt library is not actually being used. Both are common.

Third, rejection rate on first draft. How often does a manager send the AI-assisted draft back for a rewrite? If it is above 30 percent, the prompts are too thin or the wrong people are reviewing. If it is below 10 percent, congratulations, you have built the system. If it is zero, your manager is not reading carefully and you have a quality risk.

Ignore vanity stats. Number of prompts written. Number of conversations started. Tokens consumed. Hours "saved" based on self-report. Self-reported time savings are almost always inflated by 50 to 100 percent. Trust the timed task tests, not the survey.

The three most expensive mistakes mid-market teams make

I scope into about 30 mid-market teams a year. Every single one of them makes at least one of these three mistakes. The combined cost is usually six figures a year in wasted seats, wasted time, and missed output.

Mistake one: treating prompts as personal property. The marketing manager has a great prompt for blog briefs in her personal ChatGPT history. She leaves the company. The prompt walks out the door. The replacement marketer rebuilds it from scratch over six weeks of trial and error. Cost: roughly $8,000 in lost productivity per departed prompt-savvy employee. Fix: shared library, owned by a role, not a person.

Mistake two: buying enterprise AI before fixing the workflow. A 25-person agency spends $60,000 on an enterprise AI platform with custom GPTs, RAG, and SSO. Six months later, 8 of 25 employees use it weekly. The other 17 still use their personal ChatGPT because the enterprise tool has a worse interface and slower model responses. Cost: $60,000 plus the IT director's time. Fix: get the prompt library and the four-part skeleton working on a $30-per-seat consumer plan first. Move to enterprise only when your team is hitting the limits of consumer tools, which most mid-market teams never actually reach.

Mistake three: skipping the review step. The marketing director stops reviewing AI drafts because they are "good enough." Three months later, a hallucinated statistic ships in a customer-facing white paper. The PR cleanup costs more than every hour AI ever saved the team. Cost: variable, but I have seen it run from $5,000 in republishing costs to a six-figure customer churn event. Fix: a human reviews every customer-facing AI output until you have at least 90 days of clean data showing the prompt is reliable. Even then, spot-check one in ten.

None of these three mistakes are technical. They are management mistakes. Which is good news, because management mistakes are cheaper to fix than technical ones.

What to do this month

Pick one team of five to ten people. Do not roll this out company-wide. Spend week one writing the voice guide and the reject list. Spend week two filling in the four-part skeleton for the five core task types and saving them in a shared folder. Spend week three running the timed before-and-after tests and tuning the prompts. Spend week four documenting what worked and rolling it to the next team.

The teams I have walked through this exact sequence are typically saving between 6 and 12 hours per person per week within 60 days, against a setup cost of about 20 hours of manager time. The math is hard to argue with. The hard part is not the math. The hard part is getting a busy operations director to slow down for two weeks to install a system instead of continuing to wing it for another year.

If you want a second pair of eyes on your team's current prompt habits and a worked-out version of the five files for your specific business, book a 30-minute scoping call. I will not pitch you on a six-month engagement. I will tell you whether your team can install this themselves or whether you need outside help, and I will be honest about which is which.