Most practice managers I talk to know AI scribes are coming, and most have already had three or four vendors pitch them. The pitches all sound similar. The provider walks in. The AI listens. The note appears in the EHR. The provider goes home on time. The practice books two more patients per provider per day. Everyone wins.
The reality is uneven. Some practices report real, sustained time savings of 60 to 90 minutes per provider per day, providers who would quit before giving the scribe back, and a measurable lift in patient face time. Other practices wasted six figures on an annual contract, watched provider adoption stall at 40 percent, and quietly let the contract lapse at renewal. The difference is mostly in the vendor evaluation, the pilot design, and the rollout strategy. The technology itself is real. The vendor landscape is uneven.
This guide walks through the 10 questions every practice manager should ask before signing a scribe contract, the pilot terms that protect the practice, the provider-adoption trap that kills 60 percent of deployments, and the BAA-anchored compliance frame that keeps the practice out of trouble. It is written for practice managers, practice administrators, and clinic owners at 5 to 50 location specialty practices: dental groups, PT chains, behavioral health, dermatology, optometry, vet specialty, and urgent care.
Why this matters for practice managers specifically
The scribe purchase decision usually lands on the practice manager's desk because it crosses three departments: clinical operations, IT, and finance. None of those three has the full picture alone. The clinical lead knows what the providers want. IT knows the integration realities. Finance knows what the budget tolerates. The practice manager owns the synthesis.
Get it right and the practice cuts provider documentation time by 60 to 90 minutes per day, recovers schedule capacity for additional patient visits, and meaningfully improves provider retention. Get it wrong and the practice signs a contract that does not deliver, providers who already feel underwater feel more underwater, and the practice manager owns the cleanup.
The vendors know practice managers are doing this evaluation under time pressure. The pitches are tuned to make the decision feel easy. The decision is not easy. The vendor differences are real and the pilot work matters.
What an AI scribe actually does
An AI scribe listens to the patient encounter (with patient consent) and produces a structured clinical note ready for the provider to review and sign. The scribe sits on a phone, a laptop, a desktop microphone, or sometimes a wearable. The audio gets transcribed by an ASR engine, the transcript gets structured by an LLM into the practice's note template, and the note appears in the EHR.
Three things make this different from the dictation tools providers have used for 20 years:
- It structures the note, not just transcribes the audio. Dictation produces a wall of text the provider has to format. The scribe produces a structured note with HPI, exam, assessment, and plan in the right fields.
- It works ambient or on-demand. Some scribes record the entire encounter passively. Others let the provider trigger a structured dictation segment by segment. The right model depends on the specialty and the provider's preference.
- It learns the provider's voice and the practice's note structure. The good scribes pick up on individual provider documentation style within 2 to 4 weeks. The bad scribes produce generic notes the provider has to rewrite.
Think of it as a fast junior medical scribe who takes structured notes during the encounter and hands the provider a draft to review at the end. The clinical judgment, the differential, and the plan all stay with the provider. The scribe handles the formatting and structuring grind.
Before you start
You need:
- A clear answer to which providers actually want the scribe. Voluntary adoption produces evangelists. Forced adoption produces resentment.
- A short list of 3 to 5 vendors based on your specialty and your EHR. Not 12. The shortlist gets vetted seriously.
- 60 to 90 days for the pilot. Short pilots hide adoption problems.
- A signed BAA with whichever vendor you pick. No BAA, no patient encounters get recorded.
- Sample notes from your providers (de-identified) the vendor can use to tune the output during pilot setup.
One thing to settle before you record anything: HIPAA, state privacy laws, patient consent for AI-assisted documentation, and (for behavioral health) 42 CFR Part 2. We have a dedicated section below. It is non-negotiable. Skipping ahead and recording a real patient encounter on the consumer tier of any AI tool is the kind of mistake that ends practice careers.
The 10 questions every practice manager should ask
These are the questions that separate the vendors who deliver from the ones who pitch well and disappoint at production. Send them in writing to every shortlisted vendor before the demo. Vendors who answer all 10 cleanly within a week are vendors worth the demo. Vendors who hedge or delay are vendors who will hedge and delay during implementation.
1. Will you sign a BAA, and is it standard or negotiated? Standard means the vendor signs without changes for any practice. Negotiated means the vendor changes the BAA per customer, which usually means slower onboarding and weaker terms.
2. Which subprocessors handle the audio, transcript, and note generation? The scribe is rarely a single-vendor stack. The audio gets sent to an ASR engine (Whisper, Deepgram, Google, AssemblyAI). The transcript gets sent to an LLM (Claude, GPT, Gemini). The storage sits with a hyperscaler. Each subprocessor needs to be in scope of the BAA chain.
3. What is your data retention policy by default, and can we configure shorter retention? Some vendors retain audio and transcripts for 7 years by default. Others retain for 30 days. For most specialty practices, the audio retention should be configurable and short (deleted after note approval), with the structured note retained per medical-records rules.
4. Can you produce a SOC 2 Type II report and a state-specific compliance addendum? SOC 2 Type II is the security audit. State-specific addendums cover California's CMIA, New York's SHIELD Act, Washington's My Health My Data Act, and other state rules that apply on top of HIPAA.
5. Which EHRs do you have certified integrations with, and do the notes flow into structured fields or free-text? Athenahealth, eClinicalWorks, AdvancedMD, NextGen, DrChrono, Kareo, OpenDental, and Epic Community Connect each have different integration stories per vendor. The right answer is structured, certified integration. Free-text dump into a single EHR field is the wrong answer for most specialty practices.
6. How do you tune note output to individual provider style, and what does the tuning timeline look like? The scribes that succeed tune to provider voice within 2 to 4 weeks. The scribes that produce generic notes do not. Ask for the specific tuning mechanism (sample notes, feedback loop, supervised editing, retraining) and timeline.
7. What accuracy SLA do you commit to during the pilot? Vendors quote internal QA accuracy. Internal QA is not what matters. What matters is provider-edit time per note compared to baseline. The vendor commits to a benchmark or they do not. "We have great accuracy" is not a commitment.
8. What does provider adoption look like at customers similar to us? Ask for three customer references at your specialty, your EHR, and your size. Ask the references about adoption rates at 30, 60, and 90 days, not just at year one. The adoption curve is the truth.
9. What happens to our data if we cancel the contract? Data export, data deletion, audit log access. Get the answer in writing in the contract, not in the demo.
10. What does pricing look like at our scale, and what discounts are available for multi-year or multi-location commitments? Pricing is per-provider per-month, usually $99 to $499. Volume discounts above 20 providers are real. Multi-year commits get further discounts but lock the practice in. Negotiate based on the pilot outcome, not on optimistic projections.
The 10 answers, in writing, before the demo. That changes which vendors waste your time.
How to design the pilot
The pilot is where the vendor either delivers or fails. The pilot is not a free demo. It is a structured evaluation against measurable benchmarks.
What to specify in the pilot agreement:
60 to 90 day pilot at [discounted rate or no charge]. No auto-renew. BAA signed before any patient audio is recorded. Pilot scope: [number] providers at [number] locations. Pilot success metrics: provider-edit time per note (target: under [X] minutes), provider-adoption rate (target: above [Y] percent of eligible encounters), patient-consent acceptance rate (informational), and EHR write-back accuracy (no manual reformatting required). Pilot end clause: practice may terminate without penalty if any success metric is missed at day 75. Data deletion at pilot end: all audio, transcripts, and generated notes deleted from vendor systems within 30 days, with written confirmation.
The pilot agreement is the document that protects the practice. Most vendors will negotiate. The ones who refuse all six terms above are vendors with thin product offerings hidden behind big sales pitches.
During the pilot, the practice manager owns the daily check-ins for the first 14 days. The first 14 days are where the provider-voice tuning happens. After day 14, weekly check-ins through day 60. The check-ins are short. They focus on what the providers are seeing, not on vendor demos of features.
How to handle the provider-adoption trap
The trap is the single largest cause of failed scribe deployments. The practice signs the contract, deploys to all providers at once, and the provider-adoption rate stalls at 40 percent. The contract still costs the practice the per-provider license. The 40 percent adoption is not enough to justify the spend. The contract dies at renewal.
The fix is staged rollout, not enterprise rollout.
Start with 2 to 4 providers. Pick the providers who actually want the scribe and have documentation styles that match what the AI is good at: HPI-heavy specialties, structured exam findings, common diagnoses. Skip the providers who have unique documentation quirks, the providers who already type fast, and the providers who are AI-skeptical. The first cohort is for proof, not for converting skeptics.
Run the first cohort for 60 days. Measure provider-edit time per note as the primary metric. Track provider satisfaction qualitatively in weekly check-ins. At day 60, the first cohort is either advocating for the tool or has identified specific issues that need to be addressed before the next cohort joins.
Roll to the next 3 to 5 providers only after the first cohort is producing real evidence. The first-cohort evangelists pull the next cohort along. The skeptical providers come last, and only if the practice has already proven the tool's value at the location.
Forced enterprise rollouts produce 40 percent adoption and dead contracts. Voluntary staged rollouts produce 80 to 90 percent adoption over 6 to 9 months. The math on which approach saves the practice money is not close.
How to evaluate provider-voice fidelity during the pilot
Provider-voice fidelity is the single biggest predictor of whether the scribe succeeds. A note that matches the provider's documentation style gets approved with light edits. A note that reads like every other AI scribe note gets rewritten, which kills the time savings.
What to do during the pilot:
Within the first 14 days of the pilot, send the vendor 5 to 10 representative notes from each pilot provider (de-identified through the BAA-covered process). Ask the vendor to tune the output to match each provider's note structure, vocabulary, and rhythm. At day 21, evaluate the tuned output against the same providers' subsequent notes. Measure how much editing the providers do. The benchmark: under 20 percent of the note text edited. If the editing rate is above 30 percent, the tuning is not working and the vendor escalates the issue or the pilot stops.
The 20 percent benchmark is empirical, based on what well-tuned scribes produce in production. Above 30 percent editing means the providers are essentially rewriting the note, which means the time savings claimed in the pitch are imaginary in the practice's hands.
The HIPAA non-negotiables
This section is short because the rule is simple, but it is the most important section in this guide.
Do not put any of the following into the consumer tier of any AI tool, including any consumer-tier transcription service:
- Patient audio, video, or transcripts
- Patient names, dates of birth, addresses, or any of the 18 HIPAA identifiers
- Medical record numbers, account numbers, or insurance IDs
- Specific clinical histories tied to a patient
- Substance use disorder records covered by 42 CFR Part 2
- Mental health treatment notes
- Anything that could identify a patient or be linked to one
Use the consumer tier for things that are not patient-specific: drafting RFP language for vendor evaluation, building the pilot agreement template, writing internal SOPs, training materials. The actual patient encounters only flow through the BAA-covered scribe vendor.
State rules add layers. California's CMIA, Texas Medical Records Privacy Act, New York SHIELD Act, and Washington's My Health My Data Act all add requirements beyond HIPAA, especially around data sharing. Behavioral health practices subject to 42 CFR Part 2 need stricter consent for SUD encounter recordings. Get the vendor's state-specific compliance documentation in writing.
State licensure adds another layer. The scribe is administrative documentation support. The provider remains the licensed party who owns the clinical content. If the vendor pitches autonomous diagnosis, autonomous treatment recommendations, or autonomous dosing suggestions as a feature, ask them how they handle state-licensure exposure. AI giving clinical advice without a license is practicing medicine, and the practice that turned on the feature is the one explaining it to the medical board.
Patient consent for AI-assisted documentation needs to be explicit. The consent language explains that an AI tool listens to the encounter, produces a draft note the provider reviews, and is used under HIPAA terms. Patients can decline. Most do not. The ones who decline are easier to handle when the consent flow is clean than when the front desk improvises. Practices have to honor decline requests, which means the workflow needs a fallback (human scribe, dictation, manual notes) for the patients who opt out.
If your group has signed an enterprise agreement with a Business Associate Agreement and a Data Processing Addendum, the rules can be different. Ask your IT director or general counsel what the BAA actually covers. Do not assume.
When NOT to use an AI scribe
AI scribes fit a wide range of specialty practice contexts well. The places where they do not fit are real and worth naming.
Skip AI scribes for:
- Encounters where the patient declines AI-assisted documentation. Honor the decline. Have a manual fallback.
- High-emotion encounters where ambient recording feels intrusive. End-of-life conversations, behavioral health crisis encounters, pediatric serious-diagnosis discussions. The provider's judgment on the encounter type matters more than the workflow optimization.
- Specialties where documentation is highly unstructured and idiosyncratic. Some sub-specialties (complex chronic pain, integrative medicine, certain behavioral health modalities) have notes that AI scribes struggle to structure. Pilot first. If the editing rate stays high after tuning, the scribe is not a fit.
- Providers who do not want it. Forced adoption produces resentment. The 60-day pilot is the time to find out which providers genuinely benefit. Skip the ones who do not.
A simple rule: AI scribes are an unfair advantage on the 70 to 80 percent of specialty encounters that are documentation-heavy and structurally similar. Trust manual workflows for the 20 to 30 percent where the encounter has emotional weight, structural irregularity, or provider preference against ambient recording.
The quick-start template
Here is the vendor evaluation brief the practice manager sends to the shortlisted vendors. Fill in the brackets, send to each, hold them to written answers within one week.
Practice: [name, type, locations, EHR, specialty mix].
Pilot scope: [number] providers at [number] locations for 60 to 90 days.
Required answers in writing within 7 business days:
- BAA terms (signed, standard, scope of subprocessors).
- Subprocessor list (ASR vendor, LLM vendor, storage).
- Data retention policy and configurable options.
- SOC 2 Type II report and state-specific compliance addendum.
- EHR integration mechanism (certified vs. screen-scrape) and structured-field mapping.
- Provider-voice tuning mechanism and timeline.
- Accuracy SLA committed in the pilot agreement.
- Three customer references at our specialty, EHR, and size.
- Data export and deletion terms at contract end.
- Pricing at our scale and discount structure for multi-year or multi-location.
Pilot agreement requirements:
- BAA signed before any patient audio is recorded.
- 60 to 90 day pilot at [discounted rate or no charge], no auto-renew.
- Provider-tuning commitment within first 14 days.
- Accuracy SLA: provider-edit rate under 30 percent of note text by day 30.
- Termination clause: practice may terminate without penalty if metrics not met by day 75.
- Data deletion: all audio, transcripts, generated notes deleted within 30 days of pilot end with written confirmation.
That is the brief. The vendors who answer it cleanly are the ones worth a demo. The ones who hedge are the ones who will hedge during implementation.
Bigger wins beyond the immediate evaluation
Once the scribe is running, three additional moves produce outsized value.
Build a per-provider edit-time dashboard. Track every provider's average edit time per note over 90 days. Some providers will be at 2 minutes per note. Others will be at 8 minutes. The 8-minute providers are either getting weak output (vendor problem) or have a documentation style that does not match the AI well (specialty problem). Either way, the data tells you where to invest training time or whether to accept that some providers will not adopt.
Use the structured note data for downstream workflows. The scribe produces structured data the EHR can use for coding suggestions, quality measure tracking, and pre-auth pre-flight checks. The cleaner the structured note, the cleaner the downstream workflows. Practices that connect the scribe output to coding and pre-auth often produce more total ROI from those downstream gains than from the documentation time savings alone.
Standardize note templates across the practice. Multi-location practices often have providers using slightly different note structures based on individual preference. The scribe rollout is the opportunity to standardize. Standardization makes coding cleaner, makes audits easier, and makes onboarding new providers faster.
Audit the provider satisfaction quarterly. Provider burnout is the underlying reason most practices buy AI scribes. The metric that matters at year one is not just edit time. It is provider retention and provider-reported satisfaction with documentation workflow. Survey the providers at quarter-marks. The scribe is working if the providers say so. If they do not, the contract is up for review.
The healthcare AI consulting connection
This is one tool in one category. Practices that figure out the broader AI question (intake, pre-auth, no-show reduction, scribe vendor evaluation, recall, billing) end up with admin overhead 30 to 50 percent below their peers, providers who actually go home on time, and a hiring story that wins in tight markets. Practices that wait usually end up either banning AI awkwardly, deploying it badly, or watching the competition pull ahead on provider retention.
If your group is wrestling with the bigger AI question, the AI Consulting in Healthcare page covers the full scope: where AI fits in private practice operations, where it does not, what the vendor landscape actually looks like, and what an engagement looks like when it works.
Closing
The goal is not to buy the cheapest scribe. It is to buy the scribe that matches your specialty, integrates cleanly with your EHR, gets adopted by your providers, and earns its license fee in measurable time savings. The vendor pitches all sound similar. The pilot work is where the differences show up. The setup above is the difference between a deployment that succeeds and one that quietly dies at renewal.
Pick three vendors. Send the 10-question brief. Sign one pilot agreement with the right protections. Run a 60 to 90 day pilot with 2 to 4 providers and measure honestly. The case for the rollout makes itself if the pilot is honest. If you want to talk about how AI fits into your practice at the program level, the AI Consulting in Healthcare page lays out the full picture and how an engagement works.
Let's talk about your AI + SEO stack
If you'd rather skip the how-to and have it shipped for you, that's what I do. Start a conversation and we'll figure out the fastest path to results.
Let's Talk