AI Note-Writing Tools for Therapists: What Works, What Does Not, and What to Watch

AI-assisted note writing has arrived in mental health practice, and it arrived faster than the profession was ready for. Tools like AutoNote, Upheal, Mentalyc, and a growing list of competitors promise to reduce documentation time by generating clinical notes from session recordings or transcripts. The promise is real. The risks are underappreciated. And the research on actual time savings is considerably more complicated than the vendor marketing suggests.

This is not an argument against the technology. It is an argument for using it with your eyes open — because the clinicians adopting these tools without understanding their limitations are creating liabilities they have not accounted for.

What these tools actually do

The core functionality is consistent across platforms: you record your session (with client consent), the tool transcribes the audio, and an AI model generates a draft clinical note in the format of your choice — SOAP, DAP, BIRP, or others. You review, edit, and sign.

That workflow is straightforward. The variation is in how well each platform handles the transcription accuracy, the clinical structure of the generated note, and the edit workflow that bridges AI output and clinician sign-off. The platforms are not equally good at any of these. The differences matter more than the marketing materials suggest.

What the time savings research actually shows

The vendor claim is that these tools reduce documentation time by 50 to 70 percent. That figure comes primarily from vendor surveys, which are not independent research. The picture from independent and peer-reviewed studies is more complicated.

~16 minutes Per 8-hour clinical shift — the time savings found in a large-scale NEJM AI study on ambient AI scribes across multiple health systems. This is the most comprehensive independent evaluation published to date, and the savings are more modest than vendor claims suggest. Source: NEJM AI study on ambient AI scribes, April 2026; reported by STAT News

The most comprehensive independent evaluation of AI scribes published to date found time savings of approximately 16 minutes per 8-hour clinical shift — meaningful, but considerably more modest than the 50 to 70 percent reduction vendors report. A STAT News review of the study noted that despite the modest time savings, AI scribe use was associated with a 31% reduction in burnout — which is arguably the more important finding.

The discrepancy between vendor-reported time savings and independent study findings is worth sitting with. It does not mean the tools are not useful. It means that the 62% and 70% figures you will see in product marketing are not independent data. They are survey results from users who chose to adopt the tool and are reporting their experience. That is a different thing.

The honest conclusion: AI note-writing tools will save you time. How much time depends heavily on your clinical content, your workflow, and how carefully you review the output. For some practitioners and some caseloads, the savings are substantial. For others, they are modest. Expect the latter and be pleasantly surprised by the former.

The liability question nobody is asking loudly enough

The note that goes in the clinical record is your note. It has your signature on it. If it contains an error — a misattributed statement, an inaccurate symptom description, a clinical interpretation that does not reflect what actually happened in the session — you are responsible for that error. Not the platform.

The CMF Group’s analysis of AI liability in clinical practice is direct on this point: there is no federal law that shifts malpractice liability from a clinician to an AI tool or its developer. The clinician whose name is on the chart is the clinician who bears responsibility for what is in it.

AI-generated notes are drafts. They need to be read, not skimmed. If you are signing AI output without genuinely reviewing it, you are outsourcing your clinical judgment to a language model.
— Renata Lima, Business & Therapy

The platforms are aware of this. Their terms of service are clear that the clinician bears responsibility for the final note. The risk is not theoretical — it is the ordinary risk of any documentation error, applied to a workflow that makes errors easier to miss because the note looks complete and professionally formatted.

TwoFold’s survey of clinicians using AI notes is instructive here: practitioners who use these tools well describe their role as moving from author to editor. They are not generating notes from scratch — they are verifying accuracy, adding nuance, and ensuring the output reflects their actual clinical judgment before signing. Practitioners who treat the AI output as final product rather than draft are the ones creating liability.

The HIPAA question you must ask first

Before you evaluate any AI note-writing tool on features, ask the HIPAA question. Where is your session audio stored? For how long? Who has access to it? Is it used to train the model?

Resilient Counseling’s analysis of AI documentation risks identifies client confidentiality as the primary concern: most AI platforms are not HIPAA-compliant without a Business Associate Agreement (BAA) in place, and the requirements for that agreement are more detailed than a checkbox on a signup form. You need to confirm the BAA exists, read it, and understand what it covers before you record a single session.

The HIPAA compliance question is table stakes. It is not sufficient on its own. Ask it first, then ask the follow-up questions.

What to look for in a platform

Data handling. Where is the audio stored, and is it used to train the AI model? If you cannot get a clear answer to both questions, that is your answer.

Accuracy on your specific clinical content. AI note-writing tools perform significantly better on structured CBT sessions than on less structured, process-oriented, or somatic approaches. A PMC study on AI note quality found that ChatGPT-4 showed substantial variability in errors and note quality, struggling particularly with non-objective data. Test the tool on a sample session before committing to it — and test it on a session that represents your most complex clinical work, not your easiest.

Edit workflow. The tools that make editing frictionless are more likely to be used carefully. The ones that make it easier to accept and sign than to review are the ones to be cautious about. If the interface design nudges you toward signing without reading, the interface is working against your clinical judgment.

Accuracy over time. Ask how the platform handles updates to its underlying model. If the model changes, does your documentation workflow change? Do you get notified? How do you validate that accuracy has been maintained?

Who should wait

If you work primarily with trauma, dissociative presentations, or any clinical content that is particularly sensitive, complex, or body-based, the current generation of AI note tools is not mature enough for your caseload. The PMC analysis of AI risk management in clinical settings identifies overreliance on AI output as the primary liability concern, and notes that AI can generate plausible-sounding but clinically incorrect content — a risk that scales with the complexity and nuance of the clinical material.

The technology will catch up. It has not yet.

For practitioners doing structured short-term work with anxiety and depression presentations, the tools are ready — with the caveats above in place. Use them carefully, review every note before signing, and do not let the efficiency gains make you careless about the clinical record.

The note is yours. The AI is a tool.

Do you have additional information about AI documentation tools, HIPAA compliance, or clinical accuracy findings? We update our articles and research regularly. Contact our editorial team with corrections, updates, or sources.