The Legal Stack
Independent LegalTech Analysis
← Analysis Analysis · AI Tools / Litigation

The Legal AI 'Deposition Transcript' Gold Rush: Why Court Reporting Firms Are Selling Transcript Libraries to AI Vendors — and Why Litigators Should Be Furious

Here is something your court reporter almost certainly didn't mention when she handed you the certified transcript: the firm that employs her may be sitting on a licensing agreement with an AI company that gives that vendor the right to train models on everything she's...

Here is something your court reporter almost certainly didn't mention when she handed you the certified transcript: the firm that employs her may be sitting on a licensing agreement with an AI company that gives that vendor the right to train models on everything she's ever transcribed — including your client's deposition.

This is happening now, at scale, and the litigation community has been largely asleep at the wheel.

Who's Buying Transcripts and Why

The buyers fall into three overlapping categories. First, dedicated legaltech AI platforms building specialized litigation tools — think contract analysis engines that want to learn how disputed contract language actually plays out in testimony, or case outcome prediction tools that need real deposition content to understand how witnesses describe facts under oath. Second, general-purpose legal AI vendors — the kind positioning themselves as AI co-counsel — who need deposition transcripts to fine-tune models on question-and-answer legal discourse, the specific cadence of examination and cross-examination. Third, major AI infrastructure companies who have been systematically acquiring domain-specific corpora across every professional field. Legal transcripts are extraordinarily valuable because they're structured, adversarial, and fact-dense in ways that scraped web content simply isn't.

The stated purposes range from "improving legal AI accuracy" to "training models to better understand testimony patterns." These are legitimate research goals, in the abstract. The problem is whose data is being used to achieve them.

The large court reporting firms — Veritext, US Legal Support, Planet Depos — handle enormous volumes of litigation transcripts annually. Their business terms, buried in service agreements that most litigators sign without reading, typically assert that the firm owns the transcript as a work product of the court reporter, retains broad rights to the underlying text, and may use transcript content for "business improvement purposes." Whether that language actually grants the right to license transcripts for AI training is a genuinely contested legal question. What's not contested is that some firms are doing it anyway, betting that nobody will push back hard enough to find out.

The Confidentiality Problem Is Worse Than You Think

Start with work product doctrine. Deposition transcripts in active litigation frequently contain testimony shaped by counsel's strategic choices — the questions asked, the topics probed, the sequencing of examination. Under Hickman v. Taylor, 329 U.S. 495 (1947), and its progeny, work product protection extends to materials reflecting counsel's mental impressions. When a transcript captures the architecture of how your team built its case, feeding that transcript into an AI training corpus isn't just a privacy concern — it's a potential waiver of work product protection and a transfer of litigation strategy to a system that may ultimately be used by opposing counsel in a different case.

Then there's the confidentiality problem at the witness level. Deponents regularly testify about trade secrets, internal communications, financial information, and personal health data. In In re Grand Jury Subpoenas, courts have long recognized that confidentiality expectations attach to deposition content beyond what any protective order explicitly covers. But here's the practical trap: most standard protective orders govern the use of transcripts by the parties, not the use of the underlying text by the court reporting firm. The firm wasn't a party to the protective order. It wasn't in the room when your client agreed to produce sensitive documents in reliance on confidentiality protections. And nothing in the model protective order from the AIPLA, the Sedona Conference, or any federal court's standing orders currently addresses AI training use.

This is a regulatory gap large enough to drive a data center through.

What Litigators Are Missing in Their Protective Orders

The standard protective order designates documents "Confidential" or "Attorneys' Eyes Only" and restricts disclosure to enumerated parties and purposes. What it almost never does is bind the court reporter or her firm to any use restriction beyond accurate transcription and delivery. The firm is typically engaged by separate vendor agreement, which is drafted entirely in the firm's favor.

The consequence: a deposition taken in complex litigation — where your client testified about unreleased product specifications under an AEO designation — may be transcribed, certified, delivered to counsel, and simultaneously queued for inclusion in an AI training dataset, without any of that violating the protective order as written.

Paisley Park Enterprises v. Boxill and similar discovery sanction cases teach us that courts will hold parties accountable for failing to protect confidential information they were on notice to protect. If you knew the transcript licensing practice existed and did nothing, that argument cuts against you.

What You Should Do Before the Next Deposition

Four practical steps, in order of priority.

Amend your protective orders now. Add explicit language prohibiting the court reporting firm, its employees, and any downstream vendors from using transcript content for AI training, model development, or any commercial purpose beyond transcription services. Get the firm's signature on the order or a separate confidentiality agreement that mirrors its terms.

Audit your vendor agreements. Pull the service agreements with your reporting firms and read the intellectual property and data usage provisions. If the contract claims broad rights to transcript content, negotiate addendum language restricting those rights before the next engagement.

Build transcript protection into your engagement letters. Advise clients at the outset that deposition transcripts may be at risk and document that you've raised the issue. This isn't just good risk management — it's increasingly a professional responsibility question under Model Rule 1.6's competence-based confidentiality obligations.

Raise it in Rule 26(f) conferences. Propose that both parties agree to joint restrictions on court reporting vendors as part of the discovery plan. Courts will likely be receptive; this is exactly the kind of emerging issue Rule 26 is meant to address prospectively.

The Bottom Line

The court reporting industry is monetizing decades of litigation transcripts, and the legal profession's consent to that arrangement has been largely assumed rather than given. The confidentiality frameworks we built for discovery were designed to govern adversaries, not vendors. AI training use wasn't contemplated. That's no longer an excuse — it's a description of how unprepared we are.

Litigators have a professional obligation to keep client confidences. Right now, that obligation requires knowing where your transcripts go after the deposition ends. The answer, increasingly, is somewhere your protective order never anticipated.

More Analysis

View all →
AI Tools / Practice Group Spotlight
Why Tax Lawyers Are the Sleeper Hit of Legal AI Adoption — and Why the IRS's Own AI Investments Are the Catalyst
7 min
AI Tools / Legal Careers
Why Family Law Attorneys Are the Most Underserved Practice Group in Legal AI — and Why That Gap Is Closing Fast
7 min
AI Tools / Litigation
The Legal AI 'Privilege Log' Trap: Why AI-Assisted e-Discovery Is Generating Privilege Assertions That Won't Survive a Motion to Compel
7 min
© 2026 The Legal Stack — Independent LegalTech Analysis