Why Transactional Lawyers Are Suddenly Very Interested in AI Watermarking — and What the Technology Actually Does and Doesn't Prove

The conference circuit has a new obsession. At every deal conference from SuperReturn to the ABA Business Law Section's spring meetings, some version of the same panel has appeared: AI in Transactions — Managing Risk in the New Deal Environment. And somewhere in the middle of that panel, someone always raises AI watermarking as the mechanism by which the industry will separate human-reviewed deal documents from machine-generated ones. The room nods. The slide advances. Nobody pushes back hard enough.

That needs to change.

What Watermarking Actually Does

AI watermarking is not a stamp in the corner of a document. It operates at the statistical layer of text generation. The dominant technical approach — pioneered in research by groups including Google DeepMind and implemented partially in tools like SynthID — works by subtly biasing the probability distributions that a language model uses when selecting tokens during generation. The result is text that reads normally to human eyes but contains detectable statistical patterns. A corresponding detection algorithm can, in theory, scan suspect text and identify whether it bears that signature.

A second approach, now increasingly common in enterprise legal AI tools, relies on cryptographic provenance logging: metadata attached to a document recording which model, which prompt, which version, and what timestamp produced which output. Harvey, Ironclad's AI features, and several document assembly tools baked into major DMS platforms have rolled out versions of this in the past eighteen months, partly in response to exactly the demand from deal counsel we're discussing.

Both approaches sound reassuring. Neither proves what deal lawyers think they prove.

The Evidentiary Gap Is Enormous

Here is what watermarking can establish, at best: that a specific passage of text was generated by a specific model at a specific time. That is genuinely useful information in some contexts. In the transactional context, it is almost always insufficient to establish the legal conclusions being attached to it.

Consider the scenario that deal counsel now routinely raise: a buyer's counsel in a post-closing dispute alleges that the target's disclosure schedules were AI-generated without meaningful attorney review, and that specific representations in those schedules were therefore not the product of reasonable investigation, potentially vitiating the underlying rep in a reps-and-warranties insurance context or triggering a fraud-adjacent claim under Delaware common law. The watermark, in this scenario, is supposed to be the smoking gun.

But watermarking does not prove absence of review. It proves generation. A partner could generate a first draft of a disclosure schedule using an AI tool, spend four hours editing it against board materials and data room documents, retain the original generation metadata, and produce a document that is simultaneously watermarked and thoroughly reviewed. The watermark says nothing about what happened between generation and signature.

The reverse problem is equally acute. Watermarking only detects output from models that embed watermarks. OpenAI has not deployed robust production watermarking in its commercial API as of this writing. Claude's watermarking capabilities are similarly limited in enterprise deployment. A document generated in a non-watermarked environment — which describes the majority of real deal work today — produces no detectable signal. Absence of a watermark proves nothing about whether AI was used.

The Specific Nightmare Scenarios

The purchase agreement scenario is arguably more dangerous. Imagine a target company claiming, post-closing, that the purchase agreement itself was AI-generated by buyer's counsel and contains provisions the target did not meaningfully understand or negotiate — a theory that could surface in earn-out disputes or indemnification fights where ambiguous drafting is at issue. Under Lorillard Tobacco Co. v. American Legacy Foundation and its progeny on contract interpretation, courts apply objective standards to what parties agreed to. The process of drafting is generally irrelevant to interpretation. But it becomes relevant the moment someone raises fraud or negligent misrepresentation, and that is precisely where AI provenance arguments are migrating.

What provenance logging actually gives you in that scenario is a record that a particular clause was generated by GPT-4o on a Tuesday afternoon. It does not give you evidence about whether an attorney read it, whether the client was advised about it, or whether it accurately reflected negotiated deal terms. The evidentiary value for the claim being made is close to zero. The evidentiary value for creating expensive discovery disputes is extremely high.

This asymmetry matters. Watermarking may not help you win the case. It will absolutely help the opposing party generate depositions.

Sophisticated Liability Theater

Let me be direct: in its current form, AI watermarking as practiced in transactional contexts is primarily liability theater. It gives general counsel something to point to when the executive committee asks what the firm is doing about AI risk. It gives legal ops teams a line item in the AI governance policy. It creates the appearance of process discipline without the substance of it.

The substance would look different. It would involve docketed human review checkpoints tied to specific document categories — a disclosure schedule being reviewed by someone with actual knowledge of the underlying business, confirmed by contemporaneous work product. It would involve revision histories that distinguish AI-generated from human-modified text and capture who made which changes. Microsoft's track changes integration with Copilot, now in preview for enterprise Word, moves in this direction. That is more meaningful than a cryptographic hash.

The EU AI Act's transparency requirements, which apply to certain AI-assisted outputs, are pushing toward provenance documentation — but even the Act's drafters acknowledge in Recital 132 that provenance logging serves disclosure purposes, not evidentiary authentication.

The Conclusion That Nobody on the Panel Gives

Watermarking is a technology in search of a legal use case it cannot actually serve. Transactional lawyers who are building representations and warranties frameworks around watermarking detection are building on sand. The real work is process design: what review actually happened, by whom, documented contemporaneously. The watermark may tell you something was generated. It will never tell you something was done right. In deal work, that distinction is the entire ballgame.