Security

On-Prem AI for Law Firms: Why Public Chatbots Waive Privilege (and What to Do Instead)

May 25, 20267 min read

If your associates are still pasting client matter into ChatGPT, Gemini, or any other public chatbot to “save time on the brief,” you should know that as of this spring, several courts have decided that’s not a productivity habit. It’s a disclosure.

The American Bar Association warned about this in 2024. Most firms didn’t change their workflow because no one had been burned yet. That window closed in 2026.

What actually happened

The decisions don’t all run the same direction, but the pattern is consistent enough that the underwriters are paying attention.

In Heppner v. Hesse Group (D. Del., February 2026), the court held that an opposing party could compel production of an attorney’s prompt history with a public LLM after that attorney admitted using it to “stress-test” arguments containing client identifying facts. The court reasoned that voluntary transmission of those facts to a third-party processor — one whose terms of service grant retention and review rights — broke the privilege bubble in the same way handing the same memo to an outside consultant without an engagement letter would. The work-product doctrine got a partial save. The attorney-client privilege did not.

Then in April, Norton Rose Fulbright published its now-circulating guidance memo on the same question, walking through the analysis under both U.S. and U.K. frameworks. Their conclusion: the privilege risk is not theoretical, and it doesn’t matter whether the model “remembers” the conversation — the act of transmission to a vendor with retention rights is itself the problem.

A handful of state bar opinions are now lining up behind the same logic. If you’re a managing partner, your professional liability carrier has almost certainly asked you about AI usage at the last renewal. If they haven’t yet, they will.

The mechanism — why “but we deleted the chat” doesn’t help

Privilege protects communications between attorney and client made in confidence for the purpose of legal advice. Once a third party with no obligation of confidentiality gets to see the substance, the protection is generally gone — and you can’t claw it back by deleting your side of the transcript.

Public chatbots are third parties with no obligation of confidentiality. Even where the vendor’s TOS says “we don’t train on your data by default,” that promise:

Is unilaterally changeable.
Doesn’t speak to retention, logging, or sub-processor access.
Doesn’t apply to enterprise admin review.
Won’t help you when opposing counsel issues a subpoena to the vendor.

Courts are unimpressed by the argument that an AI vendor “promised” not to look. Privilege analysis runs on legal relationships, not vendor promises.

Refusing to use AI doesn’t make a firm safer — it makes the firm slower while associates use it anyway, off the corporate account, in worse ways. The fix is a deployment model that doesn’t create a third-party disclosure in the first place.

What architecture actually preserves privilege

Three patterns work. None of them are exotic anymore.

1. On-prem or VPC-isolated inference. The model runs inside the firm’s own infrastructure — on-prem hardware, a single-tenant VPC, or a sovereign cloud region with no shared inference layer. There is no third-party processor in the loop because the firm is the processor. Open-weight models (Llama, Qwen, Mistral, the Anthropic on-prem program) have closed enough of the quality gap with frontier models for most legal drafting and review tasks. They are not equal to GPT-class models for the hardest reasoning, and that’s fine — the hard reasoning isn’t where the privilege risk lives.

2. Enterprise contracts with no-train, no-retain, no-human-review terms. If you must use a hosted frontier model, route it through a contractual structure that makes the vendor a confidential agent rather than a third party at arm’s length. The contract needs to spell out:

Zero data retention beyond the inference call.
Zero training use, including reinforcement learning from your traffic.
Zero human review of prompts or completions absent a specific incident workflow.
Sub-processor restrictions and audit rights.
Indemnity language tied to confidentiality breach.

The major AI vendors all sell some version of this today. Most of them won’t volunteer it to a firm of fewer than 200 lawyers, but they will sign it if you ask.

3. Documented data flows. Every privilege defense you ever raise will start with: can you show what data went where? If the answer is “we have no idea, the associates were using their personal accounts,” you don’t have a privilege defense. You have a problem. The documentation requirement is the unglamorous part, but it’s the part that holds up in court.

A vendor checklist that holds up

If you are evaluating an AI tool to use on matter files, the questions to ask, in order:

Where does inference physically run? (Country, region, hardware ownership.)
Who has read access to prompts and completions? (Vendor staff, sub-processors, support tools, regulators.)
What is the data retention window? (Including logs and observability traces — not just the user-visible chat history.)
Will you sign a no-train, no-retain, no-human-review addendum?
What happens to my data if you’re acquired, dissolved, or breached?
Can I get an attestation report or SOC 2 with the AI scope explicitly enumerated?
Can you produce, on subpoena, the contents of my prompt history — and if so, when do you notify me?

The right answers don’t require a sophisticated lawyer to evaluate. They require a vendor that has been through the conversation before.

What this looks like in practice

The firms doing this well are not running massive infrastructure projects. They’re running pilot deployments — usually one or two practice groups, usually starting with document review or memo drafting because those workflows are bounded and easy to evaluate. They run for sixty to ninety days, with the privilege-preserving architecture in place from day one. They measure quality against existing human work, then decide whether to expand.

The firms doing this poorly are running a half-banned policy memo from the GC alongside a Slack channel where partners trade ChatGPT prompts, and they have no idea which client matters have been pasted into a consumer account.

The difference is not budget. The difference is whether someone treated AI deployment as an architecture problem instead of a tool selection problem.

Foundation AI designs and deploys AI agents that operate inside a firm’s perimeter, on the firm’s terms, with privilege-preserving plumbing. If your honest answer to “do we have a defensible AI deployment model right now?” is not really, let’s talk — or read more about how we approach AI automation for law firms. The cost of doing this badly is no longer hypothetical.

← All articles Talk to us