AI hallucinations are complicating court cases. Legal scholar Damien Charlotin tells us about his hunt to track the issue.
Lawyers across the world are submitting legal briefs with AI fabrications in them: made-up case law, spurious quotes, and fake citations. One researcher has made it his mission to study these.
Anyone who uses AI tools regularly has experienced what researchers call hallucinations — seemingly legitimate answers the program will give in response to queries, that look and seem like facts but are completely made up.
A hallucination could be a fake news article, complete with fake authors and a fake headline on a topic you’ve asked for that the AI spits out, or just random mistakes that the software makes and presents as the truth.
This does not appear to be some bug that is getting ironed out as AI matures; the issue is actually getting worse as tools like ChatGPT grow more powerful. OpenAI, for example, found that two of its more recently released GPT tools, o3 and o4-mini, hallucinate at least 33 and 48 percent of the time, respectively — well over twice the rate of its previous system. And many researchers believe hallucinations will trail AI as long as it is in use, raising existential questions about the purpose and benefit of using these systems for high-level tasks.
One arena where this is popping up now is in court. AI systems being used by lawyers are hallucinating fake citations — the court decisions and research on which lawyers base their arguments. In other cases, AI programs have been found to fabricate quotes from real cases. And these hallucinations are making their way into actual briefs and memos submitted to courts around the world.
Paris-based researcher and legal scholar Damien Charlotin started tracking these faux pas recently, and how courts are addressing them, with fines, penalties or awkward and confusing interludes. A spreadsheet he has compiled of these errors now has 156 examples, and he says the issue is accelerating.
The hallucinations and their appearance in such a formal setting are the stuff of farce; comical or tragic depending on your perspective. A particular favorite of Charlotin’s is a case where a legal team submitted a brief that included a fake decision from the real judge overseeing the case.
I spoke to Charlotin this week about his own experiments with AI, how he can tell that AI is responsible for a mistake in court, and why he’s not actually pessimistic about the automated future. A lighted edited and condensed version of our conversation is below.
Eli Rosenberg: Tell us a little bit how you started this effort?
Damien Charlotin: I was teaching students about large language models and how they might impact the future of the legal profession. Naturally, I started talking about hallucinations. But since I'm an empirical scholar, I wondered what the data showed. How many decisions are out there? Can I track them? I looked for a database and couldn’t find one— so I decided to build one myself.
I’ve long worked on legal argumentation; my thesis was actually about citations. So this connects pretty directly with what I’ve always been interested in.
ER: Your database has cases from courts all over the world: what is your method for finding these errors?
DC: I only started six weeks ago, so it’s still evolving. Initially, I spent a few hours collecting historical cases, before April 2025, and found about 40. Then, right as I started, the issue seemed to explode: over 20 cases in April, more than 40 in May. I repurposed scrapers I developed for my journalism work, to use to find these kinds of cases.
Because it’s gotten so much traction, a lot of people now send me examples—some very helpful, some not. An old lady told me for example that ChatGPT lied to her, in Italian, which was sweet, but not relevant.
What I’m interested in is how courts handle these errors. There are many cases where it’s alleged that there is an AI hallucination and they do nothing about it — they just ignore it. That I don’t track. But I'm interested in the engagement, looking at how courts try to deal with these kind of things, and looking for patterns.
ER: Have you found any so far?
DC: I did post a paper last week, now under peer review, about how to handle hallucinations. So far, I’ve seen a lot of embarrassment for everyone involved when this happens, and relatively mild sanctions — mainly when someone files something with hallucinations and refuses to own up to it, which is very common. I’ve been very surprised to see so many people just deny what is obvious, because a hallucination is just not an error a human would make.
People ask me ‘How are you sure?’ Because no human would fabricate a citation. There are a few cases predating the release of ChatGPT of people making up a citation, but that kind of deliberate fraud is very, very rare, and it usually went terribly for them.
I’ve been noticing some differences in the way different jurisdictions respond: Israeli courts might give a small fine. U.S. courts tend to be very embarrassed and give you a warning, ‘Don’t do it again,’ etc. That’s partly procedural—the U.S. legal process has many stages where such corrections can be made before a final ruling. It’s still a developing area.
ER: What are the signs that there have been AI-written errors in legal documents?
DC: I look for mentions of nonexistent citations or fabricated cases—anything that suggests hallucinations. Again, if a cited case doesn’t actually exist, that’s not a human mistake. There is no other credible explanation. That simply didn’t happen before November 2022 when ChatGPT was released.





ER: And what do you think it signifies about where we’re at with this technology, that this is happening in such a high-level setting?
DC: I'm not sure it's that significant, given the number of cases around the world that are using AI.
Are these mistakes that common? I’m not sure. My database has a disclaimer—obviously, the world of hallucinations is much larger than what I can cover, because I only include decisions that I’ve found. Many likely go undetected.
My broader argument is that it’s not such a big break from the past. Sloppy legal work has always existed. Lawyers have copied and pasted from old briefs, reused strings of citations that bear no relation to the argument they’re making.
The difference is that in the past, if you cited something, there was a case there. You could always try to argue and weasel your way around the fact that it might not really apply to the argument you’re making. But with hallucinations, the citation does not actually exist. What does it say about legal argumentation? It’s a field where we’ve always focused a lot on authority. “This is law, because X” — a full footnote, and then an authority. To an extent, that might have become a bit performative, maybe to an unhealthy degree. And it’s coming back to bite us.
So decades of relying on authorities for arguments lead to these large language models being trained on that, and coming back to us and showing us how when we generate text, what reflexes we have. So I think it says something about legal argumentation — too many citations, not enough justification.
ER: What does this mean for the future of legal work?
DC: I’m quite optimistic about AI in the legal domain actually. I didn’t start this project because I think AI doesn’t work. I expect the tools will get better, though the basic technology won’t fundamentally change.
I think AI is great from an access-to-justice perspective. The amount of work needed for briefings and legal research is a huge barrier. On the other hand, a lot of the justice system is premised on the assumption that people won’t use it. I’m afraid that if we make it easier to file cases, cases that should not be filed for example, might be. Something like, I pushed you on the street, you may have a damage case against me for $5. Would you file it if you have to go to court? No. Would you do it if you just have to click on a button? Maybe you will. So some systemic impacts could be bad.
We are reliant on systems that are geared for humans and come with the assumption that some things take so much effort that they won’t happen that often. Will that work once we remove those barriers? I’m not totally sure. But I expect that we will have greater access to justice as well. And lawyers will have more time to do different things.
ER: Beyond the legal field, it sounds like you’re not pessimistic about the future with AI writ large?
DC: That’s complicated. On one hand, there are great aspects to it. On the other, one has to be aware of the cost of convenience. It’s a mixed bag, but I'm naturally optimistic.
That’s the paradox of automation. Philosophers have been having this discussion for centuries. If you don’t do a task yourself, what exactly are you doing? AI tends to steer towards the middle ground. We might deprive ourselves of a lot of diversity in terms of reasoning and arguments if we just rely too much on this.
ER: I’ve experimented with AI a bit for longer writing projects, and it succeeds on some tasks while failing miserably at others. It does a decent job as a research assistant — finding news articles on relevant topics from specific time periods, for example. Obviously everything needs to be double-checked for hallucinations.
But on larger tasks, it produces something that looks like good writing but is not good writing. The text lacks texture, depth or heart; it fails to connect. It produces a simulacrum. Have you had experiences like that?
DC: I recently co-authored an article about how we submitted AI-generated memos to a legal competition, the Jessup Moot Court, where you submit entries on a hypothetical case about international law. The judges didn’t know. Ours looked great, like a good lawyer’s — but if you dug deeper, they were repetitive, rambling and full of hallucinations.
We handed in about a dozen AI-generated memorials, all about 40 pages, and got graded by a panel of judges. We did not change a word. Despite their flaws, they looked professional and we scored above average, for a contest in which thousands of entries were submitted. We did well on style and structure—not as well on research or depth. It took us two days to do something students normally spend five months doing.
ER: How do you use AI in your own life?
DC: In day-to-day life, I use it the way people use Google. I’m having a kid in a few weeks, so I’ve asked what we need to do to prepare. And I expect once the kid comes, we’ll get a lot of reassurance from it — is this normal, that kind of thing.
I teach coding for lawyers to automate tasks, and now I use AI to automate even more. In my legal practice, I use it to generate text. I was writing a brief today in a case. And I used AI, because that’s what I do. Obviously I checked and read over everything, because I don’t want to end up in my own database.
Do you have an interesting, concerning or strange story about an experience with AI? Feel free to reach out to me on Signal at @elirosenberg.30.