Essay·June 2026·4 min read

How AI fabricates a citation that looks perfect

Ask a general-purpose AI for a reference and it will often hand you one that looks immaculate: a plausible author list, a real-sounding journal, a tidy year, even a DOI. The catch is that the paper may not exist. Understanding why is the first step to never being caught out by it.

A citation is just more text to predict

A language model works by predicting the next most likely piece of text, one token at a time. It has read enormous amounts of writing, including millions of real references, so it has learned the shape of a citation extremely well — the order of authors, how journals are abbreviated, where the year sits, what a DOI looks like.

But learning the shape is not the same as looking something up. When the model writes a reference, it isn't retrieving a record from a database — it is generating the most probable-looking string of characters. Most of the time that produces a citation that is formatted flawlessly and points to nothing.

The formatting is the easy part to fake. The existence of the paper is the hard part — and that's the part nobody checks at a glance.

Why it's so convincing

Three things make a fabricated reference hard to spot. The authors are often real people who work in the field, so the names ring true. The journal is usually a genuine one that publishes on the topic. And the DOI has the correct format, so it looks verifiable even when it resolves to the wrong article — or to nothing at all.

A subtler failure is the half-true citation: a real paper, attached to a claim it doesn't actually make. The reference resolves, the formatting is clean, and only someone who opens the source and reads it would notice the mismatch.

Why it matters more in medicine

In most writing, a bad reference is an embarrassment. In a clinical manuscript, it's a liability. A single fabricated or misattributed citation can undermine an argument, mislead a reader making a care decision, and put a paper at risk of correction or retraction. The bar for a reference you put your name to is simple: you should be able to open it and see your claim supported.

How to catch it

The habit that protects you is unglamorous but reliable: open every source. Don't trust the formatting, and don't trust that a DOI exists because it's shaped like one — follow it and confirm it lands on the paper named. Then read enough of that paper to confirm it actually supports the sentence it's attached to. If you can't open it, you can't cite it.

A different design

Fabrication is only possible because the model is allowed to write a reference from memory. Cento is built the other way around: it retrieves real papers first, and the model may only cite from that retrieved set, by reference — never by inventing one. If a claim has no supporting source, it's flagged rather than dressed up in a citation. The result isn't "trust the AI"; it's that every reference is one you can open yourself.

Write papers you can defend

Cento is an AI co-writer for medical research where every citation is real, graded, and checked. Join the early-access waitlist — ophthalmology first.

Join the waitlist