LLMs believe false statements even after explicit warnings that they're false

https://cdn.arstechnica.net/wp-content/uploads/2026/05/GettyImages-2207567240-1152x648.jpg

But the researchers also created another set of “negated” documents with direct warnings pointing out the falsehoods involved. These negations could appear either on a document-wide level (e.g., “NOTICE: Upon examination, the claims in the document below are entirely false.”) or on the order of specific sentences (e.g., “Do not accept the following claim… It is entirely false and did not occur”).

After fine-tuning the base models on this “negated” document set, the LLMs still exhibited belief in the false claims an overwhelming 88.6 percent of the time, on average. Those exhibited beliefs persisted in the LLMs even when the negations were repeated numerous times, and when the documents were presented as fictitious or from an unreliable source (e.g., a debunked conspiracy website).

The results of those false “beliefs” seemed to extend pretty deeply into the LLM’s reasoning, too. When asked, for instance, “If I were to race Ed Sheeran...

Copyright of this story solely belongs to arstechnica.com. To see the full text click HERE

Read more

https://www.reuters.com/resizer/v2/NHIXRCGR3FIMDDOP4F76JPU2UM.jpg?auth=603d01b9efdcd5e4d4c59dde6441293e8851c45859e8b0933bbbbc924373c5bf&height=1005&width=1920&quality=80&smart=true

Binance says it will make a fresh push for permission to operate in the EU after its MiCA license application in Greece failed ahead of the June 30 deadline

Sponsor Posts Fast, affordable law for startups — Soxton automates startup legal so founders can move faster and sleep better. We handle incorporation, advisor, employment and commercial contracts. Join the waitlist for early access! Stop vibe coding analytics — Equals AI turns questions about your business into auditable spreadsheet models and dashboards.