Frontier AI models corrupt 25% of document content

https://images.ctfassets.net/jdtwqhzvc2n1/5A114QnbRiFMtlZuY18ROe/8a6fb481f9761188b83d3ec44d26714e/LLM_data_corruption.jpg?w=800&q=75

As large language models become more capable, users are tempted to delegate knowledge tasks where models process documents on their behalf and provide the finished results. But how far can you trust the model to stay faithful to the content of your documents when it has to iterate over them across multiple rounds?

A new study by researchers at Microsoft shows that large language models silently corrupt documents that they work on by introducing errors. The researchers developed a benchmark that simulates multi-step autonomous workflows across 52 professional domains, using a method that automatically measures how much content degrades over time.

Their findings show that even top-tier frontier models corrupt an average of 25% of document content by the end of these workflows. And providing models with agentic tools or realistic distractor documents actually worsens their performance.

This serves as a warning that while there is increasing pressure to automate...

Copyright of this story solely belongs to venturebeat.com. To see the full text click HERE

Read more

https://i.guim.co.uk/img/media/e60060331fe40c724bf285e66b4146d6017b30c8/243_0_4437_3549/master/4437.jpg?width=300&dpr=2&s=none

Granta and the Commonwealth Foundation say they can't determine yet if AI was used to write a prize-winning short story after critics pointed to signs of AI use

Sponsor Posts Niantic Spatial: World models need real-world data — Scaniverse is the gateway to spatial services — self-serve and built for AI and robotics. Large-area 3D reconstruction from 360° cameras and precise localization, anywhere machines operate. App Spotlight: Quo for Zoho CRM — App Spotlight brings you hand-picked solutions that enhance your