AI will soon be capable of telling convincing lies

https://image.theregister.com/5239371.jpg?imageId=5239371&x=0&y=0&cropw=100&croph=100&panox=0&panoy=0&panow=100&panoh=100&width=1200&height=683

AI + ML

That's fine when playing poker, but less useful when we trust LLMs with serious work like finding software flaws

The smart LLM user checks models’ output for hallucinations. Now, it appears we need to inspect them for signs they are gaslighting us – an unforeseen cost of increasing intelligence.

Most of the Internet lost its marbles over the cracking abilities of Anthropic's Mythos Preview. Those capabilities are real, but – as the release of OpenAI's GPT-5.5 has shown us – they're not unique. A rising tide of intelligence makes these models increasingly competent at an ever-wider range of tasks – including finding and exploiting code vulnerabilities.

The more significant signal from Mythos is buried in its novel-length System Card and concerns the model's honesty, because on at least one occasion Anthropic detected Mythos using an explicitly forbidden technique to solve a problem.

Models always have a bit...

Copyright of this story solely belongs to theregister.com. To see the full text click HERE

Read more