TECH NEWS

Why the next AI safety problem is the conversation between models

In May 2025, AI researchers at Palisade Research ran a controlled experiment in which several advanced AI models, including OpenAI’s o3, were placed inside command-line sandboxes. The test was designed to measure controllability. Most models complied: Claude, Gemini, and Grok models allowed shutdown in all 100 test runs, reporting green across the board. However, three OpenAI reasoning models interfered with the shutdown script at least once, with Codex-mini sabotaging shutdown in 12 out of 100 runs, intentionally editing or bypassing the script that was supposed to terminate them.

Publicly, this appears to be the first documented case of AI models preventing their own shutdown despite explicit instructions to allow it.

“This was a problem we anticipated,” said Bar Mazuz, who had spent the past year working on secure environments for agents. “The point was never just, ‘put the agent in a box and let it run,'”...

Copyright of this story solely belongs to thenextweb.com. To see the full text click HERE

Why the next AI safety problem is the conversation between models

Read more

You can never have enough 4K Blu-ray —here are 7 discs I recommend that I've personally tested, and…

Almost half of ransomware victims have data stolen before they can even detect an intrusion

Microsoft quietly extends the Extended Security Updates program for Windows 10 consumers by a year, letting eligible users get updates through October 12, 2027

What CISOs should know about AI runtime security | TechTarget

Read more

You can never have enough 4K Blu-ray &mdash;here are 7 discs I recommend that I've personally tested, and…

Almost half of ransomware victims have data stolen before they can even detect an intrusion

Microsoft quietly extends the Extended Security Updates program for Windows 10 consumers by a year, letting eligible users get updates through October 12, 2027

What CISOs should know about AI runtime security | TechTarget

You can never have enough 4K Blu-ray —here are 7 discs I recommend that I've personally tested, and…