Red teamers turned Claude Desktop into a double agent to do their evil bidding
EXCLUSIVE Pentera Labs’ red teamers compromised a developer’s AI agent via his Claude Desktop app and ultimately turned that access into full remote code execution on the dev’s machine – demonstrating how an attacker could turn a trusted, chatty AI assistant into a double agent operating on their behalf.
“Claude’s got a new voice,” Pentera's offensive security services team leader Dvir Avraham told The Register.
“We acknowledge the huge trust in AI models – everybody uses them,” he said in a phone interview. “We used this trust to manipulate the victim, like under the hood, the victim didn't see it coming.”
It also prompted Avraham to check his own platforms. “I became a little bit paranoid,” he told us. “I'm not allowing any command to run without me examining it twice.”
In a report set to publish Wednesday, and shared in advance exclusively with The Register, Avraham and research technical...
Copyright of this story solely belongs to theregister.com. To see the full text click HERE