TECH NEWS

Leading AI models are more vulnerable to malicious prompts than vendors claim

Dive Brief:

Major AI developers’ model-safety claims rest on incorrect assumptions about how hackers behave, Cisco researchers said in a report published on Wednesday.
AI vendors assume that their models are safe from hijacking if they can fend off a single malicious prompt at a time, but hackers are increasingly using multistage prompts to evade model defenses, Cisco said, and most models aren’t prepared for those kinds of attacks.
The new report illustrates a mostly underappreciated danger lurking inside AI models, one that could expose businesses using these tools to a wide range of disruptions and harm.

Dive Insight:

Cisco’s evaluation of 15 leading AI models from OpenAI, Anthropic, Google, Amazon and xAI “found that single-turn attack success rate (ASR) is not a reliable proxy for what happens when an attacker can adapt across turns,” researchers Nicholas Conley and Amy Chang wrote.

Their tests revealed that AI models were much...

Copyright of this story solely belongs to ciodive.com. To see the full text click HERE

Leading AI models are more vulnerable to malicious prompts than vendors claim

Dive Brief:

Dive Insight:

Read more

Digital advocacy firms like CiviClick and Influent appear to use AI to generate mass public comments on local energy projects, mostly favoring fossil fuel use

Fast answers, shallow thinking: Reclaiming reflexivity in an accelerated AI world

Planning a fridge upgrade? These premium refrigerators are worth a look

Micro Center Tech Days Delivers Big Savings on PC Building Essentials