Leading AI models are more vulnerable to malicious prompts than vendors claim

https://imgproxy.divecdn.com/gnH9hbUyesKuLnc3MydEnbESbDX2sLEh6TFw-HAj8r4/g:ce/rs:fit:770:435/Z3M6Ly9kaXZlc2l0ZS1zdG9yYWdlL2RpdmVpbWFnZS9HZXR0eUltYWdlcy0yMTk2MTM5NTU5LmpwZw==.webp

Dive Brief:

  • Major AI developers’ model-safety claims rest on incorrect assumptions about how hackers behave, Cisco researchers said in a report published on Wednesday.
  • AI vendors assume that their models are safe from hijacking if they can fend off a single malicious prompt at a time, but hackers are increasingly using multistage prompts to evade model defenses, Cisco said, and most models aren’t prepared for those kinds of attacks.
  • The new report illustrates a mostly underappreciated danger lurking inside AI models, one that could expose businesses using these tools to a wide range of disruptions and harm.

Dive Insight:

Cisco’s evaluation of 15 leading AI models from OpenAI, Anthropic, Google, Amazon and xAI “found that single-turn attack success rate (ASR) is not a reliable proxy for what happens when an attacker can adapt across turns,” researchers Nicholas Conley and Amy Chang wrote.

Their tests revealed that AI models were much...

Copyright of this story solely belongs to ciodive.com. To see the full text click HERE

Read more

https://assets.bwbx.io/images/users/iqjWHBFdfxIU/iWqMOeTcvSvg/v1/1200x1200.jpg

Digital advocacy firms like CiviClick and Influent appear to use AI to generate mass public comments on local energy projects, mostly favoring fossil fuel use

Sponsor Posts Fast, affordable law for startups — Soxton automates startup legal so founders can move faster and sleep better. We handle incorporation, advisor, employment and commercial contracts. Join the waitlist for early access! Stop vibe coding analytics — Equals AI turns questions about your business into auditable spreadsheet models and dashboards.