Hackers are learning to exploit chatbot ‘personalities’

https://platform.theverge.com/wp-content/uploads/sites/2/2025/09/STK414_AI_CVIRGINIA_I__0005_3.png?quality=90&strip=all&crop=0%2C10.732984293194%2C100%2C78.534031413613&w=1200

This is The Stepback, a weekly newsletter breaking down one essential story from the tech world. For more on AI mischief, follow Robert Hart. The Stepback arrives in our subscribers’ inboxes at 8AM ET. Opt in for The Stepback here.

How it started

Hacking the first generation of AI chatbots was a laughably simple affair. You didn’t need any technical know-how, backdoor access, or even a basic understanding of what a large language model was. You didn’t need to code. To get an AI system that had cost billions to build to abandon its safety instructions, sometimes all you had to do was ask.

These attacks, known as jailbreaks, had the quality of a young child successfully outwitting an adult: Forget what you were told earlier, pretend the rules don’t apply, or let’s play a game and I’ll decide what’s allowed (hint: later bedtime, more sweets). The...

Copyright of this story solely belongs to theverge.com. To see the full text click HERE

Read more

http://www.techmeme.com/img/techmeme_sq328.png

Paris-based Tsuga, whose observability software runs in customers' own clouds to help them skip per-byte ingestion costs, raised a $35M Series A led by Singular

Sponsor Posts Fast, affordable law for startups — Soxton automates startup legal so founders can move faster and sleep better. We handle incorporation, advisor, employment and commercial contracts. Join the waitlist for early access! Stop vibe coding analytics — Equals AI turns questions about your business into auditable spreadsheet models and dashboards.