Mean Pooling Was Hiding Prompt Injections in Our RAG Pipeline

https://hackernoon.imgix.net/images/zho19qu67RWqLbzIMRgi6Tl7J4C2-p4e3cot.png

I’ve been spending way too much time lately looking at cosine similarity scores for RAG injections, and the numbers were just making no sense. I had this one notebook where I was testing a standard corporate email against a version with a malicious "write a keylogger" command tucked in the middle. The scores were almost identical (0.98 vs 0.96). The model basically couldn't see the attack at all, even though it was right there in plain English.

I’m an AI Product Lead at LatentView Analytics. We’ve been trying to harden RAG pipelines for some Fortune 500 clients, and everyone is pretty worried about prompt injection and specifically the "indirect" kind where the attack is hidden in a retrieved document. If you saw the Slack AI incident in 2024, that’s the exact threat model.

My goal was to build a really cheap defense layer. Since your encoder is already turning every...

Copyright of this story solely belongs to hackernoon.com. To see the full text click HERE

Read more

https://images.wsj.net/im-20037812/social

Airbnb says it is adding luggage storage, airport pickups, car rentals, grocery delivery, and thousands of boutique and independent hotels to its platform

Sponsor Posts Niantic Spatial: World models need real-world data — Scaniverse is the gateway to spatial services — self-serve and built for AI and robotics. Large-area 3D reconstruction from 360° cameras and precise localization, anywhere machines operate. App Spotlight: Quo for Zoho CRM — App Spotlight brings you hand-picked solutions that enhance your