Google says Gemini 3.5 Flash can slash enterprise AI costs by more than $1 billion a year

https://images.ctfassets.net/jdtwqhzvc2n1/7fKaf2uuvjEF3Xx4BoMfGO/18db678b063c2016fd103fb10b401e6e/Nuneybits_Vector_art_of_a_retro_desktop_computer_with_a_CRT_mon_d9002d4b-28a6-41f8-adae-f9...

Google unveiled Gemini 3.5 Flash at its annual I/O developer conference on Tuesday, a new artificial intelligence model that the company says shatters what had become a seemingly iron law of the AI industry: that the smartest models must also be the slowest and most expensive to run.

The model sits at the center of a sweeping set of announcements — from a video-generating "world model" called Gemini Omni to a 24/7 personal AI agent called Gemini Spark — but 3.5 Flash carries perhaps the most immediate consequence for the enterprises pouring billions of dollars into AI infrastructure. Sundar Pichai, Google's chief executive, told reporters during a press briefing Monday that companies running roughly one trillion tokens per day on Google Cloud could save more than $1 billion annually by shifting 80 percent of their workloads to a mix of Flash and other frontier models.

"You've probably heard anecdotes from...

Copyright of this story solely belongs to venturebeat.com. To see the full text click HERE

Read more