DeepSeek-V3 now runs at 20 tokens per second on Mac Studio, and that’s a nightmare for OpenAI

4 days, 4 hours ago venturebeat

Chinese AI startup DeepSeek has quietly released a new large language model that’s already sending ripples through the artificial intelligence industry — not just for its capabilities, but for how it’s being deployed. The 641-gigabyte model, dubbed DeepSeek-V3-0324, appeared on AI repository Hugging Face today with virtually no announcement, continuing the company’s pattern of low-key but impactful releases.

What makes this launch particularly notable is the model’s MIT license — making it freely available for commercial use — and early reports that it can run directly on consumer-grade hardware, specifically Apple’s Mac Studio with M3 Ultra chip.

The new Deep Seek V3 0324 in 4-bit runs at > 20 toks/sec on a 512GB M3 Ultra with mlx-lm! pic.twitter.com/wFVrFCxGS6
— Awni Hannun (@awnihannun) March 24, 2025

“The new DeepSeek-V3-0324 in 4-bit runs at > 20 tokens/second on a 512GB M3 Ultra with mlx-lm!” wrote AI ...

Copyright of this story solely belongs to venturebeat . To see the full text click HERE

Share: