Compression’s new goal: Reducing how much an AI ‘overthinks’
Back in the late ‘90s, you compressed because storage was limited, bandwidth was expensive, and users valued rapid response.
Then, file compression was about encoding, restructuring or modifying data to reduce its size – smaller payloads meant faster, more efficient delivery and less storage space.
Today, compression is about not bankrupting yourself on inference.
In the AI world, every token generated is an act of cognition and cognition, for machines, is expensive. So, we no longer compress to make things smaller. We compress so it is cheaper for AI to “think.”
And yes, bandwidth still costs money. Cloud provider egress is infamous, and data transfer bills can still produce heart palpitations. But be honest and compare the cost of moving a megabyte across the wire with the cost of generating 10,000 tokens on a top-shelf large language model (LLM).
One is a forgotten rounding error on the monthly bill....
Copyright of this story solely belongs to techradar.com. To see the full text click HERE