AI vision agents use 45x more tokens than APIs in benchmark
Using AI to click around on a website burns 45x as many tokens as just using APIs
For AI agents, seeing is expensive
Businesses deploying AI agents to automate computer usage may be spending far more money than necessary if those agents try to emulate human visual interaction.
Reflex, an enterprise application platform, recently set out to compare vision agents with API agents.
A vision agent in this context refers to an AI agent that mimics human interaction by relying on image processing and optical character recognition to operate an application. In this instance, that's Claude Sonnet navigating a web app user interface via browser-use 0.12, a tool for automated web browser operation.
An API agent here refers to Claude Sonnet interacting with a web app via tools and APIs. The agent calls the same handling mechanisms that the UI calls and receives structured data in response, rather than...
Copyright of this story solely belongs to theregister.com. To see the full text click HERE