Intentionally Spread Thin

January 23, 2025

OpenAI is shoving more stuff out the door that seems very early, very ambitious, and very janky.

OpenAI is releasing a “research preview” of an AI agent called Operator that can “go to the web to perform tasks for you,” according to a blog post. “Using its own browser, it can look at a webpage and interact with it by typing, clicking, and scrolling,” OpenAI says. It’s launching first in the US for subscribers of OpenAI’s $200 per month ChatGPT Pro tier.

Operator relies a “Computer-Using Agent” model that combines GPT-4o’s vision capabilities with “advanced reasoning through reinforcement learning” to be able to interact with GUIs, OpenAI says. “Operator can ‘see’ (through screenshots) and ‘interact’ (using all the actions a mouse and keyboard allow) with a browser, enabling it to take action on the web without requiring custom API integrations,” according to OpenAI.

One thing I learned working in all these tech startups is that the easiest way to solve extremely difficult “last mile” problems is to get distracted by something else and work on that instead.