Nova Act promises AI-powered browser automation—but Amazon admits it’s still a research preview, not a finished product.
Amazon’s new SDK lets devs build agents that click, scroll, and fill out forms in your browser—but it’s still early days.
Amazon drops Nova Act: useful tech, not magic
Another week, another AI agent.
On March 31, Amazon released Nova Act, its first public attempt at a general-purpose browser-based AI agent. The idea? You give it instructions in natural language, and it performs real actions inside a web browser—no API required.
It’s part of nova.amazon.com, a new playground for developers to explore Amazon’s homegrown “Nova” foundation models, and build with them using the new SDK.
But let’s be clear up front: this is a research preview, not a finished product.
What Nova Act actually does
With the Nova Act SDK, developers can build AI agents that:
- Submit time-off requests
- Make online orders
- Fill out web forms
- Interact with drop-downs, date pickers, and popups
- Handle step-by-step flows like checkout or onboarding
No integrations, no fancy APIs—just browser control via natural language + Python.
It also supports behind-the-scenes workflows (headless mode), parallel agent execution, and interleaving with Python code for reliability and testing. Think Playwright + LLM, but more tightly integrated.
The benchmarks: better than peers, but don’t overread it
Amazon claims Nova Act beats OpenAI’s CUA and Anthropic’s Claude 3.7 on internal browser interaction tests like ScreenSpot and GroundUI.
📊 Example:
- ScreenSpot Web Text: Nova Act 0.939 vs. Claude 3.7 at 0.900
- GroundUI Web: Nova Act 0.805 vs. OpenAI’s 0.823 (still competitive)
It’s a strong showing, but keep in mind:
- These are Amazon’s benchmarks
- No tests yet on broader evals like WebVoyager
- No real-world usage data
Translation: promising, but let’s not crown it king just yet.
The use case for ecommerce operators
If you run workflows that live entirely inside web browsers—and your tools don’t have APIs—Nova Act is worth a look.
👀 Potential use cases:
- Automating back-end ecommerce tasks (vendor portals, supplier forms)
- Scraping or monitoring PDPs
- Filling out repetitive shipping or compliance forms
- Prototyping internal tools that help employees navigate janky systems
💡 But: It’s not production-ready. It’s a dev toy—for now. And only useful if you have Python skills in-house.
Amazon’s endgame: agents, not chatbots
Amazon’s been late to the AI party, but Nova Act is part of its broader “Nova” push—models built in-house to reduce reliance on OpenAI and Anthropic, and to power Amazon-native apps like the upcoming Alexa+ refresh.
The agent layer is the next battleground: OpenAI has “Operator,” Anthropic has “Computer Use,” Google has… whatever Gemini is trying to be. Everyone’s chasing the same vision: AI that doesn’t just talk, but does.
Amazon’s angle? Skip the glossy demos. Focus on high-reliability, low-glamour browser control. It’s a solid niche—if they can execute.
The bottom line: promising, not revolutionary
Nova Act isn’t some AGI leap. It’s a browser automation tool with an LLM strapped to it. That’s useful. It might even be important. But let’s not confuse “can click buttons reliably” with “future of intelligence.”
✅ Worth exploring if:
- You’re running browser-based workflows with no API access
- You have devs interested in AI agents
- You’re already in the AWS/Nova ecosystem
🚫 Skip if:
- You need reliability now
- You want polished UX or out-of-the-box productivity
- You’re not into early-stage dev work
Amazon’s late, but they’re playing the long game. Nova Act won’t change your business tomorrow—but it’s worth keeping on the radar.