Anthropic's Claude Just Ran a Research Lab Autonomously โ 20x Faster Than a Human Team
Anthropic just published the Phase 2 results from Project Fetch, and the numbers are striking.
Claude Opus 4.7 completed full research lab tasks โ path planning, route coordination, physical task execution in a wetlab environment โ without needing any human to make decisions along the way. The speed gap compared to human teams: roughly 20x. Compared to AI teams running without Claude: 37x.
This isn't a benchmark on a leaderboard. It's an AI model running a real research workflow, start to finish, on its own.
What Project Fetch Actually Is
Project Fetch started in August 2024 as an experiment to see whether AI could replace human teams in a wetlab research setting. Phase 1 used Claude Opus 4.1 and showed that human teams โ given full automation assistance โ could meaningfully outperform AI-only setups in some tasks.
Phase 2 flips that result on its head. With Claude Opus 4.7, the AI completes tasks entirely without human assistance and does it faster than any human team managed to.
The tasks involved aren't software tasks โ they're physical. Route planning for robot arms. Navigation decisions in a lab space. Coordinating a sequence of steps that, in a real research setting, normally require a trained scientist to supervise.
The Numbers Worth Paying Attention To
The headline figure is the 20x speed advantage over human teams. But there's a more interesting number buried in the results: Claude Opus 4.7 also ran 37x faster than AI teams that operated without Claude.
That's not comparing AI to humans. That's comparing Claude-assisted AI systems to other AI systems. The gap is enormous โ and it suggests that the model quality, not just automation in general, is doing a lot of the work here.
Error rates also dropped significantly, falling to under 10x compared to earlier experiments.
Where It Still Struggles
Anthropic is transparent about the limits. Claude Opus 4.7 performed well on high-level coordination โ planning routes, making sequencing decisions, handling the reasoning layer of a research task. It still struggled with tasks that require precise physical manipulation: fine motor control along coastline paths, and certain insect-handling tasks that demand sub-millimeter accuracy.
These aren't surprising failure modes. Reasoning at a distance is different from controlling a gripper at the millimeter level. But the honest disclosure is useful context: this is a model doing remarkably well at the cognitive layer of lab work, while the physical execution layer is still a harder problem.
Why It's Built on a General Model
One detail worth noting: Claude Opus 4.7 is not a robotics-specific model. Anthropic didn't fine-tune a specialized system for wetlab tasks. The performance comes from scaling a general-purpose model and applying it to a physical research domain.
That's a meaningful architectural bet. The claim is that a sufficiently capable general model โ given the right context and tools โ can outperform specialized systems that were purpose-built for a narrower domain. Project Fetch Phase 2 is evidence in favor of that bet.
It's also a direct counter to the "AI can't do real science" argument. Running a research task autonomously, from start to finish, with a 20x speed advantage โ that's not a demo. That's a workflow.
What This Means If You Use OpenClaw
Project Fetch is a wetlab. OpenClaw is a digital workspace. But the underlying dynamic is the same: an AI agent, given a goal and the right tools, working through a multi-step task without anyone managing it turn by turn.
The reason results like this matter for OpenClaw users is that the same Claude models you use through OpenClaw are the ones Anthropic is validating in these experiments. When Anthropic publishes evidence that Claude Opus 4.7 can run an autonomous research workflow faster than a human team, that's evidence about what the underlying model is capable of โ not just in a lab, but in any context where you give it a task and let it work.
Autonomous agents aren't a future feature. They're already operating, at scale, in physical research environments. OpenClaw puts the same capability inside your workflows.