Anthropic Launches Claude 3.5 Sonnet Upgrade and Claude 3.5 Haiku with Breakthrough Computer Use AI

November 05, 2025 · 3 min read

Anthropic has unveiled significant updates to its AI lineup, headlined by an upgraded Claude 3.5 Sonnet and the new Claude 3.5 Haiku model. These releases, detailed in the company's official announcement, mark a strategic push to enhance performance in coding and general intelligence tasks. The upgraded Claude 3.5 Sonnet shows broad improvements, with coding benchmarks like SWE-bench Verified jumping from 33.4 to 49.0, outperforming rivals such as OpenAI's o1-preview. Claude 3.5 Haiku, set for release later this month, matches the capabilities of the previous top-tier Claude 3 Opus while maintaining the speed of its Haiku predecessors.

A key innovation is the introduction of computer use in public beta, a first for frontier AI models. This feature allows Claude to interact with computer interfaces—viewing screens, moving cursors, and typing—enabling automation of complex, multi-step tasks. Early adopters like Asana, Canva, and Replit are already experimenting with this capability; for instance, Replit is using it to evaluate apps in development for its Agent product. Anthropic emphasizes that computer use is experimental and can be error-prone, but it aims to gather developer feedback for rapid refinement.

Performance gains are not limited to coding. On agentic tool use benchmarks such as TAU-bench, Claude 3.5 Sonnet improved scores in retail and airline domains, with GitLab reporting up to 10% better reasoning in DevSecOps tasks. The model maintains its predecessor's pricing and speed, making it accessible for enterprises. Partners like Cognition and The Browser Company have noted substantial improvements in planning and problem-solving, with the latter calling it the best model tested to date.

Safety remains a priority, with joint evaluations by the US and UK AI Safety Institutes confirming that the model adheres to Anthropic's Responsible Scaling Policy at ASL-2. For computer use, Anthropic has implemented classifiers to detect misuse, addressing risks like spam and fraud. The capability scored 14.9 on OSWorld benchmarks, nearly double the next-best system, though challenges persist in actions like scrolling and dragging.

Claude 3.5 Haiku excels in coding with a 40.6 score on SWE-bench Verified, surpassing many state-of-the-art models including GPT-4o. Its low latency and improved tool use make it ideal for user-facing applications and data-heavy tasks. Both models are available or upcoming on Anthropic's API, Amazon Bedrock, and Google Cloud's Vertex AI, with Haiku initially as text-only and image input to follow.

This rollout underscores Anthropic's focus on practical AI advancements, blending performance boosts with innovative interaction methods. As the AI race intensifies, these models could redefine automation in software development and beyond, inviting developers to explore new frontiers in AI-driven productivity.