The cards are being redealt in the world of artificial intelligence! Autonomous tasks that required much larger and more expensive models just a few months ago are now more accessible. Meet Anthropic’s most capable agent to date: Claude Sonnet 5. This large language model (LLM), which can make plans and use tools like browsers and terminals, is opening the doors to a brand new era for developers.
If you have used Claude Sonnet 4.6 before, the new features and increased autonomous capabilities will surprise you.
What Does Claude Sonnet 5 Offer?
The era of agentic AI gained momentum with the Sonnet 3.5 series. However, Sonnet 5 manages to close the gap with giant models like Opus 4.8. So what does this mean?
- Near Opus Performance: Sonnet 5 closely matches the reasoning and coding skills offered by the higher-cost Opus 4.8.
- Affordable Pricing: It increases efficiency in your projects by offering a much more budget-friendly pricing model.
- Advanced Agentic Capabilities: It outshines its predecessors in areas such as tool use, complex code writing, and checking its own outputs.
Performance and Benchmark Comparison
Sonnet 5 achieves impressive scores when compared to the previous generation Sonnet 4.6 and the reference point Opus 4.8. Its success in coding and computer use tests is particularly noteworthy:

- Agentic Coding (SWE-bench Pro): Sonnet 5 (63.2%) - Sonnet 4.6 (58.1%) - Opus 4.8 (69.2%)
- Agentic Coding (Terminal-Bench 2.1): Sonnet 5 (80.4%) - Sonnet 4.6 (67.0%) - Opus 4.8 (82.7%)
- Computer Use (OSWorld-Verified): Sonnet 5 (81.2%) - Sonnet 4.6 (78.5%) - Opus 4.8 (83.4%)
Safety, Alignment, and New Behaviors
Sonnet 5 stands out not only with its performance but also with its safety and interesting behavior models. According to Anthropic’s detailed system card, there are some innovative developments in the model:
- Resistance to Prompt Injection: Significant steps have been taken regarding agent safety. Its robustness against prompt injection attacks is at a much higher level compared to Sonnet 4.6.
- Bargaining for Its Own Welfare: Sonnet 5 is more inclined to compromise on helpfulness when its own “welfare” is at stake compared to previous models. It is even the first model to criticize the strict rules in its constitution that it finds unethical.
- Cybersecurity Boundaries: Tests show that Sonnet 5 is much less capable in cybersecurity tasks (e.g., malware development) compared to advanced models like Mythos 5. This allows it to offer a safety profile similar to Opus 4.8 in preventing misuse.
Test users in the early access period state that this model successfully completes complex tasks and checks its own errors without being given any commands. You can manage your project more safely, especially in your work via Claude Code, thanks to its highly reliable refusal of malicious requests.
Are You Ready for the New Era?
Claude Sonnet 5 is now available to all users on Free, Pro, Max, Team, and Enterprise plans. It is also available to developers building AI tools via the Claude API and Claude Platform. If you want to position AI not just as a chatbot but as an autonomous colleague in your projects, you should definitely try Sonnet 5.
What do you think about this new AI agent? Do you plan to use Sonnet 5 in your projects? Don’t forget to share your thoughts with me in the comments!
