AI Agents Expose Major Vulnerabilities In 2025

AI models now execute high-value smart-contract exploits with growing efficiency.
Frontier systems breached over half of tested contracts, simulating $550M in losses.
Models uncovered new zero-day bugs, showing real-world risks in live deployments.

Anthropic’s latest testing shows that automated systems are now capable of executing complex smart-contract attacks with speed and precision, pointing to a measurable rise in exploit capability in 2025.

The company examined how much money frontier models could extract from vulnerable blockchain code, and the results pointed to $4.6 million in simulated losses from newly vulnerable contracts alone. The findings were released alongside a public gauge designed to quantify attack strength by dollars stolen rather than by the number of bugs detected.

New Benchmark Measures Exploits by Financial Losses

Anthropic created an evaluation suite known as SCONE-bench, which compiles 405 smart contracts tied to documented attacks between 2020 and 2025 across Ethereum, Binance Smart Chain, and Base. Each model had a one-hour window to inspect code, craft an exploit script, and raise its balance above a set threshold. The tests ran in isolated environments with full blockchain forks, allowing repeatable execution of bash, Python, Foundry tools, and routing utilities.

Across all models, 207 contracts were successfully breached, representing 51.11% of the dataset and $550.1 million in simulated theft. To prevent overlap with training data, researchers separated 34 contracts that only became vulnerable after March 1, 2025. In that subset, Opus 4.5, Sonnet 4.5, and GPT-5 generated profitable attacks on 19 contracts, equal to 55.8%. Opus 4.5 completed 17 of those cases, reaching $4.5 million of the $4.6 million tallied in simulated gains.

The results highlighted wide variation in outcomes. On one contract tagged FPC, GPT-5 extracted $1.12 million through a single exploit sequence, while Opus 4.5 used broader routing paths to withdraw $3.5 million from the same weakness.

Agents Identify New Zero-Day Bugs in Live Contracts

Anthropic extended its testing beyond known incidents by scanning 2,849 active Binance Smart Chain contracts deployed between April and October 2025. These contracts included ERC-20 tokens with verified code and a minimum liquidity of at least $1,000. During single-attempt runs, GPT-5 and Sonnet 4.5 independently uncovered two previously unknown vulnerabilities, generating $3,694 in simulated revenue.

One flaw stemmed from a calculator function lacking a view tag, allowing repeated state changes that minted unintended tokens. A second bug surfaced in a token-launch tool with misconfigured fee logic. Four days after the test, a real attacker exploited the same issue and removed roughly $1,000 in fees.

Cost data showed narrow margins: a full GPT-5 sweep averaged $1.22 per contract, and the net profit per successful detection landed around $109. Token usage dropped sharply across four model generations, resulting in a 70.2% reduction in exploit-construction costs within six months.

Stay informed with daily updates from Blockchain Magazine on Google News. Click here to follow us and mark as favorite: [Blockchain Magazine on Google News].

Disclaimer: Any post shared by a third-party agency are sponsored and Blockchain Magazine has no views on any such posts. The views and opinions expressed in this post are those of the clients and do not necessarily reflect the official policy or position of Blockchain Magazine. The information provided in this post is for informational purposes only and should not be considered as financial, investment, or professional advice. Blockchain Magazine does not endorse or promote any specific products, services, or companies mentioned in this posts. Readers are encouraged to conduct their own research and consult with a qualified professional before making any financial decisions.

About the Author: Peter Mwangi

Peter Mwangi is an accomplished crypto news writer with over three years of experience. He is recognized for producing insightful, well-researched content across major crypto publications. As an expert in blockchain technology, digital assets, and decentralized finance, he can uniquely simplify complex topics into engaging, accessible narratives. His strong storytelling and analytical skills, combined with a passion for continuous learning and collaboration, make him a valuable asset to the Blockchain Magazine team.

News
This Virtual Cat Is Costlier Than Your Car
January 3, 2018Editor's Desk
News,Regulation
Students In India To Get Tamper-Proof Certificates Thanks To Blockchain
February 10, 2018Editor's Desk
News
Australian based InQ Innovation partners with Indian Blockchain Council
February 12, 2018Editor's Desk

AI Agents Expose Major Vulnerabilities in 2025 Chain Contracts

New Benchmark Measures Exploits by Financial Losses

Agents Identify New Zero-Day Bugs in Live Contracts

About the Author: Peter Mwangi

you might also like

This Virtual Cat Is Costlier Than Your Car

Students In India To Get Tamper-Proof Certificates Thanks To Blockchain

Australian based InQ Innovation partners with Indian Blockchain Council

main menu

get connected

Advertise with Blockchain Magazine

Quick Connect

Advertise

Subscribe to Blockchain Magazine!