Google's Gemini 3 Deep Think Major Upgrade: Reasoning Ability Crushes Opus 4.6 and GPT-5.2, Aims to Be the "Most Scientific AI"

動區BlockTempo
BTC-1,13%

Google Releases Major Update to Gemini 3 Deep Think, Achieving 84.6% on ARC-AGI-2 Test, Surpassing Claude Opus 4.6 (68.8%) and GPT-5.2 (52.9%), While Reaching “Legendary Master” Level on Codeforces.
(Background recap: The emergence of ChatGPT learning mode: dusk of tutoring, or dawn of the golden education era?)
(Additional context: Google officially launches “Gemini 3”! What are the highlights of topping the world’s smartest AI models?)

Table of Contents

  • Not only can it take exams, but it can also identify human errors
  • The tectonic shift in market share
  • Ripple effects on the crypto industry
  • The beginning of the scientific victory race

Today (13th), Google announced a major upgrade to Gemini 3 Deep Think. In the ARC-AGI-2 reasoning test—designed specifically to prevent AI from memorizing answer banks and to assess whether AI can infer rules from examples—Gemini 3 Deep Think scored 84.6%.

For comparison, Claude Opus 4.6 (Thinking Max mode) scored 68.8%, GPT-5.2 (Thinking xhigh mode) scored 52.9%, and the human average is around 60%.

Even more impressively, on the original ARC-AGI-1 test, Deep Think achieved 96%, essentially hitting the ceiling of what was once considered “one of the hardest AI exams.”

Deep Think is currently available to Google AI Ultra subscribers, with API early access open to enterprise users.

Not only can it take exams, but it can also identify human errors

Beyond the scores, Google highlighted a detail: when reviewing a peer-reviewed mathematical paper, Deep Think successfully identified a logical flaw that all previous reviewers had missed. This paper was verified by mathematicians from Rutgers University.

The significance of this case lies in the fact that it demonstrates the model’s ability not just in standardized tests, but in real, open scientific scenarios. Peer review is a core quality control mechanism in academia. If AI can reliably provide valuable assistance in this process, its potential to accelerate scientific research far exceeds what scores alone can measure.

Deep Think also reached gold medal levels in the 2025 International Physics Olympiad and Chemistry Olympiad written exams, with an Elo rating of 3,455 on Codeforces—equivalent to a “Legendary Master” level, a tier only a very few human programmers worldwide can attain.

On “Humanity’s Last Exam,” a benchmark designed by experts across fields to be deliberately challenging for AI, Deep Think scored 48.4% (without tools), setting a new record.

The tectonic shift in market share

The tech race among the three AI giants is reshaping the market landscape. ChatGPT’s market share has dropped from its peak of 87% to about 68%, while Gemini has surged from under 5% to over 18%, with Anthropic’s Claude steadily nibbling away at the enterprise market.

Google’s unique advantage in this race is its distribution capability. Gemini is integrated into Android, Chrome, Google Workspace, and Search, meaning even if its capabilities are on par with competitors, Google can leverage its channels to attract users.

However, distribution is a double-edged sword. If Gemini’s user experience isn’t compelling enough, it could lose user trust faster than any competitor, because users are “passively exposed” rather than “actively choosing.” OpenAI’s users are paying customers, inherently more tolerant and sticky.

Ripple effects on the crypto industry

Every upgrade in the AI arms race drives up demand for computing infrastructure. The cost to train cutting-edge models has ballooned from hundreds of millions of dollars in 2024 to billions by 2026. This directly impacts two areas:

First, the transformation path for Bitcoin miners. As mining profits are squeezed (JPM estimates BTC production costs have dropped to $77,000, with prices around $66,000), large-scale mining operations are accelerating their shift toward AI computing services.

High-cost mining firms aren’t “quitting,” but “diversifying,” moving from Bitcoin mining to providing AI compute contracts.

Second, the narrative around AI tokens. Whenever Google, OpenAI, or Anthropic releases major upgrades, on-chain AI-related tokens (such as decentralized compute protocols) often experience short-term speculation.

But the fundamental issues remain unchanged: decentralized computing still lags far behind enterprise-level AI training in latency and throughput. The narrative can run fast, but infrastructure can’t keep pace with the story.

The scientific victory race has only just begun

Deep Think’s upgrade has pushed Google back to the forefront of AI competition, at least in reasoning and scientific domains. But a subtle shift in Google’s wording reveals a change in positioning: it no longer emphasizes “the smartest general AI,” but repeatedly highlights “born for science.”

As benchmarks for general AI become more crowded and differentiation harder, “my AI can help you do science” is a more compelling value proposition than “my AI scores the highest.” If Deep Think can reliably assist peer review, accelerate drug discovery, or find overlooked solutions in physical simulations, that’s more meaningful than any leaderboard ranking.

The challenge is that the gap between “scoring high on benchmarks” and “reliably assisting humans in real scientific scenarios” may be larger than Google hints. Benchmarks have clear answers; science does not.

View Original
Disclaimer: The information on this page may come from third parties and does not represent the views or opinions of Gate. The content displayed on this page is for reference only and does not constitute any financial, investment, or legal advice. Gate does not guarantee the accuracy or completeness of the information and shall not be liable for any losses arising from the use of this information. Virtual asset investments carry high risks and are subject to significant price volatility. You may lose all of your invested principal. Please fully understand the relevant risks and make prudent decisions based on your own financial situation and risk tolerance. For details, please refer to Disclaimer.

Related Articles

Keep Going All-In on Bitcoin! Strategy Announces ATM Stock Offering to Raise 42 Billion Dollars

MicroStrategy (MSTR) plans to raise $42 billion through a new at-the-market offering program by selling common stock and preferred shares, with the aim of increasing its Bitcoin purchases. Recently, the company spent $76.6 million to acquire 1,031 Bitcoin. However, the company currently has book losses exceeding $3.2 billion, and high dividend pressure and limited cash reserves raise concerns about future financial risks.

区块客7m ago

The three major US stock indices opened lower, Li Auto rose 2.8% and announced a $1 billion share buyback.

Gate News reports that on March 24, affected by factors including Iran's denial of peace talk rumors, U.S. stocks opened lower. The Dow Jones Industrial Average (U.S. blue-chip benchmark index) fell 0.24%, the S&P 500 Index (U.S. large-cap benchmark index) declined 0.62%, and the Nasdaq Composite Index (U.S. technology stock benchmark index) dropped 0.63%. Regarding individual stocks, Li Auto (LI.O) rose 2.8%, with the company announcing a $1 billion stock buyback plan. Amazon (AMZN.O) fell 1%, as the company's Amazon Web Services (AWS) region in Bahrain experienced service disruptions.

GateNews18m ago

# Gold and BTC Diverge: A Battle Over the Definition of Safe-Haven Assets

# Woke up, and BTC pulled back to 70k. On the drive this morning, the radio was reporting that gold came under pressure as the Fed's March FOMC meeting failed to meet rate cut expectations, erasing all gains for the year so far. Recently, geopolitical tensions in the Middle East have escalated, causing global capital markets to shake. According to classical narratives in traditional finance, geopolitical conflicts should push up gold prices—a logic rooted in gold's thousands of years of safe-haven attributes, long since becoming the instinctive reaction of market participants. Yet the market performance in March 2026 has shattered this stereotype: gold prices continued to decline, breaking through the critical support level of $4,500, while Bitcoin's decline was far smaller than traditional risk assets like stocks, displaying a certain characteristic of "relative safe-haven." This anomalous divergence, on the surface is a difference in asset price movements, but at a deeper level reflects a structural change long overlooked by the market: the investor base for gold and Bitcoin is undergoing a fundamental shift

金色财经_50m ago

US stock index futures continued their downward trend, with Nasdaq futures and Dow futures both down 0.6%

Gate News reported that on March 24, according to certain CEX data, US stock index futures continued their downward trend. S&P 500 Index (US large-cap stock benchmark index) E-mini futures fell 0.5%, Nasdaq 100 Index (US technology stock index) futures fell 0.6%, and Dow Jones Index (US industrial stock index) futures fell 0.6%.

GateNews1h ago

Bernstein: Bitcoin May Have Hit Cyclical Bottom, Maintains $150,000 Target Price by End of 2026

Gate News reports that on March 24, according to CoinDesk, Wall Street brokerage Bernstein stated in its latest report that Bitcoin may have already reached a cyclical bottom, while maintaining a price target of $150,000 by the end of 2026. The report notes that previous pullbacks were mainly driven by high interest rate environments, Middle East geopolitical risks, and ETF outflows during a certain period, but the overall fundamentals have not experienced systemic pressure. Additionally, sustained ETF inflows and corporate treasury accumulation continue to be viewed as important factors driving Bitcoin's upside.

GateNews1h ago

NYSE Partners with Securitize to Launch RWA Tokenized Securities Platform: 24/7 Trading, Instant Settlement, Stablecoin Deposits and Withdrawals

ICE, the parent company of the New York Stock Exchange, announced the development of a tokenized securities platform that will enable 24/7 trading and instant settlement through blockchain technology. The platform will support fractional ownership and will partner with BNY and Citi to provide clearing support, aiming to address efficiency issues in traditional finance. This move marks NYSE's further entry into the asset tokenization space.

動區BlockTempo1h ago
Comment
0/400
No comments