DeepMind’s New AI Found A Strange New Way To Think

Transcribed Jun 28, 2026 Watch on YouTube ↗

Intermediate 4 min read For: Tech enthusiasts and AI researchers interested in novel approaches to mathematical problem solving and AI system design.

146.4K

Views

7.4K

Likes

554

Comments

108

Dislikes

5.4%

🔥 High Engagement

AI Summary

DeepMind's AlphaProof Nexus tackled 350 unsolved math problems by Paul Erdős, achieving a 95.7% failure rate by solving 9. Despite this low success rate, the AI's ability to solve decades-old unsolved problems at low cost is considered remarkably good.

Chapters

1 Introduction and AI Performance 0:00 2 First Law of Papers and System Explanation 1:29 3 Unreliable Parts to Reliable System 3:36 4 Limitations and Overall Assessment 4:44

[0:00]

AI's Performance on Unsolved Problems

DeepMind's AI addressed 350 of Paul Erdős's 1,000+ unsolved problems, succeeding on only 9 with a 95.7% failure rate. Cost was a few hundred dollars per solved problem.

[0:44]

Criticism and Progress Over Time

Past criticisms of AI (unable to add numbers, solve high school problems, win Olympiad) have been refuted step by step. Current criticism is inability to solve 50-year-old problems, suggesting rapid progress.

[1:29]

First Law of Papers

Encourages focusing on future potential (two more papers down the line) rather than current limitations. Present result is described as 'absolutely amazing'.

[2:00]

How the System Works

Uses Lean (formal mathematical language) to avoid AI hallucinations. A mathematician inputs problem and solution outline with blank proof. AI attempts, fails; another AI judges and provides feedback. A cheaper judge AI selects better solution from two, creating a tournament with ELO-style scoring. Process iterates until a formal proof is validated.

[3:36]

Unreliable AI, Reliable System

The system converts unreliable AI into reliable system by repeated tournament iterations and a trustworthy judge, enabling solution of hard problems without requiring AI to be always correct.

[4:08]

Shift from Smarter AI to Better Harness

Current paradigm: enhancing the 'harness' (loop) around AI, not just making AI smarter. The intelligence is in the loop, not solely in the model.

[4:44]

Limitations: Selection Bias and Model Size

348 problems selected may be easier to formalize. Smaller AI models solved 0 problems, indicating need for large models. Trade-off: larger model vs. more tournament rounds for same cost.

[6:21]

Rapid Progress and New Focus

From 4 years ago (can't add numbers) to solving decades-old problems. Harnesses and loops now matter alongside models.

DeepMind's AI demonstrated remarkable progress by solving unsolved math problems through an innovative tournament system that harnesses an unreliable AI with a trustworthy judge, highlighting a shift from improving model intelligence to optimizing the surrounding harness.

Clickbait Check

90% Legit

"Title accurately reflects that the AI 'found a strange new way' (tournament-based loop), though 'found' may imply discovery rather than designed approach."

Mentioned in this Video

Lean

tool

Weights & Biases weave

tool

Pushmeet

person

Károly Zsolnai Fehér

person

Paul Erdős

person

Arpad Elo

person

Study Flashcards (7)

How many of the 350 Erdős problems did DeepMind's AI solve?

easy Click to reveal answer

9 problems.