A 23-Year-Old Amateur Used ChatGPT to Crack a 60-Year-Old Math Problem No Expert Could Solve

April 28, 2026

In April 2026, Liam Price, a 23-year-old with no advanced mathematical training, entered a decades-old unsolved problem into GPT-5.4 Pro on an idle afternoon. He described his approach as "vibe maths:" intuitive questioning combined with repeated trials. Eighty minutes later, the AI produced a proof that had eluded professional mathematicians for 60 years. Fields Medalist Terence Tao reviewed the result and confirmed it was real.

The problem was Erdős Problem #1196, one of hundreds of open conjectures posed by the legendary Hungarian mathematician Paul Erdős. It concerns primitive sets — collections of integers where no number divides evenly into any other. The question asks how small a particular mathematical score, called the Erdős sum, can get when the numbers in the set are very large. The conjecture, posed around 1968 by Erdős, Sárközy, and Szemerédi, predicted the sum approaches exactly 1 as the numbers grow toward infinity.

What the AI Found

Stanford mathematician Jared Lichtman had spent years working on this problem. In 2023, he established the best known upper bound of approximately 1.399 — strong work, but not the exact answer. GPT-5.4 Pro produced the correct main term: 1 + O(1/log x), with an even sharper follow-up correction term later extracted by Tao.

The method the AI used was the unexpected part. For 60 years, every mathematician who attempted the problem approached it by translating it from number theory into probability theory — a path so natural, Lichtman said, that no one questioned it. GPT-5.4 Pro went a different direction entirely, building the proof using the von Mangoldt function — a tool from analytic number theory that encodes the fundamental theorem of arithmetic — combined with a Markov chain argument. The technique had existed for 90 years. Nobody had ever applied it to this class of problems.

Tao said that previous researchers had "collectively made a slight wrong turn" right at the beginning, and that the AI, carrying none of the field's accumulated assumptions about the right approach, found the path they had all missed. Lichtman compared it to a new opening line in chess that human convention had simply never considered, drawing parallels to AlphaGo's Move 37 — the move in 2016 that looked like a mistake and rewrote the theory of the game.

The Nuance the Headlines Missed

The romantic version of the story, "AI defeats the experts alone," does not hold up to scrutiny, and the people involved were quick to say so. Lichtman was direct: "The raw output of ChatGPT's proof was actually quite poor. So it required an expert to kind of sift through."

Price sent the result to Kevin Barreto, a second-year mathematics undergraduate at Cambridge and his occasional collaborator. Barreto recognised the significance and contacted Tao and Lichtman. Together, they refined the argument, shortened it, extracted the cleaner formulation, and developed a follow-up note with a sharper correction term. The 80 minutes describes how long the AI took to generate the proof artifact. The broader mathematical event — verification, reformulation, and extension into a small new theory — was human work.

Tao's summary put it precisely: "We have discovered a new way of thinking about large numbers and their structures. This is a beautiful result, but its long-term significance remains to be seen."

What It Might Mean

Several things about this result are worth separating. The problem was genuinely hard; Lichtman called the proof "the first AI result at the level of Erdős's Book," referring to the imaginary volume Erdős said God keeps containing the most beautiful proof of every theorem. The technique the AI found is not specific to this one problem: Lichtman and Tao believe the Markov chain approach may unlock related problems in number theory that follow similar patterns, making the contribution larger than a single solve.

At the same time, the result is a collaboration, not a replacement. The AI found the strategic spine of a proof that no human had located. Human experts turned that skeleton into rigorous mathematics. The model of "vibe maths by an amateur, validation and refinement by experts" may represent something genuinely new about how mathematical discovery can work, but it also confirms that expert judgment remains essential at every stage after the initial generation.

A $20 per month subscription accessed a result that eluded professional researchers for six decades. Whether that says more about the power of the tool or the structure of the problem (or both) is a question mathematicians and AI researchers are now actively debating.

This article is based on reporting from ByteIota, Abit.ee, Webiano, and the original discussion thread at ErdosProblems.com.

Image courtesy of Thomas T and Unsplash.

This article was generated with AI assistance and reviewed for accuracy and quality.

Last updated: April 28, 2026

Report Error

About this article: This article was generated with AI assistance and reviewed by our editorial team to ensure it follows our editorial standards for accuracy and independence. We maintain strict fact-checking protocols and cite all sources.

Word count: 783Reading time: 0 minutes

Explore More AI Resources

Continue with high-value guides related to this topic.

Compare AI Models

See ChatGPT, Claude, and Gemini side-by-side in one place.

Best AI Newsletters

Find top AI newsletters and subscribe to ChatAI Weekly.

AI FAQ

Quick answers about ChatAI, AI tools, and multi-model chat.

AI Tools

Use free AI tools for summarization, translation, and more.

AI Tools for this Article

📧 Stay Updated

Get the latest AI news delivered to your inbox every morning.

Cursor’s Composer 2 Built on Kimi Model, Company Confirms After Backlash

AI Analysis5 min

Analyzing the Impact of Sycophancy Quantification on AI Ethics and Accountability

AI Analysis5 min

The Impact of Claude Haiku 4.5 on AI Industry Innovation

AI Analysis5 min

Browse All Articles

Share this article:

Continue Reading

Cursor’s Composer 2 Built on Kimi Model, Company Confirms After Backlash

Cursor’s release of its new Composer 2 coding model is drawing scrutiny after the company acknowledged it was built in part on Moonshot AI’s open-source Kimi model, raising questions about...

March 23, 2026•5 min read

Analyzing the Impact of Sycophancy Quantification on AI Ethics and Accountability

In a groundbreaking development, the quantification of sycophancy rates on the BrokenMath benchmark has shed light on a critical issue in AI ethics and accountability. This measurement is not just...

October 28, 2025•5 min read

The Impact of Claude Haiku 4.5 on AI Industry Innovation

Significance of Claude Haiku 4.5 Introducing Claude Haiku 4.5 marks a significant milestone in AI development, offering users access to cutting-edge technology at a more affordable price point. This...

October 16, 2025•5 min read

Explore All Articles

A 23-Year-Old Amateur Used ChatGPT to Crack a 60-Year-Old Math Problem No Expert Could Solve

What the AI Found

The Nuance the Headlines Missed

What It Might Mean

Explore More AI Resources

Compare AI Models

Best AI Newsletters

AI FAQ

AI Tools

AI Tools for this Article

Settings

📧 Stay Updated

Related Articles

Cursor’s Composer 2 Built on Kimi Model, Company Confirms After Backlash

Analyzing the Impact of Sycophancy Quantification on AI Ethics and Accountability

The Impact of Claude Haiku 4.5 on AI Industry Innovation

Continue Reading

Cursor’s Composer 2 Built on Kimi Model, Company Confirms After Backlash

Analyzing the Impact of Sycophancy Quantification on AI Ethics and Accountability

The Impact of Claude Haiku 4.5 on AI Industry Innovation

AI News Daily

Stay Ahead of AI

Go Premium

Follow Our Community

ChatAI

AI News Daily

Go Premium

ChatAI

Follow Our Community