What the AI Found
Stanford mathematician Jared Lichtman had spent years working on this problem. In 2023, he established the best known upper bound of approximately 1.399 — strong work, but not the exact answer. GPT-5.4 Pro produced the correct main term: 1 + O(1/log x), with an even sharper follow-up correction term later extracted by Tao.
The method the AI used was the unexpected part. For 60 years, every mathematician who attempted the problem approached it by translating it from number theory into probability theory — a path so natural, Lichtman said, that no one questioned it. GPT-5.4 Pro went a different direction entirely, building the proof using the von Mangoldt function — a tool from analytic number theory that encodes the fundamental theorem of arithmetic — combined with a Markov chain argument. The technique had existed for 90 years. Nobody had ever applied it to this class of problems.
Tao said that previous researchers had "collectively made a slight wrong turn" right at the beginning, and that the AI, carrying none of the field's accumulated assumptions about the right approach, found the path they had all missed. Lichtman compared it to a new opening line in chess that human convention had simply never considered, drawing parallels to AlphaGo's Move 37 — the move in 2016 that looked like a mistake and rewrote the theory of the game.
The Nuance the Headlines Missed
The romantic version of the story, "AI defeats the experts alone," does not hold up to scrutiny, and the people involved were quick to say so. Lichtman was direct: "The raw output of ChatGPT's proof was actually quite poor. So it required an expert to kind of sift through."
Price sent the result to Kevin Barreto, a second-year mathematics undergraduate at Cambridge and his occasional collaborator. Barreto recognised the significance and contacted Tao and Lichtman. Together, they refined the argument, shortened it, extracted the cleaner formulation, and developed a follow-up note with a sharper correction term. The 80 minutes describes how long the AI took to generate the proof artifact. The broader mathematical event — verification, reformulation, and extension into a small new theory — was human work.
Tao's summary put it precisely: "We have discovered a new way of thinking about large numbers and their structures. This is a beautiful result, but its long-term significance remains to be seen."
What It Might Mean
Several things about this result are worth separating. The problem was genuinely hard; Lichtman called the proof "the first AI result at the level of Erdős's Book," referring to the imaginary volume Erdős said God keeps containing the most beautiful proof of every theorem. The technique the AI found is not specific to this one problem: Lichtman and Tao believe the Markov chain approach may unlock related problems in number theory that follow similar patterns, making the contribution larger than a single solve.
At the same time, the result is a collaboration, not a replacement. The AI found the strategic spine of a proof that no human had located. Human experts turned that skeleton into rigorous mathematics. The model of "vibe maths by an amateur, validation and refinement by experts" may represent something genuinely new about how mathematical discovery can work, but it also confirms that expert judgment remains essential at every stage after the initial generation.
A $20 per month subscription accessed a result that eluded professional researchers for six decades. Whether that says more about the power of the tool or the structure of the problem (or both) is a question mathematicians and AI researchers are now actively debating.
This article is based on reporting from ByteIota, Abit.ee, Webiano, and the original discussion thread at ErdosProblems.com.
Image courtesy of Thomas T and Unsplash.
This article was generated with AI assistance and reviewed for accuracy and quality.