AI Just Started Making Discoveries

By Sammi Teki — 20 May 2026

From official OpenAI X account

A general-purpose AI model just produced an original mathematical proof, disproving a conjecture that has been open since 1946. The proof was checked by Fields medallists. It held. That is a sentence that was not possible a year ago.

For the last few years, the AI conversation has been about speed. Write faster. Summarise faster. Code faster. Automate the boring stuff. As someone building an AI startup, I use it every day and the productivity gain is undeniable.

But today something different happened. OpenAI announced that one of its internal reasoning models disproved a conjecture in discrete geometry that mathematicians have been stuck on since 1946. Not by searching the internet. Not by copying a known answer. By producing a new proof that external mathematicians then checked and verified.

That is a discovery.

Before going further, it's worth noting that OpenAI has been here before and got it wrong. Seven months ago, their former VP Sebastien Bubeck posted that GPT-5 had solved ten Erdős problems over a weekend. It turned out the model had found existing solutions already in the literature - not new results. Yann LeCun and Demis Hassabis publicly called it out, the post was deleted, and the episode became one of the more embarrassing moments in AI marketing. So when OpenAI makes a mathematical discovery claim today, the context matters. This time, they appear to have learned from that embarrassment.

The proof was checked by a group of external mathematicians who then wrote a companion paper explaining the argument. The names on that paper are not lightweights: Fields medallist Tim Gowers, Noga Alon, Thomas Bloom, Will Sawin, Arul Shankar, and others. Gowers calls the result "a milestone in AI mathematics." Shankar says current AI models have demonstrated they are capable of having original ideas and carrying them out. These are people with reputations they would not attach to a press release.

The problem

In 1946, Paul Erdős asked a deceptively simple question. Place n points on a flat plane. How many pairs of those points can be exactly one unit apart?

Picture dots on a piece of paper. You want to maximise the number of dot-pairs that are exactly one ruler-length apart. For nearly 80 years, mathematicians believed that structured, grid-like arrangements were essentially the ceiling. You could get creative with the arrangement, but you couldn't fundamentally beat the grid.

OpenAI's model found a construction that does beat it. It produced an infinite family of point configurations that yield more unit-distance pairs than the old conjecture allowed — a polynomial improvement over the expected bound. The proof was not a brute-force search. The model drew on algebraic number theory, a branch of mathematics that studies number systems beyond ordinary integers, and applied those ideas to a geometric problem about distances on a plane. Specifically, it used properties of algebraic number fields to build point sets where the distances between points have a structure that forces more unit-distance pairs to appear than grid-based approaches ever could.

The key move was sequencing. The model took mathematical tools that all existed independently and assembled them in an order that clicked. No individual piece was new. The construction was.

Why humans missed it

The companion paper has the most interesting explanation I've seen for why this problem stayed open for 80 years, and it has nothing to do with intelligence.

The mathematicians who wrote it identified four things that all had to be true simultaneously for someone to solve this: you had to be spending serious time on the unit distance conjecture specifically, you had to be actively trying to disprove it despite Erdős himself repeatedly saying he believed it was true, you had to think there was value in generalising the original construction to other number fields and be willing to invest time exploring that direction, and you had to know enough class field theory to recognise the right construction when it appeared.

That is four independent filters. Any working mathematician might pass one or two. Almost nobody would pass all four at once. The career incentives don't line up - spending months trying to disprove a conjecture that the field's most prolific figure believed was true, using tools from a different subdomain, is not how most researchers allocate their time.

The AI didn't have those constraints. It wasn't worried about career risk. It wasn't rationing its time. It didn't carry the inherited assumption that Erdős was probably right. It explored the path anyway, and the path worked.

Thomas Bloom's framing is the most useful one: the model succeeded partly through "superhuman levels of patience" combined with access to a wide range of technical machinery. That is a more honest description than "AI is smarter than mathematicians." It is not about intelligence. It is about search — exploring a much larger space of possible approaches without the taste filters that make humans efficient but also make us miss things.

What this is and what it isn't

The result is real. The verification is serious. As an AI milestone, it matters.

But some context is necessary before the takes get out of hand.

AI has produced novel scientific results before. AlphaFold predicted protein structures. AlphaProof solved International Mathematical Olympiad problems. DeepMind systems have generated new conjectures. What is different here is that OpenAI claims this came from a general-purpose reasoning model — not a system built specifically for geometry or trained only on mathematical proofs. If that holds up under continued scrutiny, it is a meaningful shift in what we should expect from frontier models.

It is also worth being honest that Erdős problems, while famous and genuinely difficult, represent what some mathematicians have called an "accessible tail" of open problems. The hardest unsolved problems in mathematics — the Riemann Hypothesis, the Navier-Stokes equations — are qualitatively different. The distance from here to "AI solves any hard problem" is still large.

So the honest version of the claim is not "AI can now discover anything." The honest version is: a general-purpose AI model produced an original, expert-verified result on a well-known open mathematical problem, and the way it did it - by exploring paths that human incentive structures made unlikely, tells us something about where we are heading.

Where this matters

Maths is the cleanest test case because proofs are binary. They hold or they don't. There is no room for a model to just sound convincing, eventually the logic either works or it breaks.

But the underlying capability demonstrated here - holding long chains of reasoning together, combining tools from distant fields, exploring directions humans wouldn't prioritise, is useful far beyond pure mathematics. Drug discovery, materials science, protein engineering, climate modelling, battery chemistry. All of these involve the same core challenge: too many possible paths for humans to explore manually, and correctness that can eventually be checked.

The future this points toward is not "AI replaces researchers." It is more like: AI generates a much larger set of candidate ideas, and human experts verify, refine, interpret, and direct. The human role shifts from generating every idea to judging, testing, and steering a bigger discovery engine. That is still a massive shift, even if it is not the one the headlines will describe.

The question that changed

For years, the question about AI was: can it help us work faster?

That has been answered. Yes.

The new question is whether AI can help us discover things we would not have found on our own. This result, the verification behind it, and the explanation for why humans missed it all suggest the answer is starting to be yes.

Which question a company, a lab, or a researcher is still asking — the speed question or the discovery question — probably tells you more about where they'll be in five years than any other signal.

me at an arcade

I'm Sammi, founder of ALFRD. We're building AI that understands business financial data, so you can trust it before you analyse it.

Checks. Cleans. Standardises. Consolidates.

Request to join the free beta