If you’re looking to rub elbows with the who’s who of mathematics before they hit the big time, look no further than the International Math Olympiad (IMO).

Each year since 1959, high school math students from more than 100 countries have competed to solve a wide variety of math problems involving algebra, geometry, and number theory quickly and elegantly. Many IMO winners have secured prestigious math awards as adults, including the coveted Fields Medal.

In essence, IMO is a benchmark for students to see if they have what it takes to succeed in the field of mathematics. Now, artificial intelligence has aced the test—well, the geometry part at least.

In a paper published this January in Nature, a team of scientists from Google’s DeepMind have introduced a new AI called AlphaGeometry that’s capable of passing the geometry section of the International Math Olympiad without relying on human examples.

“We’ve made a lot of progress with models like ChatGPT … but when it comes to mathematical problems, these [large language models] essentially score zero,” Thang Luong, Ph.D., a senior staff research scientist at Google DeepMind and a senior author of the AlphaGeometry paper, tells Popular Mechanics. “When you ask [math] questions, the model will give you what looks like an answer, but [it actually] doesn’t make sense.”

For example, things get messy when AI tries to solve an algebraic word problem or a combinatorics problem that asks it to find the number of permutations (or versions) of a number sequence.

To answer math questions of this caliber, AlphaGeometry relies on a combination of symbolic AI—which Luong describes as being precise but slow—and a neural network more similar to large language models (LLMs) that is responsible for the quick, creative side of problem-solving.

Yet, math experts aren’t convinced that an AI made to solve high school-level math problems is ready to take off the training wheels and tackle more difficult subjects, e.g. advanced number theory or combinatorics, let alone boundary-pushing math research.

Why AI Struggles With Math

While LLM-powered AI tools have exploded in the past two years, these models have routinely struggled to handle math problems. This is part of what makes AlphaGeometry stand out from the crowd. But even so, that doesn’t necessarily mean it’s ready to tackle higher-level math yet.

“The challenge of AI is that [it] cannot come up with new concepts.”

Marijin Heule, Ph.D., is an associate professor of computer science at Carnegie Mellon University whose work focuses on another kind of automated theorem prover called SAT solvers. In this case, “SAT” refers to a measure of validity called “satisfiability” and not the math section of the high school SAT.

“When it comes down to solving math problems or problems in general, the challenge of AI is that [it] cannot come up with new concepts,” Heule tells Popular Mechanics.

This limitation impacts symbolic AI and neural networks in different ways, Heule explains, but both stem from the issue that these AI rely on an existing bank of human knowledge. However, this isn’t exactly true for AlphaGeometry because it relies on synthetic data, which isn’t based on human examples but is made to mimic them.

While AIs might not be effective mathematicians on their own, that doesn’t necessarily mean they can’t be great apprentices to human mathematicians.

“At least for the foreseeable future, [AI will] be mostly assisting,” Heule says. “One of the other things that these machines can do really well is they can tell you if there is an incorrect argument and [offer] a counterexample.”

These AI-powered nudges can help researchers distinguish research dead-ends from promising paths.

Why Geometry?

Of all the math fields the AlphaGeometry team could have tackled, Luong says there were a few factors that helped them zero in on geometry.

“I think geometry is visually appealing [and] we do geometry as kids,” he says. “And geometry is everywhere in design and architecture, so it’s very important.”

numbers of the mind
Getty Images

Geometry also offered a unique challenge as being one of the International Math Olympiad fields with the fewest number of proof examples written in a computer-friendly format (e.g. without pictures).

While Heule agrees that AlphaGeometry is “really cool work,” he admits that designing a geometry solver is one of the easier tasks for a math AI.

While human computer scientists did work behind the scenes to formalize geometry problems in a way that computers can reason about, Heule says the reasoning is pretty straightforward once that preparation work is complete.

In part, this is because the considerations of geometry problems (e.g. the relationship between angles, points, and lines) are fairly contained compared to more complex areas, he says.

Take for example Fermat’s Last Theorem. This number theory problem took over three centuries to solve, and Heule says it would be extremely difficult to explain its solution to AI, let alone ask AI to solve it.

“Large-scale fields of modern mathematics … are so big that any one of them contains multitudes,” says Heather Macbeth, Ph.D., an assistant professor of mathematics at Fordham University with a focus on geometry. “I think, maybe a more precise question would be to talk about the styles of problems, which might occur within any mathematical field that some of these AI systems are useful for,” she tells Popular Mechanics.

For example, AI could be useful for pattern recognition or so-called needle-in-a-haystack problems where mathematicians are looking for something with a very particular property, Macbeth says.

Toward General AI

While AI likely won’t be solving centuries-old math problems in the near future, Luong is confident there are still existing advancements on the horizon for AlphaGeometry and its ilk. Perhaps these models could even graduate high school and take on the Putnam Mathematical Competition for undergraduate students.

But beyond math tests themselves, Luong is hopeful about what models like AlphaGeometry could mean for the field of AI at large—in particular, researchers’ goals of designing a generalized AI.

“If we want to talk about building an artificial general intelligence, where we want the AI to be as smart as a human, I think the AI needs to be able to perform deep reasoning,” Luong says. “This means that the AI needs to be able to plan ahead for many, many steps [and] see the big picture of how things connect together … the IMO is the perfect test for that.”

Headshot of Sarah Wells

Sarah is a science and technology journalist based in Boston interested in how innovation and research intersect with our daily lives. She has written for a number of national publications and covers innovation news at Inverse.