Learning the Language of Geometry AlphaGeometry, a system that nears expert proficiency in proving complex geometry theorems

Published

Jan 24, 2024

Reading time

2 min read

Machine learning algorithms often struggle with geometry. A language model learned to prove relatively difficult theorems.

What's new: Trieu Trinh, Yuhuai Wu, Quoc Le, and colleagues at Google and New York University proposed AlphaGeometry, a system that can prove geometry theorems almost as well as the most accomplished high school students. The authors focused on non-combinatorial Euclidean plane geometry.

How it works: AlphaGeometry has two components. (i) Given a geometrical premise and an unproven proposition, an off-the-shelf geometric proof finder derived statements that followed from the premise. The authors modified the proof finder to deduce proofs from not only geometric concepts but also algebraic concepts such as ratios, angles, and distances. (ii) A transformer learned to read and write proofs in the proof finder’s specialized language.

The authors generated a synthetic dataset of 100 million geometric premises, propositions, and their proofs. For instance, given the premise, “Let ABC be any triangle with AB = AC” (an isosceles triangle) and the proposition “∠ABC = ∠BCA,” the proof involves constructing a line between A and the midpoint between B and C. The authors translated these problems into the proof finder’s language. They pretrained the transformer, given a premise and proposition, to generate the proof.
The authors modified 9 million proofs in the dataset to remove references to some lines, shapes, or points from premises. Instead, they introduced these elements in statements of the related proofs. They fine-tuned the transformer, given a modified premise, the proposition, and the proof up to that point, to generate the added elements.
At inference, given a premise and proposition, the proof finder added statements. If it failed to produce the proposition, the system fed the statements so far to the transformer, which predicted a point, shape, or line that might be helpful in deducing the next statement. Then it gave the premise, proposition, and proof so far — including the new element — to the proof finder. The system repeated the process until the proof finder produced the proposition.

Results: The authors tested AlphaGeometry on 30 problems posed by the International Mathematical Olympiad, an annual competition for high school students. AlphaGeometry solved 25 of them correctly. Comparing that achievement to human performance isn’t so straightforward because human competitors can receive partial credit. Human gold medalists since 2000 solved 25.9 problems correctly, silver medalists solved 22.9 problems, and bronze medalists solved 19.3 problems. The previous state-of-the-art approach solved 10 problems, and the modified proof finder solved 14 problems. In one instance, the system identified an unused premise and found a more generalized proof than required, effectively solving many similar problems at once.

Why it matters: Existing AI systems can juggle symbols and follow simple rules of deduction, but they struggle with steps that human mathematicians represent visually by, say, drawing a diagram. It’s possible to make up this deficit by (i) alternating between a large language model (LLM) and a proof finder, (ii) combining geometric and algebraic reasoning, and (iii) training the LLM on a large data set. The result is a breakthrough for geometric problem solving.

We're thinking: In 1993, the teenaged Andrew Ng represented Singapore in the International Mathematics Olympiad, where he won a silver medal. AI’s recent progress in solving hard problems is a sine of the times!

Subscribe to The Batch