From Sequences to Symbols Transformers Extend AI's Mathematical Capabilities

Published

Apr 13, 2022

Reading time

2 min read

Given a sequence of numbers, neural networks have proven adept at discovering a mathematical expression that generates it. New work uses transformers to extend that success to a further class of expressions.

What’s new: A team at Meta (formerly Facebook) led by Stéphane d’Ascoli and Pierre-Alexandre Kamienny introduced Deep Symbolic Regression, training models to translate integer and float sequences to mathematical expressions. Unlike earlier work, their approach is able to find functions in which terms in a sequence depend on previous terms (such as the Fibonacci sequence u_n = u_n-1 + u_n-2). You can try out an interactive demo here.

Key insight: Transformers excel at learning underlying patterns in natural language. Converting a sequence of numbers into a mathematical expression is analogous to translating one natural language into another.

How it works: Given a sequence of numbers, a transformer learned to generate a function made up of operators (such as add, multiply, modulo, and square root), constants, the index of the term to be computed, and references to previous terms.

To train the model, the authors generated 5 million expressions by sampling from possible values (operators, constants, and so on), assembling them in the proper format, and sampling any values required to start the sequence. Then they computed each expression’s results, generating sequences of random length between 5 and 30 terms.
During training, the loss function encouraged the generated function to match the true function.
The authors evaluated their approach according to the next 10 terms in a given sequence. This method was preferable to comparing generated expressions to their true equivalents, as a given expression can take various equivalent forms (by, say, swapping the order of two terms in a sum).

Results: The authors compared their symbolic approach with a numeric model (a transformer trained to predict the next 10 terms in a sequence). Generating expressions of up to 10 operators that resulted in integer sequences, the symbolic model achieved 78.4 percent accuracy compared to the numeric model’s 70.3 percent. Generating expressions that resulted in float sequences — a more difficult task — the symbolic model achieved 43.3 percent accuracy compared to the numeric model’s 29 percent. The symbolic model also outperformed Mathematica’s built-in methods for deriving functions from sequences, tested on sequences sampled from the Online Encyclopedia of Integer Sequences (OEIS). Generating 10 terms that followed sequences of length 15, the numeric and symbolic models achieved accuracies of 27.4 percent and 19.2 percent respectively. Mathematica’s FindSequenceFunction and FindLinearRecurrence achieved 12 percent and 14.8 percent.

Yes, but: To rule out arbitrary sequences such as the digits of pi, the authors selected OEIS sequences classified as easy; that is, results of expressions deemed easy to compute and understand. Finding expressions that yield more complicated sequences might strain the authors’ approach.

Why it matters: Machine learning research struggles with abstract reasoning tasks. Mathematical symbols may be a piece of the solution.

We’re thinking: 2, 4, 6, 8, who do we appreciate? Transformers!

Subscribe to The Batch