How AI is Fundamentally Changing the Nature of Mathematics?

The Ascent of the Machines: How AI is Fundamentally Changing the Nature of Mathematics

The landscape of modern scientific discourse has undeniably shifted, moving the conversation for mathematicians away from traditional subjects like string theory or quantum gravity toward the rapid rise of Artificial Intelligence (AI). While public perception often oscillates between the fear of a coming “singularity” and dismissing AI as mere “total hype,” a growing and vibrant community of mathematicians and theoretical physicists believes that AI is currently poised to fundamentally change the very nature of how mathematics is conducted.

Mathematical data is inherently well-suited for AI. For mathematicians, AI is best understood through the mantra that connectivism leads to emergence. AI systems rely on neural networks—structures where functions are linked together, attempting to imitate the operations of the human brain. The theoretical power of this structure is secured by the universal approximation theorems, which prove that a sufficiently complex network can imitate any input and output relationship to arbitrary precision.

A Brief History of Artificial Intelligence

The conceptual roots of AI stretch back centuries, with René Descartes proposing the idea of the “beast machine” in 1617. In 1842, Ada Lovelace, working on Charles Babbage’s Analytical Engine, made a prescient statement, suggesting the machine might one day “compose elaborate and scientific pieces of music of any degree of complexity”—a prediction reminiscent of modern large language models (LLMs).

The term “Artificial Intelligence” itself was invented by John McCarthy and colleagues for a grant proposal at the Dartmouth Conference in 1956. Shortly thereafter, the perceptron was invented, marking the beginning of the artificial neural network. A critical milestone was Alan Turing’s 1949 concept (later named the Turing Test) for determining if a person could distinguish between interacting with a human or a machine. This test was officially passed by ChatGPT in 2022. Other key moments include IBM’s Deep Blue beating humans in chess in 1997, and DeepMind’s AlphaGo solving the vastly more complicated game of Go in 2012.

Defining Mathematics: The Study of Patterns

The essence of mathematics, according to the great mathematician G.H. Hardy, is that a mathematician is a “painter or a poet and he’s a maker of patterns”. Mathematics is simply a natural language fit for studying patterns—whether in sums, sequences, pictures, or geometry.

The influence of AI on mathematical practice can be explored through three distinct directions: bottom up, top down, and metamatics.

1. Bottom Up: The Formalization of Proof

The “bottom up” approach traces back to Euclid’s Elements (circa 300 BC), built upon foundational axioms. A major 20th-century attempt to axiomatize mathematics was Principia Mathematica (1910) by Russell and Whitehead, which, remarkably, took 362 pages to prove that $1+1=2$. This ambitious program was curtailed by Gödel’s incompleteness theorem, which established that within any logical system, there will always be statements whose truth value cannot be decided.

Despite Gödel’s limitation, computer scientists persisted. In 1956, the Logical Theory Machine used early computing to prove dozens of theorems from Principia Mathematica.

Today, computational power—far exceeding that used for the 1960s lunar landing program—has revived this pursuit. The Mathlib project, sometimes called the Xena project, is an ongoing community effort to formalize the axiomatic foundations and proofs of modern mathematics using proof assistants like Lean. Within 10 years, this project formalized the entirety of the undergraduate mathematics program. Formalization is deemed absolutely crucial because even in the most severely peer-reviewed journals, such as the Annals of Mathematics, mutually contradictory papers have escaped review, indicating a need for machine-checked proofs.

2. Top Down: Experimental Mathematics and Conjecture

The “top down” direction aligns with experimental mathematics, where practitioners “muck around” with data and ideas, trying to find patterns. The great Soviet mathematician Vladimir Arnold half-jokingly suggested that mathematics is merely “a branch of physics where the experiments are cheap”.

Historically, this approach yielded profound results:

  • Gauss (at age 16) observed patterns in the distribution of prime numbers and conjectured the Prime Number Theorem ($\approx x/\log x$), which was only rigorously proved 50 years later.

  • Two of the six remaining Millennium Prize Problems originated from experimental observations. The Riemann Hypothesis was posited by Bernhard Riemann in 1859 after checking only about 20 zeros of the zeta function. The Birch and Swinnerton-Dyer (BSD) conjecture arose in the 1960s when the mathematicians used the EDSAC computer to plot properties of elliptic curves, marking one of the first major conjectures raised with computer assistance.

Today, AI is being deployed for conjecture formulation. To assess the quality of AI-assisted mathematical discovery, a stringent benchmark has been proposed: The Birch Test. A system must pass three criteria:

  1. Automatic (A): The system must generate the conjecture automatically, without human guidance filtering the data.

  2. Interpretable (I): The output must be mathematically useful—an equation or a formula, not just a black-box program prediction.

  3. Non-trivial (N): The conjecture must be interesting enough for human mathematicians to actively work on.

As of the last eight years, **nothing has fully passed the Birch Test**. However, systems have come close. Using neural networks and statistical techniques, researchers analyzed $3.5$ million elliptic curves from the LMFDB database. This interface successfully produced a precise, open conjecture related to the BSD problem that defied the top minds in the field, representing a major breakthrough in AI-human collaboration.

3. Metamatics: Large Language Models in Research

The final, and perhaps most mysterious, direction is metamatics, involving the use of Large Language Models (LLMs) like ChatGPT and DeepSeek. LLMs are based on neural networks that statistically infer the most likely next word, essentially sophisticated auto-completion systems.

LLMs have demonstrated stunning capabilities in high-level math competition. In 2024, a combination of Alpha Geometry 2 by DeepMind, which included components of Mathlib, achieved a silver medal in the extremely difficult International Math Olympiad (IMO). More recently, Gemini achieved a gold medal.

Moving to true research level, the **Frontier Math Project**—the research analog of the IMO challenge—was launched. In a secret meeting in Berkeley, 30 human mathematicians formulated 40 challenging research-level problems (Tier 4) that required precise, long numerical answers to prevent guessing. Recent benchmarking shows that large language models achieved a 10% success rate on these complex Tier 4 problems. This demonstrates that LLMs can look at problems that specialists struggle to understand and provide a precise answer with correct reasoning, primarily by pattern matching across the entirety of the internet’s mathematical corpus.

Conclusion

While AI has yet to solve any major open conjecture in mathematics, it is rapidly helping advance discovery and changing the daily routine of practitioners, assisting with coding and citation searches. Researchers anticipate the next phase, Tier 5 of the Frontier Math Project, will involve challenging AI with open problems that humans cannot yet solve.

The consensus among those driving this change is that the future of mathematics will be defined by humans working in tandem with AI, viewing the technology as a partner rather than a competitor poised to take jobs. Despite the exciting developments, funding agencies in some countries remain conservative. Nevertheless, the current era is arguably the most exciting for mathematics since the time of Euclid, driven by the rise of AI across axiomatic foundations, experimental discovery, and general research problem-solving.