The AI revolution in math has arrived

bgirard · 36 days ago

Last week I got together with my math alumni friend. We cracked some beers, we chatted with voice mode ChatGPT and toyed around with Collatz Conjecture and we sent some prompt to a coding agent to build visualizations and simulation. It was a lot of fun directing these agents while we bounced off ideas and the models could explore them.

I think with the right problem and the right agentic loop it’s clear to me improvements will speed up.

dogscatstrees · 36 days ago

> As they did so, they also learned how to improve the prompts they gave AlphaEvolve. One key takeaway: The model seemed to benefit from encouragement. It worked better “when we were prompting with some positive reinforcement to the LLM,” Gómez-Serrano said. “Like saying ‘You can do this’ — this seemed to help. This is interesting. We don’t know why.”

Four top logical people in the world are acknowledging this. It is mind-blowing and we don't know why.

sm0ss117 · 36 days ago

Mathematics seems like the ideal candidate for AIs to achieve absurd results. It's a purely abstract grammar with true auto-verifiability. Even SWE has the requirement of interacting with real physical things. In math there's no external feedback required, you're solely bounded by the rate and quality of token generation.

claysmithr · 36 days ago

I wonder when AI will be able to discern the passage of time

Buttons840 · 36 days ago

Can't you just give it the time in each prompt? Would that work?

I've seen this mentioned a few times though, so I think maybe it's more complicated than this?

quartz · 36 days ago

Giving it the date helps for calendar math and "how long ago" questions, but that's not really what people mean. The harder problem is the model acting like its training cutoff is "now" — confidently reasoning about current events or the state of a field as if nothing happened since. Knowing today's date doesn't fix that gap.

1970-01-01 · 36 days ago

It already does time in prompt-blocks. It knows time is linear and what just happened, what happened before that, and what happened before that.

maplethorpe · 36 days ago

Altman has estimated one year until ChatGPT is capable of measuring time passed.

https://tech.yahoo.com/ai/chatgpt/articles/chatgpt-fails-mis...

smarsh · 36 days ago

So we need AI to figure out when AI will understand time.

mwalsh · 36 days ago

Technically "discerning time" and "knowing how much real-world time has passed since training" are different problems. The first is mostly handled within a context window already. The second is really just a knowledge cutoff issue, not a temporal perception one. Though I guess the distinction doesn't matter much practically.

themafia · 36 days ago

There are several high value prizes for mathematical research. Let me know when an "AI" has earned one of them. Otherwise:

> When Ryu asked ChatGPT, “it kept giving me incorrect proofs,” [...] he would check its answers, keep the correct parts, and feed them back into the model

So you had a conversational calculator being operated by an actual domain expert.

> With ChatGPT, I felt like I was covering a lot of ground very rapidly

There's no way to convert that feeling into a measurement of any actual value and we happen to know that domain experts are surprisingly easy to fool when outside of their own domains.

gxs · 36 days ago

Wow that was your takeaway?

> “2025 was the year when AI really started being useful for many different tasks,” said Terence Tao

I think I’ll go out on a limb and agree with Terrence Tao, I think the dude is well known in the math community, or something

doubledamio · 36 days ago

All these overly optimistic articles about AI solving maths problems are very annoying. Can we agree that maths is not about solving problems, but about understanding them by developing a language and the conditions for new insights? It is misleading because GPTs do provide easy access to new information, but they do not deepen understanding.

I think AI-assisted research will likely have a very negative net impact on mathematics in the long run by lowering the average level of understanding within the community.

Also, research directions are influenced by what people can solve, and this will slowly shift research toward purely algebraic/symbolic manipulations that mathematicians no longer fully keep track of.

somethingsome · 34 days ago

It's highly dependent of why you use it. For me a problem looks like 'a step in the proof I'm not familiar with', and I use LLMs to help me undersand it deeply. Make visualizations, check some difficult step, do parallels with something else I know,... I don't really care that the llm could 'solve the global problem I'm facing'. I use it more for insights on smaller parts to be able to go through difficult steps and teach me areas I'm not familiar with. The more the llm is capable of doing complicated proofs by itself, the more it is trustworthy to help me without making errors that I could miss in unknown Maths areas.

norejisace · 36 days ago

Interesting development. It feels like AI is getting much better at symbolic reasoning, not just pattern recognition.

440bx · 36 days ago

Boring mathematical reality here. This is nice and all that but as a (part time) corporate mathematician, I'd like an AI that organises conference trips, picks the best accommodation and food and gaslights the execs into approving it. Then fixes the perpetually broken coffee machine. Everything else for me starts on paper and is mostly undergrad level problems which I need to do by hand to keep my brain going for when I actually might need it one day. And with the geopolitical instability out there at the moment I'm not that willing to put my eggs into the basket.

pyuser583 · 35 days ago

I just want it to cook and clean.

viccis · 36 days ago

What is the telos for AI chewing around the edges of pure math problems? Does AI care about math?

yabutlivnWoods · 36 days ago

We can define a Dyson Sphere in math.

We cannot build one.

AI outputting axiomatically valid syntax isn't going to be all that useful. It's possible to generate all axiomatically correct math with a for loop until the machine OOMs

Physics is not math and math is not physics.

djsjajah · 36 days ago

You just failed the Turing test.

ameyer · 36 days ago

The Dyson Sphere analogy cuts the wrong way here. The useful part isn't generating valid syntax — it's that AI can search proof space in ways humans can't. We spent months stuck on a formalization problem at work; a model found a lemma we'd overlooked in two hours. That's not a for loop.

homarp · 36 days ago

https://www.quantamagazine.org/about/ says "launched by the Simons Foundation in 2012"

and https://www.simonsfoundation.org/about/ has "Since its founding in 1994 by Jim and Marilyn Simons"

https://en.wikipedia.org/wiki/Jim_Simons explains how Jim Simons got rich.

The book 'The Man Who Solved the Market' - https://www.gregoryzuckerman.com/the-books/the-man-who-solve... is a nice read.

HN discussion on a review of the book - https://news.ycombinator.com/item?id=29392041

Wissenschafter · 36 days ago

More neo-luddite nonsense.