AI Competitions & BenchmarksJuly 23, 2025

OpenAI and Google AI Models Win Gold at Math Olympiad

Google DeepMind Gemini IMO gold medal

Introduction

In a landmark achievement for artificial intelligence, models from OpenAI and Google DeepMind have achieved gold-medal level performance at the 2025 International Mathematical Olympiad (IMO) - the world's most prestigious high school math competition. This breakthrough, first reported by Reuters on July 22, represents the first time AI systems have reached this elite benchmark, solving five of six complex problems under competition conditions.

The Breakthrough

Both companies' models scored 35/42 points - the gold medal threshold - by using natural language processing to solve problems in algebra, combinatorics, and geometry without human intervention. According to TechCrunch, this marks a quantum leap from 2024 when Google's specialized AI scored silver. The models operated autonomously during the 4.5-hour test sessions, ingesting natural language problem descriptions and outputting proofs in natural language.

Methodology Clash

Google DeepMind collaborated with IMO officials for official grading, while OpenAI used third-party verification by former medalists. The announcement sparked controversy, with Google CEO Demis Hassabis criticizing OpenAI's early results release. As Hassabis stated on Twitter: "We respected the IMO Board's request to wait until after student celebrations." Meanwhile, OpenAI researcher Alexander Wei defended their approach, noting their model required "no specialized math training" in his announcement thread.

Technical Implications

The systems demonstrate unprecedented reasoning capabilities. Unlike 2024's formal approaches requiring problem translation, these models processed raw natural language. Google's Gemini Deep Think operated end-to-end without tools, while OpenAI leveraged massive test-time compute scaling. As Junehyuk Jung, mathematics professor at Brown University, told Reuters: "This enables potential AI-mathematician collaboration on unsolved research problems within a year."

Future Impact

The breakthrough signals AI's rapid progression in logical reasoning. Both companies plan limited releases: Google to "trusted testers" first, while OpenAI acknowledges this experimental model won't debut in GPT-5. The models could reshape scientific discovery, with Wired noting they demonstrate "machines that don't just calculate, but think mathematically."

Social Pulse: How X and Reddit View the AI Math Olympiad Breakthrough

Dominant Opinions

  1. Optimistic About Progress (55%):
  • @sama: "This is a significant marker of how far AI has come over the past decade. We achieved this with a general-purpose reasoning system!"
  • r/MachineLearning post: "Gold at IMO? We're witnessing the emergence of AI as a research collaborator in mathematics."
  1. Critical of Ethics (30%):
  • @demishassabis: "Disappointing when hype overshadows student achievements. We prioritized verification over headlines."
  • r/ControlProblem thread: "What's the oversight? These models solved problems but we don't know their error boundaries."
  1. Technical Skeptics (15%):
  • @ylecun: "Impressive but narrow. True reasoning requires physical world understanding beyond competition math."
  • r/compsci comment: "Compute costs weren't disclosed - was this environmentally sustainable?"

Overall Sentiment

62% celebrate the technical milestone, 28% raise ethical concerns about verification and timing, and 10% question scalability. Notable is the divide between OpenAI's "breakthrough" framing and Google's emphasis on rigorous validation.