Ioannis Antonoglou

Ioannis Antonoglou is a Greek computer scientist, co-founder, president, and chief technology officer of Reflection AI, and a former founding-era research engineer at Google DeepMind who co-authored AlphaGo, AlphaGo Zero, AlphaZero, and MuZero.
Ioannis Antonoglou

Ioannis Antonoglou

Ioannis Antonoglou is a Greek computer scientist and reinforcement-learning researcher. He is the co-founder, president, and chief technology officer of Reflection AI, the 2024 startup building autonomous coding agents and a planned open-weights frontier language model. He was previously a founding-era research engineer at Google DeepMind, where he co-authored the AlphaGo, AlphaGo Zero, AlphaZero, and MuZero papers with David Silver and later led the reinforcement-learning-from-human-feedback program for Gemini alongside Misha Laskin.

At a glance

Origins

Antonoglou was born and raised in Thessaloniki, Greece, and completed a diploma in electrical and computer engineering at the Aristotle University of Thessaloniki in 2011. He moved to the UK for graduate study and finished an M.Sc. in artificial intelligence and machine learning at the University of Edinburgh in 2012. He has been based in London since.

Career

Antonoglou joined Google DeepMind in late 2012, two years before the Google acquisition, as employee number twenty-five and the sixth member of the research team. He was a co-author of the 2013 Playing Atari with Deep Reinforcement Learning preprint with Volodymyr Mnih, Koray Kavukcuoglu, David Silver, Alex Graves, Daan Wierstra, and Martin Riedmiller, and the follow-on February 2015 Nature paper Human-level control through deep reinforcement learning that introduced the deep Q-network architecture and established deep reinforcement learning as a viable research program.

Through the AlphaGo era (2014 to 2017) Antonoglou worked under Silver as the research engineer responsible for accelerating the neural networks that ran the system, including GPU optimization for the Lee Sedol-era training and the subsequent migration of the systems to Google's first-generation Tensor Processing Units. He was a co-author on the January 2016 Nature paper that introduced AlphaGo, the October 2017 AlphaGo Zero Nature paper, and the December 2017 AlphaZero preprint and December 2018 Science publication that generalized AlphaGo Zero to chess and shogi.

The MuZero project followed in 2019 and 2020. Antonoglou was the second of twelve authors on the December 2020 Nature paper Mastering Atari, Go, chess and shogi by planning with a learned model, behind first author Julian Schrittwieser. MuZero extended AlphaZero to settings where the rules of the environment are not given to the agent, by learning a model of the environment dynamics jointly with the policy and value networks. The MuZero second-author credit is the most prominent single artifact on Antonoglou's published record. In parallel with the DeepMind work, he completed a part-time PhD at University College London under Silver's supervision; the thesis Learning to Search in Reinforcement Learning was filed in 2023 and consolidated the AlphaZero, MuZero, and planning-with-learned-models research arc into a single doctoral submission.

After the December 2022 ChatGPT release, Antonoglou transitioned within DeepMind to lead the reinforcement-learning-from-human-feedback program for Gemini, the post-training pipeline that trains the preference and reward models underlying Google's frontier-model family. Misha Laskin, who had joined DeepMind in February 2022 and led reward modeling on the same program, was his principal collaborator across the 2022 to 2024 period.

In March 2024 Antonoglou and Laskin departed DeepMind to co-found Reflection AI, with Antonoglou as president and chief technology officer and Laskin as chief executive officer. The company emerged from stealth in March 2025 with $130 million in cumulative funding, including a $25 million seed round and a $105 million Series A co-led by Lightspeed Venture Partners and CRV. The first product, Asimov, is an autonomous coding agent released in November 2024 that ingests source code, project documentation, internal communications, and engineering notes to build a model of how a software system was developed. In October 2025 Reflection AI raised a $2 billion Series B at an $8 billion valuation led by NVIDIA, with additional participation from Lightspeed, Sequoia Capital, and Eric Schmidt. The company has publicly committed to releasing an open-weights frontier model in 2026 positioned as a US-domiciled alternative to DeepSeek.

Affiliations

Notable contributions

Antonoglou's published record runs from the founding-era DQN paper through the AlphaGo to MuZero arc and into the Gemini RLHF program, before the founding of Reflection AI.

Investments and boards

The entries below are limited to AI, semiconductors, datacenters, software, and energy.

  • Reflection AI (AI): Co-founder, president, and chief technology officer, March 2024 to present. Cumulative funding approximately $2.13 billion through April 2026, including a $25 million seed round, a $105 million Series A co-led by Lightspeed Venture Partners and CRV in early 2025, and a $2 billion Series B at an $8 billion valuation in October 2025 led by NVIDIA.

No other public investor activity on record in AI, semiconductors, datacenters, software, or energy as of May 2026.

Network

Antonoglou's longest-running professional relationship is with David Silver, who supervised his UCL doctoral work and led the AlphaGo through MuZero projects through which the two co-authored the principal Nature and Science papers from 2015 to 2020. Silver departed DeepMind in January 2026 to found Ineffable Intelligence. Antonoglou's recurring DeepMind co-authors across the AlphaGo to MuZero arc include Julian Schrittwieser (MuZero first author), Thomas Hubert, Karen Simonyan, Arthur Guez, and Laurent Sifre. From the earlier DQN era, his collaborators include Volodymyr Mnih (DQN first author) and Koray Kavukcuoglu.

His current closest professional relationship is with Misha Laskin, the Reflection AI chief executive officer with whom he ran the Gemini RLHF program at DeepMind from 2022 to 2024. The senior DeepMind research staff he overlapped with in the Gemini period include Demis Hassabis, Shane Legg, and Lila Ibrahim. Among Insurgent-lab co-founder peers, his Reflection AI position runs in parallel with Tim Rocktäschel at Recursive Superintelligence and Silver at Ineffable Intelligence, the two other 2025 to 2026 reinforcement-learning founders to depart Google DeepMind for independent labs.

Position in the field

As of May 2026, Antonoglou occupies a structurally distinctive position among Insurgent-lab chief technology officers through a twelve-year DeepMind tenure that included co-author credit on every major AlphaGo and MuZero paper, the UCL doctoral submission filed in 2023, and the rapid Reflection AI valuation acceleration to $8 billion. The AlphaGo to MuZero arc places him in the small group of senior researchers with co-author credit on the four foundational Nature and Science publications that defined deep reinforcement learning between 2015 and 2020.

Industry coverage has consistently characterized him as the AlphaGo-credentialed research counterpart to Laskin's chief-executive role at Reflection AI, with Antonoglou running the technical and research direction and Laskin running the public-facing strategic and fundraising functions. The October 2025 NVIDIA-led round placed Reflection AI among the highest-valuation Insurgent labs in the 2024 to 2025 cohort, behind only the largest scale (Safe Superintelligence, Thinking Machines Lab, and Ineffable Intelligence) on capital base. The DeepMind-to-founder trajectory in 2025 to 2026 includes Silver at Ineffable, Rocktäschel at Recursive Superintelligence, and Antonoglou and Laskin at Reflection AI.

Outlook

Open questions over the next 6 to 18 months:

  • Open-weights frontier-model release. The 2026 commitment to release a US-domiciled open-weights frontier model is the central public milestone for Reflection AI. Release timing, capability profile, parameter count, and licensing terms will shape the durability of the "America's open frontier AI lab" framing and Antonoglou's technical leadership role.
  • Asimov adoption and revenue. Customer base, deal sizes, and competitive position against OpenAI Codex, Anthropic Claude Code, and Cursor will inform whether the autonomous framing differentiates from the broader coding-agent market.
  • Research-direction validation. Whether the AlphaZero and MuZero search-and-planning thesis from his DeepMind work generalizes to open-ended reasoning at frontier scale.
  • Senior research talent recruitment. Continued movement of reinforcement-learning and post-training specialists from DeepMind, OpenAI, and Anthropic into Reflection AI's London, San Francisco, and New York offices, which had approximately 60 staff at the October 2025 round.
  • Follow-on financing. Reflection AI was reported in March 2026 to be seeking new investors at a valuation above $20 billion, which if confirmed would mark a further step from the October 2025 round.

Sources

About the author
Nextomoro

Nextomoro

nextomoro tracks progress for AI research labs, models, and what's next.

AI Research Lab Intelligence

nextomoro tracks progress for AI research labs, models, and what's next.

AI Research Lab Intelligence

Great! You’ve successfully signed up.

Welcome back! You've successfully signed in.

You've successfully subscribed to AI Research Lab Intelligence.

Success! Check your email for magic link to sign-in.

Success! Your billing info has been updated.

Your billing was not updated.