Ioannis Antonoglou
Ioannis Antonoglou is a Greek computer scientist and reinforcement-learning researcher. He is the co-founder, president, and chief technology officer of Reflection AI, the 2024 startup building autonomous coding agents and a planned open-weights frontier language model. He was previously a founding-era research engineer at Google DeepMind, where he co-authored the AlphaGo, AlphaGo Zero, AlphaZero, and MuZero papers with David Silver and later led the reinforcement-learning-from-human-feedback program for Gemini alongside Misha Laskin.
At a glance
- Education: Diploma in electrical and computer engineering, Aristotle University of Thessaloniki (2011); M.Sc. in artificial intelligence and machine learning, University of Edinburgh (2012); PhD in computer science, University College London (2023), supervised by David Silver, with a thesis titled Learning to Search in Reinforcement Learning.
- Current role: Co-founder, president, and chief technology officer of Reflection AI since March 2024.
- Key contributions: co-author of the AlphaGo (Nature, January 2016), AlphaGo Zero (Nature, October 2017), and AlphaZero (December 2017) papers; second author on the MuZero Nature paper (December 2020); co-author of the DQN Nature paper (February 2015); reinforcement-learning-from-human-feedback program lead on Gemini at Google DeepMind (2022 to 2024); co-founder of Reflection AI alongside Misha Laskin (2024).
- X / Twitter: @real_ioannis
- LinkedIn: Ioannis Alexandros Antonoglou
- Google Scholar: Ioannis Antonoglou
Origins
Antonoglou was born and raised in Thessaloniki, Greece, and completed a diploma in electrical and computer engineering at the Aristotle University of Thessaloniki in 2011. He moved to the UK for graduate study and finished an M.Sc. in artificial intelligence and machine learning at the University of Edinburgh in 2012. He has been based in London since.
Career
Antonoglou joined Google DeepMind in late 2012, two years before the Google acquisition, as employee number twenty-five and the sixth member of the research team. He was a co-author of the 2013 Playing Atari with Deep Reinforcement Learning preprint with Volodymyr Mnih, Koray Kavukcuoglu, David Silver, Alex Graves, Daan Wierstra, and Martin Riedmiller, and the follow-on February 2015 Nature paper Human-level control through deep reinforcement learning that introduced the deep Q-network architecture and established deep reinforcement learning as a viable research program.
Through the AlphaGo era (2014 to 2017) Antonoglou worked under Silver as the research engineer responsible for accelerating the neural networks that ran the system, including GPU optimization for the Lee Sedol-era training and the subsequent migration of the systems to Google's first-generation Tensor Processing Units. He was a co-author on the January 2016 Nature paper that introduced AlphaGo, the October 2017 AlphaGo Zero Nature paper, and the December 2017 AlphaZero preprint and December 2018 Science publication that generalized AlphaGo Zero to chess and shogi.
The MuZero project followed in 2019 and 2020. Antonoglou was the second of twelve authors on the December 2020 Nature paper Mastering Atari, Go, chess and shogi by planning with a learned model, behind first author Julian Schrittwieser. MuZero extended AlphaZero to settings where the rules of the environment are not given to the agent, by learning a model of the environment dynamics jointly with the policy and value networks. The MuZero second-author credit is the most prominent single artifact on Antonoglou's published record. In parallel with the DeepMind work, he completed a part-time PhD at University College London under Silver's supervision; the thesis Learning to Search in Reinforcement Learning was filed in 2023 and consolidated the AlphaZero, MuZero, and planning-with-learned-models research arc into a single doctoral submission.
After the December 2022 ChatGPT release, Antonoglou transitioned within DeepMind to lead the reinforcement-learning-from-human-feedback program for Gemini, the post-training pipeline that trains the preference and reward models underlying Google's frontier-model family. Misha Laskin, who had joined DeepMind in February 2022 and led reward modeling on the same program, was his principal collaborator across the 2022 to 2024 period.
In March 2024 Antonoglou and Laskin departed DeepMind to co-found Reflection AI, with Antonoglou as president and chief technology officer and Laskin as chief executive officer. The company emerged from stealth in March 2025 with $130 million in cumulative funding, including a $25 million seed round and a $105 million Series A co-led by Lightspeed Venture Partners and CRV. The first product, Asimov, is an autonomous coding agent released in November 2024 that ingests source code, project documentation, internal communications, and engineering notes to build a model of how a software system was developed. In October 2025 Reflection AI raised a $2 billion Series B at an $8 billion valuation led by NVIDIA, with additional participation from Lightspeed, Sequoia Capital, and Eric Schmidt. The company has publicly committed to releasing an open-weights frontier model in 2026 positioned as a US-domiciled alternative to DeepSeek.
Affiliations
- Google DeepMind: Founding-era research engineer through senior staff researcher, late 2012 to March 2024.
- University College London: PhD candidate in computer science, advised by David Silver, filed 2023.
- Reflection AI: Co-founder, president, and chief technology officer, March 2024 to present.
Notable contributions
Antonoglou's published record runs from the founding-era DQN paper through the AlphaGo to MuZero arc and into the Gemini RLHF program, before the founding of Reflection AI.
- Playing Atari with Deep Reinforcement Learning (December 2013) and Human-level control through deep reinforcement learning (February 2015). Co-author on the NIPS 2013 workshop preprint and the follow-on Nature paper introducing the deep Q-network architecture for Atari, with Volodymyr Mnih, Koray Kavukcuoglu, and David Silver as lead authors.
- Mastering the game of Go with deep neural networks and tree search (January 2016). Eighth of twenty authors on the AlphaGo Nature paper, with Silver and Aja Huang as lead authors. The system defeated Fan Hui 5 to 0 in October 2015 and Lee Sedol 4 to 1 in March 2016 in Seoul.
- Mastering the game of Go without human knowledge (October 2017). Fourth of seventeen authors on the AlphaGo Zero Nature paper, which learned Go from self-play and surpassed the original AlphaGo in three days of training.
- Mastering Chess and Shogi by Self-Play with a General Reinforcement Learning Algorithm (December 2017). Fourth of thirteen authors on the AlphaZero preprint and December 2018 Science paper, generalizing AlphaGo Zero across chess, shogi, and Go from random initialization.
- Mastering Atari, Go, chess and shogi by planning with a learned model (December 2020). Second of twelve authors on the MuZero Nature paper, with Julian Schrittwieser as first author. MuZero extended AlphaZero to environments where the rules are unknown.
- Learning to Search in Reinforcement Learning (2023). UCL PhD thesis supervised by David Silver.
- Gemini RLHF program (2022 to 2024). Reinforcement-learning-from-human-feedback program lead at Google DeepMind, working alongside Misha Laskin on reward modeling.
- Reflection AI founding (March 2024). Co-founder, president, and chief technology officer. The company has shipped the Asimov coding agent, raised approximately $2.13 billion, and committed to a 2026 open-weights frontier-model release.
- Public-talk record. From AlphaGo to AGI on Sequoia Capital's Training Data podcast (January 2025); Why Reflection AI Bets Their Business on Open Weights at Sierra Ventures (2025); Outliers Greek-language podcast.
Investments and boards
The entries below are limited to AI, semiconductors, datacenters, software, and energy.
- Reflection AI (AI): Co-founder, president, and chief technology officer, March 2024 to present. Cumulative funding approximately $2.13 billion through April 2026, including a $25 million seed round, a $105 million Series A co-led by Lightspeed Venture Partners and CRV in early 2025, and a $2 billion Series B at an $8 billion valuation in October 2025 led by NVIDIA.
No other public investor activity on record in AI, semiconductors, datacenters, software, or energy as of May 2026.
Network
Antonoglou's longest-running professional relationship is with David Silver, who supervised his UCL doctoral work and led the AlphaGo through MuZero projects through which the two co-authored the principal Nature and Science papers from 2015 to 2020. Silver departed DeepMind in January 2026 to found Ineffable Intelligence. Antonoglou's recurring DeepMind co-authors across the AlphaGo to MuZero arc include Julian Schrittwieser (MuZero first author), Thomas Hubert, Karen Simonyan, Arthur Guez, and Laurent Sifre. From the earlier DQN era, his collaborators include Volodymyr Mnih (DQN first author) and Koray Kavukcuoglu.
His current closest professional relationship is with Misha Laskin, the Reflection AI chief executive officer with whom he ran the Gemini RLHF program at DeepMind from 2022 to 2024. The senior DeepMind research staff he overlapped with in the Gemini period include Demis Hassabis, Shane Legg, and Lila Ibrahim. Among Insurgent-lab co-founder peers, his Reflection AI position runs in parallel with Tim Rocktäschel at Recursive Superintelligence and Silver at Ineffable Intelligence, the two other 2025 to 2026 reinforcement-learning founders to depart Google DeepMind for independent labs.
Position in the field
As of May 2026, Antonoglou occupies a structurally distinctive position among Insurgent-lab chief technology officers through a twelve-year DeepMind tenure that included co-author credit on every major AlphaGo and MuZero paper, the UCL doctoral submission filed in 2023, and the rapid Reflection AI valuation acceleration to $8 billion. The AlphaGo to MuZero arc places him in the small group of senior researchers with co-author credit on the four foundational Nature and Science publications that defined deep reinforcement learning between 2015 and 2020.
Industry coverage has consistently characterized him as the AlphaGo-credentialed research counterpart to Laskin's chief-executive role at Reflection AI, with Antonoglou running the technical and research direction and Laskin running the public-facing strategic and fundraising functions. The October 2025 NVIDIA-led round placed Reflection AI among the highest-valuation Insurgent labs in the 2024 to 2025 cohort, behind only the largest scale (Safe Superintelligence, Thinking Machines Lab, and Ineffable Intelligence) on capital base. The DeepMind-to-founder trajectory in 2025 to 2026 includes Silver at Ineffable, Rocktäschel at Recursive Superintelligence, and Antonoglou and Laskin at Reflection AI.
Outlook
Open questions over the next 6 to 18 months:
- Open-weights frontier-model release. The 2026 commitment to release a US-domiciled open-weights frontier model is the central public milestone for Reflection AI. Release timing, capability profile, parameter count, and licensing terms will shape the durability of the "America's open frontier AI lab" framing and Antonoglou's technical leadership role.
- Asimov adoption and revenue. Customer base, deal sizes, and competitive position against OpenAI Codex, Anthropic Claude Code, and Cursor will inform whether the autonomous framing differentiates from the broader coding-agent market.
- Research-direction validation. Whether the AlphaZero and MuZero search-and-planning thesis from his DeepMind work generalizes to open-ended reasoning at frontier scale.
- Senior research talent recruitment. Continued movement of reinforcement-learning and post-training specialists from DeepMind, OpenAI, and Anthropic into Reflection AI's London, San Francisco, and New York offices, which had approximately 60 staff at the October 2025 round.
- Follow-on financing. Reflection AI was reported in March 2026 to be seeking new investors at a valuation above $20 billion, which if confirmed would mark a further step from the October 2025 round.
Sources
- Ioannis Alexandros Antonoglou. LinkedIn profile with current role and career history.
- Ioannis Antonoglou | Google Scholar. Google Scholar profile listing the AlphaGo, AlphaGo Zero, AlphaZero, MuZero, DQN, and Gemini publications.
- Ioannis Antonoglou | Sequoia Capital. Sequoia Capital founder profile page.
- Reflection AI: The Race to Unlock Superintelligence. Sequoia spotlight covering the Antonoglou and Laskin DeepMind backgrounds and the Reflection AI founding thesis.
- Ioannis Antonoglou | Endeavor Greece. Endeavor Greece entrepreneur profile with biographical and education detail.
- Mastering the game of Go with deep neural networks and tree search. The January 2016 Nature paper introducing AlphaGo, Antonoglou as eighth author.
- Mastering the game of Go without human knowledge. The October 2017 Nature paper on AlphaGo Zero, Antonoglou as fourth author.
- Mastering Chess and Shogi by Self-Play with a General Reinforcement Learning Algorithm. The December 2017 AlphaZero preprint, Antonoglou as fourth author.
- Mastering Atari, Go, chess and shogi by planning with a learned model. The December 2020 Nature paper on MuZero, Antonoglou as second author behind Julian Schrittwieser.
- Human-level control through deep reinforcement learning. The February 2015 Nature paper on the deep Q-network architecture.
- Learning to Search in Reinforcement Learning. UCL PhD thesis filed in 2023, supervised by David Silver.
- From AlphaGo to AGI ft ReflectionAI Founder Ioannis Antonoglou. January 2025 Sequoia Capital Training Data podcast covering the DeepMind years and the Reflection AI founding.
- ReflectionAI founder Ioannis Antonoglou: From AlphaGo to AGI. Sequoia Capital podcast page with episode notes and full transcript.
- Open Models, Open Futures with Ioannis Antonoglou. Sierra Ventures Ascend interview confirming the co-founder, president, and CTO title.
- Photo: Ioannis Antonoglou | Sequoia Capital, Sequoia Capital founder portrait.