Bio

Jimmy Lei Ba is a Canadian machine-learning researcher and associate professor of computer science at the University of Toronto. He is a co-author of the Adam optimizer (Kingma and Ba, ICLR 2015) and first author of Layer Normalization (Ba, Kiros and Hinton, 2016), and he was one of the eleven publicly named founding team members of xAI from March 2023 until his departure on February 10, 2026. As of May 2026, he holds his University of Toronto faculty appointment, his Vector Institute affiliation, and his Canada CIFAR AI Chair; his post-xAI plans are undisclosed.

At a glance

Education: BASc (2011), MASc (2014), and PhD (2018) in Electrical and Computer Engineering at the University of Toronto. PhD supervised by Geoffrey Hinton; MASc supervised by Brendan Frey.
Current role: Associate Professor of computer science at the University of Toronto since July 2024 (Assistant Professor 2018 to 2024). Vector Institute faculty affiliate and a Canada CIFAR AI Chair since 2018.
Previous role: Founding team member of xAI from March 2023 to February 10, 2026. Press coverage described his portfolio as research, safety, and enterprise efforts, with credited contributions to the Grok 4 model line.
Key contributions: Adam optimizer (Kingma and Ba, ICLR 2015); Layer Normalization (Ba, Kiros, Hinton, 2016); Lookahead optimizer (Zhang, Lucas, Ba, Hinton, NeurIPS 2019); attention-mechanism work including Show, Attend and Tell (Xu, Ba et al., ICML 2015).
Awards: Sloan Research Fellowship in computer science (2023); Facebook Graduate Student Fellowship (2016 to 2018).
X / Twitter: @jimmybajimmyba
Personal site: jimmylba.github.io
Google Scholar: Jimmy Ba (h-index 69, more than 317,000 citations as of May 2026)

Origins

Public biographical material on Ba is comparatively thin. He has no Wikipedia entry as of May 2026, and the available record runs through his personal site at jimmylba.github.io, his University of Toronto faculty page, his Wikidata entry, the Adam and Layer Normalization papers, and the February 2026 press coverage of his departure from xAI. His given name appears as "Jimmy Lei Ba" on the Layer Normalization paper. His entire higher-education record runs through the University of Toronto from 2007 onward; personal background before university is not publicly documented.

Career

Ba completed a BASc (2011), MASc (2014), and PhD (2018) in Electrical and Computer Engineering at the University of Toronto. The MASc was supervised by Brendan Frey and produced the 2013 NeurIPS paper "Adaptive Dropout for Training Deep Neural Networks". The PhD was supervised by Geoffrey Hinton, with the thesis Learning to Attend with Neural Networks. The doctoral period included a 2013 research internship at Microsoft Research with Rich Caruana, producing "Do Deep Nets Really Need to be Deep?" at NeurIPS 2014, and a 2014 internship at Google DeepMind with Volodymyr Mnih and Koray Kavukcuoglu, producing "Multiple Object Recognition with Visual Attention" at ICLR 2015. He held a Massey College Junior Fellowship and a Facebook Graduate Student Fellowship.

The Adam optimizer paper, "Adam: A Method for Stochastic Optimization" by Diederik Kingma and Ba, was posted to arXiv in December 2014 and presented at ICLR 2015. Adam combined moving-average gradient estimates with adaptive per-parameter learning rates derived from second-moment estimates, in a form straightforward to implement and computationally efficient. As of May 2026 the paper has accumulated approximately 250,000 citations on Google Scholar, and Adam and its descendants are the dominant optimizers in contemporary frontier-model training stacks.

The Layer Normalization paper by Ba, Jamie Ryan Kiros, and Geoffrey Hinton was posted to arXiv in July 2016 and presented at the NeurIPS 2016 Deep Learning Symposium. The technique computes normalization statistics across the hidden units of a single training case rather than across a batch, removing the dependence on batch size and applying cleanly to recurrent architectures. It later became a default component of the transformer architecture and remains in use across the contemporary large-language-model stack.

Ba joined the University of Toronto Department of Computer Science as an Assistant Professor in 2018, with a Vector Institute affiliation, and was named a Canada CIFAR AI Chair in the inaugural cohort that December. The Sloan Research Fellowship in computer science followed in February 2023, and he was promoted to Associate Professor with tenure effective July 1, 2024.

In March 2023, Ba joined xAI as one of the eleven publicly named founding team members. The team was assembled from senior researchers at Google DeepMind, OpenAI, Google Brain, Microsoft Research, and the University of Toronto, and the launch was publicly announced on July 12, 2023. He retained his University of Toronto appointment in parallel with the xAI work.

On February 10, 2026, Ba posted on X that it was his last day at xAI. The post described xAI's mission as a project to "push humanity up the Kardashev tech tree", expressed gratitude to Elon Musk and the founding team, and indicated he would stay close to the company as a friend. Bloomberg, CNBC, and Silicon Republic covered the announcement, characterizing the departure as the sixth among the original twelve named co-founders and the second within twenty-four hours, following Yuhuai (Tony) Wu's announcement earlier on February 10, 2026. Press coverage described Ba's portfolio at xAI as research, safety, and enterprise efforts, with credited contributions to the Grok 4 model line. The departures came shortly after the SpaceX acquisition of xAI on February 2, 2026, in an all-stock transaction valuing the combined entity at $1.25 trillion. As of May 2026, Ba's post-xAI plans have not been publicly disclosed.

Affiliations

University of Toronto: BASc, MASc, and PhD in Electrical and Computer Engineering, 2007 to 2018.
Microsoft Research: Research Intern, 2013, in Redmond, Washington.
Google DeepMind: Research Intern, 2014, in London, England.
University of Toronto, Department of Computer Science: Assistant Professor, 2018 to July 2024; Associate Professor with tenure, July 2024 to present.
Vector Institute: Faculty Affiliate, 2018 to present.
Canada CIFAR AI Chair: Inaugural cohort named December 2018; renewed thereafter.
xAI: Founding team member, March 2023 to February 10, 2026.

Notable contributions

Ba's published record concentrates in optimization, normalization, attention mechanisms, and reinforcement learning. His Google Scholar profile lists Adam as the most-cited single paper, with Layer Normalization, Show Attend and Tell, and the Lookahead optimizer following.

"Adam: A Method for Stochastic Optimization" (Kingma and Ba, ICLR 2015). The most-cited deep-learning paper of the past decade, with approximately 250,000 citations on Google Scholar as of May 2026. Adam is the dominant first-order optimizer in contemporary frontier-model training stacks.
"Layer Normalization" (Ba, Kiros, Hinton, 2016). First-author paper introducing the normalization technique that became a default component of the transformer architecture. Approximately 18,700 citations.
"Show, Attend and Tell: Neural Image Caption Generation with Visual Attention" (Xu, Ba, Kiros, Cho, Courville, Salakhutdinov, Zemel, Bengio, ICML 2015). Co-author paper introducing soft and hard attention mechanisms for image captioning.
"Multiple Object Recognition with Visual Attention" (Ba, Mnih, Kavukcuoglu, ICLR 2015). First-author paper on glimpse-based attention, written during the 2014 DeepMind internship.
"Do Deep Nets Really Need to be Deep?" (Ba and Caruana, NeurIPS 2014). First-author paper on knowledge distillation, written during the 2013 Microsoft Research internship.
"Lookahead Optimizer: k Steps Forward, 1 Step Back" (Zhang, Lucas, Ba, Hinton, NeurIPS 2019). An outer-loop wrapper that improves the stability and generalization of inner optimizers including Adam.
"Large Language Models Are Human-Level Prompt Engineers" (Zhou et al., ICLR 2023). Co-author paper on automatic prompt generation using language models as their own optimizers.
xAI founding-team research contributions (March 2023 to February 2026). Press coverage credited Ba with research contributions to the Grok 4 model line. Specific authored xAI artifacts are not publicly disclosed.

Investments and boards

No public investor activity on record in AI, semiconductors, datacenters, software, or energy as of May 2026.

Network

Ba's longest-running professional relationships fall in three cohorts. The first is the University of Toronto machine-learning group and his thesis advisors, Geoffrey Hinton and Brendan Frey, with Ruslan Salakhutdinov as a frequent co-author. The second is the small set of long-standing co-authors on his most-cited work, including Diederik Kingma on Adam, Jamie Ryan Kiros on Layer Normalization and on Show Attend and Tell, Yoshua Bengio and Kelvin Xu on Show Attend and Tell, and Roger Grosse and James Martens on the Kronecker-factored optimization line. The third is the xAI founding team. Beyond Elon Musk, the founding cohort included Igor Babuschkin (departed August 2025 to launch Babuschkin Ventures), Greg Yang (informal advisor since January 2026), Christian Szegedy (departed February 2025 to join Morph Labs and later found Math Inc), Yuhuai (Tony) Wu (departed February 10, 2026), Manuel Kroiss, Toby Pohlen, Ross Nordeen, Kyle Kosic, Guodong Zhang, and Zihang Dai.

Position in the field

As of May 2026, Ba occupies a structurally distinctive position among machine-learning researchers. The Adam optimizer is the most widely used first-order optimization method for training deep neural networks, and Layer Normalization is a default component of the transformer architecture. The Adam paper alone has accumulated more than 250,000 citations on Google Scholar, with an aggregate count above 317,000, high for the associate-professor career stage. Press coverage of his February 2026 departure from xAI consistently led with the Adam attribution.

The 2023 Sloan Research Fellowship and the 2024 promotion to Associate Professor with tenure both fell within the xAI founding-team period. His public-commentary cadence is moderate. He posts intermittently to @jimmybajimmyba, with limited solo English-language video presence beyond the CIFAR Brains Behind AI profile in 2019, the DLRLSS 2019 Optimization in Deep Learning lecture, and the October 2020 CAIDA seminar at the University of British Columbia.

Outlook

Open questions over the next 6 to 18 months:

Post-xAI role. Whether Ba returns to a full-time University of Toronto research focus, joins another industry lab, or starts a new venture. The departure post framed 2026 as a recalibration year without naming a destination.
xAI advisor relationship. Whether the post-departure relationship remains active, including any continued advisory work, given the language about staying close to the team.
University of Toronto cadence. Whether a full-time academic posture produces new optimizer or scaling-method publications at the cadence of the pre-2023 period.
Optimizer research direction. Whether the next generation of optimizer work continues the Adam and Lookahead trajectory, including engagement with the muon and second-order-method literature.
Public-commentary cadence. Whether Ba's posting cadence and his seminar-and-podcast schedule pick up after the xAI departure.

Sources

Jimmy Ba (personal site). Personal site listing his career history at the University of Toronto and his publication record.
Jimmy Ba CV. Curriculum vitae listing education, fellowships, publications, talks, and industry experience through 2018.
Jimmy Ba on X. The @jimmybajimmyba X account, the primary public channel for his commentary and the source of the February 10, 2026 last-day-at-xAI announcement.
Jimmy Ba on Google Scholar. Publication and citation listing, including the Adam, Layer Normalization, and attention-mechanism papers.
Jimmy Ba on Wikidata. Wikidata entry recording his education and employment history.
Sloan Research Fellowships awarded to Jimmy Ba and Sushant Sachdeva. University of Toronto Department of Computer Science announcement of the February 2023 Sloan Fellowship.
Eight U of T researchers named AI chairs by Canadian Institute for Advanced Research. University of Toronto announcement of the December 2018 Canada CIFAR AI Chair appointment.
The Brains Behind AI: Jimmy Ba. The October 2019 CIFAR profile video on Ba and the Adam optimizer.
Musk's xAI loses second co-founder in two days as Jimmy Ba departs. CNBC coverage of the February 10, 2026 departure announcement.
xAI Co-Founders Ba, Wu Join Exodus From Musk's AI Startup. Bloomberg coverage of the Wu and Ba departures.
Jimmy Ba and Tony Wu latest tech co-founders to exit Musk's xAI. Silicon Republic coverage of the SpaceX-acquisition timing and the founding-team departures.
Adam: A Method for Stochastic Optimization. The 2014 arXiv posting and 2015 ICLR paper introducing the Adam optimizer.
Layer Normalization. The 2016 paper introducing the layer-normalization technique.
Lookahead Optimizer: k Steps Forward, 1 Step Back. The 2019 NeurIPS paper introducing the Lookahead optimizer.
Show, Attend and Tell: Neural Image Caption Generation with Visual Attention. The 2015 ICML paper on attention-based image captioning.
Learning to Attend with Neural Networks (PhD thesis). The 2018 University of Toronto PhD thesis.
Announcing xAI. The July 12, 2023 founding announcement, naming Ba among the founding team members.
Feature image: Jimmy Ba portrait, University of Toronto Department of Computer Science, used per faculty-page reproduction conventions.

Jimmy Ba

Bio

At a glance

Origins

Career

Affiliations

Notable contributions

Investments and boards

Network

Position in the field

Outlook

Sources

Nextomoro

AI Research Lab Intelligence

Jimmy Ba

Bio

At a glance

Origins

Career

Affiliations

Notable contributions

Investments and boards

Network

Position in the field

Outlook

Sources

Nextomoro

QwQ-32B

Qwen3 Coder 480B-A35B

MiniMax M2

Kimi K2.5

Qwen 3.6

AI Research Lab Intelligence