Greg Yang
Greg Yang is a Chinese-American mathematician and artificial intelligence researcher known for theoretical work on the scaling of large neural networks. He is a founding team member of xAI since March 2023, the author of the Tensor Programs series of papers, and the developer of the maximal update parametrization (muP) and muTransfer hyperparameter-transfer technique referenced in the GPT-4 technical report. As of May 2026, he is in an informal advisor role at xAI following a January 2026 Lyme-disease diagnosis, after stepping back from his operational role on January 21, 2026.
At a glance
- Education: AB in Mathematics and SM in Computer Science from Harvard University (2018, accelerated AB/SM program). 2018 Hoopes Prize for senior thesis "A Homological Theory of Functions"; honorable mention for the 2018 Frank and Brennie Morgan Prize for outstanding undergraduate research in mathematics.
- Current role: Informal advisor at xAI since January 21, 2026, after stepping back from an operational founding-team role for health reasons.
- Previous role: Founding team member of xAI from March 2023 to January 2026, and Senior Researcher at Microsoft Research from 2018 to 2023.
- Key contributions: Tensor Programs I through VI (2019 to 2024); maximal update parametrization (muP) and the muTransfer zero-shot hyperparameter-transfer technique referenced in the GPT-4 training stack.
- X / Twitter: @TheGregYang
- GitHub: thegregyang
- Personal site: thegregyang.com
- Mastodon: @thegregyang@mathstodon.xyz
- Google Scholar: Greg Yang
Origins
Public biographical material on Yang is comparatively thin. He has no Wikipedia entry as of May 2026, and the available record runs through his personal site at thegregyang.com, his X account, the Tensor Programs paper series, his Microsoft Research talks, and the January 2026 press coverage of his step-back from xAI.
Yang was born in Hunan Province, China, and moved with his family to Guangzhou for kindergarten and to Beijing for elementary school. The family relocated to the United States during his middle-school years, with the move taking him through Houston, Texas, before completing high school in Montgomery County, Maryland.
He entered Harvard College and concentrated in mathematics, with research interests in algebraic topology, computational complexity, and learning theory. At the end of his sophomore year he took a leave of absence to work as an electronic-dance-music producer and disc jockey under the stage name Zeta. Public reporting credits the music period as the moment he became interested in artificial intelligence; on returning to Harvard he accelerated the remaining course load and completed both the AB in Mathematics and the SM in Computer Science through the combined AB/SM program in 2018.
Career
Yang's senior thesis at Harvard, "A Homological Theory of Functions", developed a framework placing Boolean function complexity, learning theory, and algebraic topology on a common footing. The thesis won the 2018 Hoopes Prize and was named for the 2018 Morgan Prize honorable mention. He acknowledged Madhu Sudan, Shing-Tung Yau, and Michael Freedman among others in the thesis.
Yang joined Microsoft Research as a researcher in 2018, on the recommendation of Harry Shum, then Microsoft executive vice president of artificial intelligence and research, after declining an offer from Google. He was based in Redmond, Washington, and rose to Senior Researcher. The principal output of the period is the Tensor Programs series of papers, beginning with "Tensor Programs I: Wide Feedforward or Recurrent Neural Networks of Any Architecture are Gaussian Processes" at NeurIPS 2019, and continuing through "Tensor Programs II" (June 2020), "Tensor Programs III: Neural Matrix Laws" (September 2020), Tensor Programs IV at ICML 2021, "Tensor Programs V: Tuning Large Neural Networks via Zero-Shot Hyperparameter Transfer" at NeurIPS 2021, and "Tensor Programs IVb: Adaptive Optimization in the Infinite-Width Limit" in August 2023. Tensor Programs VI extended the framework to depthwise transfer in residual networks under a parametrization labeled Depth-muP.
Tensor Programs V is the most-cited single paper in the series. The work introduced the maximal update parametrization (muP), which preserves optimal hyperparameters across model widths, and the muTransfer technique, which uses muP to tune hyperparameters on a small proxy model and zero-shot transfer them to a much larger target model. The accompanying Microsoft Research blog post reported that hyperparameters transferred from a 40-million-parameter proxy outperformed published numbers for the 6.7-billion-parameter version of GPT-3, with tuning compute amounting to roughly seven percent of the final pretraining run. The GPT-4 technical report cites Yang and collaborators in describing the predictable-scaling methodology used to train the model.
In March 2023, Yang left Microsoft Research and co-founded xAI with Elon Musk, as one of the eleven publicly named founding team members. The team was assembled from senior researchers at Google DeepMind, OpenAI, and Google Brain, and the launch was publicly announced on July 12, 2023. Yang's role at xAI was described in the founding announcement and in subsequent xAI communications as a mathematician focused on theoretical foundations for the lab's training methods. Specific functional titles beyond "founding team member" are not publicly stated.
On January 21, 2026, Yang posted on X that he had been diagnosed with Lyme disease and would step back from his operational role at xAI into an informal advisory position. He stated that the symptoms began in early 2025 with what he had assumed was a cold or flu, and that the lingering fatigue progressed to multi-day exhaustion after exercise or certain foods. Bloomberg, Yahoo Finance, and Investing.com covered the announcement; Musk replied publicly with a recovery wish. The transition took effect immediately.
Affiliations
- Harvard University: AB in Mathematics and SM in Computer Science, combined AB/SM program, through 2018.
- Microsoft Research: Researcher and then Senior Researcher, 2018 to March 2023.
- xAI: Founding team member, March 2023 to January 2026 (operational role); informal advisor, January 2026 to present.
Notable contributions
Yang's published record concentrates in the theoretical foundations of large-scale deep learning, with the Tensor Programs series and the muP / muTransfer technique as the principal entries. His Google Scholar profile lists Tensor Programs V as the most-cited single paper.
- "Tensor Programs I: Wide Feedforward or Recurrent Neural Networks of Any Architecture are Gaussian Processes" (NeurIPS 2019). Sole-author paper introducing the Tensor Programs formal language and the Master Theorem for the infinite-width limit of any architecture expressible in the language.
- "Tensor Programs II: Neural Tangent Kernel for Any Architecture" (June 2020). Extended the framework to neural tangent kernel computations across architectures.
- "Tensor Programs III: Neural Matrix Laws" (September 2020). Proved laws governing the spectra of neural-network weight and activation matrices in the wide limit.
- Tensor Programs IV: Feature Learning in Infinite-Width Neural Networks (ICML 2021). Identified the maximal update parametrization (muP) as the unique parametrization in which feature learning persists in the infinite-width limit.
- "Tensor Programs V: Tuning Large Neural Networks via Zero-Shot Hyperparameter Transfer" (NeurIPS 2021, with co-authors at Microsoft Research and OpenAI). Introduced the muTransfer technique. The associated Microsoft Research blog post presented the GPT-3-scale results, and the GPT-4 technical report cites the work in describing the predictable-scaling stack.
- "Tensor Programs IVb: Adaptive Optimization in the Infinite-Width Limit" (August 2023). Extended the framework to general adaptive optimizers including Adam, refining the optimizer treatment in earlier installments.
- Tensor Programs VI (2024). Extended muP to residual networks of arbitrary depth, introducing Depth-muP for depthwise hyperparameter transfer.
- muP open-source library (microsoft/mup on GitHub). PyTorch implementation of the maximal update parametrization, available via
pip install mup. - xAI founding-team research contributions (March 2023 to January 2026). Theoretical-foundations work on the scaling methodology applied across the Grok model family. Specific authored xAI artifacts are not publicly disclosed.
Investments and boards
No public investor activity on record in AI, semiconductors, datacenters, software, or energy as of May 2026.
Network
Yang's longest-running professional relationships fall in three cohorts. The first is the Microsoft Research cohort that worked alongside him on the Tensor Programs and muP papers, including co-authors on Tensor Programs V drawn from Microsoft and OpenAI. The second is his Harvard cohort and the academic advisors named in his thesis acknowledgements, including Madhu Sudan, Shing-Tung Yau, and Michael Freedman. The third is the xAI founding team. Beyond Elon Musk, the founding cohort included Igor Babuschkin (engineering lead through August 2025, departed to launch Babuschkin Ventures), Christian Szegedy (departed February 2025 to join Morph Labs and later found Math Inc), Yuhuai (Tony) Wu (departed February 10, 2026), Jimmy Ba (departed February 10, 2026), Manuel Kroiss, Toby Pohlen, Ross Nordeen, Kyle Kosic, Guodong Zhang, and Zihang Dai.
Position in the field
As of May 2026, Yang occupies a structurally distinctive position among researchers working on large-scale neural-network theory. The Tensor Programs framework is the principal mathematical treatment of infinite-width neural-network behavior across multiple architectures and optimizers, and the muTransfer technique is one of the few theoretically motivated methods that has crossed into the production training stacks of frontier large language models. The GPT-4 technical report reference is the principal signal for the practical reach of the work.
The xAI founding-team role is the most-publicly-documented part of his record outside the academic publication channel. Press coverage in 2023 and 2024 named him among the technical leadership of the founding team, with Bloomberg, Yahoo Finance, and other outlets covering the January 2026 step-back. The transition to an informal advisor role removed him from day-to-day operating cadence but preserved the formal affiliation.
His public-commentary cadence runs through the @TheGregYang X account, the thegregyang.com personal site, and a series of academic seminars and podcast appearances. The January 2023 Cartesian Cafe episode with Timothy Nguyen is a three-hour treatment of the Tensor Programs framework intended as the canonical long-form introduction.
Outlook
Open questions over the next 6 to 18 months:
- Health and return cadence. Whether Yang's recovery from Lyme disease permits a return to operational research at xAI or elsewhere, and over what timeline.
- Tensor Programs continuation. Whether the framework continues to develop publicly through xAI publications, through return engagement at Microsoft Research, or through independent academic collaborations.
- muP adoption pattern. Whether published industry implementations of muP and muTransfer continue to spread beyond the GPT-4 reference and the open-source microsoft/mup library, and whether xAI publishes its own muP variants.
- xAI advisor relationship. Whether the informal advisor role becomes a publicly visible commentary or research-direction channel, or remains a private capacity.
- Public-commentary cadence. Whether Yang resumes the seminar and podcast cadence of his Microsoft-era public profile, given the reduced operational load.
Sources
- Greg Yang (personal site). Personal site listing his career history at Microsoft Research and xAI, the Tensor Programs paper series, and selected talks.
- Greg Yang on X. The @TheGregYang X account, the primary public channel for his commentary and the source of the January 21, 2026 Lyme-disease announcement.
- Greg Yang on GitHub. Public repositories including code accompanying the Tensor Programs papers.
- Greg Yang on Google Scholar. Publication and citation listing.
- Greg Yang on Mathstodon. Mathematics-focused Mastodon presence.
- xAI Co-Founder Yang Leaves Musk's Startup After Lyme Diagnosis. Bloomberg coverage of the January 21, 2026 step-back announcement.
- xAI cofounder is taking a step back to 'go founder mode on my health' after Lyme disease diagnosis. Yahoo Finance / AOL coverage of the January 2026 announcement and the move to an informal advisor role.
- xAI co-founder Greg Yang steps down after Lyme disease diagnosis. Investing.com coverage of the symptom timeline and the advisor transition.
- Tensor Programs I: Wide Feedforward or Recurrent Neural Networks of Any Architecture are Gaussian Processes. The 2019 sole-author paper introducing the Tensor Programs formalism.
- Tensor Programs V: Tuning Large Neural Networks via Zero-Shot Hyperparameter Transfer. The March 2022 paper introducing muTransfer.
- muTransfer: A technique for hyperparameter tuning of enormous neural networks. Microsoft Research blog post on the muP / muTransfer results.
- GPT-4 technical report. OpenAI report citing Yang and collaborators in the predictable-scaling methodology section.
- Announcing xAI. The July 12, 2023 founding announcement, naming Yang among the founding team members.
- 2018 Frank and Brennie Morgan Prize. Wikipedia entry on the Morgan Prize, recording Yang's 2018 honorable mention for "Homological theory of functions" at Harvard University.
- Greg Yang | Large N Limits: Random Matrices & Neural Networks | The Cartesian Cafe. The January 2023 long-form podcast episode with Timothy Nguyen on the Tensor Programs framework.