Edouard Grave
Edouard Grave is a French computer scientist, born June 28, 1986. He is the Chief Language Officer of Kyutai, the Paris-based nonprofit AI research lab founded in November 2023, where he leads work on large language models, and a co-author on the fastText library, the original LLaMA paper at Meta AI FAIR Paris, and the Moshi speech-text foundation model. As of May 2026, he is one of the senior researchers shaping the open-research direction at Kyutai, alongside ex-FAIR Paris colleagues Hervé Jegou and Patrick Pérez.
At a glance
- Education: Engineer's degree, École Polytechnique (X2006, class of 2006, graduated 2009); PhD in statistics, Université Pierre et Marie Curie (Paris VI, now Sorbonne Université), 2010 to 2014, advised by Francis Bach and Guillaume Obozinski at Inria.
- Current role: Chief Language Officer at Kyutai since November 2023.
- Key contributions: co-author on the fastText word-embeddings line, including Enriching Word Vectors with Subword Information (TACL 2017) and Bag of Tricks for Efficient Text Classification (EACL 2017); co-author on the LLaMA paper (February 2023); co-author on the retrieval-augmented Atlas language model (2022) and the Moshi full-duplex spoken-dialogue model (2024).
- X / Twitter: @EXGRV
- GitHub: EdouardGrave
- Google Scholar: 7UV4ET4AAAAJ
- OpenReview: Edouard_Grave1
- arXiv author page: Grave, E.
Origins
Grave was born on June 28, 1986 in France. He entered École Polytechnique in 2006 as a member of the X2006 promotion. His admission and the 2009 engineer's-degree graduation are recorded in the Légifrance decrees that publish each Polytechnique class roster. The Polytechnique years overlapped with the early 2010s expansion of machine-learning research in Paris that produced much of the cohort he would later work with at FAIR and at Kyutai.
After Polytechnique, Grave moved into doctoral research at the boundary of statistics and machine learning, the path that took him through Inria, three postdoctoral positions, and into industrial AI research.
Career
Grave began his PhD in 2010 at Université Pierre et Marie Curie (Paris VI), the science campus of what became Sorbonne Université, with research carried out at Inria under Francis Bach and Guillaume Obozinski. The thesis, "A Markovian approach to distributional semantics," was defended in 2014 and developed probabilistic models for learning word and phrase representations from large unlabeled corpora using hidden-Markov-tree structure over syntactic dependency parses. The thesis line produced a 2011 NIPS paper on the Trace Lasso regularizer with Bach and Obozinski and a 2013 CoNLL paper on Hidden Markov tree models for semantic class induction with the same advisors.
After defending in 2014, Grave held three postdoctoral positions in succession. He spent 2014 at the University of California, Berkeley on the STATWEB Inria-Berkeley team, working between Berkeley and Inria's SIERRA group. In 2015 he was a postdoctoral researcher at Columbia University in New York. In 2016 he joined Facebook AI Research (FAIR) as a postdoctoral researcher in the Paris and New York offices. The period produced the Improving Neural Language Models with a Continuous Cache paper at ICLR 2017, the first of the language-modeling artifacts that defined the next several years of his record.
Grave converted from postdoctoral researcher to Research Scientist at FAIR in 2017 and remained at the lab through early 2023. The FAIR period produced three of his most-cited research lines. The first was the fastText word-embeddings work with Armand Joulin, Piotr Bojanowski, and Tomas Mikolov. The two flagship papers, Enriching Word Vectors with Subword Information (TACL 2017) and Bag of Tricks for Efficient Text Classification (EACL 2017), introduced subword-aware embeddings and the fastText classifier and reached over 22,000 combined Google Scholar citations by mid-2026. The associated GitHub library became one of the most widely used pieces of open-source NLP infrastructure, with multilingual word vectors for 157 languages released in Grave et al. 2018 at LREC.
The second line was retrieval-augmented language modeling, with the 2022 Atlas paper co-authored with Gautier Izacard and others showing that an 11-billion-parameter retrieval-augmented model could outperform a 540-billion-parameter dense model on Natural Questions with only 64 training examples. The line built on earlier Grave-Izacard collaborations on Fusion-in-Decoder retrieval architectures.
The third line was the LLaMA paper, submitted to arXiv on February 27, 2023 under the title "LLaMA: Open and Efficient Foundation Language Models". The 14-author byline lists Grave alongside Hugo Touvron as first author, Timothée Lacroix, Marie-Anne Lachaux, Baptiste Rozière, and others, with Guillaume Lample as the last-listed senior author. The paper introduced the foundation-model family in 7B, 13B, 33B, and 65B parameter sizes and triggered the wave of open-weights frontier-class language models that followed through 2023 and 2024.
Grave left Meta in 2023 in the larger FAIR Paris exodus that also produced Mistral AI. He spent a brief period at Apple in 2023 before joining Kyutai at its November 2023 launch as one of the lab's six founding researchers, alongside Pérez, Hervé Jegou, Laurent Mazaré, Neil Zeghidour, and Alexandre Défossez. He carries the Chief Language Officer title and leads the language-model side of the program, with the Moshi speech-text foundation model in 2024 and the Helium compact multilingual model in 2025 as the period's most visible artifacts.
Affiliations
- École Polytechnique: Engineering student, X2006 promotion, 2006 to 2009.
- Université Pierre et Marie Curie (Paris VI) and Inria: PhD candidate in statistics, 2010 to 2014.
- University of California, Berkeley: Postdoctoral researcher (STATWEB Inria-Berkeley associated team), 2014.
- Columbia University: Postdoctoral researcher, 2015.
- Meta AI (FAIR Paris and New York): Postdoctoral researcher 2016 to 2017; Research Scientist 2017 to 2023.
- Apple: Researcher, 2023.
- Kyutai: Chief Language Officer, November 2023 to present.
Notable contributions
Grave's published record runs from distributional-semantics work at Inria, through the fastText embeddings line and the retrieval-augmented language-modeling line at FAIR, into the LLaMA paper, and on to the Moshi and Helium releases at Kyutai. His Google Scholar profile lists approximately 86,000 citations and an h-index of 52 as of mid-2026.
- Enriching Word Vectors with Subword Information (TACL 2017). Co-authored with Bojanowski, Joulin, and Mikolov; the paper introducing the subword-aware fastText word embeddings, with over 15,500 citations.
- Bag of Tricks for Efficient Text Classification (EACL 2017). Co-authored with Joulin, Bojanowski, and Mikolov; the fastText text-classifier paper, with over 7,500 citations.
- fastText library. The open-source word-representation and text-classification library that grew out of the EACL and TACL 2017 papers and became one of the most widely used NLP toolkits of the late 2010s.
- Learning Word Vectors for 157 Languages (LREC 2018). First-authored multilingual extension of the fastText embeddings, distributing pretrained word vectors for 157 languages from Wikipedia and Common Crawl.
- LLaMA: Open and Efficient Foundation Language Models (February 2023). Co-author on the 14-author paper introducing the LLaMA model family, with over 27,000 citations and a foundational role in the wave of open-weights frontier-class language models.
- Atlas: Few-shot Learning with Retrieval Augmented Language Models (2022, JMLR 2023). Co-author with Izacard and others of the retrieval-augmented language-model paper showing that an 11B-parameter model could outperform much larger dense models on knowledge-intensive few-shot tasks.
- Improving Neural Language Models with a Continuous Cache (ICLR 2017). First-authored language-modeling paper introducing a non-parametric cache mechanism for neural language models.
- Unsupervised Cross-lingual Representation Learning at Scale (XLM-R, 2020). Co-author on the multilingual masked-language-model paper that defined cross-lingual pretraining at scale, with over 10,000 citations.
- Moshi: a speech-text foundation model for real-time dialogue (2024). Co-author on the Kyutai full-duplex speech-text foundation model paper, with Alexandre Défossez, Laurent Mazaré, Manu Orsini, Amélie Royer, Pérez, Hervé Jegou, and Neil Zeghidour.
- Helium 1 (2025). Co-author on the Kyutai compact multilingual 2-billion-parameter language model release.
Investments and boards
No public investor activity on record in AI, semiconductors, datacenters, software, or energy as of May 2026.
Network
Grave's research collaborations trace through three overlapping cohorts. The first is his Inria PhD lineage: Francis Bach and Guillaume Obozinski were his doctoral advisors, and the early statistics-and-NLP papers from 2011 to 2014 were co-authored with them.
The second is FAIR Paris. The fastText line ran with Armand Joulin, Piotr Bojanowski, and Tomas Mikolov. The retrieval-augmented language-modeling line ran with Gautier Izacard, Grave's PhD advisee at Sorbonne and FAIR from 2019 to 2023. The LLaMA-paper byline links Grave to Hugo Touvron as first author and to several authors who later followed each other across the FAIR-Paris-to-Mistral migration, including Timothée Lacroix, Guillaume Lample, Marie-Anne Lachaux, and Thibaut Lavril. Yann LeCun was the chief AI scientist of Meta during his FAIR period.
The third is the Kyutai founding cohort: Pérez (CEO), Hervé Jegou, Laurent Mazaré, Neil Zeghidour, and Alexandre Défossez. The Moshi paper byline collects all six. The Kyutai scientific committee includes Yann LeCun, Yejin Choi, and Bernhard Schölkopf.
Position in the field
As of May 2026, Grave occupies a role at the intersection of two research trajectories. The first is the multilingual-NLP and word-embeddings line that traces through fastText and XLM-R; the line shaped how a generation of practitioners worked with non-English language data and made fastText one of the most widely deployed open-source NLP libraries in the late 2010s. The second is the retrieval-augmented and foundation-model line that traces through Atlas, the LLaMA paper, and the language-model side of Moshi and Helium at Kyutai.
His position differs from the ex-FAIR-Paris peers who founded Mistral AI. The Mistral co-founders moved from FAIR into a venture-backed commercial company; Grave moved into a nonprofit open-research lab funded by Iliad, CMA CGM, and Schmidt Futures. His public-facing presence is also lower-profile than Arthur Mensch or Guillaume Lample, with public communications channeled through arXiv submissions, the @EXGRV X account, and conference appearances. Press coverage of Kyutai has consistently identified Grave as the senior researcher carrying the language-model thread of the lab's program, alongside Zeghidour and Défossez on the speech and audio side.
Outlook
Open questions over the next 6 to 18 months:
- Helium 2 and successor language-model releases. Whether Kyutai's compact multilingual line continues at the cadence and capability bar set by Helium 1, and whether the lab releases a frontier-scale multilingual successor.
- Moshi successor and the speech-language stack. The integration of language-model and speech-model research at Kyutai, including any Moshi-2 or full-duplex multilingual successor, with Grave's language-side leadership shaping the language-model component.
- Open-weights and open-data commitment. Whether Kyutai's open-research and open-data positioning continues at the commitment level set in 2023 and 2024, particularly as the lab's compute and capability footprint scales.
- PhD pipeline and senior recruitment. Whether Grave's PhD advisees and the broader Sorbonne and Inria pipeline continue feeding researchers into Kyutai, and whether the lab can attract additional senior FAIR alumni.
- Public-facing posture. Whether the lower-profile communications style of the Kyutai senior research group shifts as the lab's public-launch artifacts (Moshi, Helium, Mimi) attract more press and policy attention.
Sources
- Edouard Grave on Wikidata. Wikidata record covering birth date, education, and identifiers.
- Edouard Grave on OpenReview. Career and education record covering the Inria PhD, postdoctoral positions, and FAIR-and-later roles, with Francis Bach listed as PhD advisor.
- Edouard Grave on the Mathematics Genealogy Project. Doctoral record listing the 2014 Université Pierre et Marie Curie thesis "A Markovian approach to distributional semantics" with advisors Bach and Obozinski.
- Edouard Grave's Google Scholar profile. Citation metrics, h-index, and chronological publication record.
- Liste des élèves de la promotion X 2006 inscrits sur la liste des ingénieurs diplômés de l'École polytechnique en 2009. The Légifrance government decree confirming Grave's 2009 engineer's-degree graduation from the X2006 Polytechnique promotion.
- Enriching Word Vectors with Subword Information. The 2017 TACL paper introducing subword-aware fastText embeddings.
- Bag of Tricks for Efficient Text Classification. The 2017 EACL paper introducing the fastText classifier.
- Learning Word Vectors for 157 Languages. The 2018 LREC paper distributing pretrained multilingual fastText word vectors.
- LLaMA: Open and Efficient Foundation Language Models. The February 2023 Meta AI paper introducing the LLaMA model family.
- Atlas: Few-shot Learning with Retrieval Augmented Language Models. The 2022 paper on retrieval-augmented language modeling at scale.
- Improving Neural Language Models with a Continuous Cache. The 2017 ICLR first-authored paper on cache-based neural language modeling.
- Unsupervised Cross-lingual Representation Learning at Scale. The 2020 XLM-R cross-lingual masked-language-model paper.
- Moshi: a speech-text foundation model for real-time dialogue. The 2024 Kyutai full-duplex spoken-dialogue model paper.
- Kyutai Team page. Lists Grave with the Chief Language Officer title and links to his external profiles.
- Kyutai launch announcement (TechCrunch). November 2023 coverage of the Kyutai launch and founding research team.
- fastText library. The open-source word-representation and text-classification library.
- Photo: Kyutai Team page, used as the source portrait under press-use convention for the team-page photographs.