AI4Bharat
AI4Bharat is an Indian academic artificial intelligence research initiative based at the Indian Institute of Technology Madras (IIT Madras), with a research mandate focused on Indic-language AI applications. The initiative was founded in 2019 by Mitesh Khapra (IIT Madras professor and One Fourth Labs co-founder), Pratyush Kumar, and other IIT Madras researchers. AI4Bharat develops the IndicTrans translation models, IndicBERT language models, IndicVoices speech datasets, and other open-research output covering 22 Indian languages. As of April 2026, AI4Bharat is one of the principal Indian academic AI research initiatives, with Indian government cooperation through Bhashini (the National Language Translation Mission) and Indian commercial customer cooperation.
At a glance
- Founded: 2019 at IIT Madras by Mitesh Khapra, Pratyush Kumar, and other IIT Madras researchers.
- Status: Academic research initiative at IIT Madras with Indian government cooperation through Bhashini.
- Funding: Indian government funding through Bhashini and IIT Madras academic-research-funding. Selected industry-cooperative-agreement and philanthropic funding.
- Lead: Mitesh Khapra, Co-Founder and Lead. IIT Madras professor.
- Other notable leadership: Pratyush Kumar, Co-Founder.
- Open weights: Yes. Open-research outputs released through Hugging Face and GitHub.
- Flagship outputs: IndicTrans translation models; IndicBERT language models; IndicVoices speech datasets; Indic-language NLP research; cooperation with Bhashini.
Origins
AI4Bharat was founded in 2019 at IIT Madras with a research mandate explicitly oriented around Indic-language AI applications, with emphasis on the long-tail Indian language coverage problem. The initiative built open-research infrastructure including IndicTrans (translation), IndicBERT (language models), IndicVoices (speech datasets), and other Indic-language NLP research output.
The 2022 launch of Bhashini (Indian government National Language Translation Mission) anchored cooperation with AI4Bharat's research output. The 2024 to 2026 period has continued open-research output and Indian government cooperation.
Mission and strategy
AI4Bharat's mission is to advance Indic-language AI research with open-research output. The strategy combines two threads. First, Indic-language AI research across translation, language modeling, speech, and other areas. Second, cooperation with Bhashini and Indian government Indic-language AI initiatives.
Distribution channels include open-weights distribution through Hugging Face, open-source code through GitHub, published research through major academic venues, and cooperation with Bhashini.
Models and products
- IndicTrans translation models. Indic-to-Indic and Indic-to-English translation across 22 Indian languages.
- IndicBERT. Indic-language language models.
- IndicVoices. Speech datasets across Indian languages.
- Indic-language NLP research output.
Distribution channels include open-weights distribution and cooperation with Bhashini.
Benchmarks and standing
AI4Bharat's evaluation framework focuses on Indic-language NLP benchmarks (translation, language modeling, speech recognition) and open-research adoption. The IndicTrans line has been characterized in Indic-language NLP industry coverage as one of the principal Indic-language translation systems globally.
Leadership
- Mitesh Khapra, Co-Founder and Lead. IIT Madras professor.
- Pratyush Kumar, Co-Founder.
- Senior research staff across the Indic-language AI program.
Funding and backers
Indian government funding through Bhashini, IIT Madras academic-research-funding, and selected industry-cooperative-agreement funding.
Industry position
AI4Bharat occupies a distinctive position as one of the principal Indian academic AI research initiatives, with open-research output covering 22 Indian languages and Indian government cooperation through Bhashini.
Competitive landscape
- Krutrim. Direct Indian commercial AI peer with different commercial-startup architecture.
- Sarvam AI. Indian AI peer.
- Hugging Face. Open-research distribution partner.
- Bhashini. Indian government Indic-language AI initiative cooperation partner.
- Allen Institute for AI (Ai2), EleutherAI, LAION, BigScience. Non-Indian open-research peers.
Outlook
- The continued cadence of Indic-language AI research output through 2026 to 2027.
- Continued cooperation with Bhashini.
- Continued IIT Madras academic-research-funding trajectory.
Sources
- AI4Bharat official site. Initiative reference.
- AI4Bharat on GitHub. Open-source repositories.
- AI4Bharat on Hugging Face. Open-weights model distribution.
- Bhashini. Indian government National Language Translation Mission.
- IIT Madras. Parent academic institution.