AI4Bharat

AI4Bharat is the Indian academic AI research initiative based at IIT Madras, focused on Indic-language AI research and the IndicTrans translation, IndicBERT language model, and other open-research output for Indian language coverage.
AI4Bharat

AI4Bharat

AI4Bharat is an Indian academic artificial intelligence research initiative based at the Indian Institute of Technology Madras (IIT Madras), with a research mandate focused on Indic-language AI applications. The initiative was founded in 2019 by Mitesh Khapra (IIT Madras professor and One Fourth Labs co-founder), Pratyush Kumar, and other IIT Madras researchers. AI4Bharat develops the IndicTrans translation models, IndicBERT language models, IndicVoices speech datasets, and other open-research output covering 22 Indian languages. As of April 2026, AI4Bharat is one of the principal Indian academic AI research initiatives, with Indian government cooperation through Bhashini (the National Language Translation Mission) and Indian commercial customer cooperation.

At a glance

  • Founded: 2019 at IIT Madras by Mitesh Khapra, Pratyush Kumar, and other IIT Madras researchers.
  • Status: Academic research initiative at IIT Madras with Indian government cooperation through Bhashini.
  • Funding: Indian government funding through Bhashini and IIT Madras academic-research-funding. Selected industry-cooperative-agreement and philanthropic funding.
  • Lead: Mitesh Khapra, Co-Founder and Lead. IIT Madras professor.
  • Other notable leadership: Pratyush Kumar, Co-Founder.
  • Open weights: Yes. Open-research outputs released through Hugging Face and GitHub.
  • Flagship outputs: IndicTrans translation models; IndicBERT language models; IndicVoices speech datasets; Indic-language NLP research; cooperation with Bhashini.

Origins

AI4Bharat was founded in 2019 at IIT Madras with a research mandate explicitly oriented around Indic-language AI applications, with emphasis on the long-tail Indian language coverage problem. The initiative built open-research infrastructure including IndicTrans (translation), IndicBERT (language models), IndicVoices (speech datasets), and other Indic-language NLP research output.

The 2022 launch of Bhashini (Indian government National Language Translation Mission) anchored cooperation with AI4Bharat's research output. The 2024 to 2026 period has continued open-research output and Indian government cooperation.

Mission and strategy

AI4Bharat's mission is to advance Indic-language AI research with open-research output. The strategy combines two threads. First, Indic-language AI research across translation, language modeling, speech, and other areas. Second, cooperation with Bhashini and Indian government Indic-language AI initiatives.

Distribution channels include open-weights distribution through Hugging Face, open-source code through GitHub, published research through major academic venues, and cooperation with Bhashini.

Models and products

  • IndicTrans translation models. Indic-to-Indic and Indic-to-English translation across 22 Indian languages.
  • IndicBERT. Indic-language language models.
  • IndicVoices. Speech datasets across Indian languages.
  • Indic-language NLP research output.

Distribution channels include open-weights distribution and cooperation with Bhashini.

Benchmarks and standing

AI4Bharat's evaluation framework focuses on Indic-language NLP benchmarks (translation, language modeling, speech recognition) and open-research adoption. The IndicTrans line has been characterized in Indic-language NLP industry coverage as one of the principal Indic-language translation systems globally.

Leadership

  • Mitesh Khapra, Co-Founder and Lead. IIT Madras professor.
  • Pratyush Kumar, Co-Founder.
  • Senior research staff across the Indic-language AI program.

Funding and backers

Indian government funding through Bhashini, IIT Madras academic-research-funding, and selected industry-cooperative-agreement funding.

Industry position

AI4Bharat occupies a distinctive position as one of the principal Indian academic AI research initiatives, with open-research output covering 22 Indian languages and Indian government cooperation through Bhashini.

Competitive landscape

Outlook

  • The continued cadence of Indic-language AI research output through 2026 to 2027.
  • Continued cooperation with Bhashini.
  • Continued IIT Madras academic-research-funding trajectory.

Sources

About the author
Nextomoro

AI Research Lab Intelligence

nextomoro tracks progress for AI research labs, models, and what's next.

AI Research Lab Intelligence

Great! You’ve successfully signed up.

Welcome back! You've successfully signed in.

You've successfully subscribed to AI Research Lab Intelligence.

Success! Check your email for magic link to sign-in.

Success! Your billing info has been updated.

Your billing was not updated.