Hugging Face, Inc. is an American company that was founded in 2016 by French entrepreneurs Clément Delangue, Julien Chaumond, and Thomas Wolf. The company initially developed a chatbot app targeted at teenagers but later pivoted to being a platform for machine learning after open-sourcing the model behind the chatbot.
The company is well-known for its Transformers library, which is a Python package containing open-source implementations of transformer models for text, image, and audio tasks. The library is compatible with PyTorch, TensorFlow, and JAX deep learning libraries and includes implementations of notable models like BERT and GPT-2. The library has undergone several name changes, originally being called "pytorch-pretrained-bert," then "pytorch-transformers," and finally "transformers."
The Hugging Face platform, also known as the Hugging Face Hub, serves as a centralized web service for hosting Git-based code repositories, models, datasets, and web applications designed for small-scale demonstrations of machine learning applications. The hub operates similarly to GitHub, with features such as discussions and pull requests for projects. As of now, the platform is being used by more than 5,000 organizations, including prominent names like the Allen Institute for AI, Meta AI, Amazon Web Services, and Microsoft.
In addition to the Transformers library and the Hugging Face Hub, the company's ecosystem contains libraries for other tasks such as dataset processing ("Datasets"), model evaluation ("Evaluate"), simulation ("Simulate"), and machine learning demonstrations ("Gradio"). They have also developed a web app called "Write with Transformers," which is used to showcase the text generation capabilities of the Transformers repository.
Hugging Face has made significant contributions to AI research. Some of their noteworthy contributions include the development of "Multitask Prompted Training Enables Zero-Shot Task Generalization," an open-source state-of-the-art zero-shot language model from BigScience, DistilBERT, a more compact and efficient version of BERT, and Neuralcoref, an open-source coreference resolution library that users can train on their own datasets and languages.
The company has been involved in numerous collaborations and partnerships. For example, they launched the BigScience Research Workshop in collaboration with several other research groups, which culminated in the release of BLOOM, a multilingual large language model with 176 billion parameters. They also announced a partnership with Graphcore to optimize their Transformers library for the Graphcore IPU, and another partnership with Amazon Web Services (AWS) to make Hugging Face's products available to AWS customers.
On the financial front, Hugging Face has received support through several rounds of funding. In March 2021, they raised $40 million in a Series B funding round, and by May 2022, they had achieved a $2 billion valuation in a Series C funding round led by Coatue and Sequoia.