Falcon

Falcon is an open-weights large language model family developed by the Technology Innovation Institute (TII) in Abu Dhabi, with releases beginning in May 2023 and continuing through Falcon-H1 in 2025 and Falcon-H1 Arabic in early 2026. The family covers a wide range of sizes from 0.5 billion to 180 billion parameters and is distributed under permissive licensing through Hugging Face, with commercial use available under the Apache 2.0 license for early Falcon variants and the TII Falcon LLM License for later ones. As of April 2026, the Falcon line is the public face of UAE sovereign AI research, anchoring the early 2023 to 2024 wave of open-weights models from non-US, non-Chinese sources, and continuing iteration with the hybrid Mamba-Transformer Falcon-H1 architecture introduced in 2025.

At a glance

Lab: Technology Innovation Institute
Released: Falcon 7B and Falcon 40B (May 2023), Falcon 180B (September 2023), Falcon 2 (May 2024), Falcon 3 (December 2024), Falcon-H1 family (May 2025), Falcon-H1 Arabic (January 2026)
Modality: Text (Falcon 1, 3, and H1); text and vision (Falcon 2 11B VLM)
Open weights: Yes. Apache 2.0 license for Falcon 7B and Falcon 40B; TII Falcon LLM License (a permissive license based on Apache 2.0 with a responsible-use clause) for Falcon 180B, Falcon 2, Falcon 3, and Falcon-H1 variants. Weights available on Hugging Face under the tiiuae organization.
Context window: Up to 32,000 tokens for Falcon 3; up to 262,000 tokens for Falcon-H1 variants
Pricing: Free for self-hosted open-weights deployment; no first-party hosted commercial API
Distribution channels: Hugging Face tiiuae organization, Falcon LLM website, AWS SageMaker JumpStart (Falcon 40B and Falcon 180B), various third-party inference providers

Origins

The Falcon program was established within TII's AI and Digital Science Research Center (AIDRC), with founding director Ebtesam Almazrouei leading the early work. The first major releases, Falcon 7B and Falcon 40B, arrived in May 2023 and were trained on the RefinedWeb dataset, a TII-developed web-derived corpus published alongside the models. RefinedWeb's technical premise was that careful filtering and deduplication of common-crawl-derived web data could match or exceed curated training corpora.

Falcon 40B used a 40-billion-parameter dense decoder-only transformer trained on 1 trillion tokens. Falcon 7B used 7 billion parameters trained on 1.5 trillion tokens. Both were released under Apache 2.0, an unusually permissive licensing posture in mid-2023 (Meta's LLaMA had launched in February 2023 under a research-only license that excluded commercial use). On release, Falcon 40B reached the top of the Hugging Face Open LLM Leaderboard ahead of LLaMA 65B, contributing to early international visibility for TII as a sovereign-AI research center.

Falcon 180B followed in September 2023, scaling the family to 180 billion parameters trained on 3.5 trillion tokens of RefinedWeb data. The 180B release used 4,096 Nvidia A100 GPUs across approximately 7 million GPU-hours, the largest publicly disclosed open-weights training run of 2023. Licensing shifted to a TII-modified Apache 2.0 with restrictions on hosted-service redistribution. Falcon 180B outperformed Meta's Llama 2 70B and OpenAI's GPT-3.5 on MMLU at release, scoring 68.74 on the Hugging Face Leaderboard composite as the highest openly released pretrained LLM at the time.

Falcon 2, released May 2024, included an 11-billion-parameter dense base model trained on 5.5 trillion tokens of multilingual data across 11 languages, plus a vision-language variant (Falcon 2 11B VLM) pairing the base model with a CLIP ViT-L/14 encoder. Falcon 2 11B performed on par with Google's Gemma 7B on the Hugging Face Leaderboard and exceeded Meta's Llama 3 8B.

Falcon 3 launched on December 17, 2024, as a family of smaller models for edge and resource-constrained deployments. The family included 1B, 3B, 7B, and 10B dense transformer variants plus a Falcon3-Mamba-7B state-space-model variant. The 7B and 10B variants were trained on 14 trillion tokens using 1,024 H100 GPUs. Falcon 3 reached the top of the Hugging Face Open LLM Leaderboard for sub-13B-parameter models, with the Falcon3-10B-Base scoring 73.1 on MMLU and 83.0 on GSM8K.

The most architecturally distinctive release came in May 2025 with the Falcon-H1 family, introducing a hybrid Mamba-Transformer design that combined attention with Mamba-2 state-space heads in parallel within each mixer block. Falcon-H1 covered six sizes (0.5B, 1.5B, 1.5B-Deep, 3B, 7B, and 34B), a 262,000-token context window, and multilingual coverage across 18 languages. Falcon Arabic, also announced May 2025, was the first Arabic-specific Falcon release. Falcon-H1 Arabic, released January 2026, set the Open Arabic LLM Leaderboard record at three model scales.

Capabilities

The Falcon family covers a broad range of model sizes and architectural approaches, reflecting TII's strategic emphasis on accessibility and breadth rather than peak benchmark leadership at the largest scales.

The Falcon 1, 2, and 3 transformer variants use standard decoder-only architectures with rotary position embeddings, SwiGLU activations, and grouped-query attention in later releases. Falcon 3 adopted a 256-dimension head configuration optimized for FlashAttention-3. The Falcon3-Mamba-7B variant uses a state-space-model architecture rather than transformer attention, anticipating the hybrid direction that became central to Falcon-H1.

Falcon-H1's hybrid architecture is the most consequential technical contribution from the line. The hybrid mixer block runs attention and Mamba-2 heads in parallel, with the optimal attention-to-state-space ratio determined empirically per model scale. The configuration retains the contextual reasoning strength of full-attention transformers while substantially reducing memory footprint and inference latency at long context. The 262,000-token context window of Falcon-H1 variants is among the longest in the open-weights tier, and the architecture's lower per-token compute cost makes long-context inference more economical than transformer-only alternatives at comparable parameter count.

Multilingual coverage is a distinctive strength. Falcon 2 introduced 11-language coverage; Falcon 3 expanded it; Falcon-H1's multilingual tokenizer was designed for over 100 languages with first-class support for 18. Arabic-language capability is a particular focus, reflecting TII's UAE national-research mandate. Falcon-H1 Arabic models lead the Open Arabic LLM Leaderboard, outperforming substantially larger general-purpose alternatives.

The Falcon 2 11B VLM extended the line into vision-language tasks, pairing the base model with a CLIP ViT-L/14 vision encoder and a multimodal projector trained on image-caption pairs. The Falcon line does not currently include dedicated reasoning, coding, or audio specialist variants comparable to the modality-specific lines from larger labs.

Benchmarks and standing

Falcon's benchmark positioning has varied substantially across the family's release history.

At the May 2023 Falcon 7B and Falcon 40B release, Falcon 40B briefly held the top position on the Hugging Face Open LLM Leaderboard ahead of LLaMA 65B. Falcon 180B at the September 2023 release reached 68.74 on the same composite leaderboard, the highest score among openly released pretrained LLMs at the time, and outperformed Meta's Llama 2 70B and OpenAI's GPT-3.5 on MMLU.

Falcon 2 11B at the May 2024 release scored 64.28 on the Hugging Face Leaderboard, comparable to Google's Gemma 7B (64.29) and ahead of Meta's Llama 3 8B.

Falcon 3 at the December 2024 release reached the top of the Hugging Face Open LLM Leaderboard for models under 13 billion parameters. Falcon3-10B-Base scored 73.1 on MMLU, 83.0 on GSM8K, 59.7 on BBH, and 73.8 on MBPP. Falcon3-10B-Instruct scored 78.0 on IFEval and 86.3 on BFCL for context-length tasks. Falcon 3 was the leading open-weights option in the sub-13B-parameter category through early 2025.

Falcon-H1 at the May 2025 release positioned as a competitive open-weights alternative to similarly sized transformer-only models, with the hybrid architecture's main advantage showing on long-context and inference-efficiency benchmarks rather than absolute task performance. Falcon-H1 Arabic in January 2026 set the Open Arabic LLM Leaderboard record at three model scales.

The Falcon line has not pursued top-tier positions on the Artificial Analysis Intelligence Index, GPQA Diamond, SWE-bench Verified, or LMArena leaderboards in the way that DeepSeek, Qwen, or Llama have. Strategic emphasis has been on accessibility, multilingual coverage, and architectural diversification rather than peak leadership at the largest scales. Benchmark positions are point-in-time and the open-weights tier rotates rapidly.

Access and pricing

All Falcon weights are distributed through the tiiuae organization on Hugging Face, which hosts model cards, weights, and configuration files for every Falcon variant from Falcon 7B through Falcon-H1 Arabic. The Falcon LLM website provides documentation, technical reports, and community resources.

Licensing varies by release. Falcon 7B and Falcon 40B (May 2023) use Apache 2.0, the most permissive standard open-weights license. Falcon 180B (September 2023) and subsequent releases including Falcon 2, Falcon 3, and Falcon-H1 use the TII Falcon LLM License, a permissive license based on Apache 2.0 with an acceptable-use policy that prohibits high-risk applications. The TII license permits commercial use, modification, and redistribution under standard open-source terms with the responsible-use addendum.

There is no first-party hosted commercial API for Falcon models. Self-hosted deployment is the primary access path, with the smaller Falcon 3 1B, 3B, and 7B variants tractable on consumer GPUs and the Falcon3-10B and larger Falcon-H1 sizes requiring professional or multi-GPU configurations. AWS SageMaker JumpStart provides managed deployment for Falcon 40B and Falcon 180B with AWS-hosted infrastructure.

Third-party inference providers including Together AI, Replicate, and various Hugging Face inference endpoints offer hosted Falcon access at per-token rates that vary by provider.

Comparison

Direct competitors to the Falcon family across the 2023 to 2026 release period:

Llama 4 (Meta AI). Meta's open-weights line is the dominant Western competitor at every Falcon release. Llama has consistently led the largest open-weights releases on US-centric benchmarks. Falcon's distinguishing position has been Apache 2.0 licensing on the early variants (Llama carried more restrictive licenses through Llama 3) and stronger multilingual and Arabic coverage.
Mistral AI Mixtral and Mistral Large 2. The European open-weights peer. Mistral 7B and Mixtral 8x7B competed directly with Falcon variants through 2023 to 2024; Mistral's Apache 2.0 licensing and English-language benchmark strength positioned it as the primary alternative for Western enterprise deployments. Falcon's advantages have been broader size coverage and Arabic depth.
DeepSeek V3 (DeepSeek) and Qwen 3 (Alibaba Qwen). The major Chinese open-weights frontier releases of late 2024 and 2025. Both substantially outperform any current Falcon variant on the largest-scale benchmark leaderboards. Falcon's positioning relative to these is on architectural diversity (Falcon-H1's hybrid design), regional and language coverage, and licensing appropriate for Western enterprise deployments where Chinese-origin supply chain raises regulatory questions.
Phi-4 (Microsoft AI). The closest peer for Falcon 3's small-model emphasis. Phi-4 covers similar parameter ranges (14B and below) with strong reasoning-tuned benchmarks; Falcon 3 emphasizes multilingual coverage and architectural variety where Phi-4 is transformer-only and English-focused.
Allen Institute for AI OLMo. The fully open-source-and-data peer. OLMo releases include training data, training code, and weights under unconstrained terms; Falcon releases weights and the RefinedWeb dataset but not full training-code reproducibility. The two represent different points on the open-weights-versus-open-source spectrum.

The Falcon family's distinctive position: the broadest UAE-origin sovereign-AI research output, Apache 2.0 licensing on early variants, the hybrid Mamba-Transformer Falcon-H1 architecture, and Arabic-language depth that no major Western or Chinese open-weights line matches.

Outlook

Open questions for the Falcon family over the next 6 to 18 months:

Falcon-H1 family extension. Whether TII releases an H1-class model above 34B, or extends the hybrid architecture into multimodal variants, will determine whether Falcon-H1 graduates from a research-architecture demonstration to a frontier-tier line.
Frontier-scale ambition. Falcon 180B in 2023 was the family's largest release; subsequent generations have not exceeded that scale. Whether TII pursues a Falcon 4 or Falcon-H2 at substantially larger scale is an unresolved strategic question.
Regional Arabic-AI leadership. Falcon-H1 Arabic's leaderboard positioning establishes TII as the leading open-weights provider for Arabic deployments. Sustaining that lead against potential entrants from G42, MBZUAI, or other regional labs will require continued release cadence.
Adoption beyond research. Falcon's commercial-deployment footprint has been narrower than peer Chinese open-weights lines despite comparable licensing. Whether enterprise adoption broadens, particularly in Middle East and South Asia markets, will affect the line's strategic standing.
Hybrid-architecture validation. The Falcon-H1 hybrid Mamba-Transformer design is one of the most prominent applications of state-space-model components at scale. Whether peer labs adopt similar designs will inform whether hybrid architectures become standard in the open-weights tier.

Sources

Falcon LLM website. Official Falcon family hub with technical reports, model documentation, and release announcements.
Hugging Face: tiiuae organization. All Falcon model weights, model cards, and configuration files.
TII: World's Most Powerful Open LLM Falcon 180B announcement. Falcon 180B release announcement, September 2023.
Hugging Face Blog: The Falcon 3 Family of Open Models. Technical overview of the Falcon 3 release with model sizes, training tokens, and benchmark scores.
Hugging Face Blog: Falcon 2 11B and 11B VLM. Falcon 2 release announcement with multilingual and vision-language details.
Falcon LLM Blog: Falcon-H1 Hybrid-Head Language Models. Falcon-H1 architecture and family overview.
The RefinedWeb Dataset for Falcon LLM (arXiv). Technical paper on the RefinedWeb training corpus that underpins the Falcon line.
The Falcon Series of Open Language Models (arXiv). Falcon 1 family technical paper covering Falcon 7B, 40B, and 180B.
TII: Falcon-H1 Arabic launch. Falcon-H1 Arabic release announcement, January 2026.

Falcon

At a glance

Origins

Capabilities

Benchmarks and standing

Access and pricing

Comparison

Outlook

Sources

Nextomoro

AI Research Lab Intelligence

Falcon

At a glance

Origins

Capabilities

Benchmarks and standing

Access and pricing

Comparison

Outlook

Sources

Nextomoro

QwQ-32B

Qwen3 Coder 480B-A35B

MiniMax M2

Kimi K2.5

Qwen 3.6

AI Research Lab Intelligence