Llama 4 Maverick

400B parameters, 128 experts - Meta's most capable open model

Llama 4 Maverick is Meta's flagship MoE model. With 400B total parameters routed through 128 experts and only 17B active per token, it delivers frontier-class performance that beats GPT-4o on key benchmarks while remaining fully open-weight.

Model variants

Instruction-tuned and base models

Choose between the instruction-tuned variant optimized for chat and complex tasks, or the base model for fine-tuning and research.

128-Expert MoE Architecture

400B total parameters, 17B active per token

Maverick scales to 128 experts from Scout's 16, packing 400B total parameters while keeping the same 17B active footprint per token. This gives it significantly stronger reasoning, coding, and multimodal capabilities.

The default chat model on this site. Best for tasks requiring maximum quality: complex reasoning, code generation, multimodal analysis, and research synthesis.

Instruction-tuned

Maverick Instruct

Optimized for conversational AI, complex reasoning, and code generation

Fine-tuned with RLHF for following instructions and multi-turn dialogue

Available now

Pre-trained

Maverick Base

Foundation MoE model for fine-tuning and specialized applications

Pre-trained on diverse multimodal data with 128-expert routing

Available now

Capabilities

Frontier performance from an open-weight model

Llama 4 Maverick combines 128-expert MoE efficiency with advanced reasoning, strong coding, and native multimodal understanding - all at 17B active parameters per token.

128-expert MoE

Routes each token through specialized experts from a pool of 128. 400B total parameters deliver frontier quality at 17B inference cost per token.

Advanced reasoning

Strong performance on MMLU Pro (80.5%) and GPQA Diamond (69.8%). Competitive with proprietary models on complex reasoning tasks.

Code generation

Outperforms GPT-4o on coding benchmarks. Native function calling enables agentic workflows and autonomous code execution.

1M token context

Process long documents, codebases, and extended conversations. Sufficient for most production use cases.

Native multimodal

Early fusion architecture processes text and images together natively. Analyze screenshots, diagrams, and documents alongside text.

Multilingual

Strong performance across multiple languages. Built for global applications with cultural context understanding.

Key highlights

Why Maverick stands out

Maverick is the first open-weight model to consistently beat GPT-4o across multiple benchmark categories.

Benchmark highlights

  • MMLU Pro 80.5% - competitive with frontier proprietary models
  • GPQA Diamond 69.8% - strong scientific reasoning
  • MMMU 73.4% - excellent multimodal understanding
  • Outperforms GPT-4o on coding benchmarks
  • Arena ELO competitive with top-tier models

Technical specs

  • 400B total parameters, 17B active per token
  • 128 experts in MoE architecture
  • 1M token context window
  • Native multimodal (text + image)
  • Llama 3.1 compatible license

Performance

Frontier quality from an open-weight MoE model

Llama 4 Maverick achieves 80.5% on MMLU Pro and 73.4% on MMMU, outperforming GPT-4o on multiple benchmarks while activating only 17B parameters per token.

Maverick demonstrates that open-weight models can compete with the best proprietary offerings. Its 128-expert architecture delivers consistent excellence across reasoning, coding, and multimodal tasks.

Llama 4 Maverick performance comparison chart

MMLU Pro 80.5% - frontier-class knowledge and reasoning

GPQA Diamond 69.8% - strong scientific reasoning

MMMU 73.4% - excellent multimodal understanding

Outperforms GPT-4o on coding benchmarks

17B active parameters from 400B total (128 experts)

Benchmark comparison

Maverick vs Scout and previous generation

Maverick's 128-expert architecture delivers significant improvements over Scout and Llama 3.1 across all categories.

Benchmark
Llama 4 Maverick
128 experts
Featured
Llama 4 Scout
16 experts
Llama 3.1 70B
Dense
GPT-4o
Proprietary
MMLU Pro
Knowledge & reasoning
80.5%74.3%66.4%78.4%
GPQA Diamond
Scientific knowledge
69.8%57.2%46.7%53.6%
LiveCodeBench v5
Coding
43.4%32.8%28.5%37.0%
MMMU
Multimodal
73.4%69.4%-69.1%
Context Window
Max tokens
1M10M128K128K
Total Parameters
Model size
400B109B70B-
Active Parameters
Per token
17B17B70B-

Data from Meta's official model card and independent evaluations.

128-Expert Scale

400B capacity at 17B inference cost

Maverick's 128-expert MoE architecture is a significant scale-up from Scout's 16 experts. Each token is routed to specialized experts, giving the model access to 400B parameters of knowledge while only activating 17B per forward pass.

  • 128 experts vs Scout's 16 - 8x more specialization
  • 400B total parameters vs Scout's 109B
  • Same 17B active parameter cost per token as Scout
Llama 4 Maverick 128-expert MoE architecture

Multimodal

Native text and image understanding

Maverick uses early fusion architecture to process text and images together natively. This means visual understanding is built into the model from the ground up, not bolted on as a separate module.

  • 73.4% on MMMU multimodal benchmark
  • Early fusion architecture for native multimodal processing
  • Analyze screenshots, diagrams, charts, and documents
Llama 4 Maverick multimodal capabilities

Download & deploy

Self-hosted deployment

Download official model weights for deployment on your infrastructure.

Llama 4 Family

Explore the full Llama 4 lineup

Maverick is Meta's flagship open model. Compare it with Scout and see how it stacks up against other frontier models.

Llama 4 Scout

10M context window specialist

Compare

All Llama 4 Models

Complete family overview

View all

Llama 4 vs Kimi K2.6

Maverick vs Moonshot's 1T model

Compare

Llama 4 vs Qwen 3.6

Meta vs Alibaba's latest

Compare

Llama 4 vs DeepSeek V4

MoE architecture showdown

Compare

Llama 4 vs MiniMax M2.7

Scale vs cost efficiency

Compare

Get started

Ready to try Llama 4 Maverick?

Start chatting instantly for free. Maverick is the default model on this site - no setup required.