Models
| Provider | Types | |||||||
|---|---|---|---|---|---|---|---|---|
| Gemini 2.0 Flash | Text | $0.10 | $0.40 | 1M | Mid | |||
| Gemini 2.0 Flash 001 | Text | $0.10 | $0.40 | 1M | Mid | |||
| Gemini 2.5 Pro Preview | Text | $1.25 | $10.00 | 1M | Expensive | |||
| Gemini 2.0 Flash Lite | Text | $0.07 | $0.30 | 1M | Cheapest | |||
| Gemini 2.0 Flash Lite | Text | $0.07 | $0.30 | 1M | Cheapest | |||
| Gemma 3 27B | Text | $0.10 | $0.20 | 131.1K | Mid | |||
| Gemma 3 4B | Text | $0.02 | $0.04 | 131.1K | Cheapest | |||
| Gemma 3 12B | Text | $0.05 | $0.10 | 131.1K | Cheapest | |||
| Gemini 1.5 Pro Latest | Text | $1.25 | $5.00 | 2M | Expensive | |||
| Gemini 1.5 Flash | Text | $0.07 | $0.30 | 1M | Cheapest | |||
| Gemini 1.5 Flash Latest | Text | $0.07 | $0.30 | 1M | Cheapest | |||
| Gemini 1.5 Flash 001 | Text | $0.07 | $0.30 | 1M | Cheapest | |||
| Gemini 1.5 Flash 002 | Text | $0.07 | $0.30 | 1M | Mid | |||
| Gemini 1.5 Flash 8B | Text | $0.04 | $0.15 | 1M | Cheapest | |||
| Gemini 1.5 Flash 8B 001 | Text | $0.04 | $0.15 | 1M | Cheapest | |||
| Gemini 1.5 Flash 8B Latest | Text | $0.04 | $0.15 | 1M | Cheapest | |||
| Gemini 1.5 Pro | Text | $1.25 | $5.00 | 2M | Expensive | |||
| Gemini 1.5 Pro 001 | Text | $1.25 | $5.00 | 2M | Expensive | |||
| Gemini 1.5 Pro 002 | Text | $1.25 | $5.00 | 2M | Expensive | |||
| Gemini 2.5 Flash Preview 4 17 Thinking | Text | $0.15 | $3.50 | 1M | Mid |
Gemini 2.0 Flash
(gemini-2.0-flash)
Mid
Gemini 2.0 FlashBy Google
Gemini Flash 2.0 offers a significantly faster time to first token (TTFT) compared to Gemini Flash 1.5, while maintaining quality on par with larger models like Gemini Pro 1.5. It introduces notable enhancements in multimodal understanding, coding capabilities, complex instruction following, and function calling. These advancements come together to deliver more seamless and robust agentic experiences.$0.10Input(Per Million)
$0.40Output(Per Million)
1MContext Window
Gemini 2.0 Flash 001
(gemini-2.0-flash-001)
Mid
Gemini 2.0 Flash 001By Google
Gemini Flash 2.0 offers a significantly faster time to first token (TTFT) compared to Gemini Flash 1.5, while maintaining quality on par with larger models like Gemini Pro 1.5. It introduces notable enhancements in multimodal understanding, coding capabilities, complex instruction following, and function calling. These advancements come together to deliver more seamless and robust agentic experiences.$0.10Input(Per Million)
$0.40Output(Per Million)
1MContext Window
Gemini 2.5 Pro Preview
(gemini-2.5-pro-preview)
Expensive
Gemini 2.5 Pro PreviewBy Google
Gemini 2.5 Pro is Google's state-of-the-art AI model designed for advanced reasoning, coding, mathematics, and scientific tasks. It employs 'thinking' capabilities, enabling it to reason through responses with enhanced accuracy and nuanced context handling. Gemini 2.5 Pro achieves top-tier performance on multiple benchmarks, including first-place positioning on the LMArena leaderboard, reflecting superior human-preference alignment and complex problem-solving abilities.$1.25Input(Per Million)
$10.00Output(Per Million)
1MContext Window
Gemini 2.0 Flash Lite
(gemini-2.0-flash-lite-001)
Cheapest
Gemini 2.0 Flash LiteBy Google
Gemini 2.0 Flash Lite offers a significantly faster time to first token (TTFT) compared to Gemini Flash 1.5, while maintaining quality on par with larger models like Gemini Pro 1.5, all at extremely economical token prices.$0.07Input(Per Million)
$0.30Output(Per Million)
1MContext Window
Gemini 2.0 Flash Lite
(gemini-2.0-flash-lite)
Cheapest
Gemini 2.0 Flash LiteBy Google
Gemini 2.0 Flash Lite offers a significantly faster time to first token (TTFT) compared to Gemini Flash 1.5, while maintaining quality on par with larger models like Gemini Pro 1.5, all at extremely economical token prices.$0.07Input(Per Million)
$0.30Output(Per Million)
1MContext Window
Gemma 3 27B
(gemma-3-27b-it)
Mid
Gemma 3 27BBy Google
Gemma 3 introduces multimodality, supporting vision-language input and text outputs. It handles context windows up to 128k tokens, understands over 140 languages, and offers improved math, reasoning, and chat capabilities, including structured outputs and function calling. Gemma 3 27B is Google's latest open source model, successor to Gemma 2.$0.10Input(Per Million)
$0.20Output(Per Million)
131.1KContext Window
Gemma 3 4B
(gemma-3-4b-it)
Cheapest
Gemma 3 4BBy Google
Gemma 3 introduces multimodality, supporting vision-language input and text outputs. It handles context windows up to 128k tokens, understands over 140 languages, and offers improved math, reasoning, and chat capabilities, including structured outputs and function calling.$0.02Input(Per Million)
$0.04Output(Per Million)
131.1KContext Window
Gemma 3 12B
(gemma-3-12b-it)
Cheapest
Gemma 3 12BBy Google
Gemma 3 introduces multimodality, supporting vision-language input and text outputs. It handles context windows up to 128k tokens, understands over 140 languages, and offers improved math, reasoning, and chat capabilities, including structured outputs and function calling. Gemma 3 12B is the second largest in the family of Gemma 3 models after Gemma 3 27B.$0.05Input(Per Million)
$0.10Output(Per Million)
131.1KContext Window
Gemini 1.5 Pro Latest
(gemini-1.5-pro-latest)
Expensive
Gemini 1.5 Pro LatestBy Google
Google's latest multimodal model, supports image and video in text or chat prompts.
Optimized for language tasks including:
- Code generation
- Text generation
- Text editing
- Problem solving
- Recommendations
- Information extraction
- Data extraction or generation
- AI agents$1.25Input(Per Million)
$5.00Output(Per Million)
2MContext Window
Gemini 1.5 Flash
(gemini-1.5-flash)
Cheapest
Gemini 1.5 FlashBy Google
Gemini 1.5 Flash is a foundation model that performs well at a variety of multimodal tasks such as visual understanding, classification, summarization, and creating content from image, audio and video. It's adept at processing visual and text inputs such as photographs, documents, infographics, and screenshots.
Gemini 1.5 Flash is designed for high-volume, high-frequency tasks where cost and latency matter. On most common tasks, Flash achieves comparable quality to other Gemini Pro models at a significantly reduced cost. Flash is well-suited for applications like chat assistants and on-demand content generation where speed and scale matter.
Usage of Gemini is subject to Google's [Gemini Terms of Use](https://ai.google.dev/terms).
#multimodal$0.07Input(Per Million)
$0.30Output(Per Million)
1MContext Window
Gemini 1.5 Flash Latest
(gemini-1.5-flash-latest)
Cheapest
Gemini 1.5 Flash LatestBy Google
Alias that points to the most recent production (non-experimental) release of Gemini 1.5 Flash, our fast and versatile multimodal model for scaling across diverse tasks.$0.07Input(Per Million)
$0.30Output(Per Million)
1MContext Window
Gemini 1.5 Flash 001
(gemini-1.5-flash-001)
Cheapest
Gemini 1.5 Flash 001By Google
Stable version of Gemini 1.5 Flash, our fast and versatile multimodal model for scaling across diverse tasks, released in May of 2024$0.07Input(Per Million)
$0.30Output(Per Million)
1MContext Window
Gemini 1.5 Flash 002
(gemini-1.5-flash-002)
Mid
Gemini 1.5 Flash 002By Google
Stable version of Gemini 1.5 Flash, our fast and versatile multimodal model for scaling across diverse tasks, released in September of 2024.$0.07Input(Per Million)
$0.30Output(Per Million)
1MContext Window
Gemini 1.5 Flash 8B
(gemini-1.5-flash-8b)
Cheapest
Gemini 1.5 Flash 8BBy Google
Gemini Flash 1.5 8B is optimized for speed and efficiency, offering enhanced performance in small prompt tasks like chat, transcription, and translation. With reduced latency, it is highly effective for real-time and large-scale operations. This model focuses on cost-effective solutions while maintaining high-quality results.$0.04Input(Per Million)
$0.15Output(Per Million)
1MContext Window
Gemini 1.5 Flash 8B 001
(gemini-1.5-flash-8b-001)
Cheapest
Gemini 1.5 Flash 8B 001By Google
Gemini Flash 1.5 8B is optimized for speed and efficiency, offering enhanced performance in small prompt tasks like chat, transcription, and translation. With reduced latency, it is highly effective for real-time and large-scale operations. This model focuses on cost-effective solutions while maintaining high-quality results.$0.04Input(Per Million)
$0.15Output(Per Million)
1MContext Window
Gemini 1.5 Flash 8B Latest
(gemini-1.5-flash-8b-latest)
Cheapest
Gemini 1.5 Flash 8B LatestBy Google
Gemini Flash 1.5 8B is optimized for speed and efficiency, offering enhanced performance in small prompt tasks like chat, transcription, and translation. With reduced latency, it is highly effective for real-time and large-scale operations. This model focuses on cost-effective solutions while maintaining high-quality results.$0.04Input(Per Million)
$0.15Output(Per Million)
1MContext Window
Gemini 1.5 Pro
(gemini-1.5-pro)
Expensive
Gemini 1.5 ProBy Google
Google's latest multimodal model, supports image and video in text or chat prompts.
Optimized for language tasks including:
- Code generation
- Text generation
- Text editing
- Problem solving
- Recommendations
- Information extraction
- Data extraction or generation
- AI agents$1.25Input(Per Million)
$5.00Output(Per Million)
2MContext Window
Gemini 1.5 Pro 001
(gemini-1.5-pro-001)
Expensive
Gemini 1.5 Pro 001By Google
Google's latest multimodal model, supports image and video in text or chat prompts.
Optimized for language tasks including:
- Code generation
- Text generation
- Text editing
- Problem solving
- Recommendations
- Information extraction
- Data extraction or generation
- AI agents$1.25Input(Per Million)
$5.00Output(Per Million)
2MContext Window
Gemini 1.5 Pro 002
(gemini-1.5-pro-002)
Expensive
Gemini 1.5 Pro 002By Google
Google's latest multimodal model, supports image and video in text or chat prompts.
Optimized for language tasks including:
- Code generation
- Text generation
- Text editing
- Problem solving
- Recommendations
- Information extraction
- Data extraction or generation
- AI agents$1.25Input(Per Million)
$5.00Output(Per Million)
2MContext Window
Gemini 2.5 Flash Preview 4 17 Thinking
(gemini-2.5-flash-preview-04-17-thinking)
Mid
Gemini 2.5 Flash Preview 4 17 ThinkingBy Google
Gemini 2.5 Flash is Google's state-of-the-art workhorse model, specifically designed for advanced reasoning, coding, mathematics, and scientific tasks. It includes built-in "thinking" capabilities, enabling it to provide responses with greater accuracy and nuanced context handling.
This thinking variant allows the model to generate thinking tokens, which incurs higher output pricing but enables more complex reasoning capabilities.$0.15Input(Per Million)
$3.50Output(Per Million)
1MContext Window
Showing 1-20 of 48 models