Models

	Provider		Types
	Google	Gemini 2.0 Flash	Text	$0.10	$0.40	1M	Mid
	Google	Gemini 2.0 Flash 001	Text	$0.10	$0.40	1M	Mid
	Google	Gemini 2.5 Pro Preview	Text	$1.25	$10.00	1M	Expensive
	Google	Gemini 2.0 Flash Lite	Text	$0.07	$0.30	1M	Cheapest
	Google	Gemini 2.0 Flash Lite	Text	$0.07	$0.30	1M	Cheapest
	Google	Gemma 3 27B	Text	$0.10	$0.20	131.1K	Mid
	Google	Gemma 3 4B	Text	$0.02	$0.04	131.1K	Cheapest
	Google	Gemma 3 12B	Text	$0.05	$0.10	131.1K	Cheapest
	Google	Gemini 1.5 Pro Latest	Text	$1.25	$5.00	2M	Expensive
	Google	Gemini 1.5 Flash	Text	$0.07	$0.30	1M	Cheapest
	Google	Gemini 1.5 Flash Latest	Text	$0.07	$0.30	1M	Cheapest
	Google	Gemini 1.5 Flash 001	Text	$0.07	$0.30	1M	Cheapest
	Google	Gemini 1.5 Flash 002	Text	$0.07	$0.30	1M	Mid
	Google	Gemini 1.5 Flash 8B	Text	$0.04	$0.15	1M	Cheapest
	Google	Gemini 1.5 Flash 8B 001	Text	$0.04	$0.15	1M	Cheapest
	Google	Gemini 1.5 Flash 8B Latest	Text	$0.04	$0.15	1M	Cheapest
	Google	Gemini 1.5 Pro	Text	$1.25	$5.00	2M	Expensive
	Google	Gemini 1.5 Pro 001	Text	$1.25	$5.00	2M	Expensive
	Google	Gemini 1.5 Pro 002	Text	$1.25	$5.00	2M	Expensive
	Google	Gemini 2.5 Flash Preview 4 17 Thinking	Text	$0.15	$3.50	1M	Mid

Gemini 2.0 Flash

(gemini-2.0-flash)

Mid

Text

Gemini 2.0 FlashBy Google

Gemini Flash 2.0 offers a significantly faster time to first token (TTFT) compared to Gemini Flash 1.5, while maintaining quality on par with larger models like Gemini Pro 1.5. It introduces notable enhancements in multimodal understanding, coding capabilities, complex instruction following, and function calling. These advancements come together to deliver more seamless and robust agentic experiences.

$0.10Input(Per Million)

$0.40Output(Per Million)

1MContext Window

Gemini 2.0 Flash 001

(gemini-2.0-flash-001)

Mid

Text

Gemini 2.0 Flash 001By Google

Gemini Flash 2.0 offers a significantly faster time to first token (TTFT) compared to Gemini Flash 1.5, while maintaining quality on par with larger models like Gemini Pro 1.5. It introduces notable enhancements in multimodal understanding, coding capabilities, complex instruction following, and function calling. These advancements come together to deliver more seamless and robust agentic experiences.

$0.10Input(Per Million)

$0.40Output(Per Million)

1MContext Window

Gemini 2.5 Pro Preview

(gemini-2.5-pro-preview)

Expensive

Text

Gemini 2.5 Pro PreviewBy Google

Gemini 2.5 Pro is Google's state-of-the-art AI model designed for advanced reasoning, coding, mathematics, and scientific tasks. It employs 'thinking' capabilities, enabling it to reason through responses with enhanced accuracy and nuanced context handling. Gemini 2.5 Pro achieves top-tier performance on multiple benchmarks, including first-place positioning on the LMArena leaderboard, reflecting superior human-preference alignment and complex problem-solving abilities.

$1.25Input(Per Million)

$10.00Output(Per Million)

1MContext Window

Gemini 2.0 Flash Lite

(gemini-2.0-flash-lite-001)

Cheapest

Text

Gemini 2.0 Flash LiteBy Google

Gemini 2.0 Flash Lite offers a significantly faster time to first token (TTFT) compared to Gemini Flash 1.5, while maintaining quality on par with larger models like Gemini Pro 1.5, all at extremely economical token prices.

$0.07Input(Per Million)

$0.30Output(Per Million)

1MContext Window

Gemini 2.0 Flash Lite

(gemini-2.0-flash-lite)

Cheapest

Text

Gemini 2.0 Flash LiteBy Google

Gemini 2.0 Flash Lite offers a significantly faster time to first token (TTFT) compared to Gemini Flash 1.5, while maintaining quality on par with larger models like Gemini Pro 1.5, all at extremely economical token prices.

$0.07Input(Per Million)

$0.30Output(Per Million)

1MContext Window

Gemma 3 27B

(gemma-3-27b-it)

Mid

Text

Gemma 3 27BBy Google

Gemma 3 introduces multimodality, supporting vision-language input and text outputs. It handles context windows up to 128k tokens, understands over 140 languages, and offers improved math, reasoning, and chat capabilities, including structured outputs and function calling. Gemma 3 27B is Google's latest open source model, successor to Gemma 2.

$0.10Input(Per Million)

$0.20Output(Per Million)

131.1KContext Window

Gemma 3 4B

(gemma-3-4b-it)

Cheapest

Text

Gemma 3 4BBy Google

Gemma 3 introduces multimodality, supporting vision-language input and text outputs. It handles context windows up to 128k tokens, understands over 140 languages, and offers improved math, reasoning, and chat capabilities, including structured outputs and function calling.

$0.02Input(Per Million)

$0.04Output(Per Million)

131.1KContext Window

Gemma 3 12B

(gemma-3-12b-it)

Cheapest

Text

Gemma 3 12BBy Google

Gemma 3 introduces multimodality, supporting vision-language input and text outputs. It handles context windows up to 128k tokens, understands over 140 languages, and offers improved math, reasoning, and chat capabilities, including structured outputs and function calling. Gemma 3 12B is the second largest in the family of Gemma 3 models after Gemma 3 27B.

$0.05Input(Per Million)

$0.10Output(Per Million)

131.1KContext Window

Gemini 1.5 Pro Latest

(gemini-1.5-pro-latest)

Expensive

Text

Gemini 1.5 Pro LatestBy Google

Google's latest multimodal model, supports image and video in text or chat prompts. Optimized for language tasks including: - Code generation - Text generation - Text editing - Problem solving - Recommendations - Information extraction - Data extraction or generation - AI agents

$1.25Input(Per Million)

$5.00Output(Per Million)

2MContext Window

Gemini 1.5 Flash

(gemini-1.5-flash)

Cheapest

Text

Gemini 1.5 FlashBy Google

Gemini 1.5 Flash is a foundation model that performs well at a variety of multimodal tasks such as visual understanding, classification, summarization, and creating content from image, audio and video. It's adept at processing visual and text inputs such as photographs, documents, infographics, and screenshots. Gemini 1.5 Flash is designed for high-volume, high-frequency tasks where cost and latency matter. On most common tasks, Flash achieves comparable quality to other Gemini Pro models at a significantly reduced cost. Flash is well-suited for applications like chat assistants and on-demand content generation where speed and scale matter. Usage of Gemini is subject to Google's [Gemini Terms of Use](https://ai.google.dev/terms). #multimodal

$0.07Input(Per Million)

$0.30Output(Per Million)

1MContext Window

Gemini 1.5 Flash Latest

(gemini-1.5-flash-latest)

Cheapest

Text

Gemini 1.5 Flash LatestBy Google

Alias that points to the most recent production (non-experimental) release of Gemini 1.5 Flash, our fast and versatile multimodal model for scaling across diverse tasks.

$0.07Input(Per Million)

$0.30Output(Per Million)

1MContext Window

Gemini 1.5 Flash 001

(gemini-1.5-flash-001)

Cheapest

Text

Gemini 1.5 Flash 001By Google

Stable version of Gemini 1.5 Flash, our fast and versatile multimodal model for scaling across diverse tasks, released in May of 2024

$0.07Input(Per Million)

$0.30Output(Per Million)

1MContext Window

Gemini 1.5 Flash 002

(gemini-1.5-flash-002)

Mid

Text

Gemini 1.5 Flash 002By Google

Stable version of Gemini 1.5 Flash, our fast and versatile multimodal model for scaling across diverse tasks, released in September of 2024.

$0.07Input(Per Million)

$0.30Output(Per Million)

1MContext Window

Gemini 1.5 Flash 8B

(gemini-1.5-flash-8b)

Cheapest

Text

Gemini 1.5 Flash 8BBy Google

Gemini Flash 1.5 8B is optimized for speed and efficiency, offering enhanced performance in small prompt tasks like chat, transcription, and translation. With reduced latency, it is highly effective for real-time and large-scale operations. This model focuses on cost-effective solutions while maintaining high-quality results.

$0.04Input(Per Million)

$0.15Output(Per Million)

1MContext Window

Gemini 1.5 Flash 8B 001

(gemini-1.5-flash-8b-001)

Cheapest

Text

Gemini 1.5 Flash 8B 001By Google

Gemini Flash 1.5 8B is optimized for speed and efficiency, offering enhanced performance in small prompt tasks like chat, transcription, and translation. With reduced latency, it is highly effective for real-time and large-scale operations. This model focuses on cost-effective solutions while maintaining high-quality results.

$0.04Input(Per Million)

$0.15Output(Per Million)

1MContext Window

Gemini 1.5 Flash 8B Latest

(gemini-1.5-flash-8b-latest)

Cheapest

Text

Gemini 1.5 Flash 8B LatestBy Google

Gemini Flash 1.5 8B is optimized for speed and efficiency, offering enhanced performance in small prompt tasks like chat, transcription, and translation. With reduced latency, it is highly effective for real-time and large-scale operations. This model focuses on cost-effective solutions while maintaining high-quality results.

$0.04Input(Per Million)

$0.15Output(Per Million)

1MContext Window

Gemini 1.5 Pro

(gemini-1.5-pro)

Expensive

Text

Gemini 1.5 ProBy Google

Google's latest multimodal model, supports image and video in text or chat prompts. Optimized for language tasks including: - Code generation - Text generation - Text editing - Problem solving - Recommendations - Information extraction - Data extraction or generation - AI agents

$1.25Input(Per Million)

$5.00Output(Per Million)

2MContext Window

Gemini 1.5 Pro 001

(gemini-1.5-pro-001)

Expensive

Text

Gemini 1.5 Pro 001By Google

Google's latest multimodal model, supports image and video in text or chat prompts. Optimized for language tasks including: - Code generation - Text generation - Text editing - Problem solving - Recommendations - Information extraction - Data extraction or generation - AI agents

$1.25Input(Per Million)

$5.00Output(Per Million)

2MContext Window

Gemini 1.5 Pro 002

(gemini-1.5-pro-002)

Expensive

Text

Gemini 1.5 Pro 002By Google

Google's latest multimodal model, supports image and video in text or chat prompts. Optimized for language tasks including: - Code generation - Text generation - Text editing - Problem solving - Recommendations - Information extraction - Data extraction or generation - AI agents

$1.25Input(Per Million)

$5.00Output(Per Million)

2MContext Window

Gemini 2.5 Flash Preview 4 17 Thinking

(gemini-2.5-flash-preview-04-17-thinking)

Mid

Text

Gemini 2.5 Flash Preview 4 17 ThinkingBy Google

Gemini 2.5 Flash is Google's state-of-the-art workhorse model, specifically designed for advanced reasoning, coding, mathematics, and scientific tasks. It includes built-in "thinking" capabilities, enabling it to provide responses with greater accuracy and nuanced context handling. This thinking variant allows the model to generate thinking tokens, which incurs higher output pricing but enables more complex reasoning capabilities.

$0.15Input(Per Million)

$3.50Output(Per Million)

1MContext Window

Showing 1-20 of 48 models