G Google Gemini API

Google Gemini API Pricing 2026

Name: Google Gemini API
Brand: Google

Gemini 2.0 Pro and Gemini 2.0 Flash — tiered context pricing, context caching, and Vertex AI vs. direct API differences.

Updated May 2026

OpenAI Anthropic Google AWS Bedrock Azure OpenAI

Model Pricing

Google charges differently based on prompt length — crossing the 128K token threshold doubles your input cost. This creates non-linear billing that's harder to predict than flat per-token rates.

Model	Input ≤128K / 1M	Input >128K / 1M	Output ≤128K / 1M	Output >128K / 1M
Gemini 2.0 Pro	$1.25	$2.50	$5.00	$10.00
Gemini 2.0 Flash	$0.075	$0.15	$0.30	$0.60
Gemini 2.0 Flash-Lite	$0.0375	$0.075	$0.15	$0.30

128K threshold: If your average prompt is 100K tokens and occasionally spills past 128K, your cost can double on those requests. Track actual prompt lengths — not averages.

Context Caching

Gemini supports context caching to reduce costs on repeated large contexts (documentation, codebase, long system prompts). Cached content is billed at a discounted rate.

Model	Cache storage / 1M tokens / hour	Cache input / 1M tokens
Gemini 2.0 Pro	$4.50	$0.3125
Gemini 2.0 Flash	$1.00	$0.01875

Vertex AI vs. Gemini Developer API

Google offers two ways to access Gemini models with meaningfully different pricing and features:

Dimension	Gemini Developer API	Vertex AI
Target	Startups, prototypes	Enterprise, regulated
Free tier	Yes (rate-limited)	No
Pricing	Direct Google billing	GCP billing (slightly different rates)
Data residency	Limited	Full regional control
SLA	Best-effort	Enterprise SLA

Best For

Long-context document analysis

Gemini's 1M+ token context window is the industry leader for processing entire books, large codebases, or extensive conversation histories in a single call.

Multimodal workloads

Native vision, audio, and video understanding is built into Gemini. No separate model switching for image analysis tasks.

Budget-sensitive at scale

Gemini 2.0 Flash at $0.075/M input is among the cheapest capable models available. For high-volume, shorter-context tasks it's hard to beat.

Track your Google Gemini spend automatically

PayMesh connects to your Google Cloud billing to track Gemini API costs. See which models and context lengths are driving your bill.

Start monitoring for free Calculate your costs