G Google Gemini API

Google Gemini API Pricing 2026

Gemini 2.0 Pro and Gemini 2.0 Flash — tiered context pricing, context caching, and Vertex AI vs. direct API differences.

Updated May 2026
OpenAI Anthropic Google AWS Bedrock Azure OpenAI

Model Pricing

Google charges differently based on prompt length — crossing the 128K token threshold doubles your input cost. This creates non-linear billing that's harder to predict than flat per-token rates.

Model Input ≤128K / 1M Input >128K / 1M Output ≤128K / 1M Output >128K / 1M
Gemini 2.0 Pro $1.25 $2.50 $5.00 $10.00
Gemini 2.0 Flash $0.075 $0.15 $0.30 $0.60
Gemini 2.0 Flash-Lite $0.0375 $0.075 $0.15 $0.30
128K threshold: If your average prompt is 100K tokens and occasionally spills past 128K, your cost can double on those requests. Track actual prompt lengths — not averages.

Context Caching

Gemini supports context caching to reduce costs on repeated large contexts (documentation, codebase, long system prompts). Cached content is billed at a discounted rate.

Model Cache storage / 1M tokens / hour Cache input / 1M tokens
Gemini 2.0 Pro $4.50 $0.3125
Gemini 2.0 Flash $1.00 $0.01875

Vertex AI vs. Gemini Developer API

Google offers two ways to access Gemini models with meaningfully different pricing and features:

Dimension Gemini Developer API Vertex AI
Target Startups, prototypes Enterprise, regulated
Free tier Yes (rate-limited) No
Pricing Direct Google billing GCP billing (slightly different rates)
Data residency Limited Full regional control
SLA Best-effort Enterprise SLA

Best For

Long-context document analysis

Gemini's 1M+ token context window is the industry leader for processing entire books, large codebases, or extensive conversation histories in a single call.

Multimodal workloads

Native vision, audio, and video understanding is built into Gemini. No separate model switching for image analysis tasks.

Budget-sensitive at scale

Gemini 2.0 Flash at $0.075/M input is among the cheapest capable models available. For high-volume, shorter-context tasks it's hard to beat.

Track your Google Gemini spend automatically

PayMesh connects to your Google Cloud billing to track Gemini API costs. See which models and context lengths are driving your bill.

Compare providers

See also: AI API Cost Comparison 2026.