B AWS Bedrock

AWS Bedrock Pricing 2026

On-demand vs. provisioned throughput, marketplace model pricing, and the real cost of routing via AWS vs. directly through providers.

Updated May 2026
OpenAI Anthropic Google AWS Bedrock Azure OpenAI

On-Demand Model Pricing

Bedrock is a model marketplace — you access third-party models (Anthropic, Meta, Mistral, Cohere) through AWS infrastructure. Prices are set by AWS, not the model developers, and may differ from direct API rates.

Model (via Bedrock) Input / 1M tokens Output / 1M tokens vs. Direct API
Claude 3.5 Sonnet $3.00 $15.00 Same as direct
Claude 3.5 Haiku $0.80 $4.00 Same as direct
Llama 3.1 70B $0.99 $1.32 AWS-only
Llama 3.1 8B $0.22 $0.22 AWS-only
Mistral 7B $0.15 $0.20 AWS-only
Amazon Titan Text Premier $0.50 $1.50 AWS-only
Amazon Nova Pro $0.80 $3.20 AWS-only

Provisioned Throughput (PTU)

Provisioned Throughput gives you guaranteed model capacity in exchange for a fixed monthly commitment — regardless of whether you use it. It's cheaper per token at high utilization but expensive if your usage is variable or seasonal.

Commitment type Claude 3.5 Sonnet (per model unit/month) Best for
1-month commitment ~$15,000 High-volume, consistent traffic
6-month commitment ~$12,000 Stable enterprise workloads
Watch out: Underutilized PTU reservations are sunk cost — the unused capacity doesn't roll over and can't be resold. Only commit to PTU when your utilization is consistently above 70%.

Regional Billing

Bedrock prices vary by region and by whether Cross-Region Inference is enabled. A workload split across us-east-1 and eu-west-1 generates two separate line items. Multi-region deployments for latency or compliance reasons can double the number of cost tracking points.

Hidden Costs

Cost type What it is Typical impact
Knowledge base storage S3 + vector index for RAG $0.10–1.00/GB/month
Model evaluation Automated eval job runs Per-token at standard rates
Inference profiles Cross-region inference routing 5–15% markup on base rates
Guardrails Content filtering per API call $0.75 per 1K text units

Best For

Existing AWS shops

If you're already on AWS, Bedrock fits neatly into IAM, VPC, CloudWatch, and consolidated billing. No new vendor relationships, no new payment method.

Multi-model routing

Bedrock gives you access to Claude, Llama, Mistral, Amazon Titan, and Cohere through a single API surface — useful for routing by cost, latency, or capability.

Enterprise compliance

Data stays within AWS's compliance perimeter. For teams with HIPAA, FedRAMP, or regional data residency requirements, Bedrock avoids third-party data transfer issues.

Track your AWS Bedrock spend automatically

PayMesh syncs with AWS Cost Explorer to pull Bedrock costs across regions and models. See your actual spend breakdown — including PTU utilization, knowledge base storage, and guardrail fees.

Compare providers

See also: AI API Cost Comparison 2026.