On-demand vs. provisioned throughput, marketplace model pricing, and the real cost of routing via AWS vs. directly through providers.
Updated May 2026Bedrock is a model marketplace — you access third-party models (Anthropic, Meta, Mistral, Cohere) through AWS infrastructure. Prices are set by AWS, not the model developers, and may differ from direct API rates.
| Model (via Bedrock) | Input / 1M tokens | Output / 1M tokens | vs. Direct API |
|---|---|---|---|
| Claude 3.5 Sonnet | $3.00 | $15.00 | Same as direct |
| Claude 3.5 Haiku | $0.80 | $4.00 | Same as direct |
| Llama 3.1 70B | $0.99 | $1.32 | AWS-only |
| Llama 3.1 8B | $0.22 | $0.22 | AWS-only |
| Mistral 7B | $0.15 | $0.20 | AWS-only |
| Amazon Titan Text Premier | $0.50 | $1.50 | AWS-only |
| Amazon Nova Pro | $0.80 | $3.20 | AWS-only |
Provisioned Throughput gives you guaranteed model capacity in exchange for a fixed monthly commitment — regardless of whether you use it. It's cheaper per token at high utilization but expensive if your usage is variable or seasonal.
| Commitment type | Claude 3.5 Sonnet (per model unit/month) | Best for |
|---|---|---|
| 1-month commitment | ~$15,000 | High-volume, consistent traffic |
| 6-month commitment | ~$12,000 | Stable enterprise workloads |
Bedrock prices vary by region and by whether Cross-Region Inference is enabled. A workload split across us-east-1 and eu-west-1 generates two separate line items. Multi-region deployments for latency or compliance reasons can double the number of cost tracking points.
| Cost type | What it is | Typical impact |
|---|---|---|
| Knowledge base storage | S3 + vector index for RAG | $0.10–1.00/GB/month |
| Model evaluation | Automated eval job runs | Per-token at standard rates |
| Inference profiles | Cross-region inference routing | 5–15% markup on base rates |
| Guardrails | Content filtering per API call | $0.75 per 1K text units |
If you're already on AWS, Bedrock fits neatly into IAM, VPC, CloudWatch, and consolidated billing. No new vendor relationships, no new payment method.
Bedrock gives you access to Claude, Llama, Mistral, Amazon Titan, and Cohere through a single API surface — useful for routing by cost, latency, or capability.
Data stays within AWS's compliance perimeter. For teams with HIPAA, FedRAMP, or regional data residency requirements, Bedrock avoids third-party data transfer issues.
PayMesh syncs with AWS Cost Explorer to pull Bedrock costs across regions and models. See your actual spend breakdown — including PTU utilization, knowledge base storage, and guardrail fees.