Pay-as-you-go vs. Provisioned Throughput Units (PTU) — with real-world guidance on when each model makes financial sense.
Updated May 2026Azure PAYG rates are set by Microsoft and slightly differ from OpenAI's direct API prices due to Azure infrastructure overhead. The models are identical — the pricing model is not.
| Model | Input / 1M tokens | Output / 1M tokens | Context window |
|---|---|---|---|
| GPT-4o | $2.50 | $10.00 | 128K |
| GPT-4o mini | $0.165 | $0.66 | 128K |
| GPT-4 Turbo | $10.00 | $30.00 | 128K |
| o1 | $15.00 | $60.00 | 200K |
| text-embedding-3-large | $0.13 | — | 8K |
Azure's PTU model commits you to reserved model capacity. Unlike PAYG where you pay per token, PTU is a flat monthly fee per throughput unit regardless of utilization. The math only works if you're running at high sustained throughput.
| Model | PTU price (per unit/hour) | Effective input $/1M at 100% util. | Break-even vs. PAYG |
|---|---|---|---|
| GPT-4o PTU | ~$2.00 | ~$1.00–1.50 | ~50–60% utilization |
| GPT-4o mini PTU | ~$0.50 | ~$0.06–0.09 | ~50–60% utilization |
| Dimension | Azure OpenAI | OpenAI Direct |
|---|---|---|
| Model parity | GPT-4o, o1, some models lag | All models, day-one access |
| PAYG pricing | Slightly higher (GPT-4o mini: $0.165 vs $0.15) | Lower base rate |
| PTU / reserved capacity | Yes | Enterprise agreements only |
| Microsoft 365 integration | Native | Requires custom wiring |
| Compliance (HIPAA, FedRAMP) | Full Azure coverage | Separate BAA required |
| Azure bill consolidation | Yes | Separate invoice |
Azure OpenAI integrates with Active Directory, Azure Monitor, Azure Private Link, and M365 Copilot. For shops already deep in Azure, it's the path of least friction.
Azure's FedRAMP High, HIPAA, and ISO 27001 certifications extend to Azure OpenAI. For healthcare and government, compliance is a hard requirement that narrows your options.
PTU gives you guaranteed throughput and up to 40% discount vs. PAYG for workloads that run 24/7 at consistent volume. Batch processing pipelines and always-on assistants qualify.
Azure OpenAI includes default content filtering (Azure AI Content Safety) enabled on all endpoints. This adds latency and a small per-call overhead. Custom content filter policies (additional configuration) are part of the standard service but complex setups may require Azure AI Studio usage.
Fine-tuning is available for GPT-4o on Azure at rates comparable to direct OpenAI. Training costs: ~$25/M tokens. Hosted fine-tuned model inference runs at 1.5× the base model rate.
PayMesh connects to Azure's Usage API to sync costs across deployments and models. See PAYG vs. PTU utilization side-by-side and get alerted before you overspend.