AI API Cost Calculator
Estimate and compare API costs for popular AI models across providers — text, image, video, and embeddings — by input/output token pricing.
Cost / request
$0.0100
Total (1,000 requests)
$10.00
Selected model
GPT-5.4
All text models — cost for your inputs
· cheapest first| Model | Provider | Cost / request | Total × 1,000 |
|---|---|---|---|
| Mistral Small | Mistral | $0.000250 | $0.2500 |
| DeepSeek V4 Flash (chat/reasoner) | DeepSeek | $0.000280 | $0.2800 |
| GPT-4.1 nano | OpenAI | $0.000300 | $0.3000 |
| Gemini 2.5 Flash-Lite | Google (Gemini) | $0.000300 | $0.3000 |
| GPT-4o mini | OpenAI | $0.000450 | $0.4500 |
| Grok 4 Fast | xAI (Grok) | $0.000450 | $0.4500 |
| GPT-5.4 nano | OpenAI | $0.000825 | $0.8250 |
| GPT-4.1 mini | OpenAI | $0.001200 | $1.20 |
| Gemini 2.5 Flash | Google (Gemini) | $0.001550 | $1.55 |
| Gemini 3 Flash | Google (Gemini) | $0.002000 | $2.00 |
| GPT-5.4 mini | OpenAI | $0.003000 | $3.00 |
| Claude Haiku 4.5 | Anthropic (Claude) | $0.003500 | $3.50 |
| Mistral Large | Mistral | $0.005000 | $5.00 |
| GPT-4.1 | OpenAI | $0.006000 | $6.00 |
| Gemini 3.5 Flash | Google (Gemini) | $0.006000 | $6.00 |
| Gemini 2.5 ProHigher rate ($2.50/$15) above 200k input tokens. | Google (Gemini) | $0.006250 | $6.25 |
| Claude Sonnet 5Introductory pricing through Aug 31, 2026 ($2/$10); then $3/$15. | Anthropic (Claude) | $0.007000 | $7.00 |
| GPT-4o | OpenAI | $0.007500 | $7.50 |
| Cohere Command A | Cohere | $0.007500 | $7.50 |
| Gemini 3.1 ProHigher rate ($4/$18) above 200k input tokens. | Google (Gemini) | $0.008000 | $8.00 |
| GPT-5.4 | OpenAI | $0.0100 | $10.00 |
| Claude Sonnet 4.6 | Anthropic (Claude) | $0.0105 | $10.50 |
| Grok 4 | xAI (Grok) | $0.0105 | $10.50 |
| Claude Opus 4.8 | Anthropic (Claude) | $0.0175 | $17.50 |
| GPT-5.5 | OpenAI | $0.0200 | $20.00 |
| Claude Fable 5 | Anthropic (Claude) | $0.0350 | $35.00 |
| Claude Opus 4.1Deprecated. | Anthropic (Claude) | $0.0525 | $52.50 |
| GPT-5.5 Pro | OpenAI | $0.1200 | $120.00 |
Official pricing pages
How AI API Pricing Works: Tokens, Input vs Output, and Estimating Real Costs
Almost every large language model API is billed by the token rather than by the request. A token is a chunk of text — roughly four characters or about three-quarters of an English word — and both the text you send (the input, or prompt) and the text the model generates (the output, or completion) are counted. Providers quote prices per million tokens, which makes headline numbers look small until you multiply them by the volume of a real application. This calculator turns those per-million rates into the numbers that actually matter: cost per request, and cost across thousands or millions of requests.
The single most important thing to understand is that input and output tokens are priced differently. Generating text is far more compute-intensive than reading it, so output almost always costs several times more than input — commonly three to five times. A model advertised at a cheap input rate can still be expensive if your workload produces long, verbose answers. When you compare models in the table above, watch the output column closely, because for chat and generation workloads it usually dominates the bill.
The Token Cost Formula
The math for a single text request is straightforward once prices are normalised to a per-token basis:
For a batch job you simply multiply that per-request cost by the number of requests. Embedding models charge on input tokens only, image models charge per generated image (or per image token), and video models charge per second of output — this tool applies the right formula automatically for each modality.
Prompt Caching Can Cut Costs Dramatically
Several providers, including OpenAI and Anthropic, offer prompt caching. When you repeatedly send the same large context — a long system prompt, a knowledge-base document, or a running conversation history — the cached portion is billed at a steep discount on cache hits, often around 10% of the normal input price.
For agents and chatbots that resend a big fixed prompt on every turn, caching can reduce input cost by the better part of an order of magnitude. Enter your cached token count in the text calculator to see the discount reflected in the estimate.
Choosing a Model: Balancing Cost, Quality, and Latency
The cheapest model is rarely the right default, and the most expensive one is rarely necessary. A practical approach is to route by task difficulty: use a small, low-cost model (such as a mini or nano tier, Gemini Flash-Lite, or DeepSeek) for classification, extraction, and simple summarisation, and reserve a flagship model (GPT-5.5, Claude Opus, Gemini Pro) for complex reasoning, long-form generation, or high-stakes output. Because the price gap between tiers is often 10× to 50×, moving routine traffic to a cheaper model is usually the biggest single lever on an AI bill.
Estimate before you build, not after. Plug your realistic average token counts and monthly request volume into the calculator, compare the top few candidates in the table, and you will often find that two models with similar quality differ enormously in cost. For high-volume or non-urgent workloads, also check whether a provider offers a batch tier (frequently around a 50% discount) — those rates are not shown here but are listed on each provider's official pricing page linked above.
Disclaimer: This AI API Cost Calculator is provided for general estimation and educational purposes. Prices were compiled from official provider pages in July 2026and can change at any time; regional endpoints, data-residency options, batch and priority tiers, fine-tuning, and server-side tools (such as web search) may carry different or additional charges. INR figures are approximate conversions from USD, which is the billing currency for these APIs. Always confirm current pricing on the provider's official page before making purchasing or architectural decisions.
How to Use
Pick a modality: text/chat, image, video, or embeddings.
Choose a model, then enter your token counts (or images/seconds) and number of requests.
See cost per request and total, plus a comparison table of every model sorted cheapest-first.
Cross-check the number on the provider’s official pricing page before you rely on it.
Features
FAQ
Use this free AI API cost calculator to estimate and compare pricing for large language models and other AI APIs. Enter your input and output token counts to see the cost per request and monthly total for OpenAI GPT, Anthropic Claude, Google Gemini, DeepSeek, Mistral, xAI Grok, and Cohere. It also covers image generation cost per image, video generation cost per second, and embedding cost per token — all calculated in your browser with no signup.
About AI API Cost Calculator
Estimate and compare the cost of AI API calls across every popular provider and model. Enter input and output token counts to see the cost per request and monthly total for OpenAI GPT, Anthropic Claude, Google Gemini, DeepSeek, Mistral, xAI Grok, and Cohere. Includes prompt-cache pricing, image generation cost per image, video generation cost per second, and embedding cost per token, with a comparison table sorted cheapest-first. All calculations run in your browser.
Processing Note
AI API Cost Calculator runs in your browser, so the input you enter is processed locally on this page and is not uploaded to a ToolMintX account.
Tool Limits
IT tools provide quick diagnostics and transformations. They cannot see every private network, deployment setting, proxy, firewall, or production edge case.
Related Tools
AI VRAM Calculator
Estimate GPU VRAM for LLM inference and training using model, quantization, users, and context length.
Client-sideAPI Key and .env Secret Generator
Generate secure .env secrets plus selectable Hugging Face, OpenAI, JWT, database, and webhook variables.
Client-sideSubnet Calculator
Free IP Subnet Calculator to instantly calculate network subnets, CIDR, broadcast addresses, and IP ranges online.
Client-sideIPv4 to IPv6 Converter
Instantly convert IPv4 addresses to IPv6 mapped and transition formats online for free.
Client-side