AI API Cost Calculator

Estimate and compare API costs for popular AI models across providers — text, image, video, and embeddings — by input/output token pricing.

Show costs in
Prices gathered July 2026 and are per 1M tokens unless noted. API pricing changes often — always confirm on the provider’s official page before relying on a figure.

Cost / request

$0.0100

Total (1,000 requests)

$10.00

Selected model

GPT-5.4

All text models — cost for your inputs

· cheapest first
ModelProviderCost / requestTotal × 1,000
Mistral SmallMistral$0.000250$0.2500
DeepSeek V4 Flash (chat/reasoner)DeepSeek$0.000280$0.2800
GPT-4.1 nanoOpenAI$0.000300$0.3000
Gemini 2.5 Flash-LiteGoogle (Gemini)$0.000300$0.3000
GPT-4o miniOpenAI$0.000450$0.4500
Grok 4 FastxAI (Grok)$0.000450$0.4500
GPT-5.4 nanoOpenAI$0.000825$0.8250
GPT-4.1 miniOpenAI$0.001200$1.20
Gemini 2.5 FlashGoogle (Gemini)$0.001550$1.55
Gemini 3 FlashGoogle (Gemini)$0.002000$2.00
GPT-5.4 miniOpenAI$0.003000$3.00
Claude Haiku 4.5Anthropic (Claude)$0.003500$3.50
Mistral LargeMistral$0.005000$5.00
GPT-4.1OpenAI$0.006000$6.00
Gemini 3.5 FlashGoogle (Gemini)$0.006000$6.00
Gemini 2.5 ProHigher rate ($2.50/$15) above 200k input tokens.Google (Gemini)$0.006250$6.25
Claude Sonnet 5Introductory pricing through Aug 31, 2026 ($2/$10); then $3/$15.Anthropic (Claude)$0.007000$7.00
GPT-4oOpenAI$0.007500$7.50
Cohere Command ACohere$0.007500$7.50
Gemini 3.1 ProHigher rate ($4/$18) above 200k input tokens.Google (Gemini)$0.008000$8.00
GPT-5.4OpenAI$0.0100$10.00
Claude Sonnet 4.6Anthropic (Claude)$0.0105$10.50
Grok 4xAI (Grok)$0.0105$10.50
Claude Opus 4.8Anthropic (Claude)$0.0175$17.50
GPT-5.5OpenAI$0.0200$20.00
Claude Fable 5Anthropic (Claude)$0.0350$35.00
Claude Opus 4.1Deprecated.Anthropic (Claude)$0.0525$52.50
GPT-5.5 ProOpenAI$0.1200$120.00

How AI API Pricing Works: Tokens, Input vs Output, and Estimating Real Costs

Almost every large language model API is billed by the token rather than by the request. A token is a chunk of text — roughly four characters or about three-quarters of an English word — and both the text you send (the input, or prompt) and the text the model generates (the output, or completion) are counted. Providers quote prices per million tokens, which makes headline numbers look small until you multiply them by the volume of a real application. This calculator turns those per-million rates into the numbers that actually matter: cost per request, and cost across thousands or millions of requests.

The single most important thing to understand is that input and output tokens are priced differently. Generating text is far more compute-intensive than reading it, so output almost always costs several times more than input — commonly three to five times. A model advertised at a cheap input rate can still be expensive if your workload produces long, verbose answers. When you compare models in the table above, watch the output column closely, because for chat and generation workloads it usually dominates the bill.

The Token Cost Formula

The math for a single text request is straightforward once prices are normalised to a per-token basis:

cost = (input ÷ 1,000,000 × input_price) + (output ÷ 1,000,000 × output_price)

For a batch job you simply multiply that per-request cost by the number of requests. Embedding models charge on input tokens only, image models charge per generated image (or per image token), and video models charge per second of output — this tool applies the right formula automatically for each modality.

Prompt Caching Can Cut Costs Dramatically

Several providers, including OpenAI and Anthropic, offer prompt caching. When you repeatedly send the same large context — a long system prompt, a knowledge-base document, or a running conversation history — the cached portion is billed at a steep discount on cache hits, often around 10% of the normal input price.

For agents and chatbots that resend a big fixed prompt on every turn, caching can reduce input cost by the better part of an order of magnitude. Enter your cached token count in the text calculator to see the discount reflected in the estimate.

Choosing a Model: Balancing Cost, Quality, and Latency

The cheapest model is rarely the right default, and the most expensive one is rarely necessary. A practical approach is to route by task difficulty: use a small, low-cost model (such as a mini or nano tier, Gemini Flash-Lite, or DeepSeek) for classification, extraction, and simple summarisation, and reserve a flagship model (GPT-5.5, Claude Opus, Gemini Pro) for complex reasoning, long-form generation, or high-stakes output. Because the price gap between tiers is often 10× to 50×, moving routine traffic to a cheaper model is usually the biggest single lever on an AI bill.

Estimate before you build, not after. Plug your realistic average token counts and monthly request volume into the calculator, compare the top few candidates in the table, and you will often find that two models with similar quality differ enormously in cost. For high-volume or non-urgent workloads, also check whether a provider offers a batch tier (frequently around a 50% discount) — those rates are not shown here but are listed on each provider's official pricing page linked above.

Disclaimer: This AI API Cost Calculator is provided for general estimation and educational purposes. Prices were compiled from official provider pages in July 2026and can change at any time; regional endpoints, data-residency options, batch and priority tiers, fine-tuning, and server-side tools (such as web search) may carry different or additional charges. INR figures are approximate conversions from USD, which is the billing currency for these APIs. Always confirm current pricing on the provider's official page before making purchasing or architectural decisions.

How to Use

1

Pick a modality: text/chat, image, video, or embeddings.

2

Choose a model, then enter your token counts (or images/seconds) and number of requests.

3

See cost per request and total, plus a comparison table of every model sorted cheapest-first.

4

Cross-check the number on the provider’s official pricing page before you rely on it.

Features

Text, image, video, and embedding pricing in one place
Compare every model side by side, sorted by cost
Handles input, output, and cached (prompt-cache) token pricing
Covers OpenAI, Claude, Gemini, DeepSeek, Mistral, Grok, and Cohere
100% client-side — nothing is sent to a server

FAQ

Use this free AI API cost calculator to estimate and compare pricing for large language models and other AI APIs. Enter your input and output token counts to see the cost per request and monthly total for OpenAI GPT, Anthropic Claude, Google Gemini, DeepSeek, Mistral, xAI Grok, and Cohere. It also covers image generation cost per image, video generation cost per second, and embedding cost per token — all calculated in your browser with no signup.

About AI API Cost Calculator

Estimate and compare the cost of AI API calls across every popular provider and model. Enter input and output token counts to see the cost per request and monthly total for OpenAI GPT, Anthropic Claude, Google Gemini, DeepSeek, Mistral, xAI Grok, and Cohere. Includes prompt-cache pricing, image generation cost per image, video generation cost per second, and embedding cost per token, with a comparison table sorted cheapest-first. All calculations run in your browser.

Processing Note

AI API Cost Calculator runs in your browser, so the input you enter is processed locally on this page and is not uploaded to a ToolMintX account.

Tool Limits

IT tools provide quick diagnostics and transformations. They cannot see every private network, deployment setting, proxy, firewall, or production edge case.