How Much Does Vapi Actually Cost Per Minute? Real Numbers Breakdown (2026)

Vapi's pricing page says 5 cents per minute. That is the Vapi platform fee, not your total cost. The actual cost of running a Vapi agent is that 5 cents plus the LLM, plus the TTS, plus the STT, plus phone charges. For most production configurations the real number is 15 to 28 cents per minute. This breakdown walks through every line item so you can estimate accurately.

The Four Cost Components

Every Vapi call has four cost components. The Vapi platform fee (5 cents per minute, flat). The LLM fee (varies by model, typically 2 to 10 cents per minute). The TTS fee (varies by provider, typically 3 to 12 cents per minute). The STT fee (typically 1 to 3 cents per minute). Optionally phone minutes via Twilio (about 1.3 cents per minute for US inbound).

Total for a typical production setup: 15 to 28 cents per minute. For a 5-minute call that is 75 cents to $1.40.

LLM Costs in Detail

The LLM is where the biggest cost swings happen. GPT-4 Turbo runs about 8 to 12 cents per minute depending on how verbose the prompts are. GPT-4o is cheaper at roughly 3 to 5 cents per minute. GPT-4o-mini is dramatically cheaper at under 1 cent per minute. Claude 3.5 Sonnet lands at 4 to 6 cents per minute. Claude Haiku is similar to GPT-4o-mini. Groq-hosted Llama 3 models can be under half a cent per minute.

For simple use cases (appointment booking, hours inquiries), GPT-4o-mini is sufficient and cuts your LLM costs by roughly 90 percent versus GPT-4 Turbo. For complex reasoning, you probably want GPT-4o or Claude Sonnet. The model you pick is the biggest lever on per-call cost.

TTS Costs in Detail

TTS is the second biggest variable cost. ElevenLabs Turbo is around 8 to 12 cents per minute on their current pricing. ElevenLabs Flash is cheaper at 5 to 7 cents. PlayHT Turbo is similar to ElevenLabs. Cartesia Sonic is 4 to 7 cents. Deepgram Aura is 3 to 5 cents. OpenAI TTS is 3 to 5 cents. Azure TTS is under 2 cents.

For most production use cases, ElevenLabs or Cartesia sound best but cost more. Azure and OpenAI are cheaper and sound fine for simpler use cases. Test them on your actual agent before picking purely on cost.

Cost Per Minute by Configuration

Premium: GPT-4 Turbo + ElevenLabs + Deepgram~$0.28/min

Balanced: GPT-4o + ElevenLabs Flash + Deepgram~$0.20/min

Cost-optimized: GPT-4o-mini + Cartesia + Deepgram~$0.11/min

Ultra-cheap: Llama 3 + Azure TTS + Deepgram~$0.05/min

STT Costs in Detail

STT is usually the smallest line item. Deepgram Nova runs about 1 cent per minute. Whisper via OpenAI is around 1 cent per minute. Deepgram is faster and slightly more expensive when used streaming. Do not optimize here first; the dollars are elsewhere.

Phone Number Costs

If Vapi handles your number, it is about 1 dollar per month per number plus pass-through minute costs at roughly 1.3 cents per minute for inbound. If you BYO Twilio, the rate is similar but you manage it directly. Over a busy month, phone costs are typically under 5 percent of your total bill unless you are doing outbound at high volume.

A Real 5-Minute Call Breakdown

Balanced configuration: Vapi 25 cents, GPT-4o 17 cents, ElevenLabs Flash 30 cents, Deepgram 5 cents, phone 7 cents. Total: 84 cents for a 5-minute call. At 500 calls per month, that is 420 dollars.

Cost-optimized configuration: Vapi 25 cents, GPT-4o-mini 3 cents, Cartesia 22 cents, Deepgram 5 cents, phone 7 cents. Total: 62 cents for a 5-minute call. At 500 calls per month, that is 310 dollars.

Free Tier and Credits

Vapi gives you some free credits on signup for testing. The free tier is enough to build and validate an assistant but not enough for production use. Expect to deposit 50 to 100 dollars to run real production traffic while you iterate.

Ways to Cut Costs

Switch the LLM to GPT-4o-mini or Claude Haiku for any use case that does not require complex reasoning. This alone saves 5 to 10 cents per minute. Shorten the system prompt by removing unnecessary examples and instructions, which reduces per-call LLM token usage. Pick cheaper TTS: Azure or OpenAI TTS sound great for simple conversational agents.

Cache expensive tool calls at the start of the call. If a CRM lookup takes 2 seconds and returns the same data every time for a given phone number, cache it and save both latency and repeated API cost. For knowledge base queries, pre-embed content once instead of querying OpenAI every call.

Vapi Cost Share by Component (Balanced Config)

TTS (ElevenLabs Flash)38%

Vapi platform fee28%

LLM (GPT-4o)21%

Phone minutes8%

STT (Deepgram)5%

Monitoring Costs in Production

Vapi's dashboard breaks down cost per call by line item. Check this weekly, not monthly. It is easy for a misconfigured assistant to double your costs overnight (e.g., someone accidentally switching the TTS to ElevenLabs Ultra Quality). Set a billing alert at 2x your average monthly spend so you catch runaway costs before they become expensive.

Track cost per successful outcome rather than cost per minute. A 7-minute call that converts to a booked appointment is cheaper than a 3-minute call that ends with the caller hanging up confused. Optimize for the business outcome, not the minute rate.

Ciela is the demo platform for AI agencies and AI consultants. It turns any prospect's website into a live, personalized AI demo (chat, voice, or missed-call text-back) you can send before the first call.

Start Client Track Ciela pricing Niche demo playbooks All agency playbooks

Community · Training

Join First Client Club — 215+ AI agency owners.

First Client Club is our free community for AI automation agency builders. Get our outbound-with-live-demos platform, AI content templates, and a room of operators landing clients in days.

Join First Client Club, free

22 people joined this week