New 5 companies · First observed June 2025 · Updated June 2026

The voice minute is unbundling into pass-through component meters

Quick answer

Self-serve voice-agent platforms are decomposing the per-minute price into a thin platform fee plus at-cost pass-through of the LLM, TTS and telephony underneath — Vapi's advertised $0.05/min is only its hosting fee; real deployments run ~$0.13–$0.33/min. Bring-your-own components is a published discount lever, priced as exactly as Deepgram's $0.010/min BYO-TTS delta. A counter-camp (Cartesia, Bland, Tavus) still sells one bundled minute at one price.

~15% of a real Vapi minute is covered by the advertised $0.05 hosting fee

What's happening — and why

What's happening: the per-minute price of a voice agent is splitting into a bill of materials. Retell itemizes a minute into Voice Infra ($0.055/min) plus separately priced LLM, TTS, telephony and add-on lines, metered to the nearest second; Vapi charges $0.05/min for hosting and passes model and telephony through at cost; Synthflow abolished its platform fee entirely and bills the Voice Engine, the LLM and telephony as three separate per-minute meters; Deepgram publishes managed and BYO-component tiers side by side, exactly $0.010/min apart; ElevenLabs keeps cutting its Agents pricing while steering new subscriptions to pay-as-you-go minutes.

Why: the components of a conversation minute — speech-to-text, the LLM turn, text-to-speech, the phone line — have wildly different and fast-moving costs, and many buyers already hold their own Twilio contract or model key. Pricing each component separately lets the platform shrink its own fee to a defensible orchestration margin, transmit component-price deflation automatically, and turn bring-your-own keys into a published, predictable discount instead of a negotiation — voice AI's version of the BYOK split between orchestration and inference. The counter-camp (Cartesia, Bland, Tavus) bets the other way: buyers want a quote, not a bill of materials, so they sell one flat bundled minute.

How it works

one voice-agent minute · the bill of materials platform LLM (at cost) TTS (at cost) telephony advertised $0.05/min passed through at cost BYO Twilio → $0.00/min real minute ≈ $0.13–$0.33 (Vapi) · one swapped component = an exact $0.010/min delta (Deepgram)
The per-minute price decomposes into a thin platform fee plus at-cost component meters; BYO keys zero individual lines.

Evidence over time

5 supporting · 3 counter — hover or tap a point for detail, click to jump to the row.

supports ↑ challenges ↓ 2025 2026
supporting evidence counterexample

Evidence

Company Date What happened
Synthflow Jun 2025 Killed cheap flat tiers (Starter was $29/mo) and moved to no-platform-fee pay-as-you-go that bills the Voice Engine, the LLM, and telephony as three separate per-minute meters; BYO Twilio zeroes the telephony line.
Retell AI Jun 2026 Pricing fully unbundled: a voice-agent minute is itemized into Retell Voice Infra ($0.055/min) plus separately-priced LLM, TTS, telephony and add-ons, metered to the nearest second.
Vapi Jun 2026 Headline $0.05/min is only the hosting fee — model and telephony costs are passed through at cost, so deployments run ~$0.13–$0.33/min depending on the stack assembled.
Deepgram May 2026 Voice Agent API sells managed vs BYO component tiers side by side — Standard – BYO TTS at $0.065/min vs $0.075/min fully managed — pricing the bundled component as an explicit per-minute delta.
ElevenLabs May 2026 Cut Conversational AI (Agents) pricing again while steering new subscriptions to pay-as-you-go per-minute agent usage — the platform-minute fee keeps thinning.

Counterexamples

  • Cartesia · Feb 2026 — Took Voice Agents to GA on a flat, bundled per-minute rate — one number, no component meters.
  • Bland AI · Dec 2025 — Moved from a single flat $0.09/min to tier-linked flat per-minute rates (free tier up 55% to $0.14/min) — repriced the bundle rather than unbundling it; compliance (HIPAA, SOC 2, PCI) stays included rather than itemized.
  • Tavus · Feb 2026 — Deliberately bundles three of its own models (Raven perception, Sparrow turn-taking, Phoenix rendering) into one billed conversation minute — the opposite bet: the minute as an indivisible multimodal unit.

Trivia

  • Synthflow (2025-06-24) shows how wide the unbundled minute swings: the same agent can cost roughly $0.11 to $0.24 per minute depending on stack picks — GPT-4.1 mini adds $0.02/min, full GPT-4.1 adds $0.05/min, and bringing your own Twilio drops the telephony meter to $0.00/min. The "price" of a Synthflow minute is really a configuration, not a number.

  • Vapi's headline $0.05/min (verified 2026-06-09) is only its hosting fee — model and telephony costs pass through at cost, so real deployments run roughly $0.13–$0.33/min. The advertised rate covers as little as 15% of what a buyer actually pays per minute.

  • Deepgram (2026-05) prices the BYO discount to the cent: its Voice Agent "Standard – BYO TTS" tier is $0.065/min versus $0.075/min fully managed — an exact $0.010/min line item for one swapped component, the clearest published unit price for unbundling in the corpus.

  • Retell AI (verified 2026-06-09) meters calls to the nearest second with no per-call rounding — but the meter keeps running during silence and hold, because the speech-to-text engine stays active and listening. Unbundling makes even dead air a priced component.

See all pricing trivia

For buyers

On an unbundled platform the advertised per-minute rate is a floor, not a price — Vapi's $0.05/min headline covers as little as ~15% of a real deployment's $0.13–$0.33/min. Model an actual stack (which LLM, whose telephony, whose TTS) before comparing vendors, and treat the BYO levers as the negotiation: bringing your own Twilio zeroes Synthflow's telephony meter, and Deepgram prices a swapped TTS at exactly $0.010/min off. Check what the meter counts, too — Retell bills to the nearest second but keeps metering through silence and hold. On bundled platforms (Cartesia, Bland, Tavus) compare the all-in minute directly, but expect repricing risk to arrive as one opaque number: Bland's free-tier minute rose 55% in a single rewrite.

For vendors

Running the unbundled play needs component-level metering (per-second resolution at Retell), a rate card per swappable component, pass-through billing that tracks upstream LLM/TTS/telephony prices at cost, and BYO-key support priced as an explicit per-minute delta — Deepgram's $0.065 BYO-TTS vs $0.075 fully-managed pair is the template. The strategic cost is a thin, visible platform fee under constant deflation pressure: ElevenLabs has cut its Agents pricing repeatedly and Synthflow dropped its platform fee to zero, so margin has to come from volume and orchestration value. Bundling stays viable where the buyer wants one predictable number — Tavus deliberately bills three of its own models as a single conversation minute.

Outlook — what to watch

First logged in June 2026 off the wave-27 voice-AI intake, at a corpus of 207. Expect the unbundled camp to grow: component costs keep deflating and pass-through pricing transmits those cuts without repricing, while BYO discounts deepen as buyers consolidate their own Twilio and model contracts. The trend would sharpen if a bundled vendor breaks out component meters or published BYO deltas become the norm; it would weaken if buyers reject bill-of-materials quotes and the flat-minute camp (Cartesia, Bland, Tavus) wins on predictability. Watch whether unbundlers add flat all-in SKUs on top of the meters — that would signal the bundle pulling back ahead.

Bottom line

Five voice-agent platforms — Synthflow, Retell, Vapi, Deepgram, ElevenLabs — now price the minute as a thin platform fee plus at-cost component pass-through, with BYO keys as a published discount lever, while Cartesia, Bland and Tavus keep selling one bundled minute. On unbundled platforms the advertised rate is a floor: price your real stack before comparing.

FAQ

Why does my voice AI agent cost more than the advertised per-minute rate?

Because on unbundled platforms the headline number is only the platform fee. Vapi's $0.05/min covers hosting alone — model and telephony pass through at cost, so real deployments run roughly $0.13–$0.33/min. Synthflow's minute swings from about $0.11 to $0.24 depending on stack picks. Always price a configured stack, not the headline.

What does BYO (bring your own) mean in voice AI pricing?

Plugging your own contract in for a component — your Twilio account for telephony, or your own TTS or LLM key — so that line drops off the platform's meter. Bringing your own Twilio zeroes Synthflow's telephony line, and Deepgram publishes the discount to the cent: $0.065/min with BYO TTS versus $0.075/min fully managed.

Which voice AI platforms unbundle the minute, and which bundle it?

In the corpus, Synthflow, Retell AI, Vapi, Deepgram and ElevenLabs price components separately or pass them through at cost. Cartesia, Bland AI and Tavus sell one flat bundled minute — Tavus deliberately meters its three-model pipeline (perception, turn-taking, rendering) as a single indivisible unit.

Does per-second billing mean I only pay for talk time?

No. Retell meters calls to the nearest second with no per-call rounding, but the meter keeps running during silence and hold because the speech-to-text engine stays active and listening. Unbundling makes even dead air a priced component.

All trends