Is Speechmatics free to use?

Yes. The free tier requires no credit card and includes 3,000 speech-to-text minutes (50 hours) per month (1,200 real-time and 1,800 batch) plus 1 million text-to-speech characters (~20 hours).

How much does Speechmatics speech-to-text cost?

On the pay-as-you-go Pro tier, the multilingual Batch Melia 1 model is $0.129/hr, batch and real-time standard accuracy are $0.24/hr, batch enhanced accuracy is $0.40/hr, and real-time enhanced accuracy is $0.43/hr. Usage is billed to the second based on the per-hour rate.

Does Speechmatics offer volume discounts?

Yes. Pro usage above 500 hours/month for a given speech-to-text type is automatically discounted 20%, with additional discounts available from 24,000 hours/year. Enterprise pricing is custom.

How is Speechmatics text-to-speech priced?

Text-to-speech is metered separately at $0.011 per 1,000 characters on the Pro tier, after the free 1 million characters per month.

What is the Speechmatics Startup Program?

It grants early-stage founders (typically under $10M raised) up to $50,000 in usage credits across real-time, batch, and text-to-speech APIs, plus onboarding and engineering support.

Speechmatics Pricing

AI Summary

Speechmatics prices its speech-to-text API by the hour: $0.129/hr for the multilingual Batch Melia 1 model, $0.24/hr for batch or real-time standard accuracy, $0.40/hr for batch enhanced, and $0.43/hr for real-time enhanced accuracy on the self-serve Pro tier.
A free tier gives every account 3,000 free speech-to-text minutes (50 hours) per month (1,200 real-time + 1,800 batch) plus 1 million free text-to-speech characters (~20 hours), with no credit card required.
Text-to-speech is metered separately at $0.011 per 1,000 characters on the Pro tier.
Speech-to-text bolt-ons (Translation, Summaries, Chapters, Sentiment, Topics) are billed as add-on per-hour rates from $0.12/hr to $0.65/hr on top of transcription.
Pro usage above 500 hours/month for a given speech-to-text type automatically earns a 20% volume discount; Enterprise pricing is fully custom with volume discounts and on-prem deployment.
A Startup Program grants early-stage founders (typically <$10M raised) up to $50,000 in usage credits across real-time, batch, and text-to-speech APIs.

Pricing summary

Speechmatics 2026 — usage-priced speech AI APIs

Pure usage: free monthly allowance, then per-hour speech-to-text and per-character text-to-speech; Enterprise is custom.

Free

Developers and early exploration

Pay-as-you-go

Pro

from $0.129 /hr

Demanding projects and growing needs

Enterprise

Custom

Unlimited scale, flexible deployments

Startup Program: up to $50,000 in usage credits for early-stage founders (typically <$10M raised). STT rates shown for the 'Hours' unit toggle at standard (non-Model-Training) rates; lowest Pro rate is the multilingual Batch Melia 1 model at $0.129/hr.

About

Speechmatics is a Cambridge, UK speech AI company — founded in 2006 as Cantab Research by speech-recognition researcher Dr. Tony Robinson — that sells automatic speech recognition (ASR / speech-to-text) and text-to-speech (TTS) as developer-facing APIs. It raised a $62M Series B in June 2022 led by Susquehanna Growth Equity (with AlbionVC and IQ Capital), runs 100–250 staff, and reported ~£11.3M revenue in 2021. Its models cover 56+ languages for transcription (with 69 language pairs for AI translation) and emphasise accuracy across accents and dialects — the company markets reaching “over 4 billion people” through its language and accent coverage. The product line spans batch and real-time speech-to-text (powered by its Ursa generation of GPU-scaled models, launched March 2023), the Flow voice-agent API (2024), a low-latency text-to-speech engine, and a set of speech-to-text “bolt-ons” (translation, summaries, chapters, sentiment, topics).

The buyer is primarily a developer or product team building voice products — contact-center analytics, media captioning, medical and legal transcription, note-taking assistants, and real-time voice agents. Speechmatics positions itself directly against Deepgram and AssemblyAI (it publishes head-to-head comparison pages for both) on the axis of transcription accuracy and language breadth.

Pricing is split across three tiers — a free tier for exploration, a self-serve pay-as-you-go Pro tier, and a sales-led Enterprise tier with custom volume discounts and on-premises/private-cloud deployment. A Startup Program offers up to $50,000 in usage credits to early-stage founders, capped at roughly 20 startups per cohort.

Pricing summary : How Speechmatics meters speech-to-text and text-to-speech by usage

Speechmatics uses a pay-as-you-go usage model with a free monthly allowance, billed across three independent usage dimensions plus a custom Enterprise tier:

Speech-to-text (per hour): On the Pro tier, the multilingual Batch Melia 1 model is $0.129/hr, batch and real-time standard accuracy are $0.24/hr, batch enhanced accuracy is $0.40/hr, and real-time enhanced accuracy is $0.43/hr. Usage is metered to the second and billed at the per-hour rate. (The page also exposes a “Minutes” unit toggle that re-expresses the same rates per minute.)
Text-to-speech (per character): $0.011 per 1,000 characters on Pro, metered separately from speech-to-text.
Speech-to-text bolt-ons (per hour): Translation $0.65/hr, Summaries $0.12/hr, Chapters $0.40/hr, Sentiment $0.12/hr, Topics $0.20/hr — added on top of the underlying transcription rate.
Free allowance: Every account gets 3,000 free STT minutes/month (50 hrs: 1,200 real-time + 1,800 batch) and 1 million free TTS characters (~20 hrs) before any charges begin.

Volume discounts apply automatically: Pro usage above 500 hours/month for a given speech-to-text type is discounted 20%, with further discounts from 24,000 hours/year. Enterprise pricing is entirely custom (“volume discounts available”). Enabling Model Training (sharing anonymized data) earns a 33% usage discount.

What makes this different: Speechmatics prices a pure-usage API by the hour of audio rather than by tokens or API calls, separates “standard” and “enhanced” accuracy into distinct per-hour SKUs, and folds a generous free allowance into both the Free and Pro tiers so paid customers keep the same monthly freebie. It is a textbook freemium pricing model wrapped around a self-serve sales motion, with sales-led Enterprise as the only gated layer.

Pricing by product

Speech-to-Text (per-hour usage rates, Pro tier)

Tier	Price	Included	Key mechanics
Free	$0	3,000 STT minutes/mo (50 hrs: 1,200 real-time + 1,800 batch); 2 concurrent real-time sessions	No credit card; 8 hrs free/month to try
Pro	from $0.129 / hr	Same free monthly allowance, then metered per hour; 50 concurrent real-time sessions; 10 file jobs/sec	Pay-as-you-go, no commitment; billed to the second
Enterprise	Custom	Unlimited scale, no rate limits, custom models, multi-region cloud	Sales-led; volume discounts; on-prem / on-device

Speech-to-Text — Pro per-hour accuracy rates

SKU	Price	Notes
Batch Melia 1 (multilingual)	$0.129 / hr	Multilingual model; matches Standard accuracy; batch only
Batch standard accuracy	$0.24 / hr	Cost-control / turnaround model
Batch enhanced accuracy	$0.40 / hr	Best-in-class accuracy
Real-time standard accuracy	$0.24 / hr	Standard model (no turnaround benefit in real-time)
Real-time enhanced accuracy	$0.43 / hr	Highest-accuracy real-time
Volume discount	20% off	Automatic above 500 hr/month per STT type

Speech-to-Text bolt-ons (Pro, per hour)

Bolt-on	Price
Translation	$0.65 / hr
Summaries	$0.12 / hr
Chapters	$0.40 / hr
Sentiment	$0.12 / hr
Topics	$0.20 / hr

Text-to-Speech (per-character usage)

Tier	Price	Included	Key mechanics
Free	$0	1M characters/mo (~20 hrs); low-latency	English (more languages coming)
Pro	$0.011 / 1k chars	Same 1M free characters, then metered	Metered separately from speech-to-text
Enterprise	Custom	On-prem TTS, custom voice development	Sales-led, quoted

Sales motions across products: PLG / self-serve for the Free and Pro tiers (sign up and pay online, no commitment); sales-led for Enterprise (custom volume discounts, on-prem deployment, dedicated CSM).

Hidden costs : What real speech-to-text bills look like at volume

The headline $0.24/hr makes Speechmatics look almost free, but the real bill depends on which accuracy SKU you pick, which bolt-ons you switch on, and whether you cleared the 3,000-minute (50-hour) free allowance. Two archetypes show how the per-hour rates compound.

Archetype 1 — a voice-agent startup running real-time enhanced STT + TTS. A team running 800 hours/month of real-time enhanced transcription (for a live voice agent), layering Translation and Summaries bolt-ons, and synthesising ~5M characters of TTS replies:

Line item	Monthly cost
Real-time enhanced STT: 800 hrs × $0.43	$344.00
Less free allowance: 20 hrs real-time (1,200 min)	−$8.60
Translation bolt-on: 800 hrs × $0.65	$520.00
Summaries bolt-on: 800 hrs × $0.12	$96.00
Text-to-speech: (5M − 1M free) chars × $0.011/1k	$44.00
Total	$995.40

The transcription line ($344) is barely a third of the bill — the Translation bolt-on alone ($520) costs more than the underlying transcription, because each capability is a full per-hour rate stacked on top. This is the speech-AI version of the metered-add-on trap we cover in the hidden costs of usage-based pricing: the advertised base rate is a fraction of the realised bill once required capabilities are switched on.

Archetype 2 — a media-captioning team running batch enhanced. A post-production team transcribing 2,000 hours/month of pre-recorded media at enhanced accuracy, with Chapters and the free allowance applied:

Line item	Monthly cost
Batch enhanced STT: 2,000 hrs × $0.40	$800.00
Less free allowance: 30 hrs batch (1,800 min)	−$12.00
Volume discount: 20% off the 1,500 hrs over 500/mo	−$120.00
Chapters bolt-on: 2,000 hrs × $0.40	$800.00
Total	$1,468.00

The automatic 20% volume discount only applies to the hours above 500/month and only to the speech-to-text line (not the Chapters bolt-on), so a high-volume captioning workload still pays full freight on every add-on. Buyers modelling cost need to count each enabled capability as its own meter — see our primer on choosing the right value metric for why bundling vs unbundling these matters.

Want to estimate your own Speechmatics bill? Use the Speechmatics pricing calculator to model your monthly cost based on hours of audio and characters of speech.

Pricing evolution : From per-hour transcription toward a full speech-AI platform

Speechmatics has cut its speech-to-text prices across three distinct waves while steadily turning a single batch-transcription SKU into a multi-product speech-AI platform. The direction of travel is unusual: where most AI vendors raised prices as models improved, Speechmatics pushed per-hour rates down by roughly 5× over three years — and further still for its cheapest multilingual model — as GPU-scaled Ursa models lowered its own inference cost. The 2026-07-06 cut of real-time enhanced to $0.43/hr and the launch of the $0.129/hr Batch Melia 1 model are the latest steps in that descent.

Cadence

Quarter	Price changes	Product / SKU additions	Notes
2022 Q4	0	0	Baseline snapshot: Free (4 hrs/mo) / On Demand / Enterprise; batch-only at $1.25/hr Standard, $1.90/hr Enhanced; 48 languages.
2023 Q1	1	1	2023-03: Real-Time Transcription added as a priced SKU ($1.65/hr Standard, $2.15/hr Enhanced), alongside the Ursa model launch and Translation.
2023 Q2–Q3	1	2	Major cut by 2023-07: Batch Standard $1.25→$0.80, RT Enhanced $2.15→$1.35; “Lite Mode” batch added at $0.30/hr; speech bolt-ons split out; free tier doubled to 8 hrs/mo.
2024 Q1	0	0	2024-03: prices stable; language coverage up to 50; 10 concurrent real-time sessions on PAYG.
2026 Q2	1	1	2026-06-04: tiers renamed Free / Pro; second deep cut (Batch Std $0.80→$0.24, RT Enh $1.35→$0.56); Lite Mode retired; Text-to-Speech launched at $0.011/1k chars; free allowance expanded to 2,400 min/mo.
2026 Q3	1	1	2026-07-06: real-time enhanced cut again $0.56→$0.43; Batch Melia 1 multilingual model added at $0.129/hr (now the lowest Pro rate); free allowance grew to 3,000 min/mo (50 hrs: 1,200 real-time + 1,800 batch).

Tracked range: 2022-10 – 2026-07. Quarters not listed (2023 Q4, 2024 Q2–Q4, 2025) showed no priced changes in the sampled Wayback snapshots.

Notable changes

2022-10 — Earliest archived structure: Free / On Demand / Enterprise, batch-only at $1.25–$1.90/hr, 48 languages (pricing page, Wayback 2022-10-13).
2023-03 — Real-time transcription priced separately ($1.65/$2.15 per hr); coincides with the Ursa launch, which Speechmatics claimed beat OpenAI Whisper by ~25% and Microsoft by ~22% on accuracy.
2023 mid-year — First deep price cut and unbundling: standard batch dropped to $0.80/hr, a $0.30/hr “Lite Mode” appeared, and Translation/Summaries/Sentiment/Topics/Chapters became per-hour bolt-ons (verified between the 2023-03 and 2023-07 Wayback snapshots).
2024 — Flow voice-agent API launched; pricing page otherwise stable through the 2024-03 snapshot.
2026-06 — Second deep cut to $0.24/$0.40/$0.56 STT, retirement of Lite Mode, launch of per-character Text-to-Speech, and a 5× expansion of the free allowance to 2,400 minutes/month.
2026-07 — Real-time enhanced trimmed a third time ($0.56→$0.43/hr), a multilingual Batch Melia 1 model added at $0.129/hr as the new lowest Pro rate, and the free allowance grew again to 3,000 minutes/month (50 hrs: 1,200 real-time + 1,800 batch).

The three-wave price decline in detail

Across the tracked range, real-time enhanced transcription fell from $2.15/hr (2023-03) → $1.35/hr (2023-07) → $0.56/hr (2026-06) → $0.43/hr (2026-07) — a 5.0× reduction — and batch standard fell from $1.25/hr → $0.80/hr → $0.24/hr, roughly 5.2×. The 2026-07-06 Batch Melia 1 model at $0.129/hr pushes the effective floor deeper still: against the 2022 batch-standard baseline of $1.25/hr that is nearly a 10× reduction for the cheapest multilingual SKU, and it undercuts the prior $0.24/hr entry rate by ~1.9×. Each cut tracked Speechmatics’ own published engineering work on moving inference to GPUs and shrinking cost-per-hour (its GPU-optimization writeup drew an 87-point Hacker News thread on 2025-05-21). Rather than capture that margin, Speechmatics passed most of it to customers as lower per-hour rates and a far larger free allowance — a deliberate land-grab against Deepgram and AssemblyAI, both of which now sit in the same $0.13–$0.45/hr band.

What’s unique : Per-hour accuracy SKUs and a data-for-discount trade

Per-hour-of-audio metering with distinct accuracy SKUs. Speechmatics splits “standard” and “enhanced” accuracy into separate per-hour line items, letting buyers trade cost against accuracy per job rather than locking into one model. This makes model quality itself a priced dimension — a $0.24/hr vs $0.40/hr batch choice — which is rare among AI APIs that usually meter only volume. It maps cleanly to how transcription buyers already budget (hours of audio, not tokens), a value-metric fit we explore in why per-token pricing confuses buyers.

A free allowance that survives into the paid tier — and grew over time. Both Free and Pro accounts get the same 3,000 STT minutes (50 hrs) + 1M TTS characters every month, so paying customers keep the freebie rather than losing it the moment a card is added. That allowance grew from 4 hrs/mo (2022) → 8 hrs/mo (2023) → 40 hrs/mo (2026-06) → 50 hrs/mo (2026-07), tracking the same generosity curve as the price cuts.

A data-for-discount trade instead of a cash discount. Enabling “Model Training” (letting Speechmatics use your anonymised audio to improve its models) applies a 33% usage discount that can be toggled off at any time. This is a non-cash lever — the buyer pays in data rather than dollars — and it directly funds the cost-curve improvements that let Speechmatics keep cutting prices. It’s one of the cleaner examples of pricing as a two-sided value exchange in the corpus.

Counter-cyclical price cuts as a competitive weapon. While most AI vendors raised prices as capability improved, Speechmatics drove per-hour rates down ~5× from 2022 to 2026 — and nearly 10× on the new $0.129/hr multilingual Batch Melia 1 model — passing GPU-efficiency gains to customers to undercut Deepgram and AssemblyAI on a published, public rate card. Transparent public per-hour pricing in a market where AWS, Google, and many enterprise ASR vendors quote opaquely is itself a differentiator.

Automatic, no-negotiation volume discounting. Pro usage above 500 hours/month for a given speech-to-text type is discounted 20% with zero action required, and the example bill splits base-rate and discounted hours within the same month. This gives self-serve customers an enterprise-style tapered curve without a sales call — a usage-tier mechanic most APIs reserve for negotiated contracts.

Strengths & weaknesses

Strengths	Weaknesses
Transparent public per-hour rates for every STT SKU and bolt-on	Enterprise pricing fully opaque (“Custom” with no indicative band)
Generous free allowance (50 hrs/mo) carried into the paid Pro tier	TTS limited to English at launch (more languages “coming”)
Accuracy priced as its own axis (standard vs enhanced)	Bolt-ons stack as full per-hour rates — a real bill is far above base
Counter-cyclical price cuts (~5×; ~10× on Melia 1) keep it competitive vs Deepgram	Inconsistent free-allowance wording (cards “3,000 min / 50 hrs” vs legacy FAQ “8 hrs free”)
Automatic 20% volume discount with no sales call	Pro hard-capped at 6,000 hrs/mo, forcing a sales handoff at scale
Data-for-discount (33% Model Training) lever beyond cash discounts	Per-hour STT vs per-character TTS units don’t compare cleanly

Billing UX : Self-serve usage controls, unit toggles and automatic volume discounts

Hours ↔ Minutes unit toggle — the pricing page lets you re-express every speech-to-text rate per hour or per minute without changing the underlying price.
Model Training toggle (“Enable for 33% discount”) — opting into anonymized model training applies a 33% discount to usage; it can be turned off at any time for future usage.
Automatic volume discounting — Pro usage above 500 hours/month for a given speech-to-text type is discounted 20% with no action required; example billing splits base-rate and discounted hours within the same month.
Per-second metering, monthly invoicing — Pro customers are billed on the 1st of each month for the prior month’s usage, costed to the second at the per-hour rate.
Free-then-card upgrade path — accounts use the free allowance with no card on file; reaching the limit simply prompts adding a credit card in account settings to continue.
Pro usage cap — Pro tier usage is capped at 6,000 hours/month; beyond that customers are directed to sales for Enterprise terms.

Strategic wins : Decisions that strengthen the model

1. Transparent per-hour rates lower the barrier to evaluation

Publishing exact per-hour speech-to-text rates — including every bolt-on — lets developers self-qualify before talking to sales, which suits a developer-led buying motion in a market (AWS Transcribe, Google STT, many enterprise ASR vendors) full of opaque, quote-only pricing. Transparency is itself the wedge: a buyer can read the rate card, run the math, and start a free trial in one session. This is the product-led growth motion the rest of the AI-infra market is converging on.

2. Pricing the cost curve down instead of up

Most AI vendors treat capability gains as pricing power and raise rates; Speechmatics did the opposite, cutting STT ~5× from 2022 to 2026 (and nearly 10× on the new $0.129/hr multilingual Batch Melia 1 model) as GPU-scaled Ursa models lowered its own inference cost. Passing efficiency to customers — rather than banking it as margin — turned a premium ASR vendor into a price-competitive one against Deepgram and AssemblyAI without abandoning its accuracy story. It’s a textbook case of letting unit economics drive the price floor rather than the ceiling.

3. A generous free allowance that survives into the paid tier

Carrying the same 3,000 free STT minutes and 1M TTS characters into Pro removes the usual “free runs out the moment you pay” friction, and the allowance has grown over 12× since 2022. A paying customer never feels punished for upgrading, which lowers the psychological cost of the first paid invoice — the conversion-friction problem we unpack in designing free tiers that convert.

4. A Startup Program that seeds the top of the funnel

Up to $50,000 in usage credits for early-stage founders (typically <$10M raised) converts pre-revenue teams into Speechmatics-native architectures before they can afford Enterprise. Capping cohorts at ~20 startups keeps the credit liability bounded while planting switching costs early — the same land-and-expand logic behind cloud-credit programs.

5. A non-cash discount lever that funds future price cuts

The 33% “Model Training” discount lets price-sensitive buyers pay in anonymised data instead of dollars, and that data feeds the model-improvement flywheel that makes the next price cut affordable. It converts a cost (R&D data acquisition) into a customer-facing incentive — a rare two-sided lever most pricing teams overlook when they default to flat percentage discounts.

Areas to improve : Gaps and proposed fixes

1. Reconcile the free-allowance wording

The page states “3,000 minutes (50 hours)” on the cards, “1,200 + 1,800 minutes” in the comparison table, and “8 hrs free” in legacy FAQ copy carried over from the older 8-hr allowance — proposed fix: state one consistent free figure (50 hrs / 3,000 min) across cards, comparison table, and FAQ so buyers don’t distrust the headline number. Inconsistent allowance language is one of the pricing-page trust killers we flag most often.

2. Surface the true cost of stacked bolt-ons

A buyer reading “$0.24/hr” has no way to know that adding Translation ($0.65/hr) more than triples the bill. Proposed fix: show a worked example — or a live estimate — of base STT + selected bolt-ons, the same way our hidden-cost analysis recommends, so the realised per-hour cost is visible before commitment rather than discovered on the first invoice.

3. Surface Enterprise price anchors

Enterprise is “Custom” everywhere with no indicative band; proposed fix: publish a starting volume-discount example (e.g. “from $0.18/hr at 24,000 hrs/yr”) so buyers can self-qualify before booking a demo, instead of forcing every large buyer into a sales conversation to learn whether the economics even work.

4. Expose the per-character TTS rate in the same unit toggle

The Hours/Minutes toggle covers speech-to-text but text-to-speech stays per-1k-characters; proposed fix: add a per-hour-equivalent estimate for TTS so the two products compare cleanly and a voice-agent buyer can reason about combined STT+TTS cost in one mental unit.

Monetization stack & signals : how Speechmatics builds & buys its revenue engine

Buys 3 Builds 0 1 signal role

The read — where the monetization investment is going

Speechmatics buys its GTM engine (HubSpot CRM + Lemlist + Dreamdata) but names no billing/metering vendor — the meter behind its own per-hour pricing stays undisclosed. The signal worth watching is the RevOps hire below, building the bridge as it shifts "from a sales-led to a product-led growth motion."

Stack — build vs buy

Buys (vendor) · 3

HubSpot CRM Job post Jun 2026

“We're hiring a Hubspot Specialist to lead the management of Hubspot ... Owning HubSpot end-to-end, including, configuration, automations, workflows, routing, permissions, and data reliability”
Lemlist Analytics Job post Jun 2026

“Running outbound systems (Lemlist) and AI-driven GTM workflows”
Dreamdata Analytics Job post Jun 2026

“Managing marketing attribution (Dream Data), with cross-over into Salesforce and broader commercial tooling”

Unconfirmed · 1

Metering Metering inferred

What the hiring reveals

View open roles

Hubspot Specialist RevOps Jun 16, 2026

The canonical PLG-onto-a-sales-led-core inflection: a RevOps hire owning HubSpot and explicitly tasked with "blending CRM data with product usage signals" as the company shifts to product-led growth — the operational bridge between negotiated Enterprise deals and self-serve Pro usage.

“Speechmatics is transitioning from a sales-led to a product-led growth motion, which means this role has genuine scope to shape how that plays out operationally ... Maintaining data quality across the stack: deduplication, enrichment, and blending CRM data with product usage signals”

1 more matched role — supporting evidence

Customer Success Manager Customer success May 20, 2026

Signals reviewed Jun 2026 · derived from public job posts

Job postings fill and close over time — once a posting is filled we keep it as a dated citation (the quoted evidence remains); use View open roles for current listings.

Key takeaways

Match the value metric to how buyers already budget. Pricing by hour of audio rather than tokens or API calls matches how transcription buyers already think about cost, removing a translation step before purchase. Pick the unit your customer measures their own work in.
You can pass a cost curve down as a moat. Speechmatics cut prices ~5× as its inference got cheaper — and the July-2026 $0.129/hr multilingual Batch Melia 1 model reset the entry rate nearly 10× below its 2022 baseline — using efficiency gains to undercut rivals rather than bank margin. A viable strategy when your unit cost is falling faster than the market’s willingness to pay drops, and a fresh low-cost model (rather than another across-the-board cut) is a clean way to reset the headline price without repricing every existing SKU.
Make capability a priced axis, not just volume. Splitting standard vs enhanced accuracy into separate per-hour SKUs lets the same product serve cost-sensitive and quality-sensitive buyers without a discount negotiation.
A free tier that survives into paid removes upgrade friction. Carrying the identical monthly allowance into the paid plan means the first invoice never feels like a penalty — a small mechanic that materially lowers conversion anxiety. Speechmatics keeps widening that allowance in lockstep with its price cuts (up to 3,000 min / 50 hrs as of July 2026), so the free-to-paid on-ramp gets more generous with every repricing rather than being clawed back.
Non-cash discounts can fund your own roadmap. The 33% Model-Training trade buys the data that improves the models that justify the next price cut — discounts don’t have to be pure margin giveaways.

UBP implications

Accuracy can be a priced dimension. Splitting standard vs enhanced accuracy into separate per-hour SKUs shows model quality itself can be a billable usage axis, not just a feature gate. This widens the addressable market without adding tiers.
Bolt-on-per-meter packaging maximises revenue but raises bill-shock risk. Charging each capability (translation, summaries, chapters) as its own full per-hour rate captures more revenue per workload, but means the realised bill can be 2–3× the advertised base — vendors must surface stacked costs or risk churn.
Falling unit costs let usage-based vendors compete on price aggressively. When inference cost drops faster than perceived value, a UBP vendor can cut per-unit rates and expand free allowances to grab share, something a flat-subscription competitor can’t match without restructuring its whole model.

Sources

Speechmatics pricing page (accessed 2026-07-06)
Speechmatics Startup Program (accessed 2026-07-06)
Speechmatics speak-to-sales (Enterprise) (accessed 2026-07-06)
Speechmatics docs & changelog (accessed 2026-06-04)
Speechmatics pricing page, Wayback 2022-10-13 (historical) (accessed 2026-06-04)
Speechmatics pricing page, Wayback 2024-03-05 (historical) (accessed 2026-06-04)

Bottom line

Speechmatics sells speech AI the way its buyers consume it — by the hour of audio and the character of speech — with a free allowance generous enough to evaluate seriously and per-hour SKUs that price accuracy as its own dimension. Enterprise remains a custom black box, but the self-serve Pro tier is unusually transparent for the ASR market.

Want to compare Speechmatics against other usage-based AI pricing? Browse the pricing blueprint.

Pricing timeline : Major events on a vertical axis

Each milestone below corresponds to a public pricing change, product launch, or material adjustment. Major events use a filled marker; minor adjustments use a faded one.

Real-time enhanced cut to $0.43/hr, Melia 1 model, bigger free tier

Jul 2026

Current snapshot: real-time enhanced accuracy cut $0.56→$0.43/hr. New multilingual Batch Melia 1 model added at $0.129/hr (now the lowest Pro rate; Pro headline reads 'from $0.129/hr'). Free monthly allowance grew to 3,000 min (50 hrs): 1,200 real-time (20 hrs) + 1,800 batch (30 hrs). Batch Standard $0.24, Batch Enhanced $0.40, Real-time Standard $0.24 unchanged; TTS $0.011/1k chars, bolt-ons and 20%/33% discounts unchanged.

captured 2026-07-06

Pro repricing, second deep cut, and TTS launch

Jun 2026

Tiers renamed Free / Pro / Enterprise. Speech-to-text cut again — Batch Standard $0.80→$0.24, Real-time Enhanced $1.35→$0.56, Lite Mode retired (standard now $0.24). Free allowance expanded to 2,400 min/mo (40 hrs: 1,200 real-time + 1,200 batch) plus 1M TTS characters. Text-to-speech launched at $0.011/1k characters. Automatic 20% volume discount over 500 hr/mo; 33% discount for enabling Model Training; 56+ languages.

captured 2026-06-04

PAYG pricing stable; language coverage grows to 50

Mar 2024

Wayback snapshot (2024-03-05): Free / Pay As You Go / Enterprise unchanged on price (Lite $0.30, Batch Std $0.80, Batch Enh $1.04, RT Std $1.04, RT Enh $1.35); language count up to 50, 10 concurrent real-time sessions / 10 batch jobps on PAYG.

Major price cut + Lite Mode + capability bolt-ons

Jul 2023

By the 2023-07-29 snapshot the per-hour rates had been cut sharply — Batch Standard $1.25→$0.80, Real-Time Enhanced $2.15→$1.35 — and 'Lite Mode' batch transcription appeared at $0.30/hr. Speech bolt-ons (Translation $0.65, Summaries/Sentiment $0.12, Topics $0.20, Chapters $0.40 per hr) became separately billed; free tier doubled to 8 hrs/mo (4hr batch + 4hr real-time). Restructure landed between the 2023-03 and 2023-07 snapshots, alongside Ursa GPU-scaled models.

Real-time transcription added as a priced SKU

Mar 2023

Wayback snapshot (2023-03-07): On-demand now splits Batch ($1.25/hr Standard, $1.90/hr Enhanced) from Real-Time ($1.65/hr Standard, $2.15/hr Enhanced). Coincides with the March-2023 Ursa model launch and Translation/auto-language-ID release. Still 48 languages, no TTS.

Three-tier on-demand pricing ($1.25–$1.90/hr)

Oct 2022

Wayback snapshot (2022-10-13): Free (4 hrs/mo) / On Demand / Enterprise (min 200 hrs/mo). Batch speech-to-text only — no real-time split, no TTS, no bolt-ons. On Demand priced at $1.25/hr Standard, $1.90/hr Enhanced; 48 languages.

Trivia

· Speechmatics meters Pro speech-to-text by the second, but quotes prices per hour — billing is rounded to the second based on the per-hour rate.
· Enabling 'Model Training' (letting Speechmatics use your anonymized data) earns a 33% usage discount — a data-for-credit trade rather than a cash discount.
· The free tier hands every account 3,000 free minutes (50 hours) per month split across real-time (1,200 min) and batch (1,800 min) speech-to-text, plus 1 million free text-to-speech characters.

Questions & answers

Is Speechmatics free to use?: Yes. The free tier requires no credit card and includes 3,000 speech-to-text minutes (50 hours) per month (1,200 real-time and 1,800 batch) plus 1 million text-to-speech characters (~20 hours).
How much does Speechmatics speech-to-text cost?: On the pay-as-you-go Pro tier, the multilingual Batch Melia 1 model is $0.129/hr, batch and real-time standard accuracy are $0.24/hr, batch enhanced accuracy is $0.40/hr, and real-time enhanced accuracy is $0.43/hr. Usage is billed to the second based on the per-hour rate.
Does Speechmatics offer volume discounts?: Yes. Pro usage above 500 hours/month for a given speech-to-text type is automatically discounted 20%, with additional discounts available from 24,000 hours/year. Enterprise pricing is custom.
How is Speechmatics text-to-speech priced?: Text-to-speech is metered separately at $0.011 per 1,000 characters on the Pro tier, after the free 1 million characters per month.
What is the Speechmatics Startup Program?: It grants early-stage founders (typically under $10M raised) up to $50,000 in usage credits across real-time, batch, and text-to-speech APIs, plus onboarding and engineering support.