What is Sarvam AI's pricing model?

Pure usage-based, billed in Indian rupees. The LLM API charges per million tokens (Sarvam-30B ₹2.5 in / ₹10 out, Sarvam-105B ₹4 in / ₹16 out), speech-to-text is ₹30/hr, text-to-speech is ₹15–30 per 10K characters, and translation is ₹20 per 10K characters. You prepay credits that draw down as you use the APIs.

Does Sarvam AI offer a free tier?

Yes. New users get free bonus credits to test every API — the Starter plan card shows ₹300 bonus credits (the page cites new-user free credits inconsistently, from ₹100 in the docs to ₹1,000 in the marketing copy) — with a 60 requests/minute rate limit and community support. Sarvam also open-weights its models (Sarvam-30B, Sarvam-105B; Sarvam-M is now deprecated) on Hugging Face for self-hosting.

How much does the Sarvam LLM API cost per token?

Sarvam-30B is ₹2.5 per 1M input tokens, ₹1.5 cached, ₹10 output. Sarvam-105B is ₹4 input, ₹2.5 cached, ₹16 output. At roughly ₹85 to the dollar that is about $0.03–$0.05 per 1M input and $0.12–$0.19 per 1M output — well under Western frontier API rates.

How does Sarvam price speech and translation APIs?

Saaras speech-to-text is ₹30/hour (₹45/hour with speaker diarization), billed per second rounded up. Bulbul text-to-speech is ₹15/10K characters (v2) or ₹30/10K characters (v3). Translation, transliteration and Mayura are ₹20/10K characters, language identification ₹3.5/10K, and document parsing ₹0.5/page.

What is Sarvam's connection to the IndiaAI Mission?

In 2025 Sarvam was selected under India's government-run IndiaAI Mission to build the country's sovereign foundation model, backed by a reported ~₹99 crore (~$11M) GPU-compute subsidy and access to 4,096 Nvidia H100 GPUs via Yotta. That mandate underpins the from-scratch Sarvam-30B and Sarvam-105B models the paid API now serves.

Can I self-host Sarvam's models instead of using the API?

Yes. Sarvam open-weights its models on Hugging Face — Sarvam-M (24B, built on Mistral Small), and the from-scratch Sarvam-30B and Sarvam-105B. You can download and self-host them, and pay Sarvam's per-token API only when you want hosted inference.

Sarvam AI Pricing

AI Summary

Sarvam AI prices a full sovereign Indic-AI stack purely on usage, denominated in Indian rupees (₹) rather than USD — a geo-native price sheet built for the India market.
The LLM API bills per million tokens: Sarvam-30B at ₹2.5 in / ₹1.5 cached / ₹10 out, and Sarvam-105B at ₹4 in / ₹2.5 cached / ₹16 out (~$0.03–$0.19 per 1M).
Speech APIs meter differently: Saaras speech-to-text at ₹30/hr (₹45/hr with speaker diarization) and Bulbul text-to-speech at ₹15/10K chars (v2) or ₹30/10K chars (v3).
Text tools bill per 10,000 characters — translation, transliteration and Mayura at ₹20, language ID at ₹3.5, and document parsing at ₹0.5/page.
Prepaid plans (Starter free +₹300 bonus, Pro ₹10,000 +₹2,000 bonus = 12,000 credits, Business ₹50,000 +₹12,500 bonus = 62,500 credits) buy higher rate limits; credits are universal across all APIs, never expire, and roll over indefinitely.

Pricing summary

Sarvam AI 2026 — a usage-priced sovereign Indic-AI stack, billed in rupees

Per-million-token LLM inference (Sarvam-30B/105B) sits alongside per-hour speech and per-character text APIs. Prepaid credits buy higher rate limits; new users start free.

Starter

Free

Prototyping & testing

Pro

₹10,000 prepaid

Startups & POCs

About

Sarvam AI is a Bengaluru-based foundation-model company building a full-stack sovereign AI platform for India: open-weight Indic large language models plus speech (ASR/TTS), translation, transliteration, and document-digitisation APIs tuned for 22 Indian languages. It sells to Indian developers, enterprises, and the public sector — and its pricing reflects that focus, denominated entirely in Indian rupees rather than US dollars. Founded in August 2023 by Vivek Raghavan and Pratyush Kumar (both veterans of AI4Bharat and the EkStep/Aadhaar ecosystem; the legal entity is Axonwise Private Limited), Sarvam raised about $41M in December 2023 led by Lightspeed with Peak XV and Khosla — at the time the largest early-stage Indian AI round — and later a ~$200M Series B at an estimated ~$1.2B valuation, with reports of a further raise toward a ~$1.5B valuation.

What sets Sarvam apart from the rest of this foundation-model cluster is its government anchoring. In 2025 it was selected under India’s IndiaAI Mission to build the country’s first homegrown sovereign foundation model, backed by a reported ~~₹99 crore (~~$11M) compute subsidy and access to 4,096 Nvidia H100 SXM GPUs provisioned through Yotta Data Services. That mandate is the strategic spine of the whole price sheet: Sarvam is building India-trained models on subsidised national infrastructure, then metering hosted inference on those models at rupee-native rates a domestic buyer can budget against.

The model catalog runs from Sarvam-M — a 24B open-weights hybrid launched in May 2025, built on top of Mistral Small and fine-tuned for Indian languages, math, and code (it drew “foreign model in a desi kurta” criticism for that lineage) — to the Sarvam-30B (mixture-of-experts) and Sarvam-105B (activates ~9B params/token, 128K context) models launched in February 2026 and trained from scratch in Bengaluru. All are open-weighted on Hugging Face, so buyers can self-host the same models they could call over the API — the open-weight hedge familiar from Mistral AI, here wrapped in a sovereign-AI mission rather than a European one.

Pricing summary : How Sarvam AI’s pricing model works

Sarvam AI runs a pure usage-based model with a freemium on-ramp, and every meter is priced in Indian rupees. There is no per-seat subscription — you prepay credits and each API draws them down at a per-unit rate. The dimensions are:

LLM tokens — separate input, cached-input, and output rates per million tokens, varying by model: Sarvam-30B at ₹2.5 in / ₹1.5 cached / ₹10 out, Sarvam-105B at ₹4 in / ₹2.5 cached / ₹16 out.
Speech-to-text (Saaras) — billed per audio hour (per second, rounded up): ₹30/hr standard, ₹45/hr with speaker diarization. Translation-while-transcribing carries no extra charge over the base/diarized rate.
Text-to-speech (Bulbul) — billed per 10,000 characters: ₹15 for v2, ₹30 for v3.
Translation & text tools — per 10,000 characters: Sarvam Translate / Mayura / transliteration at ₹20, language identification at ₹3.5; document parsing (Sarvam Vision) at ₹0.5/page.
Prepaid plan tiers — Starter (free, ₹300 bonus credits, 60 req/min), Pro (₹10,000 prepay + ₹2,000 bonus = 12,000 credits, 200 req/min), Business (₹50,000 + ₹12,500 bonus = 62,500 credits, 1,000 req/min, now the “Most Popular” tier). Credits are universal across all APIs, never expire, and roll over indefinitely.

What makes this different: Sarvam prices its entire stack in rupees with no USD card at all — a deliberately sovereign, geo-native price built for the India market — and serves open-weight models trained on government-subsidised national compute, so the per-token rate is for hosted convenience, not the model itself.

Pricing by product

LLM API — chat & reasoning (per 1,000,000 tokens, INR)

Model	Input /1M	Cached input /1M	Output /1M	Key mechanics
Sarvam-30B	₹2.5	₹1.5	₹10	Mixture-of-experts; cost-sensitive default
Sarvam-105B	₹4	₹2.5	₹16	~9B active params/token, 128K context, 22 Indian languages

At roughly ₹85 to the dollar, Sarvam-30B input works out to a few US cents per 1M tokens and Sarvam-105B output to roughly twenty US cents per 1M — well under Western frontier API rates. Cached input is the discounted prompt-cache re-read rate.

Speech APIs (INR)

Service	Price	Key mechanics
Saaras speech-to-text	₹30 / hour	Billed per second, rounded up
STT + speaker diarization	₹45 / hour	Adds speaker labels
STT + translation	₹30 / hour	Transcribe + translate, no surcharge over base
STT + translation + diarization	₹45 / hour	Full pipeline
Bulbul text-to-speech v3	₹30 / 10K chars	Latest voices (beta pricing)
Bulbul text-to-speech v2	₹15 / 10K chars	Prior generation, half the v3 rate

Text & document APIs (per 10,000 characters unless noted, INR)

Service	Price	Key mechanics
Sarvam Translate V1	₹20 / 10K chars	Indic translation
Translate Mayura V1	₹20 / 10K chars	Translation model
Transliterate	₹20 / 10K chars	Script conversion
Language identification	₹3.5 / 10K chars	Cheapest meter on the sheet
Doc digitisation (Sarvam Vision)	₹0.5 / page	Max 10 pages per job

Sales motions across products: PLG / self-serve for every API and the Starter, Pro, and Business prepaid plans; sales-led only for Enterprise sovereign/on-prem deployments with custom rate limits, SLAs, and data-residency controls.

Hidden costs : What Sarvam AI users actually pay

Sarvam’s per-unit rates are unusually low and fully public, but the real bill is shaped by three things the headline rate doesn’t show: the output-token premium on the LLM, the fact that speech and text meters are denominated differently (per hour vs per 10K chars), and the rate-limit ceiling that effectively forces a prepaid upgrade for production traffic. Two archetypes show how the total assembles.

Archetype 1 — a Hindi voice-assistant startup. Transcribing 2,000 hours of call audio a month with Saaras (with diarization), generating 50M characters of Bulbul v3 voice replies, and routing 40M input plus 10M output tokens a month through Sarvam-30B for the conversation logic.

Line item	Monthly cost
Saaras STT (diarized) — 2,000 hrs @ ₹45/hr	₹90,000
Bulbul v3 TTS — 50M chars @ ₹30 / 10K	₹1,50,000
Sarvam-30B input — 40M tok @ ₹2.5/1M	₹100
Sarvam-30B output — 10M tok @ ₹10/1M	₹100
Estimated total	~~₹2,40,200/mo (~~$2,825)

The lesson: for a voice product the speech meters dominate, not the LLM. Tokens are almost free at these rates — the bill is overwhelmingly TTS characters and ASR hours, so the value metric to optimize is audio minutes and spoken characters, not prompt size. A product team must read all three meters together because they scale on completely different units.

Archetype 2 — a translation pipeline at production scale. Translating 500M characters of catalog/content a month via Sarvam Translate, with language ID on each item.

Line item	Monthly cost
Sarvam Translate — 500M chars @ ₹20 / 10K	₹10,00,000
Language ID — 500M chars @ ₹3.5 / 10K	₹1,75,000
Estimated total	~~₹11,75,000/mo (~~$13,800)

Here the surprise is the rate limit, not the rupees: 500M chars/month at production cadence will blow past the Starter 60 req/min and even the Pro 200 req/min ceiling, so the real cost of “scale” is moving to the Business plan (₹50,000 prepay, 1,000 req/min) or an Enterprise quote. The per-character price is cheap; the throughput tier is the gating cost.

Want to estimate your own Sarvam AI bill? Use the Sarvam AI pricing calculator to model your costs across tokens, audio hours, and characters.

Pricing evolution : Sarvam AI pricing history and changes

Sarvam’s pricing followed its model roadmap. There was no public API price until the first hosted model shipped; the rupee-native usage sheet appeared with Sarvam-M in May 2025 and broadened as the from-scratch Sarvam-30B/105B models landed in early 2026. The milestones below are reconstructed from primary announcements and contemporaneous press; quarter-level cadence will be tightened with archived snapshots on a later pass.

Cadence

Quarter	Price changes	Product / SKU additions	Notes
2023 Q4	0	0	~$41M seed + Series A; pre-product, no public pricing
2025 Q2	1	1	2025-05 Sarvam-M ships; first public INR usage API + free credits
2025 Q2	0	0	2025-04/05 Selected under IndiaAI Mission for the sovereign model
2026 Q1	0	1	2026-02-18 Sarvam-30B + Sarvam-105B (from-scratch) become the priced API models
2026 Q2	0	0	Live INR price sheet verified across LLM, speech, and text meters
2026 Q3	0	0	2026-07-23 Prepaid credits repackaged — Pro/Business bonuses raised, “Most Popular” moves to Business; per-unit rates held

Tracked range: 2023 Q4–2026 Q3. Quarters not listed had no publicly announced price or SKU change. Per-snapshot price reconstruction is a later pass; the api-pricing page now resolves (it previously 404’d), and prices read from api-pricing + docs.

Notable changes

2023-12 — ~$41M seed + Series A led by Lightspeed (Peak XV, Khosla) — largest early-stage Indian AI round at the time. No public pricing yet (TechCrunch).
2025-04/05 — Selected under India’s IndiaAI Mission to build the sovereign foundation model; reported ~~₹99 crore (~~$11M) compute subsidy + 4,096 H100 GPUs via Yotta (Inc42).
2025-05-23 — Sarvam-M (24B open-weights, built on Mistral Small) launches with a public, INR-denominated usage API and free starter credits — the first priced surface.
2026-02-18 — Sarvam-30B + Sarvam-105B (from-scratch, fully domestic) launch and become the models behind the paid per-token API (Sarvam-30B ₹2.5/₹10, Sarvam-105B ₹4/₹16 per 1M) (TechCrunch).
2026 — ~$200M Series B at an estimated ~$1.2B valuation (Peak XV, Lightspeed), with reports of a further raise toward ~$1.5B.
2026-07-23 — Prepaid credits repackaged; per-unit rates held. Every API meter stayed flat (LLM ₹2.5–₹16/1M, STT ₹30–45/hr, TTS ₹15–30/10K, translate ₹20/10K, Vision ₹0.5/page), but the credit ladder that gates rate limits was restructured: the Pro bonus doubled (₹1,000→₹2,000; 11,000→12,000 credits) and the Business bonus rose ~67% (₹7,500→₹12,500; 57,500→62,500 credits), deepening the effective prepay discount to roughly 20% on Pro and 25% on Business. The “Most Popular” badge moved from Pro (₹10,000) to the ₹50,000 Business tier and Business support was trimmed from Slack + dedicated engineer to email — a coherent move to pull production buyers up the ladder while lowering the cost to serve that tier. Starter now shows ₹300 bonus credits; the api-pricing page now resolves (previously 404); Sarvam-M is marked deprecated and Bulbul v3 is flagged as beta pricing.

From fine-tune to from-scratch, in detail

The most consequential shift in Sarvam’s short history is not a price change but a provenance change that the price sheet now rests on. Sarvam-M (May 2025) was a 24B open-weights fine-tune built on top of Mistral Small — capable on Indic benchmarks but, critics argued, “a foreign model in a desi kurta.” Under the IndiaAI sovereign mandate, the February 2026 Sarvam-30B and Sarvam-105B were trained from scratch in Bengaluru on subsidised national H100 compute. The pricing implication is sovereignty-as-credibility: the same rupee-native per-token rates now buy inference on a genuinely India-built model, which is the exact value proposition a public-sector or sovereignty-conscious enterprise buyer is paying the platform to stand behind.

What’s unique : Sarvam AI’s distinctive pricing mechanics

1. Sovereign, rupee-native pricing — no USD card at all. Almost every foundation-model lab in this corpus prices in dollars (and many geo-lock a rupee or euro view behind a toggle). Sarvam does the opposite: its entire price sheet is denominated in Indian rupees with no USD option, because the buyer it is built for — Indian developers, enterprises, and the government — budgets in rupees. That is not a cosmetic choice; it is the monetization expression of an IndiaAI-Mission sovereign mandate. The pricing is the positioning: a national AI stack priced for the nation that subsidised it.

2. Three differently-denominated meters in one stack. Sarvam bills LLM inference per million tokens, speech per audio hour, and text/TTS per 10,000 characters — three distinct units in a single platform. For a multimodal Indic product (voice assistant, dubbing, doc pipeline) the cost driver shifts surface to surface: tokens are nearly free, but audio hours and spoken characters dominate. Buyers have to model each meter on its own unit, which makes Sarvam’s bill behave very differently from a token-only lab like OpenAI.

3. Open weights on subsidised national compute, then metered hosting. Like Mistral AI, Sarvam open-weights its models (Sarvam-M, 30B, 105B) so buyers can self-host — but the models were trained on government-subsidised H100s under the IndiaAI Mission. So the per-token rate isn’t for the model (which you can download free) and isn’t even fully for the compute (which was subsidised); it’s for managed inference plus the sovereignty assurance, a structurally cheaper-to-produce inference product than a privately-funded lab’s.

Strengths & weaknesses

Strengths	Weaknesses
Fully public, rupee-native per-unit rates across LLM, speech, and text — no “contact sales” wall for any meter	INR-only sheet with no USD card adds friction for global buyers who must convert and watch FX
Token rates are strikingly low (Sarvam-30B ₹2.5/1M in, ~$0.03) versus Western frontier APIs	Output-token premium on the LLM (₹16 vs ₹4 input on 105B, 4×) favors short-answer workloads
Open weights on Hugging Face let buyers self-host the same models — a credible lock-in hedge	Speech and text use different meters (per hour vs per 10K chars), making blended cost harder to predict
Government-anchored sovereign-AI mandate + subsidised compute keep inference cheap and credible for Indian buyers	Rate limits (60/200/1,000 req/min) gate throughput, so production scale forces a prepaid upgrade
Credits never expire and roll over indefinitely — no use-it-or-lose-it prepaid trap	Enterprise sovereign/on-prem deployments are fully sales-gated with no public floor price
Free starter credits + a free open-weight path make the on-ramp genuinely zero-cost	Early model (Sarvam-M) was a Mistral-Small fine-tune, drawing “desi kurta” provenance criticism

Billing UX : Sarvam AI billing controls and transparency

Prepaid credit wallet — you top up a rupee credit balance (Starter free + ₹300 bonus, Pro ₹10,000 + ₹2,000 bonus = 12,000 credits, Business ₹50,000 + ₹12,500 bonus = 62,500 credits) and every API call draws it down at its per-unit rate; credits are universal across all APIs and there is no monthly seat commitment.
Non-expiring, rolling credits — credits “never expire and roll over indefinitely,” so unused balance is never forfeited — a buyer-friendly contrast to the typical expiring-credit model.
Per-second billing on speech — speech-to-text is billed per second, rounded up, rather than per whole minute or hour block, so short clips aren’t over-charged.
Rate-limit tiers as the upgrade lever — instead of the unit price changing, the prepaid tier raises requests/minute (60 → 200 → 1,000), so throughput, not unit price, is what you buy up; the “Most Popular” default sits on the ₹50,000 Business tier.
Free self-host path — open weights on Hugging Face let cost-sensitive teams run inference themselves and skip the API meter entirely for non-managed workloads.
Tiered support — community support on Starter and email support on both Pro and Business (Business is displayed with “Slack + Solutions Engineer” in the page’s legacy plan copy, but the current pricing cards list email support); Enterprise/Contact-Sales adds custom rate limits, VPC/on-prem, and dedicated support.

Strategic wins : Why Sarvam AI’s pricing decisions worked

1. Pricing the nation’s stack in the nation’s currency

By denominating the entire sheet in rupees with no USD card, Sarvam turned a sovereign-AI mandate into a pricing strategy. For the Indian developer or public-sector buyer it is competing for, a rupee-native rate removes FX friction and signals “built for you.” It is the monetization expression of the IndiaAI Mission selection — the pricing reinforces the positioning rather than fighting it. See usage-based pricing strategy for why aligning the meter to the buyer’s mental model wins.

2. Subsidised compute funds aggressive unit economics

Because the models were trained on government-subsidised H100 compute under the IndiaAI Mission, Sarvam can price hosted inference far below privately-funded frontier labs (Sarvam-30B input at ~$0.03/1M). That lets it under-price global APIs on the exact languages it specialises in, turning a national-infrastructure advantage into a durable price edge. This mirrors the shift away from rigid per-seat licensing toward cost-following usage rates.

3. Open weights plus non-expiring credits lower the on-ramp to zero

A free open-weight self-host path and free starter credits and credits that never expire combine into an unusually low-risk on-ramp. A developer can prototype free, self-host if they prefer, and never lose prepaid balance — which de-risks adoption in a price-sensitive market. Choosing a durable usage metric and pairing it with a forgiving credit model is what makes that on-ramp stick.

Areas to improve : Gaps in Sarvam AI’s pricing approach

1. Offer a USD view for global buyers

The INR-only sheet is perfect for India but adds conversion friction for the diaspora developers, NRIs, and global teams who want Indic-language inference. A USD toggle (as most peers offer) would widen the addressable market without diluting the sovereign positioning — the rupee can stay the default. The absence today risks reading as “domestic-only,” which understates the models’ reach.

2. Make the blended multi-meter cost legible up front

Because LLM, speech, and text bill on three different units, a multimodal product’s total is hard to forecast from the price sheet alone. A worked “estimated cost per voice session” or per-document calculator surfaced in the dashboard would prevent the bill-shock and unpredictability that mixed meters invite, and help buyers self-select a prepaid tier confidently.

3. Expose an Enterprise / sovereign-deployment anchor

On-prem and sovereign deployments are fully sales-gated with no public starting point. Given that the IndiaAI mandate makes public-sector and regulated buyers the core market, a published floor or a worked deployment example would shorten procurement cycles for exactly the institutional buyers Sarvam is positioned to win. Compare how other AI companies stage enterprise transparency.

Monetization stack & signals : how Sarvam AI builds & buys its revenue engine

Buys 0 Builds 0 13 open roles

The read — where the monetization investment is going

Sarvam's monetization stack is still greenfield: a May 2026 "Product Manager, Monetization & Retention" req wants someone who "has debated Lago vs Orb vs Stripe Billing and has opinions" — the billing/metering layer for its prepaid INR credit wallet is being chosen now, not yet operated. Investment is clearly flowing into the revenue engine: ~5 billing/API-platform and data-engineering roles, plus monetization/Studio GTM and a retention-heavy product cluster.

Stack — build vs buy

Unconfirmed · 3

Usage billing / metering layer Billing inferred Job post May 2026

“You've debated Lago vs Orb vs Stripe Billing and have opinions”
Product analytics Analytics inferred Job post 1 Job post 2 May 2026

“Mixpanel, PostHog, Amplitude, whatever the stack is”
CRM CRM inferred Job post May 2026

“You are highly proficient with Salesforce or HubSpot at an operational level”

Open roles in the revenue & lifecycle org — 13

View open roles

GTM & Strategy, Sarvam Studio MonetizationGrowth Jun 3, 2026
Head of Growth Marketing RetentionGrowth Jun 3, 2026
Product Manager, Monetization & Retention MonetizationRetention May 25, 2026
Product Manager, Growth Retention May 25, 2026
Staff Engineer, API Platform Billing engineering May 21, 2026
Staff Data Engineer Billing engineering May 21, 2026
Backend Engineer - Studio Media Platform Billing engineering May 21, 2026
Engineering — Full Stack AI Engineer Billing engineering May 21, 2026
GTM Manager On-Device AI RevOps May 13, 2026
Engagement Principal, Chanakya RevOps May 13, 2026
Engagement Manager, Chanakya Customer success seen Apr 17, 2026
Account Manager Customer success seen Apr 17, 2026
GTM/Product, Vision Customer success seen Apr 17, 2026
+9 more matched roles

Signals reviewed Jun 2026 · derived from public job posts

Job postings fill and close over time — once a posting is filled we keep it as a dated citation (the quoted evidence remains); use View open roles for current listings.

Key takeaways

Price in the buyer’s currency, literally. Sarvam’s rupee-only sheet is a deliberate sovereign signal — the pricing is the positioning. When your strategic edge is “built for this market,” denominating in that market’s currency reinforces it more than any tagline.
Subsidised inputs become a pricing moat. Government-subsidised compute under the IndiaAI Mission lets Sarvam under-price global APIs on Indic workloads. A structural cost advantage upstream shows up as a durable price advantage downstream.
Multi-meter stacks need multi-meter thinking. Tokens, audio hours, and characters scale on different units; for a voice or translation product the LLM is nearly free and the speech/text meters dominate. Model every meter on its own unit, not the headline token rate.
A forgiving credit model lowers adoption risk. Free starter credits, a free self-host path, and credits that never expire combine into a near-zero-risk on-ramp — decisive in a price-sensitive market.
Provenance is a value metric. Moving from a Mistral-Small fine-tune to from-scratch domestic models is what lets the same rupee rates carry a sovereignty assurance that public-sector buyers actually pay for.

UBP implications

Currency and geo-denomination are pricing levers, not just settings. Sarvam shows that choosing to price natively in the buyer’s currency — and declining to bolt on a USD card — can be a strategic statement. UBP practitioners targeting a specific market should treat denomination as part of the value proposition, not an afterthought.
Subsidised or differentiated input costs should flow through to the meter. When an upstream advantage (here, subsidised national compute) lowers your cost to serve, passing it through as a lower unit rate converts an infrastructure edge into a competitive pricing edge — the cleanest way usage pricing turns cost structure into go-to-market.
Mixed-meter platforms must teach buyers which unit drives the bill. When a stack bills tokens, audio hours, and characters together, the dominant cost driver shifts by use case. UBP design has to surface the binding meter per workload, or buyers misjudge cost — an early lesson for any multimodal AI platform.

Sources

Sarvam AI API pricing page (accessed 2026-07-23)
Sarvam AI docs — pricing reference (accessed 2026-07-23)
Sarvam — India’s full-stack sovereign AI platform (accessed 2026-07-23)
Sarvam open weights on Hugging Face (accessed 2026-07-23)
TechCrunch — Sarvam raises $41M (2023-12) (accessed 2026-06-11)
TechCrunch — Sarvam’s from-scratch open models (2026-02) (accessed 2026-06-11)
Inc42 — Sarvam and the sovereign AI dream (accessed 2026-06-11)
Browse the pricing blueprint corpus

Bottom line

Sarvam AI prices a full sovereign Indic-AI stack on pure usage, denominated entirely in Indian rupees: an LLM API from ₹2.5/1M tokens (Sarvam-30B) up to ₹16/1M output (Sarvam-105B), Saaras speech-to-text at ₹30–45/hr, Bulbul text-to-speech at ₹15–30/10K chars, and translation at ₹20/10K — all on open-weight models trained from scratch on government-subsidised compute under India’s IndiaAI Mission. The rupee-native sheet is the monetization face of a sovereign-AI mandate, the subsidised compute funds rates well below global APIs, and a free open-weight path plus non-expiring credits make the on-ramp nearly free. The main friction is the INR-only view for global buyers and a mixed-meter bill that needs per-unit modeling.

Want to compare Sarvam AI against other foundation-model providers? See Mistral AI and OpenAI, or browse the full pricing blueprint.

Pricing timeline : Major events on a vertical axis

Each milestone below corresponds to a public pricing change, product launch, or material adjustment. Major events use a filled marker; minor adjustments use a faded one.

Prepaid credits repackaged; 'Most Popular' moves to Business

Jul 2026

All per-unit API rates held (LLM ₹2.5–₹16/1M, STT ₹30–45/hr, TTS ₹15–30/10K, translate ₹20/10K, Vision ₹0.5/page), but the prepaid credit packaging changed: Pro bonus ₹1,000→₹2,000 (11,000→12,000 credits), Business bonus ₹7,500→₹12,500 (57,500→62,500 credits), Starter now shows ₹300 bonus credits, and the 'Most Popular' badge moved from Pro to the ₹50,000 Business tier. The api-pricing page now returns 200 (previously 404). Sarvam-M is marked deprecated; Bulbul v3 flagged as beta pricing.

captured 2026-07-23

Live INR price sheet: per-token LLM + per-hour speech + per-char text

Jun 2026

Captured live INR pricing across the stack: LLM API Sarvam-30B ₹2.5 in / ₹1.5 cached / ₹10 out and Sarvam-105B ₹4 / ₹2.5 / ₹16 per 1M tokens; Saaras STT ₹30/hr (₹45 with diarization); Bulbul TTS ₹15–30/10K chars; translation/transliteration ₹20/10K, language ID ₹3.5/10K, doc parsing ₹0.5/page; prepaid plans Starter (free) / Pro ₹10,000 (+₹1,000) / Business ₹50,000 (+₹7,500) with 60/200/1,000 req-min limits and non-expiring credits. The naked /pricing path 404s; prices read from api-pricing + docs.

Sarvam-30B + Sarvam-105B (from-scratch, fully domestic) launch

Feb 2026

Sarvam releases two open-source models trained from scratch in Bengaluru: Sarvam-30B (mixture-of-experts) and Sarvam-105B (activates ~9B params/token, 128K context, 22 Indian languages) — India's first fully domestically-trained open LLMs under the IndiaAI mandate. These become the models behind the paid per-token API (Sarvam-30B ₹2.5/₹10, Sarvam-105B ₹4/₹16 per 1M). (Source: TechCrunch, Sarvam, 2026-02.)

Sarvam-M (24B open-weights) launches with public API + free credits

May 2025

Sarvam ships Sarvam-M, a 24B open-weights hybrid model built on top of Mistral Small and fine-tuned for Indian languages, math and code (+86% on romanised GSM-8K). It is served via Sarvam's API, playground and Hugging Face — the first model behind a public, INR-denominated usage API with free starter credits. Critics call it 'a foreign model in a desi kurta.' (Source: Sarvam blog/X, Entrepreneur, 2025-05.)

Selected under India's IndiaAI Mission to build the sovereign model

Apr 2025

The Government of India selects Sarvam under the IndiaAI Mission to build the country's first homegrown sovereign foundation model, backed by a reported ~₹99 crore (~$11M) compute subsidy and 4,096 Nvidia H100 SXM GPUs provisioned via Yotta Data Services (a 100% compute subsidy reported). This government anchoring shapes the later geo-native INR price sheet. (Source: Inc42, NVIDIA blog, 2025.)

Sarvam raises ~$41M seed + Series A

Dec 2023

Five months after its August 2023 founding in Bengaluru, Sarvam (legal entity Axonwise Private Limited) raises about $41M led by Lightspeed with Peak XV and Khosla — at the time the largest early-stage funding for an Indian AI startup. No public API pricing yet; the company is in model-build mode. (Source: TechCrunch, Sarvam blog, 2023-12.)

Trivia

· Sarvam's price sheet is denominated entirely in Indian rupees (₹) with no USD card — a deliberately geo-native price for the India market, where Sarvam-30B input runs ₹2.5 per 1M tokens (~$0.03).
· Sarvam was selected under India's IndiaAI Mission to build the country's sovereign foundation model, backed by a reported ~₹99 crore (~$11M) GPU subsidy and 4,096 Nvidia H100 GPUs via Yotta — a government-anchored sovereign-AI story.
· Its first hosted model, Sarvam-M (May 2025), was a 24B fine-tune built on top of Mistral Small — drawing 'foreign model in a desi kurta' criticism — before the February 2026 Sarvam-30B/105B models were trained from scratch in Bengaluru.

Questions & answers

What is Sarvam AI's pricing model?: Pure usage-based, billed in Indian rupees. The LLM API charges per million tokens (Sarvam-30B ₹2.5 in / ₹10 out, Sarvam-105B ₹4 in / ₹16 out), speech-to-text is ₹30/hr, text-to-speech is ₹15–30 per 10K characters, and translation is ₹20 per 10K characters. You prepay credits that draw down as you use the APIs.
Does Sarvam AI offer a free tier?: Yes. New users get free bonus credits to test every API — the Starter plan card shows ₹300 bonus credits (the page cites new-user free credits inconsistently, from ₹100 in the docs to ₹1,000 in the marketing copy) — with a 60 requests/minute rate limit and community support. Sarvam also open-weights its models (Sarvam-30B, Sarvam-105B; Sarvam-M is now deprecated) on Hugging Face for self-hosting.
How much does the Sarvam LLM API cost per token?: Sarvam-30B is ₹2.5 per 1M input tokens, ₹1.5 cached, ₹10 output. Sarvam-105B is ₹4 input, ₹2.5 cached, ₹16 output. At roughly ₹85 to the dollar that is about $0.03–$0.05 per 1M input and $0.12–$0.19 per 1M output — well under Western frontier API rates.
How does Sarvam price speech and translation APIs?: Saaras speech-to-text is ₹30/hour (₹45/hour with speaker diarization), billed per second rounded up. Bulbul text-to-speech is ₹15/10K characters (v2) or ₹30/10K characters (v3). Translation, transliteration and Mayura are ₹20/10K characters, language identification ₹3.5/10K, and document parsing ₹0.5/page.
What is Sarvam's connection to the IndiaAI Mission?: In 2025 Sarvam was selected under India's government-run IndiaAI Mission to build the country's sovereign foundation model, backed by a reported ~₹99 crore (~$11M) GPU-compute subsidy and access to 4,096 Nvidia H100 GPUs via Yotta. That mandate underpins the from-scratch Sarvam-30B and Sarvam-105B models the paid API now serves.
Can I self-host Sarvam's models instead of using the API?: Yes. Sarvam open-weights its models on Hugging Face — Sarvam-M (24B, built on Mistral Small), and the from-scratch Sarvam-30B and Sarvam-105B. You can download and self-host them, and pay Sarvam's per-token API only when you want hosted inference.