All companies
technology

Sarvam AI pricing

sarvam.ai facts checked analysis reviewed
Quick summary
Sales motion
Product segment
Product
Sovereign Indic LLM, speech & translation APIs
Industry
technology
Commits
None
In this page
AI Summary
  • Sarvam AI prices a full sovereign Indic-AI stack purely on usage, denominated in Indian rupees (₹) rather than USD — a geo-native price sheet built for the India market.
  • The LLM API bills per million tokens: Sarvam-30B at ₹2.5 in / ₹1.5 cached / ₹10 out, and Sarvam-105B at ₹4 in / ₹2.5 cached / ₹16 out (~$0.03–$0.19 per 1M).
  • Speech APIs meter differently: Saaras speech-to-text at ₹30/hr (₹45/hr with speaker diarization) and Bulbul text-to-speech at ₹15/10K chars (v2) or ₹30/10K chars (v3).
  • Text tools bill per 10,000 characters — translation, transliteration and Mayura at ₹20, language ID at ₹3.5, and document parsing at ₹0.5/page.
  • Prepaid plans (Starter free, Pro ₹10,000 +₹1,000 bonus, Business ₹50,000 +₹7,500) buy higher rate limits and support; credits never expire and roll over indefinitely.
Pricing summary
Sarvam AI 2026 — a usage-priced sovereign Indic-AI stack, billed in rupees
Per-million-token LLM inference (Sarvam-30B/105B) sits alongside per-hour speech and per-character text APIs. Prepaid credits buy higher rate limits; new users start free.
Starter
Free
Developers testing the APIs
Business
₹50,000 prepaid
Scaled, latency-sensitive workloads
Enterprise
Contact us
Sovereign / on-prem deployments
LLM API — per 1M tokens
from ₹2.5 /1M tok
Per-token chat / reasoning inference
Speech & text APIs
usage metered
Speech, translation, OCR
All prices are in Indian rupees (₹) — Sarvam publishes no USD card. ~₹85 = $1 (June 2026). New users receive free starter credits to test every API; full per-model table below.

About

Sarvam AI is a Bengaluru-based foundation-model company building a full-stack sovereign AI platform for India: open-weight Indic large language models plus speech (ASR/TTS), translation, transliteration, and document-digitisation APIs tuned for 22 Indian languages. It sells to Indian developers, enterprises, and the public sector — and its pricing reflects that focus, denominated entirely in Indian rupees rather than US dollars. Founded in August 2023 by Vivek Raghavan and Pratyush Kumar (both veterans of AI4Bharat and the EkStep/Aadhaar ecosystem; the legal entity is Axonwise Private Limited), Sarvam raised about $41M in December 2023 led by Lightspeed with Peak XV and Khosla — at the time the largest early-stage Indian AI round — and later a ~$200M Series B at an estimated ~$1.2B valuation, with reports of a further raise toward a ~$1.5B valuation.

What sets Sarvam apart from the rest of this foundation-model cluster is its government anchoring. In 2025 it was selected under India’s IndiaAI Mission to build the country’s first homegrown sovereign foundation model, backed by a reported ₹99 crore ($11M) compute subsidy and access to 4,096 Nvidia H100 SXM GPUs provisioned through Yotta Data Services. That mandate is the strategic spine of the whole price sheet: Sarvam is building India-trained models on subsidised national infrastructure, then metering hosted inference on those models at rupee-native rates a domestic buyer can budget against.

The model catalog runs from Sarvam-M — a 24B open-weights hybrid launched in May 2025, built on top of Mistral Small and fine-tuned for Indian languages, math, and code (it drew “foreign model in a desi kurta” criticism for that lineage) — to the Sarvam-30B (mixture-of-experts) and Sarvam-105B (activates ~9B params/token, 128K context) models launched in February 2026 and trained from scratch in Bengaluru. All are open-weighted on Hugging Face, so buyers can self-host the same models they could call over the API — the open-weight hedge familiar from Mistral AI, here wrapped in a sovereign-AI mission rather than a European one.


Pricing summary : How Sarvam AI’s pricing model works

Sarvam AI runs a pure usage-based model with a freemium on-ramp, and every meter is priced in Indian rupees. There is no per-seat subscription — you prepay credits and each API draws them down at a per-unit rate. The dimensions are:

  • LLM tokens — separate input, cached-input, and output rates per million tokens, varying by model: Sarvam-30B at ₹2.5 in / ₹1.5 cached / ₹10 out, Sarvam-105B at ₹4 in / ₹2.5 cached / ₹16 out.
  • Speech-to-text (Saaras) — billed per audio hour (per second, rounded up): ₹30/hr standard, ₹45/hr with speaker diarization. Translation-while-transcribing carries no extra charge over the base/diarized rate.
  • Text-to-speech (Bulbul) — billed per 10,000 characters: ₹15 for v2, ₹30 for v3.
  • Translation & text tools — per 10,000 characters: Sarvam Translate / Mayura / transliteration at ₹20, language identification at ₹3.5; document parsing (Sarvam Vision) at ₹0.5/page.
  • Prepaid plan tiers — Starter (free, ₹1,000 credits, 60 req/min), Pro (₹10,000 prepay + ₹1,000 bonus, 200 req/min), Business (₹50,000 + ₹7,500 bonus, 1,000 req/min). Credits never expire and roll over indefinitely.

What makes this different: Sarvam prices its entire stack in rupees with no USD card at all — a deliberately sovereign, geo-native price built for the India market — and serves open-weight models trained on government-subsidised national compute, so the per-token rate is for hosted convenience, not the model itself.


Pricing by product

LLM API — chat & reasoning (per 1,000,000 tokens, INR)

ModelInput /1MCached input /1MOutput /1MKey mechanics
Sarvam-30B₹2.5₹1.5₹10Mixture-of-experts; cost-sensitive default
Sarvam-105B₹4₹2.5₹16~9B active params/token, 128K context, 22 Indian languages

At roughly ₹85 to the dollar, Sarvam-30B input works out to a few US cents per 1M tokens and Sarvam-105B output to roughly twenty US cents per 1M — well under Western frontier API rates. Cached input is the discounted prompt-cache re-read rate.

Speech APIs (INR)

ServicePriceKey mechanics
Saaras speech-to-text₹30 / hourBilled per second, rounded up
STT + speaker diarization₹45 / hourAdds speaker labels
STT + translation₹30 / hourTranscribe + translate, no surcharge over base
STT + translation + diarization₹45 / hourFull pipeline
Bulbul text-to-speech v3₹30 / 10K charsLatest voices
Bulbul text-to-speech v2₹15 / 10K charsPrior generation, half the v3 rate

Text & document APIs (per 10,000 characters unless noted, INR)

ServicePriceKey mechanics
Sarvam Translate V1₹20 / 10K charsIndic translation
Translate Mayura V1₹20 / 10K charsTranslation model
Transliterate₹20 / 10K charsScript conversion
Language identification₹3.5 / 10K charsCheapest meter on the sheet
Doc digitisation (Sarvam Vision)₹0.5 / pageMax 10 pages per job

Sales motions across products: PLG / self-serve for every API and the Starter, Pro, and Business prepaid plans; sales-led only for Enterprise sovereign/on-prem deployments with custom rate limits, SLAs, and data-residency controls.


Hidden costs : What Sarvam AI users actually pay

Sarvam’s per-unit rates are unusually low and fully public, but the real bill is shaped by three things the headline rate doesn’t show: the output-token premium on the LLM, the fact that speech and text meters are denominated differently (per hour vs per 10K chars), and the rate-limit ceiling that effectively forces a prepaid upgrade for production traffic. Two archetypes show how the total assembles.

Archetype 1 — a Hindi voice-assistant startup. Transcribing 2,000 hours of call audio a month with Saaras (with diarization), generating 50M characters of Bulbul v3 voice replies, and routing 40M input plus 10M output tokens a month through Sarvam-30B for the conversation logic.

Line itemMonthly cost
Saaras STT (diarized) — 2,000 hrs @ ₹45/hr₹90,000
Bulbul v3 TTS — 50M chars @ ₹30 / 10K₹1,50,000
Sarvam-30B input — 40M tok @ ₹2.5/1M₹100
Sarvam-30B output — 10M tok @ ₹10/1M₹100
Estimated total₹2,40,200/mo ($2,825)

The lesson: for a voice product the speech meters dominate, not the LLM. Tokens are almost free at these rates — the bill is overwhelmingly TTS characters and ASR hours, so the value metric to optimize is audio minutes and spoken characters, not prompt size. A product team must read all three meters together because they scale on completely different units.

Archetype 2 — a translation pipeline at production scale. Translating 500M characters of catalog/content a month via Sarvam Translate, with language ID on each item.

Line itemMonthly cost
Sarvam Translate — 500M chars @ ₹20 / 10K₹10,00,000
Language ID — 500M chars @ ₹3.5 / 10K₹1,75,000
Estimated total₹11,75,000/mo ($13,800)

Here the surprise is the rate limit, not the rupees: 500M chars/month at production cadence will blow past the Starter 60 req/min and even the Pro 200 req/min ceiling, so the real cost of “scale” is moving to the Business plan (₹50,000 prepay, 1,000 req/min) or an Enterprise quote. The per-character price is cheap; the throughput tier is the gating cost.

Want to estimate your own Sarvam AI bill? Use the Sarvam AI pricing calculator to model your costs across tokens, audio hours, and characters.


Pricing evolution : Sarvam AI pricing history and changes

Sarvam’s pricing followed its model roadmap. There was no public API price until the first hosted model shipped; the rupee-native usage sheet appeared with Sarvam-M in May 2025 and broadened as the from-scratch Sarvam-30B/105B models landed in early 2026. The milestones below are reconstructed from primary announcements and contemporaneous press; quarter-level cadence will be tightened with archived snapshots on a later pass.

Cadence

QuarterPrice changesProduct / SKU additionsNotes
2023 Q400~$41M seed + Series A; pre-product, no public pricing
2025 Q2112025-05 Sarvam-M ships; first public INR usage API + free credits
2025 Q2002025-04/05 Selected under IndiaAI Mission for the sovereign model
2026 Q1012026-02-18 Sarvam-30B + Sarvam-105B (from-scratch) become the priced API models
2026 Q200Live INR price sheet verified across LLM, speech, and text meters

Tracked range: 2023 Q4–2026 Q2. Quarters not listed had no publicly announced price or SKU change. Per-snapshot price reconstruction is a later pass; the naked /pricing path 404s, so prices are read from api-pricing + docs.

Notable changes

  • 2023-12 — ~$41M seed + Series A led by Lightspeed (Peak XV, Khosla) — largest early-stage Indian AI round at the time. No public pricing yet (TechCrunch).
  • 2025-04/05 — Selected under India’s IndiaAI Mission to build the sovereign foundation model; reported ₹99 crore ($11M) compute subsidy + 4,096 H100 GPUs via Yotta (Inc42).
  • 2025-05-23Sarvam-M (24B open-weights, built on Mistral Small) launches with a public, INR-denominated usage API and free starter credits — the first priced surface.
  • 2026-02-18Sarvam-30B + Sarvam-105B (from-scratch, fully domestic) launch and become the models behind the paid per-token API (Sarvam-30B ₹2.5/₹10, Sarvam-105B ₹4/₹16 per 1M) (TechCrunch).
  • 2026 — ~$200M Series B at an estimated ~$1.2B valuation (Peak XV, Lightspeed), with reports of a further raise toward ~$1.5B.

From fine-tune to from-scratch, in detail

The most consequential shift in Sarvam’s short history is not a price change but a provenance change that the price sheet now rests on. Sarvam-M (May 2025) was a 24B open-weights fine-tune built on top of Mistral Small — capable on Indic benchmarks but, critics argued, “a foreign model in a desi kurta.” Under the IndiaAI sovereign mandate, the February 2026 Sarvam-30B and Sarvam-105B were trained from scratch in Bengaluru on subsidised national H100 compute. The pricing implication is sovereignty-as-credibility: the same rupee-native per-token rates now buy inference on a genuinely India-built model, which is the exact value proposition a public-sector or sovereignty-conscious enterprise buyer is paying the platform to stand behind.


What’s unique : Sarvam AI’s distinctive pricing mechanics

1. Sovereign, rupee-native pricing — no USD card at all. Almost every foundation-model lab in this corpus prices in dollars (and many geo-lock a rupee or euro view behind a toggle). Sarvam does the opposite: its entire price sheet is denominated in Indian rupees with no USD option, because the buyer it is built for — Indian developers, enterprises, and the government — budgets in rupees. That is not a cosmetic choice; it is the monetization expression of an IndiaAI-Mission sovereign mandate. The pricing is the positioning: a national AI stack priced for the nation that subsidised it.

2. Three differently-denominated meters in one stack. Sarvam bills LLM inference per million tokens, speech per audio hour, and text/TTS per 10,000 characters — three distinct units in a single platform. For a multimodal Indic product (voice assistant, dubbing, doc pipeline) the cost driver shifts surface to surface: tokens are nearly free, but audio hours and spoken characters dominate. Buyers have to model each meter on its own unit, which makes Sarvam’s bill behave very differently from a token-only lab like OpenAI.

3. Open weights on subsidised national compute, then metered hosting. Like Mistral AI, Sarvam open-weights its models (Sarvam-M, 30B, 105B) so buyers can self-host — but the models were trained on government-subsidised H100s under the IndiaAI Mission. So the per-token rate isn’t for the model (which you can download free) and isn’t even fully for the compute (which was subsidised); it’s for managed inference plus the sovereignty assurance, a structurally cheaper-to-produce inference product than a privately-funded lab’s.


Strengths & weaknesses

StrengthsWeaknesses
Fully public, rupee-native per-unit rates across LLM, speech, and text — no “contact sales” wall for any meterINR-only sheet with no USD card adds friction for global buyers who must convert and watch FX
Token rates are strikingly low (Sarvam-30B ₹2.5/1M in, ~$0.03) versus Western frontier APIsOutput-token premium on the LLM (₹16 vs ₹4 input on 105B, 4×) favors short-answer workloads
Open weights on Hugging Face let buyers self-host the same models — a credible lock-in hedgeSpeech and text use different meters (per hour vs per 10K chars), making blended cost harder to predict
Government-anchored sovereign-AI mandate + subsidised compute keep inference cheap and credible for Indian buyersRate limits (60/200/1,000 req/min) gate throughput, so production scale forces a prepaid upgrade
Credits never expire and roll over indefinitely — no use-it-or-lose-it prepaid trapEnterprise sovereign/on-prem deployments are fully sales-gated with no public floor price
Free starter credits + a free open-weight path make the on-ramp genuinely zero-costEarly model (Sarvam-M) was a Mistral-Small fine-tune, drawing “desi kurta” provenance criticism

Billing UX : Sarvam AI billing controls and transparency

  • Prepaid credit wallet — you top up a rupee credit balance (Starter free, Pro ₹10,000 + ₹1,000 bonus, Business ₹50,000 + ₹7,500 bonus) and every API call draws it down at its per-unit rate; no monthly seat commitment.
  • Non-expiring, rolling credits — credits “never expire and roll over indefinitely,” so unused balance is never forfeited — a buyer-friendly contrast to the typical expiring-credit model.
  • Per-second billing on speech — speech-to-text is billed per second, rounded up, rather than per whole minute or hour block, so short clips aren’t over-charged.
  • Rate-limit tiers as the upgrade lever — instead of price changing, the prepaid tier raises requests/minute (60 → 200 → 1,000), so throughput, not unit price, is what you buy up.
  • Free self-host path — open weights on Hugging Face let cost-sensitive teams run inference themselves and skip the API meter entirely for non-managed workloads.
  • Tiered support — community support on Starter, email on Pro, and a Slack channel with a dedicated engineer on Business; Enterprise adds solutions support and SLAs.

Strategic wins : Why Sarvam AI’s pricing decisions worked

1. Pricing the nation’s stack in the nation’s currency

By denominating the entire sheet in rupees with no USD card, Sarvam turned a sovereign-AI mandate into a pricing strategy. For the Indian developer or public-sector buyer it is competing for, a rupee-native rate removes FX friction and signals “built for you.” It is the monetization expression of the IndiaAI Mission selection — the pricing reinforces the positioning rather than fighting it. See usage-based pricing strategy for why aligning the meter to the buyer’s mental model wins.

2. Subsidised compute funds aggressive unit economics

Because the models were trained on government-subsidised H100 compute under the IndiaAI Mission, Sarvam can price hosted inference far below privately-funded frontier labs (Sarvam-30B input at ~$0.03/1M). That lets it under-price global APIs on the exact languages it specialises in, turning a national-infrastructure advantage into a durable price edge. This mirrors the shift away from rigid per-seat licensing toward cost-following usage rates.

3. Open weights plus non-expiring credits lower the on-ramp to zero

A free open-weight self-host path and free starter credits and credits that never expire combine into an unusually low-risk on-ramp. A developer can prototype free, self-host if they prefer, and never lose prepaid balance — which de-risks adoption in a price-sensitive market. Choosing a durable usage metric and pairing it with a forgiving credit model is what makes that on-ramp stick.


Areas to improve : Gaps in Sarvam AI’s pricing approach

1. Offer a USD view for global buyers

The INR-only sheet is perfect for India but adds conversion friction for the diaspora developers, NRIs, and global teams who want Indic-language inference. A USD toggle (as most peers offer) would widen the addressable market without diluting the sovereign positioning — the rupee can stay the default. The absence today risks reading as “domestic-only,” which understates the models’ reach.

2. Make the blended multi-meter cost legible up front

Because LLM, speech, and text bill on three different units, a multimodal product’s total is hard to forecast from the price sheet alone. A worked “estimated cost per voice session” or per-document calculator surfaced in the dashboard would prevent the bill-shock and unpredictability that mixed meters invite, and help buyers self-select a prepaid tier confidently.

3. Expose an Enterprise / sovereign-deployment anchor

On-prem and sovereign deployments are fully sales-gated with no public starting point. Given that the IndiaAI mandate makes public-sector and regulated buyers the core market, a published floor or a worked deployment example would shorten procurement cycles for exactly the institutional buyers Sarvam is positioned to win. Compare how other AI companies stage enterprise transparency.


Key takeaways

  1. Price in the buyer’s currency, literally. Sarvam’s rupee-only sheet is a deliberate sovereign signal — the pricing is the positioning. When your strategic edge is “built for this market,” denominating in that market’s currency reinforces it more than any tagline.
  2. Subsidised inputs become a pricing moat. Government-subsidised compute under the IndiaAI Mission lets Sarvam under-price global APIs on Indic workloads. A structural cost advantage upstream shows up as a durable price advantage downstream.
  3. Multi-meter stacks need multi-meter thinking. Tokens, audio hours, and characters scale on different units; for a voice or translation product the LLM is nearly free and the speech/text meters dominate. Model every meter on its own unit, not the headline token rate.
  4. A forgiving credit model lowers adoption risk. Free starter credits, a free self-host path, and credits that never expire combine into a near-zero-risk on-ramp — decisive in a price-sensitive market.
  5. Provenance is a value metric. Moving from a Mistral-Small fine-tune to from-scratch domestic models is what lets the same rupee rates carry a sovereignty assurance that public-sector buyers actually pay for.

UBP implications

  1. Currency and geo-denomination are pricing levers, not just settings. Sarvam shows that choosing to price natively in the buyer’s currency — and declining to bolt on a USD card — can be a strategic statement. UBP practitioners targeting a specific market should treat denomination as part of the value proposition, not an afterthought.
  2. Subsidised or differentiated input costs should flow through to the meter. When an upstream advantage (here, subsidised national compute) lowers your cost to serve, passing it through as a lower unit rate converts an infrastructure edge into a competitive pricing edge — the cleanest way usage pricing turns cost structure into go-to-market.
  3. Mixed-meter platforms must teach buyers which unit drives the bill. When a stack bills tokens, audio hours, and characters together, the dominant cost driver shifts by use case. UBP design has to surface the binding meter per workload, or buyers misjudge cost — an early lesson for any multimodal AI platform.

Sources


Bottom line

Sarvam AI prices a full sovereign Indic-AI stack on pure usage, denominated entirely in Indian rupees: an LLM API from ₹2.5/1M tokens (Sarvam-30B) up to ₹16/1M output (Sarvam-105B), Saaras speech-to-text at ₹30–45/hr, Bulbul text-to-speech at ₹15–30/10K chars, and translation at ₹20/10K — all on open-weight models trained from scratch on government-subsidised compute under India’s IndiaAI Mission. The rupee-native sheet is the monetization face of a sovereign-AI mandate, the subsidised compute funds rates well below global APIs, and a free open-weight path plus non-expiring credits make the on-ramp nearly free. The main friction is the INR-only view for global buyers and a mixed-meter bill that needs per-unit modeling.

Want to compare Sarvam AI against other foundation-model providers? See Mistral AI and OpenAI, or browse the full pricing blueprint.

Pricing timeline : Major events on a vertical axis

Each milestone below corresponds to a public pricing change, product launch, or material adjustment. Major events use a filled marker; minor adjustments use a faded one.

Live INR price sheet: per-token LLM + per-hour speech + per-char text

Captured live INR pricing across the stack: LLM API Sarvam-30B ₹2.5 in / ₹1.5 cached / ₹10 out and Sarvam-105B ₹4 / ₹2.5 / ₹16 per 1M tokens; Saaras STT ₹30/hr (₹45 with diarization); Bulbul TTS ₹15–30/10K chars; translation/transliteration ₹20/10K, language ID ₹3.5/10K, doc parsing ₹0.5/page; prepaid plans Starter (free) / Pro ₹10,000 (+₹1,000) / Business ₹50,000 (+₹7,500) with 60/200/1,000 req-min limits and non-expiring credits. The naked /pricing path 404s; prices read from api-pricing + docs.

Sarvam-30B + Sarvam-105B (from-scratch, fully domestic) launch

Sarvam releases two open-source models trained from scratch in Bengaluru: Sarvam-30B (mixture-of-experts) and Sarvam-105B (activates ~9B params/token, 128K context, 22 Indian languages) — India's first fully domestically-trained open LLMs under the IndiaAI mandate. These become the models behind the paid per-token API (Sarvam-30B ₹2.5/₹10, Sarvam-105B ₹4/₹16 per 1M). (Source: TechCrunch, Sarvam, 2026-02.)

Sarvam-M (24B open-weights) launches with public API + free credits

Sarvam ships Sarvam-M, a 24B open-weights hybrid model built on top of Mistral Small and fine-tuned for Indian languages, math and code (+86% on romanised GSM-8K). It is served via Sarvam's API, playground and Hugging Face — the first model behind a public, INR-denominated usage API with free starter credits. Critics call it 'a foreign model in a desi kurta.' (Source: Sarvam blog/X, Entrepreneur, 2025-05.)

Selected under India's IndiaAI Mission to build the sovereign model

The Government of India selects Sarvam under the IndiaAI Mission to build the country's first homegrown sovereign foundation model, backed by a reported ~₹99 crore (~$11M) compute subsidy and 4,096 Nvidia H100 SXM GPUs provisioned via Yotta Data Services (a 100% compute subsidy reported). This government anchoring shapes the later geo-native INR price sheet. (Source: Inc42, NVIDIA blog, 2025.)

Sarvam raises ~$41M seed + Series A

Five months after its August 2023 founding in Bengaluru, Sarvam (legal entity Axonwise Private Limited) raises about $41M led by Lightspeed with Peak XV and Khosla — at the time the largest early-stage funding for an Indian AI startup. No public API pricing yet; the company is in model-build mode. (Source: TechCrunch, Sarvam blog, 2023-12.)

Trivia
  • · Sarvam's price sheet is denominated entirely in Indian rupees (₹) with no USD card — a deliberately geo-native price for the India market, where Sarvam-30B input runs ₹2.5 per 1M tokens (~$0.03).
  • · Sarvam was selected under India's IndiaAI Mission to build the country's sovereign foundation model, backed by a reported ~₹99 crore (~$11M) GPU subsidy and 4,096 Nvidia H100 GPUs via Yotta — a government-anchored sovereign-AI story.
  • · Its first hosted model, Sarvam-M (May 2025), was a 24B fine-tune built on top of Mistral Small — drawing 'foreign model in a desi kurta' criticism — before the February 2026 Sarvam-30B/105B models were trained from scratch in Bengaluru.

Questions & answers

What is Sarvam AI's pricing model?
Pure usage-based, billed in Indian rupees. The LLM API charges per million tokens (Sarvam-30B ₹2.5 in / ₹10 out, Sarvam-105B ₹4 in / ₹16 out), speech-to-text is ₹30/hr, text-to-speech is ₹15–30 per 10K characters, and translation is ₹20 per 10K characters. You prepay credits that draw down as you use the APIs.
Does Sarvam AI offer a free tier?
Yes. New users get free starter credits to test every API (₹1,000 free credits on the Starter plan per the API-pricing sheet), with a 60 requests/minute rate limit and community support. Sarvam also open-weights its models (Sarvam-M, Sarvam-30B, Sarvam-105B) on Hugging Face for self-hosting.
How much does the Sarvam LLM API cost per token?
Sarvam-30B is ₹2.5 per 1M input tokens, ₹1.5 cached, ₹10 output. Sarvam-105B is ₹4 input, ₹2.5 cached, ₹16 output. At roughly ₹85 to the dollar that is about $0.03–$0.05 per 1M input and $0.12–$0.19 per 1M output — well under Western frontier API rates.
How does Sarvam price speech and translation APIs?
Saaras speech-to-text is ₹30/hour (₹45/hour with speaker diarization), billed per second rounded up. Bulbul text-to-speech is ₹15/10K characters (v2) or ₹30/10K characters (v3). Translation, transliteration and Mayura are ₹20/10K characters, language identification ₹3.5/10K, and document parsing ₹0.5/page.
What is Sarvam's connection to the IndiaAI Mission?
In 2025 Sarvam was selected under India's government-run IndiaAI Mission to build the country's sovereign foundation model, backed by a reported ~₹99 crore (~$11M) GPU-compute subsidy and access to 4,096 Nvidia H100 GPUs via Yotta. That mandate underpins the from-scratch Sarvam-30B and Sarvam-105B models the paid API now serves.
Can I self-host Sarvam's models instead of using the API?
Yes. Sarvam open-weights its models on Hugging Face — Sarvam-M (24B, built on Mistral Small), and the from-scratch Sarvam-30B and Sarvam-105B. You can download and self-host them, and pay Sarvam's per-token API only when you want hosted inference.