All companies
technology

SambaNova pricing

cloud.sambanova.ai facts checked analysis reviewed
Quick summary
Billing units
Product segment
Region
Product
SambaNova Cloud inference API & RDU AI systems
Industry
technology
Commits
Available (annual)
In this page
AI Summary
  • SambaNova sells two things: SambaNova Cloud (SambaCloud), a public per-1M-token inference API, and RDU-based hardware systems (SambaStack/SambaManaged) that are sales-quoted.
  • SambaCloud token rates (June 2026) span $0.15 input / $0.75 output for DeepSeek-V3.1-cb up to $3.00 input / $4.50 output for full DeepSeek-V3.1/V3.2; Meta-Llama-3.3-70B is $0.60 in / $1.20 out and gpt-oss-120b is $0.22 in / $0.59 out.
  • Three account tiers: Free ($0, $5 of credits that expire in 30 days, no credit card), Developer (pay-as-you-go per token), and Enterprise (subscription/custom with production rate limits, BYOC and custom limits).
  • Custom RDU silicon (SN40L, and the agentic SN50 launched Feb 2026) is SambaNova's wedge — it sells speed-per-token rather than the cheapest token, competing on fastest inference.
  • Hardware/DataScale systems carry no public rate card and are sold via enterprise contracts; the company raised a $350M Series E in February 2026 on top of a 2021 Series D that valued it above 5B.
Pricing summary
SambaNova Cloud 2026 — Inference tiers
SambaCloud is a public per-1M-token inference API. Free credits to start, pay-as-you-go after, subscription for scale.
Free
$0
Developers evaluating the API
Enterprise
Contact sales
High-volume & regulated workloads
SambaNova Cloud rates as of June 2026 (cloud.sambanova.ai/pricing & /plans). RDU hardware systems are sales-quoted separately.

About

SambaNova Systems is a Palo Alto AI company founded in 2017 by Stanford professors Kunle Olukotun and Christopher Ré together with former Oracle executive Rodrigo Liang. Rather than building on NVIDIA GPUs, SambaNova designs its own AI silicon — the Reconfigurable Dataflow Unit (RDU), most recently the SN40L and the agentic-inference SN50 — and packages it into full “chips-to-model” systems. The company raised a $676M Series D led by SoftBank Vision Fund 2 in 2021 at a valuation above 5B, pushing total funding past 1B, and announced a further $350M Series E in February 2026 alongside the SN50 and an Intel collaboration.

For pricing purposes, SambaNova is really two businesses. SambaNova Cloud (branded SambaCloud) is a developer-facing, OpenAI-compatible inference API that rents access to open models — Llama, DeepSeek, Qwen-class, gpt-oss, Gemma, MiniMax — billed per million tokens, with a published rate card and a free tier. SambaStack / SambaManaged is the enterprise hardware side: RDU systems and racks sold as sales-quoted contracts with no public price. The throughline is the chip: SambaNova competes less on the cheapest token and more on the fastest token, routinely claiming record tokens-per-second on its own hardware.

For current rates, see SambaNova Cloud pricing. Note the rate card lives on the cloud.sambanova.ai subdomain — the marketing site’s /pricing path returns a 404 because the systems business is sales-only.


Pricing summary : How SambaNova’s pricing model works

SambaNova’s pricing is hybrid, split cleanly by product:

  1. SambaNova Cloud (inference API)pure usage-based, billed per 1M tokens with separate input and output rates per model. It has three account tiers: a Free plan ($0, $5 of credits, no credit card, 30-day expiry), a pay-as-you-go Developer plan, and a subscription-based Enterprise plan with production rate limits and add-ons like BYOC and custom limits. This rate card is fully public.
  2. RDU systems (SambaStack / SambaManaged / DataScale)sales-quoted. There is no public price for the hardware, racks, or managed deployments; these are enterprise contracts sold by SambaNova’s go-to-market team.

So the buyer journey is genuinely self-serve at the bottom (sign up, get $5, call the API) and sales-led at the top (buy or rent RDU capacity), with the per-token API serving as both a product and a demand-generation funnel into the silicon.

What makes this different: Most inference APIs are reselling NVIDIA GPU time and compete on price-per-token. SambaNova runs the same open models on its own RDU silicon and competes on speed-per-token — the rate card is the wrapper, but the pitch is “fastest inference,” not “cheapest.” That makes its per-token prices closer to mid-pack while its differentiation lives in throughput and latency.


Pricing by product

SambaNova Cloud per-1M-token rates, as of June 2026 (input / output, USD):

ModelInput /1MOutput /1MNotes
DeepSeek-V3.1-cb$0.15$0.75Cheapest on the card
gpt-oss-120b$0.22$0.59Open-weight reasoning
gemma-4-31B-it$0.38$1.15Mid-size instruct
Meta-Llama-3.3-70B-Instruct$0.60$1.20Mainstream workhorse
MiniMax-M2.7$0.60$2.40High output cost
DeepSeek-R1-Distill-Llama-70B$0.70$1.40Distilled reasoning
DeepSeek-V3.1 / V3.2$3.00$4.50Frontier-class, priciest

Account tiers (SambaNova Cloud):

TierPriceIncludedKey mechanics
Free$0$5 credits, Production modelsNo card; credits expire in 30 days
DeveloperPay-as-you-goAll Production & Preview modelsStandard rate limits, per-token billing
EnterpriseSubscription / customProduction rate limits, BYOCSales-quoted for larger usage

Sales motions across products: the cloud API Free and Developer tiers are fully self-serve (PLG); Enterprise and all RDU hardware (SambaStack, SambaManaged, DataScale) are sales-led and quoted. There is no public price for the systems business.


Hidden costs : What SambaNova users actually pay

On the cloud side the rate card is clean, but real bills depend on a few things beyond the headline per-token number:

Line itemCost
Input tokens (e.g. Llama-3.3-70B)$0.60 per 1M
Output tokens (e.g. Llama-3.3-70B)$1.20 per 1M
Reasoning / “thinking” tokensBilled as output — DeepSeek-R1-distill at $1.40/1M output adds up fast
Free credits$5, then they expire in 30 days
Enterprise rate limits / dedicated capacitySales-quoted (subscription)
RDU systems / SambaStackSales-quoted; no public price

The real cost traps are structural, not line-item. First, output and reasoning tokens dominate — output rates run 2–5x input (MiniMax-M2.7 is $0.60 in but $2.40 out), so chatty or chain-of-thought workloads cost far more than the input-side rate suggests. Second, the DeepSeek-V3.1/V3.2 frontier tier at $3.00/$4.50 is roughly 20x the cheapest model, so model choice swings the bill enormously. Third, the $5 free credit expires in 30 days, so the trial doesn’t bridge a slow procurement cycle. And on the systems side, the entire cost is opaque until you talk to sales.

Want to estimate your own SambaNova Cloud bill? Use the SambaNova pricing calculator to model your costs based on model and token volume.


Pricing evolution : SambaNova pricing history and changes

Cadence

PeriodPrice changesProduct / SKU additionsNotes
2024 H2Public token rate card launchedSambaNova Cloud (free + pay-as-you-go)OpenAI-compatible API on RDU
2025 H2Per-model rates trackedSovereign-AI regional cloudsArgyll, Infercom, OVHcloud, SouthernCrossAI
2026 Q1–Q2Rate card spans $0.15–$3.00 inputSN50 RDU; $350M Series ENewer DeepSeek/gpt-oss/Gemma/MiniMax models added

Tracked range: 2024–present. The systems/hardware business has never published a public price, so only the cloud rate card is trackable.

Notable changes

  • 2024 H2 — SambaNova Cloud launches as a public, OpenAI-compatible inference API with a free developer tier and per-token pay-as-you-go billing, positioned on fastest-token throughput for open models rather than per-GPU-hour rental.
  • Late 2025 — Sovereign-AI inference partnerships (UK, Germany, EU, Australia) extend the token-based cloud regionally while keeping the published rate card.
  • June 2026 — Rate card spans $0.15/$0.75 (DeepSeek-V3.1-cb) to $3.00/$4.50 (DeepSeek-V3.1/V3.2), with Meta-Llama-3.3-70B at $0.60/$1.20 and gpt-oss-120b at $0.22/$0.59. SN50 RDU and a $350M Series E announced in February 2026.

The direction of travel is model proliferation, not headline price moves: SambaNova keeps adding newer open models at tiered rates rather than re-cutting a flat per-token price, so the effective cost depends almost entirely on which model you pick.


What’s unique : SambaNova’s distinctive pricing mechanics

1. Speed as the value metric, not price. SambaNova prices per token like everyone else, but the product it’s actually selling is throughput on custom RDU silicon. Its marketing leads with record tokens-per-second, so buyers pay mid-pack token rates for top-tier latency rather than the cheapest possible token.

2. A true free tier on inference, sales-only on hardware. The same company offers a no-credit-card $5 free tier on the cloud API and a fully gated, contact-sales motion on its RDU systems — a clean split between PLG funnel and enterprise sale within one brand.

3. Per-model price spread, not per-tier. Instead of bundling tokens into plan tiers, SambaNova lets the model choice set the price: from $0.15 input for a distilled DeepSeek to $3.00 input for the frontier model — roughly a 20x spread on the same rate card.


Strengths & weaknesses

StrengthsWeaknesses
Public, transparent per-token rate cardMarketing-site /pricing 404s; rate card hidden on cloud subdomain
Genuine free tier ($5, no card) on the APIFree credits expire in 30 days
Differentiated on inference speed (custom RDU)Token rates are mid-pack, not cheapest
OpenAI-compatible API, easy migrationHardware/systems pricing fully opaque (sales-only)
Newer open models added quicklyOutput/reasoning tokens make bills hard to predict

Billing UX : SambaNova billing controls and transparency

  • Billing controls — Self-serve console issues an API key; the Free tier draws down $5 of credits, after which you add a card and pay-as-you-go on the Developer tier. Enterprise moves to subscription-based pricing with production rate limits.
  • Usage visibility — Per-token billing with separate input/output rates is shown on the public pricing page; consumption is metered against credits, then the card.
  • Payment options — Self-serve credit-card checkout for Free/Developer; sales-led contracts, invoicing, and BYOC/custom-rate-limit arrangements for Enterprise and all RDU hardware.

Strategic wins : Why SambaNova’s pricing decisions worked

1. Using a free token tier as a funnel into custom silicon

The $5-no-card cloud tier lets any developer try RDU-backed inference in minutes, turning a hardware company’s API into a top-of-funnel acquisition channel. See how AI companies structure pricing.

2. Competing on speed instead of racing token prices to zero

By anchoring on fastest-inference rather than cheapest-token, SambaNova avoids the deflationary token price war and justifies mid-pack rates with throughput — a value-metric choice. Related: outcome-based pricing trends.

3. Letting model choice carry the price spread

Rather than rigid plan tiers, SambaNova prices each model independently across a 20x range, so customers self-select cost/quality without a packaging negotiation. See choosing the right usage metric.


Areas to improve : Gaps in SambaNova’s pricing approach

1. Discoverability of the rate card

The marketing site’s /pricing path 404s and the real rate card lives on a separate cloud subdomain, so prospective buyers hit a dead end on the obvious URL. See bill shock and cost unpredictability.

2. Output-token predictability

With output rates 2–5x input and reasoning tokens billed as output, bills are hard to forecast. A token-estimator or per-request cost preview would reduce surprise charges for chain-of-thought workloads.

3. Opaque systems pricing

The entire RDU hardware business is sales-quoted with no indicative public number, which slows evaluation for buyers comparing against GPU-cloud alternatives that publish at least banded rates.


Key takeaways

  1. SambaNova is a hybrid model — public per-token usage pricing on the cloud API, sales-quoted contracts on RDU hardware. For the underlying model, see the introduction to usage-based pricing.
  2. Token rates span ~20x by model — from $0.15 input (DeepSeek-V3.1-cb) to $3.00 input (DeepSeek-V3.1/V3.2) — so model selection, not tier, drives the bill.
  3. There’s a real free tier on inference — $5 of credits, no credit card — but it expires in 30 days.
  4. The differentiation is speed, not price — SambaNova runs open models on its own RDU silicon and sells fastest-inference at mid-pack token rates.
  5. The hardware business stays opaque — no public price for SambaStack/SambaManaged/DataScale; everything above the API is a sales conversation.

UBP implications

  1. A usage-based API can be a funnel for a non-usage product. SambaNova uses a metered, free-tier inference API to generate demand for sales-quoted silicon — usage pricing as acquisition, not just monetization.
  2. The value metric need not be the cheapest unit. Pricing per token while competing on tokens-per-second shows a usage-based vendor can hold mid-pack unit prices if it differentiates on a quality dimension buyers can feel.
  3. Per-item pricing can replace tiered packaging. Letting each model set its own rate across a wide spread lets customers self-select cost vs. quality without bundles — a clean pattern for catalogs of fungible units.

Sources


Bottom line

SambaNova is a hybrid pricing story: a transparent, usage-based inference API (SambaNova Cloud) with a free tier and per-1M-token rates from $0.15 to $3.00 input, bolted onto a sales-only RDU hardware business with no public price at all. The cloud rate card competes on speed rather than the cheapest token — SambaNova runs open models on its own silicon and sells fastest-inference — while the free $5-no-card tier funnels developers toward both pay-as-you-go usage and, eventually, enterprise systems deals. The things to watch are output-token costs and the opaque hardware pricing above the API. Browse the pricing blueprint for more fully-researched company profiles, or compare SambaNova against other Infrastructure, Compute & MLOps companies.

Pricing timeline : Major events on a vertical axis

Each milestone below corresponds to a public pricing change, product launch, or material adjustment. Major events use a filled marker; minor adjustments use a faded one.

Rate card spans $0.15 to $3.00 input across newer models

June 2026 SambaCloud rate card: DeepSeek-V3.1-cb $0.15/$0.75, gpt-oss-120b $0.22/$0.59, gemma-4-31B-it $0.38/$1.15, Meta-Llama-3.3-70B $0.60/$1.20, MiniMax-M2.7 $0.60/$2.40, DeepSeek-R1-Distill-Llama-70B $0.70/$1.40, DeepSeek-V3.1/V3.2 $3.00/$4.50. SN50 RDU and $350M Series E announced Feb 2026.

Sovereign-AI inference partnerships expand the footprint

SambaNova signs sovereign-AI inference deals (Argyll UK, Infercom Germany, OVHcloud EU, SouthernCrossAI Australia), extending the token-based cloud into region-specific clouds while keeping the published rate card.

SambaNova Cloud launches with a free developer tier

SambaNova opens a public, OpenAI-compatible inference API positioned on fastest-token-throughput for open models (Llama family), with a free tier and pay-as-you-go per-token billing rather than per-GPU-hour.

Trivia
  • · SambaNova was founded in 2017 by Stanford professors Kunle Olukotun and Christopher Ré with ex-Oracle exec Rodrigo Liang; its 2021 Series D ($676M, SoftBank-led) valued it above 5B.
  • · Its pricing pitch isn't the cheapest token — it's the fastest. SambaNova runs open models on its own RDU silicon and routinely claims record tokens-per-second for Llama, DeepSeek, gpt-oss and Gemma.
  • · The public rate card lives on cloud.sambanova.ai, not sambanova.ai/pricing — the marketing domain's /pricing path 404s, because the hardware business has no public price at all.

Questions & answers

How does SambaNova's pricing work?
SambaNova is hybrid. SambaNova Cloud (SambaCloud) is a public, usage-based inference API billed per 1M tokens, with a Free tier ($5 of credits, no card), a pay-as-you-go Developer tier, and a subscription-based Enterprise tier. Separately, SambaNova sells RDU-based hardware systems (SambaStack, SambaManaged) that are sales-quoted with no public rate card.
How much does SambaNova Cloud cost per million tokens?
As of June 2026, SambaCloud token rates range from $0.15 input / $0.75 output for DeepSeek-V3.1-cb to $3.00 input / $4.50 output for full DeepSeek-V3.1 and V3.2. Meta-Llama-3.3-70B-Instruct is $0.60 input / $1.20 output, gpt-oss-120b is $0.22 / $0.59, gemma-4-31B-it is $0.38 / $1.15, and MiniMax-M2.7 is $0.60 / $2.40.
Does SambaNova have a free tier?
Yes, for the cloud API. The SambaNova Cloud Free plan gives you $5 in API credits with no credit card required, access to Production models, and community support; the credits expire in 30 days. The hardware/systems business has no free tier and is sold through sales.
Is SambaNova usage-based or subscription pricing?
Both, depending on the product. The cloud inference API is pure usage-based (per-token, pay-as-you-go) on the Free and Developer tiers, shifting to subscription-based pricing on Enterprise for larger usage. The RDU hardware systems are sold as sales-quoted enterprise contracts.