What is it
Committed-Use Pricing is a pricing model where the customer commits to a minimum spend over a period (typically annual) in exchange for a discounted rate.It is the enterprise counterpart to pure-usage pricing. Where pure usage lets a buyer start at zero and pay only for what they consume, committed-use asks the buyer to put a floor under their spend — a dollar minimum, a reserved machine, or a prepaid volume — and rewards that floor with a lower per-unit rate. The trade is symmetric: the vendor converts a volatile usage stream into predictable, bookable revenue, and the buyer converts an unpredictable bill into a known rate.
The discount is almost always the headline. Bright Data charges $8/GB for residential proxies with no commitment, but $2.50/GB once you sit on its $1,999/month committed tier — the same bytes, a third of the price. Together AI lists on-demand H100s at $6.49/hour and reserved H100s at $4.99/hour for buyers who commit to a 7–30 day cluster. Vast.ai advertises up to 50% off its on-demand marketplace rate for reserved, pre-paid capacity. The mechanism is identical across very different products: spend certainty buys rate.
Committed-use rarely stands alone. In this corpus it is almost always the top rung of a ladder that starts with self-serve usage — a free trial or pay-as-you-go entry, then volume tiers, then an annual commit reserved for the enterprise buyer. That structure is so consistent among infrastructure vendors that it has become a documented cross-corpus pattern: the infrastructure-layer commitment-pricing trend tracks how GPU-compute and data vendors have converged on offering identical “annual commit” enterprise tiers alongside on-demand rates.
How it works
Committed-use pricing has three structural variants, distinguished by what the buyer actually commits to:| Variant | What the buyer commits to | Typical discount | Example |
|---|---|---|---|
| Dollar floor | A minimum monthly or annual spend | 20–70% | Bright Data committed tiers ($499 / $999 / $1,999/mo) |
| Reserved capacity | A specific GPU or machine for a term | Up to 50% | Vast.ai (1/3/6-month), DeepInfra DeepCluster (3-year) |
| Annual usage commit | A booked annual usage volume | Negotiated | Baseten, Fireworks AI, Groq |
The unit math is what makes the model work for both sides. A commit is a floor, not a cap: the buyer pays at least the committed amount, draws down usage against it, and — at most vendors — pays standard on-demand rates on anything above the commitment. Fireworks AI, Baseten, and Groq all state explicitly that over-commitment usage bills at list price, so the discount applies only inside the committed band.
Worked example. Suppose a team runs steady inference on Together AI H100s, about 5,000 GPU-hours a year:
- On-demand: 5,000 hr × $6.49/hr = $32,450/year, fully variable.
- Reserved commit at $4.99/hr: 5,000 hr × $4.99 = $24,950/year, a 23% saving — but the team is now on the hook for that floor even in a quiet month.
- If the team underestimates and runs 6,000 hours, the extra 1,000 hours bill at the on-demand rate, not the reserved rate.
The break-even logic is the buyer’s whole decision: a commit only pays off if forecasted usage reliably clears the floor. That is why committed-use clusters in vendors whose customers have predictable, always-on workloads — GPU inference, proxy bandwidth, vector search — and is far rarer in spiky, experimental usage. See the usage-invoicing & billing-cycles guide for how commits, drawdowns, and overage reconcile on the invoice.
Companies using this
The companies below are every in-corpus entry whose pricing model includes a commitment component, verified against their public pricing pages. The cluster is heavily weighted toward infrastructure — GPU compute and web-data platforms — where always-on workloads make a spend floor a rational trade.Patterns observed
Commitment is the enterprise rung of a usage ladder, not a standalone model. None of these 19 companies sell committed-use as their only option. RunPod, Together AI, Fireworks AI, Baseten, Groq, and Replicate all lead with self-serve on-demand usage and reserve the “annual commit” badge for the enterprise tier. The commit is the destination of the PLG-to-sales upgrade path, not its starting point.GPU compute is the center of gravity. The largest concentration is GPU-hour vendors, where the underlying capital cost makes utilization guarantees valuable to both sides. Vast.ai sells reserved machines at up to 50% off, DeepInfra offers a 3-year DeepCluster at $2.99/GPU-hour all-in, Anyscale layers annual commit discounts on its $1-per-ACU rate card, and Cerebras reserves capacity for committed orgs. This concentration is exactly what the infra-commitment-standard trend documents: the infrastructure layer has standardized on the same commit-plus-on-demand shape.
Discounts scale with commitment depth, transparently or not. The most legible version is Bright Data, which publishes a full ladder: $8/GB PAYG dropping to $7, $6, and $5/GB on the $499, $999, and $1,999 monthly committed tiers (with residential rates falling as low as $2.50/GB). Most vendors are far less transparent — Baseten, Fireworks, and Replicate say “volume discount, annual commit” and quote privately. The published-ladder approach is the exception, not the rule.
The commitment is increasingly disguised as a tier minimum. turbopuffer doesn’t sell a contract at all — it sets a monthly minimum spend per tier ($64 / $256 / $4,096) that functions as a soft commitment floor: pick a tier, and you’ve effectively committed to that minimum even on pure usage. Lorikeet does the same with annual credit pools (18,000 / 48,000 / custom resolutions) billed monthly. The floor is the commitment, dressed as packaging.
Counterexamples & variants
Twelve Labs: the gate-only variant. Twelve Labs offers committed-use contracts, but only as a fully gated Enterprise tier — there is no published prepaid bundle or committed-use discount visible on the page. The only commitment path is “contact sales,” which means a developer on the pay-as-you-go Developer plan has no on-page incentive to commit and no way to model the saving. This is the failure mode of commitment pricing: when the discount is invisible, the floor stops being a value exchange and becomes pure friction.m3ter: the platform-fee floor. m3ter’s “commitment” is a custom core platform fee bundling allowances for usage data ingested and bills calculated — a committed platform spend rather than a discounted usage rate. It’s a reminder that not every commitment is a discount mechanism; some are simply a minimum to access the product at all.
Reserved-capacity commits carry lock-in risk the dollar-floor variant doesn’t. Vast.ai is explicit that reserved credits are “locked to one machine” — pre-pay for a host and your capacity is tied to it, not portable across the marketplace. DeepInfra’s 3-year DeepCluster delivers the corpus’s lowest GPU rate precisely because the buyer absorbs three years of obsolescence risk on a fast-moving hardware curve. The deeper the discount, the longer the lock-in window — and on GPUs, three years is several hardware generations.
What this means for buyers vs vendors
For buyers
Commit only against the floor you are confident you will clear in your worst month, not your average month. The Together AI example above turns negative the moment forecasted usage dips below the committed hours, because you still pay the floor. Favor dollar-floor commits (Bright Data, turbopuffer minimums) over reserved-capacity commits (Vast.ai, DeepInfra DeepCluster) when your workload mix is uncertain — a dollar floor is fungible across products; a reserved machine is not. And always extract the overage rate before signing: vendors like Lorikeet that sell a pre-committed pool without a published overage rate leave you unable to model the downside. Use the pricing calculator hub to model commit-versus-on-demand break-even before you negotiate.
For vendors
Committed-use converts volatile usage into bookable revenue, which is why every infrastructure vendor in this corpus offers it. But the published-ladder approach (Bright Data) consistently outperforms the gate-only approach (Twelve Labs) at converting self-serve users into committed accounts, because the buyer can see the saving and self-qualify. The most common gap is the missing middle: Fireworks AI’s own teardown flags that self-serve customers grow into the $10K–$50K/month band with no published mid-tier discount before the Enterprise commit — a friction point that pushes growing accounts to shop around. Publish a visible commit ladder, state the overage rate, and you remove the two biggest reasons a ready buyer stalls. The usage-invoicing & billing-cycles guide covers how to reconcile commits and overage cleanly on the invoice.
| Company | Product | Pricing model | Billing units | Free tier | Verified |
|---|---|---|---|---|---|
| Anyscale | Managed Ray platform for distributed AI training, inference, and batch processing (RayTurbo, Anyscale Compute Units) | pure-usagecommitmenthybrid | gpu-hourscpu-hourscredits | Yes | 2026-05-29 |
| Baseten | ML inference infrastructure — dedicated GPU deployments, Model APIs, and Truss framework | pure-usagehybridcommitment | gpu-hourstokensrequests | Yes | 2026-05-29 |
| Bright Data | Web data platform — proxy networks, scraping APIs, a managed scraping browser, SERP and unlocker APIs, ready-made datasets, and eCommerce insights | pure-usagehybridcommitment+1 | bandwidth-gbrequestsrecords+1 | Yes | 2026-06-04 |
| Browse AI | No-code web scraping and website-monitoring platform that turns any site into a structured dataset or API | freemiumhybridcommitment | creditsseats | Yes | 2026-06-04 |
| Cerebras | Wafer-scale AI inference cloud and WSE hardware systems | pure-usagesubscriptioncommitment | tokensapi-callsgpu-hours | Yes | 2026-05-30 |
| Clay | AI-powered GTM data-enrichment and outbound platform billed on Actions plus Data Credits | hybridfreemiumcommitment | creditsactions | Yes | 2026-06-02 |
| DeepInfra | Serverless inference cloud — per-token LLM/embedding APIs, per-image and per-minute media models, per-hour on-demand GPU containers, and reserved DeepCluster GPU clusters | pure-usagecommitment | tokensgpu-hoursrequests+1 | No | 2026-06-02 |
| Fireworks AI | Generative AI inference platform — serverless per-token, on-demand GPU, fine-tuning, batch API | pure-usagehybridcommitment | tokensgpu-hoursrequests | Yes | 2026-05-30 |
| Groq | GroqCloud — LPU-based ultra-low-latency inference API for Llama, GPT-OSS, Qwen, Whisper, and Mixtral | pure-usagehybridcommitment | tokensrequestsapi-calls | Yes | 2026-05-29 |
| Lorikeet | AI customer-support agent that resolves chat, email, SMS, and voice tickets | outcome-basedcommitment | resolutionscredits | No | 2026-06-07 |
| m3ter | Usage-based billing and metering infrastructure for B2B SaaS | hybridcommitment | transactionsevents | No | 2026-06-03 |
| Nomic | Nomic Platform (AEC agentic workflows) + Atlas data-exploration app + Nomic Embed embedding/Developer API | hybridseat-basedcommitment+1 | seatstokenscredits+2 | Yes | 2026-06-04 |
| Replicate | Cloud platform for running, fine-tuning, and deploying AI models via REST API | pure-usagehybridcommitment | gpu-hourstokensrequests | Yes | 2026-05-30 |
| RunPod | GPU cloud marketplace — Secure Cloud and Community Cloud Pods, Serverless endpoints, and persistent storage | pure-usagehybridcommitment | gpu-hoursstorage-gb | No | 2026-05-30 |
| Together AI | AI Acceleration Cloud — serverless inference, dedicated endpoints, GPU clusters, Code Sandbox, fine-tuning | pure-usagehybridcommitment | tokensgpu-hourscpu-hours+1 | Yes | 2026-05-29 |
| turbopuffer | Serverless vector and full-text search database on object storage | pure-usagecommitment | storage-gbvectors-indexedgb-hours+1 | No | 2026-06-04 |
| Twelve Labs | Video understanding foundation models (Marengo for search/embeddings, Pegasus for analysis) delivered as a usage-metered API | pure-usagefreemiumcommitment | media-minutestokensrequests | Yes | 2026-06-02 |
| Vast.ai | GPU rental marketplace — on-demand, interruptible (spot), and reserved cloud GPUs plus autoscaling serverless inference | pure-usagecommitment | gpu-hoursstorage-gbbandwidth-gb | No | 2026-06-02 |
| Vectara | Enterprise RAG-as-a-Service and agent platform for trusted, grounded, auditable AI | commitmentsubscription | creditsrequestsstorage-gb | No | 2026-06-02 |
FAQ
What is committed-use pricing?
Committed-use pricing is a model where the customer agrees to a minimum spend or volume over a fixed period — usually a year — in exchange for a discounted per-unit rate. The vendor gets revenue predictability; the buyer gets a lower price.
How much can you save with a committed-use discount?
In this corpus, committed discounts range from roughly 20% to nearly 70%. Bright Data cuts residential-proxy rates from $8/GB to $2.50/GB at its top committed tier, Vast.ai offers up to 50% off on-demand for reserved capacity, and Together AI drops H100s from $6.49 to $4.99/hour on a reserved commit.
What's the difference between committed-use and reserved-instance pricing?
Reserved instances are a subset of committed-use: you pre-pay or commit to a specific GPU or machine for a term (Vast.ai's 1/3/6-month reservations, DeepInfra's 3-year DeepCluster). General committed-use can also be a dollar floor with no specific resource attached, like Bright Data's $499/$999/$1,999 monthly tiers.
What happens if you exceed your commitment?
Most vendors bill overage above the commitment at standard on-demand rates. Baseten, Fireworks, and Groq all state that over-commitment usage reverts to list pricing — so the commit is a floor, not a cap.
Which AI companies use committed-use pricing?
19 in-corpus companies offer it, concentrated in GPU compute and data infrastructure: RunPod, Together AI, Fireworks AI, Baseten, Cerebras, Groq, Replicate, DeepInfra, Anyscale, Vast.ai, Bright Data, and others.
Trivia
-
Bright Data's residential proxies drop from $8/GB pay-as-you-go to $2.50/GB on the $1,999/month committed tier — a 69% rate cut bought purely with a spend floor.
-
DeepInfra's cheapest published GPU rate, $2.99/GPU-hour all-in, requires a 3-year reserved DeepCluster term — the longest commitment window in the corpus.
-
Every GPU-compute vendor in the corpus — RunPod, Together AI, Fireworks, Baseten, Groq, Replicate, Anyscale — ships the identical 'annual commit' enterprise badge alongside on-demand rates.
Related pricing models
- Hybrid Pricing ModelA pricing model that combines a fixed recurring fee with variable usage-based charges, both meaningful to the bill.
- Seat Plus Usage PricingA subset of hybrid pricing where a per-user seat fee is combined with usage-based charges that typically dominate the bill at scale.
- Outcome-Based PricingA pricing model where the customer is charged per business outcome — a resolved support ticket, a converted lead, a closed sale — rather than per unit of input.
- Freemium PricingA pricing model that combines a permanently free tier with paid upgrade plans, used to drive product-led growth and self-serve acquisition.
- Subscription PricingA pricing model that charges a flat recurring fee — monthly or annual — with no usage component meaningful to the bill.
- Pure Usage PricingA pricing model where the customer pays only for what they consume, with no fixed recurring fee beyond a possible minimum.
- Seat-Based PricingA pricing model where the primary billing dimension is the number of named users, regardless of their consumption.