AI Summary
About
Snowflake Cortex is the AI layer of Snowflake’s data cloud — a set of in-database AI capabilities you call straight from SQL or Python without moving data out of the warehouse. It spans LLM functions (COMPLETE, SUMMARIZE, TRANSLATE, SENTIMENT, EXTRACT_ANSWER), the newer AISQL functions, model APIs that route to hosted models (Llama, Mistral, Claude, OpenAI, Snowflake’s own Arctic/Llama variants), plus two higher-level products: Cortex Analyst (natural-language-to-SQL over your semantic models) and Cortex Search (managed hybrid vector + keyword retrieval for RAG).
Snowflake Inc. (NYSE: SNOW) is public, with over 7,200 brands on the platform and multi-billion-dollar product revenue. Cortex is not a standalone product with its own price book — it is monetized entirely through Snowflake’s existing credit consumption model. That single design choice shapes everything about how Cortex costs behave: there are no seats, no Cortex subscription, and no separate invoice — just more credits burned against whatever rate your account already pays.
For the live rate card, see Snowflake’s pricing page and the AI pricing docs.
Pricing summary : How Snowflake Cortex’s pricing model works
Cortex is pure consumption, and every AI query stacks two metered layers on top of each other:
- AI / LLM token usage — each LLM function call is billed in Snowflake credits per 1M tokens, at a published rate that varies by model. For generative functions (
AI_COMPLETE,SUMMARIZE,TRANSLATE) both input and output tokens count; for extractive functions (EXTRACT_ANSWER,SENTIMENT,AI_EMBED) only input tokens count. - Warehouse compute — the virtual warehouse that executes the SQL is billed by the second at credits per hour by warehouse size (X-Small = 1, Small = 2, Medium = 4, Large = 8, doubling per step). Snowflake recommends staying at Medium or smaller for Cortex calls.
Both layers — and storage — draw down credits at your edition’s per-credit price: $2.00 (Standard), $3.00 (Enterprise), $4.00 (Business Critical), with Virtual Private Snowflake (VPS) sales-quoted. So the dollar cost of an identical Cortex query is literally edition-dependent: the same token volume costs 2x as much on Business Critical as on Standard.
What makes this different: there is no AI price tag. Cortex doesn’t sell tokens at a dollar rate the way OpenAI or Anthropic do — it sells credits, and the AI is just one more thing credits can buy. That makes Cortex trivially easy to adopt (it’s already in your existing contract and bill) but harder to reason about, because the headline credit rate hides both the model-token component and the warehouse-time component underneath it.
Pricing by product
| Component | What it meters | Price | Notes |
|---|---|---|---|
| Edition credit price | Every credit you spend | $2.00 / $3.00 / $4.00 per credit | Standard / Enterprise / Business Critical; VPS sales-quoted |
| Cortex LLM token usage | LLM function tokens | ~0.11 credits/1M (llama 3.1-8b) up to several credits/1M (frontier/long-context) | Output tokens cost more than input; generative funcs count both |
| Warehouse compute | Query runtime | 1 / 2 / 4 / 8 credits-hour (XS/S/M/L) | Recommended ≤ Medium for Cortex; doubles per size step |
| Cortex Search serving | Indexed data kept warm | 6.3 credits / GB / month | Continuous charge while service is resumed, even with zero queries |
| Cortex Search embeddings | Tokens embedded | ~0.05 credits/1M (lightweight model) | Charged on each insert/update |
| Cortex Analyst | NL-to-SQL questions | Bundled at edition credit rate | No separate fee; cost = underlying tokens + compute |
| Optimized storage | Data at rest | $23.00 / TB / month | List price; capacity (pre-paid) discounts apply |
Worked example at Enterprise ($3/credit): summarizing 1M input + 0.5M output tokens with snowflake-llama-3.3-70b (~0.29/0.29 credits per 1M) costs a fraction of a credit in tokens plus the Medium-warehouse seconds it took to run — and the compute layer can easily exceed the token cost on small queries.
Sales motions across products: Cortex is self-serve / PLG for any existing Snowflake customer (turn it on in SQL, no procurement), while net-new platform contracts and capacity commitments are sales-led.
Hidden costs : What Snowflake Cortex users actually pay
The headline “credits per 1M tokens” rate is the part people budget for — and the part that’s almost never the whole bill.
- Warehouse time is the silent second meter. Every Cortex call runs inside a warehouse you’re already paying for by the second. Run an
AI_COMPLETEacross a large table and the warehouse stays hot for the whole scan; the compute credits can dwarf the token credits. Practitioners have reported five-figure single-query surprise bills from exactly this pattern. - Edition multiplies everything. Business Critical’s $4/credit means identical AI usage costs 2x a Standard account. Teams on regulated editions for compliance reasons quietly pay a 2x AI premium.
- Long-context variants ~2x. Switching a model to its long-context variant roughly doubles the per-token rate (e.g. claude-sonnet-4-5-long-context at ~3.30/12.38 credits per 1M).
- Cortex Search is always-on. The 6.3 credits/GB-month serving charge accrues continuously while the index is resumed — you pay to keep it warm even on a day with zero searches.
- Output-token asymmetry. Output tokens cost far more than input on most models (e.g. claude-haiku-4-5 at 0.55 in / 2.75 out), so chatty generations cost more than the input size suggests.
| Line item (Enterprise, $3/credit) | Monthly cost |
|---|---|
| LLM tokens — 50M in / 25M out on snowflake-llama-3.3-70b (~0.29/0.29 per 1M) | ~$65 |
| Warehouse compute — ~40 Medium-hours running the calls (4 cr/hr x 40 x $3) | ~$480 |
| Cortex Search — 20 GB index kept warm (6.3 cr/GB x 20 x $3) | ~$378 |
| Estimated total | ~$923 |
Want to estimate your own Snowflake Cortex bill? Use the Snowflake Cortex pricing calculator to model your costs based on usage patterns.
Pricing evolution : Snowflake Cortex pricing history and changes
Cadence
| Period | Price changes | Product / SKU additions | Notes |
|---|---|---|---|
| 2024 | 0 to edition rates | Cortex LLM functions GA | Per-1M-token credit rates added to consumption table |
| 2025 | 0 to edition rates | Cortex Search + Analyst | Search serving charge + Analyst bundling formalized |
| 2026 Q2 | 0 to edition rates | Frontier + long-context models | New model rates added; long-context ~2x standard |
Tracked range: 2024–present. Snowflake holds edition credit prices steady ($2/$3/$4) and evolves Cortex pricing by adding per-model token rates to the Service Consumption Table rather than changing the credit price.
Notable changes
- 2024-05 — Cortex LLM functions reached GA with published per-1M-token credit rates; no new SKU.
- 2025-01 — Cortex Search introduced its continuous ~6.3 credits/GB-month serving charge; Cortex Analyst bundled at the standard credit rate.
- 2026-06 — Consumption table expanded to cover Claude 4.5 / GPT-5.5-class models and long-context variants (~2x their standard rate); edition credit prices unchanged.
What’s unique : Snowflake Cortex’s distinctive pricing mechanics
1. AI sold as credits, not tokens. Cortex never quotes a dollar-per-token rate. It publishes credits-per-1M-tokens and lets your edition’s $/credit convert it to dollars. The AI is monetized as just another credit sink alongside storage and compute — the purest expression of “AI on the warehouse.”
2. Two stacked meters per query. Almost every other LLM provider bills you once (tokens). Cortex bills the model tokens and the warehouse seconds that ran the SQL. The compute layer is frequently the larger of the two, which is unique among AI APIs.
3. Edition-multiplied AI cost. Because the credit price is set by edition, the same AI workload costs 2x on Business Critical vs Standard. Your compliance posture, not your AI usage, can be the bigger driver of your AI bill.
Strengths & weaknesses
| Strengths | Weaknesses |
|---|---|
| Zero procurement friction — already in your Snowflake contract and bill | Two stacked meters (tokens + warehouse) make cost hard to predict |
| Per-model token rates are publicly published in the consumption table | Warehouse compute can quietly exceed token cost on big-table queries |
| Data never leaves the warehouse — no egress, governance stays in place | Edition multiplier means regulated accounts pay 2x for identical AI |
| Cortex Analyst bundled at the credit rate (no separate fee) | Cortex Search bills continuously to keep the index warm |
| Fully consumption — pay only for queries run, no per-seat AI license | $/credit varies by edition/region, so headline rates need conversion |
Billing UX : Snowflake Cortex billing controls and transparency
- Billing controls — Cortex spend is governed by the same levers as the rest of Snowflake: resource monitors with credit quotas and auto-suspend, warehouse auto-suspend/auto-resume, and budgets. There’s no Cortex-specific cap, so guardrails are warehouse- and account-level.
- Usage visibility — Per-function token consumption is fully observable via
ACCOUNT_USAGEviews (CORTEX_FUNCTIONS_USAGE_HISTORY,CORTEX_SEARCH_SERVING_USAGE_HISTORY), letting you attribute credits to specific models and queries — strong transparency once you know where to look, but it requires SQL, not a dashboard tile. - Payment options — On-demand (pay month-to-month by credit) or pre-paid capacity commitments that unlock volume discounts on credits and storage. Billing flows through the existing Snowflake account; Cortex never invoices separately.
Strategic wins : Why Snowflake Cortex’s pricing decisions worked
1. Frictionless adoption via the existing meter
By refusing to create a separate Cortex SKU, Snowflake made AI a one-line SELECT away for every existing customer — no new contract, no new budget approval. The credit model is the distribution channel. See usage-based pricing strategy for why removing procurement friction accelerates adoption.
2. Data-gravity lock-in
Cortex runs where the data already lives, so the pricing story (“no egress, no data movement, governance intact”) is also a retention story. Customers spend AI credits because moving data out to a standalone model API would be slower and riskier. Related: how AI companies structure pricing and outcome-based pricing trends.
3. Pricing stability through model churn
Snowflake keeps the credit price fixed and absorbs model economics into per-token credit rates in the consumption table. Customers get a stable unit of account (the credit) even as the underlying model market reprices constantly. See choosing the right usage metric.
Areas to improve : Gaps in Snowflake Cortex’s pricing approach
1. The double meter breeds bill shock
Because warehouse compute is billed alongside tokens, users routinely underestimate Cortex cost — a single AI_COMPLETE over a large table can run a warehouse hot for the whole scan and produce reported five-figure surprise charges. See bill shock and cost unpredictability.
2. Dollar cost is one conversion removed
Quoting AI in credits, not dollars, means every team has to multiply by their edition rate to know what a query actually costs — and that rate also varies by region. A native per-function dollar estimator in the UI would close the gap.
3. Always-on Cortex Search charge
Paying 6.3 credits/GB-month just to keep a search index resumed penalizes spiky or low-query-volume RAG workloads. A true serverless / pay-per-query Search tier would better fit intermittent use.
Key takeaways
- Cortex has no price of its own. AI is metered as Snowflake credits and converted to dollars by your edition rate ($2/$3/$4 per credit).
- Budget for two meters, not one. Every AI query costs model tokens plus warehouse seconds — and compute often wins.
- Token rates are public and per-model. From ~0.11 credits/1M (llama 3.1-8b) to several credits/1M for frontier/long-context, with output dearer than input.
- Cortex Search is an always-on cost (~6.3 credits/GB-month) while Cortex Analyst rides free on the credit rate.
- For the data-platform category, consumption-on-the-warehouse is the winning AI monetization — zero friction, strong lock-in, predictable unit of account.
UBP implications
- Meter what you already sell. Snowflake bolted AI onto an existing credit meter instead of inventing a new unit — the fastest path to monetizing AI when you already have a consumption relationship.
- Beware stacked meters. Bundling a hidden second meter (compute) under a headline meter (tokens) maximizes revenue but erodes predictability; expose both clearly or expect bill-shock churn.
- A stable unit of account beats raw transparency. Holding the credit price fixed while repricing models underneath gives customers a steady mental model even as your COGS moves — a powerful UBP pattern for AI resellers.
Sources
- Snowflake pricing — editions & credits (accessed 2026-06-16)
- Snowflake AI (Cortex) pricing docs (accessed 2026-06-16)
- Cost considerations for Cortex AI Functions (accessed 2026-06-16)
- Understanding cost for Cortex Search Services (accessed 2026-06-16)
- Flexera — Snowflake Cortex LLM functions overview (2026) (accessed 2026-06-16)
- Modern DataTools — Snowflake Cortex pricing (2026) (accessed 2026-06-16)
Bottom line
Snowflake Cortex is the cleanest example in the corpus of “AI on the warehouse” pricing: there is no Cortex subscription and no per-seat fee — AI/LLM usage is metered as Snowflake credits (a published per-1M-token rate per model) and stacked on top of the warehouse compute that runs the query, all converted to dollars by your edition’s credit price ($2 Standard / $3 Enterprise / $4 Business Critical). That makes adoption nearly frictionless for existing Snowflake customers but means real Cortex cost lives in two meters — tokens and compute — plus always-on charges like Cortex Search’s 6.3 credits/GB-month. Budget for both, watch your warehouse sizing, and convert credits to dollars at your edition rate before you trust any headline number.
Want to compare Snowflake Cortex against other data-platform and AI-infrastructure companies? Browse the pricing blueprint.
Pricing timeline : Major events on a vertical axis
Each milestone below corresponds to a public pricing change, product launch, or material adjustment. Major events use a filled marker; minor adjustments use a faded one.
Consumption table refreshed with frontier + long-context models
Per-1M-token credit rates expanded to cover newer models (Claude 4.5 family, GPT-5.5) and long-context variants priced at roughly 2x their standard counterparts; edition credit prices held at $2/$3/$4.
Cortex Search and Cortex Analyst pricing formalized
Cortex Search introduced a continuous serving charge (~6.3 credits/GB-month of indexed data) plus embedding-token credits; Cortex Analyst bundled at the standard edition credit rate with no separate fee.
Cortex LLM functions reach general availability
Snowflake GA'd Cortex LLM functions (COMPLETE, SUMMARIZE, TRANSLATE, SENTIMENT, EXTRACT_ANSWER) with per-1M-token credit pricing published in the Service Consumption Table — no new SKU, billed as standard credits.
- · Cortex has no price tag of its own — it inherits Snowflake's credit rate, so the same llama call costs 2x as much on a Business Critical account ($4/credit) as on Standard ($2/credit).
- · Long-context model variants are priced at roughly double their standard counterparts — claude-sonnet-4-5-long-context runs ~3.30 input / 12.38 output credits per 1M tokens.
- · Cortex Search bills 6.3 credits per GB per month just to keep the index warm — you pay it whether or not anyone runs a single search query.
Questions & answers
- How is Snowflake Cortex priced?
- Cortex has no separate license. AI/LLM functions are metered in Snowflake credits — a published per-1M-token rate per model — and you also pay for the virtual warehouse compute that runs the query plus storage. Credits are billed at your edition's rate: $2 (Standard), $3 (Enterprise), or $4 (Business Critical) per credit.
- How much do Cortex LLM functions cost per token?
- Rates are quoted in credits per 1M tokens and vary by model. Lightweight models like llama 3.1-8b are about 0.11 credits per 1M input and output tokens (~$0.33/$0.33 at Enterprise's $3/credit). Larger and long-context models cost more — e.g. claude-haiku-4-5 is roughly 0.55 input / 2.75 output credits per 1M tokens — and output tokens are consistently pricier than input.
- Does Snowflake Cortex have a free tier?
- There is no standalone Cortex free tier. New Snowflake accounts come with free trial credits you can spend on anything including Cortex, but ongoing Cortex use draws down paid credits like any other Snowflake workload.
- What does Cortex Search cost on top of LLM functions?
- Cortex Search adds a serving charge of 6.3 credits per GB per month of indexed data (including vector embeddings) that runs continuously while the service is active, plus embedding token credits and the warehouse compute to refresh and query it.