What is 01.AI's pricing model?

01.AI runs a three-layer model: the Yi open-weight models (Yi-34B, Yi-1.5, Yi-VL, Yi-Coder) are free to download on HuggingFace; the Yi API bills per token from a prepaid balance on two public SKUs (yi-lightning ¥0.99/1M, yi-vision-v2 ¥6/1M); and the enterprise agent platforms (万智, 万策) are quoted through sales.

Does 01.AI offer a free tier?

Yes, in three senses. The Yi open-weight models are free to download and self-host from HuggingFace; the API platform grants ¥36 of cash credit on registration; and the rate-limit ladder starts at a Free tier (10 RPM / 80,000 TPM on yi-lightning) before any top-up. Beyond the credit, the API is pay-per-token from a prepaid balance rather than a flat subscription.

How much does the Yi API cost per million tokens?

The live platform card lists two SKUs: yi-lightning at ¥0.99 per 1M tokens and yi-vision-v2 at ¥6 per 1M tokens, both with 16K context. The price covers input and output tokens combined, and images consume roughly 500–700 tokens each. Yi-Lightning has held the ¥0.99 level since the October 2024 China price war; the older Yi-Large, Yi-Medium and Yi-Spark tiers no longer appear on the card.

What is 01.AI's intelligent routing?

Since 7 August 2025 the yi-lightning and yi-vision-v2 endpoints are routed services: the platform inspects your input and dispatches it to whichever model — DeepSeek-V3, Qwen3-30B-A3B, Qwen2.5-VL-72B-Instruct, or the Yi model itself — offers better value, while you pay one fixed per-token rate. You buy a price/latency envelope, not a specific set of weights.

Are Yi models open source?

Yes. The Yi open-weight family — Yi-34B, Yi-1.5 (6B/9B/34B), Yi-VL multimodal, and Yi-Coder — is published under Apache 2.0 on HuggingFace and free to download and self-host. The flagship Yi-Lightning and Yi-Large models are closed and served only via the API.

01.AI Pricing

AI Summary

01.AI (零一万物 / Lingyiwanwu) is Kai-Fu Lee's Chinese foundation-model lab behind the open-weight Yi model family, the Yi API, and — since 2026 — a set of enterprise AI-agent platforms.
Its open-weight models (Yi-34B, Yi-1.5 6B/9B/34B, Yi-VL, Yi-Coder) remain free to download on HuggingFace; you only pay for hosted inference.
The live API card on platform.lingyiwanwu.com now lists exactly two SKUs: yi-lightning at ¥0.99 per 1M tokens and yi-vision-v2 at ¥6 per 1M tokens, both with a 16K context window and billed on combined input + output tokens.
Since 7 August 2025 both SKUs are 'intelligent routing' services: the platform routes a request to DeepSeek-V3, Qwen3-30B-A3B, Qwen2.5-VL-72B-Instruct or the Yi model behind one fixed per-token price, so the branded SKU is now a router rather than a single model.
Access is prepaid: new accounts get ¥36 in cash credit, any top-up carries a bonus of 20% or more and instantly lifts the account to rate-limit Tier 3, and the Free/Tier 1–5 ladder governs RPM and TPM rather than price.
The company's primary motion is sales-led enterprise: the Wanzhi (万智) platform (2025-03) and the TrueNorth (万策) decision platform with Boss AI / Investor AI / TopSales AI agents (2026-07) are quoted through consultations with no public price. 01.AI is reported (Tech in Asia, July 2026) to be targeting a 2027 Hong Kong IPO and repositioning as 'China's Palantir,' emphasising enterprise and government deployment over open-model distribution — a direction that points its priced surface toward negotiated deployment contracts rather than the public ¥0.99 token card, though nothing on the rate card has moved yet. The company's primary motion is sales-led enterprise: the Wanzhi (万智) platform (2025-03) and the TrueNorth (万策) decision platform with Boss AI / Investor AI / TopSales AI agents (2026-07) are quoted through consultations with no public price. 01.AI is reported (Tech in Asia, July 2026) to be targeting a 2027 Hong Kong IPO and repositioning as 'China's Palantir,' emphasising enterprise and government deployment over open-model distribution — a direction that points its priced surface toward negotiated deployment contracts rather than the public ¥0.99 token card, though nothing on the rate card has moved yet.

Pricing summary

01.AI 2026 — free Yi open weights + a two-SKU routed API, with enterprise agents sold by quote

The Yi open weights stay free to self-host. The live platform card now lists exactly two prepaid per-token SKUs — yi-lightning ¥0.99/1M and yi-vision-v2 ¥6/1M — while the 万智 / 万策 enterprise agent platforms are quoted through sales.

Open weights

Free

Developers self-hosting Yi models

Flagship rate

yi-lightning

¥0.99 /1M tokens

Real-time chat and high-complexity reasoning

yi-vision-v2

¥6 /1M tokens

Image, chart, OCR and document understanding

Enterprise agent platforms

Large enterprises and sovereign-AI buyers

Prices read from the live platform card (platform.lingyiwanwu.com/docs, captured 2026-07-22) and quoted in RMB — roughly ~$0.14 and ~$0.85 per 1M tokens at about 7.1 RMB to the US dollar. New accounts receive ¥36 of cash credit; any top-up adds a bonus of 20% or more and lifts the account to rate-limit Tier 3. The older Yi-Large / Yi-Medium / Yi-Spark ladder no longer appears on the card.

About

01.AI (零一万物 / Lingyiwanwu) is a Chinese foundation-model company founded in March 2023 by Kai-Fu Lee — the former head of Google China and founder of Sinovation Ventures. It builds the Yi model family: open-weight models (Yi-34B, Yi-1.5, Yi-VL, Yi-Coder) published free on HuggingFace, alongside a small hosted API that today sells exactly two per-token SKUs — yi-lightning and yi-vision-v2. It also ships consumer apps — Wanzhi (万知) and the international PopAI — and, increasingly, sells enterprise AI-agent platforms to large accounts.

01.AI reached unicorn status (over $1B valuation) within eight months of launching, with backers including Alibaba and Xiaomi. Kai-Fu Lee’s cost-efficiency thesis is central to its pricing identity: the company has claimed it trained a top-tier model on about 2,000 GPUs for roughly $3 million, versus an estimated $80–100 million for comparable frontier runs — a gap it converts directly into cheap token rates.

The strategic story since 2024 is a deliberate retreat from the frontier. After Yi-Large (May 2024) and Yi-Lightning (October 2024) made 01.AI a price-war protagonist, the company scaled back independent frontier pre-training in early 2025, formed a joint “industrial large model laboratory” with Alibaba Cloud (Alibaba trains the giant models; 01.AI builds smaller, cost-efficient ones), and reoriented toward sales-led enterprise solutions. By mid-2026 that reorientation is the whole company: the corporate site leads with a “Top-down Thesis” (一号位工程) transformation programme, AI consulting plus forward-deployed engineers, sovereign-AI deployments, and the TrueNorth (万策) decision platform launched in July 2026 with three packaged agents — Boss AI, Investor AI and TopSales AI. None of it carries a public price; the only call to action is “Book a Consultation”.

What survives on the self-serve side is a deliberately small API. The developer platform still publishes an open RMB card, but it lists only two SKUs — and since 7 August 2025 both are intelligent routing services that dispatch a request to DeepSeek, Qwen or a Yi model behind one fixed per-token price. 01.AI sits in the same sovereign-lab cluster as Mistral AI, but where Mistral sells its own weights through its own card, 01.AI now sells a price/latency envelope that may be served by someone else’s model.

For the latest on 01.AI’s models and platform, visit the Yi developer platform or 零一万物.

Pricing summary : free open weights, a two-SKU routed API, quoted enterprise agents

01.AI prices on three dimensions — tokens (the only meter on the API), prepaid credit balance (how you fund those tokens, and what sets your rate limits), and quoted engagements (everything on the enterprise side). The layers:

Open weights (free). The Yi open-weight family — Yi-34B / Yi-34B-200K, Yi-1.5 (6B/9B/34B), Yi-VL multimodal, and Yi-Coder — is published on HuggingFace and free to download. You self-host; your only cost is your own compute. This was 01.AI’s first go-to-market (Yi-34B, November 2023) and is still live.
Yi API (per token, prepaid). The public card at platform.lingyiwanwu.com/docs lists two SKUs: yi-lightning at ¥0.99 per 1M tokens and yi-vision-v2 at ¥6 per 1M tokens, both with a 16K context window. The billing unit is the token, and the rate is charged on input and output tokens combined — including conversation history replayed as input, and roughly 500–700 tokens per image on the vision model. There is no separate input/output split and no long-context surcharge on the card.
Intelligent routing (since 2025-08-07). Both SKUs are routers, not single models: the platform reads your input and dispatches it to DeepSeek-V3, Qwen3-30B-A3B or Yi-Lightning (for yi-lightning) and Qwen2.5-VL-72B-Instruct or Yi-Vision-V2 (for yi-vision-v2), “to provide better value for money.” The price stays fixed regardless of which model answers.
Prepaid balance + rate-limit tiers. Access is funded by top-ups, not subscriptions. New accounts get ¥36 of cash credit; any top-up carries a bonus of 20% or more and, under the current promotion, lifts the account straight to Tier 3. Tiers govern throughput only — Free/Tier 1 allows 10 RPM and 80,000 TPM on yi-lightning, rising to 200 RPM / 400,000 TPM at Tier 5. The ladder is bought with cumulative top-up, at published thresholds: any amount → Tier 1, ¥500 → Tier 2, ¥2,000 → Tier 3, ¥10,000 → Tier 4, ¥100,000 → Tier 5. Vouchers do not count toward the cumulative top-up total that determines your tier.
Enterprise agent platforms (sales-led, no public price). The Wanzhi (万智) enterprise LLM platform (2025-03) and the TrueNorth (万策) decision platform (2026-07) with Boss AI / Investor AI / TopSales AI, plus AI-transformation consulting with forward-deployed engineers and sovereign-AI deployments, are all scoped through a “Book a Consultation” form. No list price, no published floor.

What makes this different: the branded SKU is no longer the model. 01.AI sells a fixed price-per-token envelope and reserves the right to serve it from a competitor’s weights — a routed, cost-arbitraged meter that most labs keep hidden inside their own stack. Meanwhile the money moved to quoted agent platforms, so the ¥0.99 card is now a small, honest shop window in front of a consulting business.

The API card renders publicly without login, so this page sets price_transparency: public — with the caveat that the enterprise platforms carry no published price at all.

Pricing by product

Yi API (developer plans)

Model	Price / 1M tokens	Context & included	Key mechanics
yi-lightning	¥0.99 (~$0.14)	16K context; ¥36 signup credit applies	Routed to DeepSeek-V3 / Qwen3-30B-A3B / Yi-Lightning; input + output billed together
yi-vision-v2	¥6 (~$0.85)	16K context; multi-image input	Routed to Qwen2.5-VL-72B-Instruct / Yi-Vision-V2; ~500–700 tokens per image
Prepaid balance	Top-up, any amount	Bonus of 20%+ on any top-up	Cumulative total buys rate-limit tiers at ¥500 / ¥2,000 / ¥10,000 / ¥100,000; vouchers excluded from that total
New-account credit	¥36 cash credit	Granted on registration	Promotion also lifts any top-up straight to Tier 3

Only these two SKUs appear on the live card. The 2024-era ladder — Yi-Large, Yi-Large-Turbo, Yi-Medium, Yi-Medium-200K, Yi-Vision, Yi-Spark — is no longer listed or priced on the platform.

Yi API (rate-limit tiers)

Tier	yi-lightning RPM / TPM	yi-vision-v2 RPM / TPM	How you get it
Free / Tier 1	10 / 80,000	12 / 48,000	Free on registration; any cumulative top-up amount reaches Tier 1
Tier 2	40 / 120,000	40 / 120,000	¥500 cumulative top-up
Tier 3	120 / 160,000	120 / 240,000	¥2,000 cumulative top-up — or instantly, under the current top-up promotion
Tier 4	120 / 240,000	200 / 400,000	¥10,000 cumulative top-up
Tier 5	200 / 400,000	400 / 800,000	¥100,000 cumulative top-up

Tiers price throughput, not tokens: the per-token rate is identical at every tier. RPM = requests per minute, TPM = tokens per minute, both measured per model. A sixth tab labelled “Tier S” exists on the console but publishes neither a threshold nor rate limits.

Open-weight Yi models (free to self-host)

Model	Price	Included	Key mechanics
Yi-34B / Yi-34B-200K	Free (weights)	34B params, 4K–200K context variants	#1 on the HuggingFace base-model board at release
Yi-1.5 (6B / 9B / 34B)	Free (weights)	Base + chat checkpoints, 4K/16K/32K context	Self-host; pay your own compute only
Yi-VL (6B / 34B)	Free (weights)	Multimodal image+text	Open multimodal
Yi-Coder (1.5B / 9B)	Free (weights)	Sub-10B code models	Open coding models

The open-weight family carries no meter at all — it is the only 01.AI surface where the buyer’s bill is entirely their own infrastructure.

Enterprise & consumer surfaces

Surface	Price	Key mechanics
TrueNorth / 万策 decision platform (2026-07)	Contact us	Boss AI, Investor AI, TopSales AI packaged agents; quoted via “Book a Consultation”
Wanzhi Enterprise (万智) LLM platform (2025-03)	Contact us	Model deployment, application practice, fine-tuning tooling
AI transformation consulting + FDE	Contact us	Forward-deployed engineers co-build with the customer team
Sovereign AI / industry solutions	Contact us	Supply chain, manufacturing, energy, agriculture, investment, education, retail
Wanzhi (万知) consumer assistant	Free (legacy)	Launched free in 2024; no longer listed among products on the 2026 corporate site
PopAI	Freemium (legacy)	International consumer app; no longer listed on the 2026 corporate site

Sales motions across products: PLG / self-serve for the open weights and the two-SKU prepaid API; sales-led for the 万策 / 万智 platforms, consulting, and sovereign-AI deployments, which are the company’s primary revenue motion.

Hidden costs : What 01.AI users actually pay

01.AI’s headline numbers look almost free — open weights at $0 and Yi-Lightning at ~$0.14/1M — but the real cost depends heavily on which layer you use. Two archetypes show how the total assembles.

Archetype 1 — a developer self-hosting Yi-34B. The weights are free, but you pay for GPUs. Assume one A100-class instance running roughly 720 hours/month at a typical cloud rate.

Line item	Monthly cost
Yi-34B weights (Apache 2.0)	$0
GPU inference compute (1 A100-class, ~720 hrs)	~$1,000–1,500
Ops / serving overhead (est.)	~$200
Estimated total	~$1,200–1,700/mo

The lesson: “free open weights” shifts cost from a per-token line to your own infrastructure. For low volume, the closed API is far cheaper; self-hosting wins only at scale or where data residency demands it.

Archetype 2 — a developer calling the two-SKU API. At ¥0.99/1M (~~$0.14) for text and ¥6/1M (~~$0.85) for vision, even heavy usage stays cheap — the cost that bites is everything the card doesn’t meter.

Line item	Monthly cost
yi-lightning — 200M tokens @ ~$0.14/1M	~$28
yi-vision-v2 — 20M tokens @ ~$0.85/1M (≈30k images at ~650 tokens each)	~$17
Multi-turn history replayed as input (~25% of text tokens)	~$7
Prepaid top-up to clear Free-tier limits (10 RPM / 80,000 TPM)	Balance, not net-new spend
Platform onboarding — CN registration + 用户信息 identity verification	Friction, not $
Estimated total	~$52/mo + access overhead

The bill is trivial; three non-price costs are not. First, conversation history is billed as input on every turn, so a chatty agent quietly re-pays for its own context — the one line that scales faster than your request count. Second, throughput is bought separately from tokens: the per-token rate is identical at every tier, so escaping the Free tier’s 10 RPM / 80,000 TPM ceiling means parking cash in a prepaid balance you may not spend, and vouchers don’t count toward the cumulative total that promotes you. Third, the on-ramp still runs through the Chinese platform with identity verification — the ¥36 signup credit and the 20%-plus top-up bonus are real, but they are denominated in a workflow non-China buyers have to deliberately opt into.

The fourth cost has no line at all: since the 2025-08-07 routing switch, you cannot pin which model answers. If DeepSeek-V3 and Yi-Lightning behave differently on your prompt, your eval variance is a cost you carry, not one 01.AI prices.

Want to estimate your own 01.AI bill? Use the 01.AI pricing calculator to model token volume against the yi-lightning and yi-vision-v2 rates, or weigh self-hosting compute versus the per-token API.

Pricing evolution : 01.AI pricing history and changes

01.AI’s pricing arc is unusually sharp: open weights first, then a price-war sprint to the cheapest flagship token rate in China, then a sales-led enterprise pivot — and a public card that shrank from seven priced SKUs to two in a single month without the flagship rate moving at all. The milestones below are reconstructed from official launch announcements, the live platform docs, dated archive snapshots of that card, and contemporaneous press.

Cadence

Quarter	Price changes	Product / SKU additions	Notes
2023 Q4	0	1	2023-11 Yi-34B open weights released free (Apache 2.0)
2024 Q2	1	2	2024-05 Yi-Large closed API at ¥20/1M; Wanzhi free assistant ships
2024 Q3	1	0	2024-08 international platform.01.ai suspends API; routes to CN platform
2024 Q4	1	1	2024-10 Yi-Lightning cut to ¥0.99/1M (~$0.14) in the China price war
2025 Q1	1	0	by 2025-01-22 the card collapses from seven priced SKUs to two — yi-lightning ¥0.99/1M and yi-vision-v2 ¥6/1M survive, Yi-Large ¥20 / Yi-Large-Turbo ¥25 / Yi-Medium-200K ¥12 / Yi-Medium ¥2.5 / Yi-Spark ¥1 are delisted; 2025-01/03 pre-training scaled back; Alibaba Cloud industrial-model lab; enterprise pivot
2025 Q3	0	2	2025-08-07 yi-lightning and yi-vision-v2 become intelligent-routing services at unchanged rates
2026 Q3	0	1	2026-07 TrueNorth (万策) ships with Boss AI / Investor AI / TopSales AI, quoted; 2026-07-20 01.AI reported targeting a 2027 Hong Kong IPO and a “China’s Palantir” enterprise-deployment repositioning (no rate-card change); 2026-07-22 rate card re-verified against live and archived sources — ¥0.99 and ¥6 unchanged, no price movement

Tracked range: 2023 Q4–2026 Q3. Quarters not listed had no publicly announced price or SKU change. RMB rates are labeled with approximate USD conversions at about 7.1 RMB to the dollar.

Notable changes

2023-11 — Yi-34B open weights released free under Apache 2.0; ranked #1 on HuggingFace’s pretrained base board (open weights, not a paid API, as the first go-to-market).
2024-05 — Yi-Large closed API launches at ¥20/1M (~$2.7), under a third of GPT-4 Turbo; Kai-Fu Lee declares “cash burning is no longer a winning strategy.”
2024-08 — International platform.01.ai suspends API services and routes users to the Chinese platform.lingyiwanwu.com — the first narrowing of the public, English-card API surface.
2024-10 — Yi-Lightning cut to ¥0.99/1M (~$0.14), ~40% faster than the prior flagship; LMSYS Chatbot Arena top-6 globally / #1 in China. The signature 01.AI beat in the 2024 price war.
by 2025-01-22 — The public rate card collapses from seven priced SKUs to two. Yi-Large (¥20/1M), Yi-Large-Turbo (¥25), Yi-Medium-200K (¥12), Yi-Medium (¥2.5) and Yi-Spark (¥1) are delisted; only yi-lightning (¥0.99) and yi-vision-v2 (¥6) remain. No surviving rate changed — the ladder around them was simply deleted, and the card kept rendering publicly without a login.
2025-01 — Reports that 01.AI is selling its pre-training team to Alibaba; Kai-Fu Lee denies a sale, reframing the Alibaba relationship as a partnership.
2025-03 — Joint “industrial large model laboratory” with Alibaba Cloud; frontier pre-training scaled back; pivot to sales-led enterprise vertical solutions (finance, energy, gaming), as Kai-Fu Lee described it to KrASIA.
2025-08-07 — A platform notice converts yi-lightning and yi-vision-v2 into intelligent-routing services: each request is dispatched to DeepSeek-V3, Qwen3-30B-A3B or Yi-Lightning (text), or Qwen2.5-VL-72B-Instruct or Yi-Vision-V2 (vision), at an unchanged per-token rate. No price moved; what the price buys did.
2026-07 — TrueNorth (万策) enterprise decision platform ships with Boss AI, Investor AI and TopSales AI, all quoted via “Book a Consultation” — the packaging is named and productised, the pricing is not.
2026-07-20 — 01.AI is reported (Tech in Asia; finance.biggo.com) to be targeting a 2027 Hong Kong IPO and repositioning as “China’s Palantir,” emphasising enterprise and government deployment over open-model distribution. No published rate moved, but it puts a public-markets clock on the pivot: an enterprise-deployment posture typically replaces published per-token rates with negotiated deployment contracts, so the ¥0.99 shop window is the surface most exposed at the next sweep.
2026-07-22 — Rate card re-verified at platform.lingyiwanwu.com/docs against both the live page and independent archive snapshots: no price movement. yi-lightning holds ¥0.99/1M and yi-vision-v2 holds ¥6/1M, and the card has rendered publicly, without a login, continuously since at least 2025-01-22. A June 2026 review that recorded the card as gated was a capture failure, not a vendor change.

The card that shrank — in detail

The two-step from ¥20/1M (Yi-Large, May 2024) to ¥0.99/1M (Yi-Lightning, October 2024) is the entire story of China’s 2024 LLM price war compressed into one lab. Yi-Large arrived already cheap relative to GPT-4 Turbo; five months later Yi-Lightning was roughly 20× cheaper still, matching peers like Alibaba Qwen, ByteDance Doubao, DeepSeek and Zhipu in a race toward ~¥1/1M and below.

What happened next is the more instructive half, and the July 2026 recheck reframes it. The narrowing was real, but it was a narrowing of the catalogue, not of transparency. Snapshots of the docs billing card show seven priced SKUs on 2 December 2024 and only two by 22 January 2025 — and the card itself has rendered publicly, without a login, on every snapshot since. The headline rate is exactly the same ¥0.99 it was in October 2024: more than 18 months of flat nominal pricing on the flagship SKU, published the whole time.

The change is in the shape, not the number. Three things happened at once:

The ladder collapsed. Seven priced SKUs became two. A buyer who wanted a long-context or premium option — Yi-Medium-200K, Yi-Large — no longer has one to buy; the catalogue is a fast SKU and a vision SKU, both capped at 16K context. 01.AI stopped selling choice and started selling a single price point.
The SKU stopped being a model. Since 2025-08-07 the ¥0.99 buys a routed envelope, and the weights answering may be DeepSeek’s or Alibaba’s. 01.AI is now, in part, a reseller of its competitors’ inference at a fixed markup — which is precisely how it can hold ¥0.99 while capping its own training near Yi-Lightning scale.
The meter moved off price and onto throughput. The published Tier 1–5 ladder charges the same per-token rate at every level; what a top-up buys is RPM and TPM, at ¥500 for Tier 2, ¥2,000 for Tier 3, ¥10,000 for Tier 4 and ¥100,000 for Tier 5 in cumulative spend. Combined with the ¥36 credit and the 20%-plus bonus, the commercial lever is balance held, not rate paid.

Read together, this is not a company that hid and restored its token pricing so much as one that has reduced the priced surface to the smallest thing it can defend — a single credible rate, served however is cheapest — while the actual business, the TrueNorth agent platform, stays behind a consultation form. The pricing lesson from 2024 stands and sharpens: winning a price war does not keep you in the business of selling priced tokens. It may only leave you with a shop window.

What’s unique : 01.AI’s distinctive pricing mechanics

1. The cheapest flagship token in the 2024 China price war. Yi-Lightning at ¥0.99/1M (~$0.14) wasn’t just cheap — it was a top-6 LMSYS Arena model priced below GPT-4o-mini. 01.AI deliberately decoupled benchmark rank from price, using a claimed $3M / 2,000-GPU training cost to justify a rate that closed-frontier labs couldn’t match. The price was the product story.

2. The SKU is a price envelope, not a model. This is 01.AI’s most unusual mechanic. Since 2025-08-07, yi-lightning and yi-vision-v2 are routers: the platform reads your input and answers it with DeepSeek-V3, Qwen3-30B-A3B, Qwen2.5-VL-72B-Instruct or a Yi model, and charges one fixed per-token rate either way. Most vendors that arbitrage inference this way hide it inside a product; 01.AI documents it on the rate card and names the competitors it routes to. The commercial logic is clean — a lab that capped its own training near Yi-Lightning scale can still hold ¥0.99 by buying the cheapest adequate inference — but it inverts what a buyer normally purchases. You are guaranteed a price and a latency band, not a set of weights, and you cannot pin the model that serves you.

3. A two-SKU shop window in front of a quoted business. The catalogue went from seven priced SKUs in December 2024 to exactly two by January 2025, both capped at 16K context, while the enterprise side grew a named product line (TrueNorth 万策, Boss AI / Investor AI / TopSales AI) with no published price at all. The pure-usage card is now deliberately minimal — a single credible rate that proves cost discipline — and the pricing mechanic that carries revenue isn’t on it.

4. Throughput, not tokens, is what money buys. The published Free/Tier 1–5 ladder charges an identical per-token rate at every level; a top-up only raises RPM and TPM. Layered on top are a ¥36 registration credit, a 20%-plus bonus on any top-up, and an instant promotion to Tier 3 — so the real commercial lever is prepaid balance held, not rate paid, closer to prepaid-credit mechanics than to a subscription. The docs are explicit that vouchers don’t count toward the cumulative total that sets your tier, which tells you the balance itself is the thing being sold.

5. A foundation lab that priced its way out of frontier pricing. After winning on token price, 01.AI concluded that selling priced tokens against giants was a losing game and shifted to sales-led vertical solutions with Alibaba doing the heavy training. Like Mistral AI it gives open weights away under Apache 2.0 — but where Mistral doubled down on selling its own models through its own card, 01.AI kept the card and gave up the models behind it.

Strengths & weaknesses

Strengths	Weaknesses
The public card renders without a login and holds ¥0.99/1M (~$0.14) — the same rate as October 2024, and archived continuously at that rate since	The catalogue collapsed from seven priced SKUs to two in a single month (Dec 2024 → Jan 2025), both capped at 16K context — no long-context or premium SKU left to buy
Only two SKUs and one meter: an evaluator can price a workload in under a minute	Both SKUs are routed — you buy a price band, not a model, and cannot pin the weights that answer you
Metering rules are stated plainly in docs: token-based, input + output combined, history billed as input, ~500–700 tokens per image	The rate card is RMB-only on the Chinese platform; the international on-ramp closed in 2024 and never reopened
Rate-limit ladder is published with exact RPM/TPM per tier, so throughput ceilings are visible before spend	Escaping the 10 RPM Free tier requires parking prepaid balance, and vouchers don’t count toward the tier total
Open weights (Apache 2.0) let buyers self-host and avoid the routing question entirely	Enterprise platforms (TrueNorth 万策, Wanzhi 万智) are fully sales-gated — no public floor price or worked example
Cost-efficiency thesis ($3M / 2,000-GPU training) plus routing to cheaper third-party inference makes ¥0.99 structurally defensible	Frequent strategic resets (price war → narrowing → routed two-SKU card) make the priced surface hard to plan against

Billing UX : 01.AI billing controls and transparency

账单概览 (Billing overview) — a first-class console page showing the prepaid balance and spend against it, sitting alongside 调用明细 (Call details), the per-request usage log.
充值/提速 (Top up / speed up) — the single funding control. It doubles as the throughput lever: the same page that adds balance is where the cumulative-top-up total that sets your rate-limit tier is accrued. The docs note explicitly that 代金券 (vouchers) do not count toward the cumulative top-up total.
API Key 管理 (API key management) — keys are issued per account after 用户信息 identity verification, which is a prerequisite before any call is billable.
Rate-limit tier table (Free / Tier 1–5) — published in the docs with exact RPM and TPM per model, so a buyer can see the throughput ceiling before spending, not after hitting a 429.
Model & billing card (模型与计费) — the price table sits in the public docs with no login wall, and the same section spells out the metering rules: token-based, input + output combined, multi-turn history billed as input, ~500–700 tokens per image.
Routing disclosure — an “升级通知 (upgrade notice)” banner pinned in the docs sidebar and a notice on the platform home state that since 2025-08-07 yi-lightning and yi-vision-v2 are served by intelligent routing, and ask users to optimise prompts accordingly.
Enterprise surfaces — no self-serve billing at all: the only control on the corporate site (CN and EN) is a “Book a Consultation” / 预约咨询 form, with scoping and invoicing through sales.
Currency — pricing is RMB-native (¥); USD figures on this page are approximate conversions at about 7.1 RMB to the US dollar and are not official quotes.

Strategic wins : Why 01.AI’s pricing decisions worked

1. Winning the benchmark-per-dollar story

Pricing a top-6 LMSYS Arena model (Yi-Lightning) at ¥0.99/1M let 01.AI own the “frontier quality at commodity price” narrative in China — a clean, headline-able claim that turned a cost-efficiency thesis into marketing. For a period it made 01.AI the reference point for cheap-but-good. See usage-based pricing strategy for why metering inference rather than the model itself scales.

2. Open weights as a distribution flywheel

Releasing Yi-34B and the Yi-1.5 family free under Apache 2.0 — with Yi-34B topping HuggingFace’s base board — bought developer mindshare and credibility that no ad spend could. The open artifact seeded adoption while the closed flagships and enterprise solutions carried revenue. This mirrors the shift away from rigid per-user licensing toward open, adoption-first distribution.

3. Knowing when to stop racing

The most underrated win is the retreat: rather than burn capital chasing frontier pre-training against giants, 01.AI partnered with Alibaba Cloud and redirected to vertical solutions where smaller models win on depth. Choosing not to compete on the most expensive axis — and repricing around solutions — is a discipline most labs learn too late. See choosing the right usage metric for the broader framing.

4. Decoupling the price promise from the cost base

The 2025-08-07 routing switch is the move that lets everything else hold together. By making yi-lightning a router across DeepSeek-V3, Qwen3-30B-A3B and its own weights, 01.AI turned its rate card into a promise it can keep regardless of what its own training roadmap does — the ¥0.99 headline survived the decision to stop building frontier models, because the price is now backed by procurement rather than by a single asset. Vendors whose price is welded to one model have to reprice every time their cost base moves; 01.AI decoupled the meter from the thing being metered and bought itself room. The cost is buyer control, which is a real trade — but as a way to defend a headline rate through a strategy reversal, it works.

5. Publishing the throughput ladder instead of hiding it

The Free/Tier 1–5 table gives exact RPM and TPM per model before a buyer spends anything, and states plainly that the token rate doesn’t change between tiers. That is a small piece of honesty with outsized effect: it moves the “will this scale for me?” question from a post-purchase 429 to a pre-purchase reading, and it stops the tier ladder from being mistaken for a price ladder. Most vendors that gate throughput leave the ceilings undocumented and let bill shock or throttling surprise do the discovery.

Areas to improve : Gaps in 01.AI’s pricing approach

1. Give the router a control surface

Disclosing that yi-lightning routes to DeepSeek-V3 or Qwen3-30B-A3B is honest, but disclosure alone leaves the buyer holding an unpriced risk: two models with different failure modes can answer the same prompt, and nothing in the API says which one did. Three cheap fixes would close it — return the serving model in the usage field of each response, offer a pinned-model parameter (even at a higher rate), and publish a rough routing mix. Buyers running evals need reproducibility more than they need the last few fen of savings, and today they have to absorb that variance as an unmodelled cost.

2. Sell something above ¥0.99

Collapsing to two 16K SKUs means a customer who grows out of the card has nowhere to go but a sales call. The 2024 ladder had a long-context option (Yi-Medium-200K) and a premium option (Yi-Large); both are gone, and the gap between “¥0.99 routed envelope” and “Book a Consultation” is now the entire middle of the market. A single priced step up — longer context, a pinned model, or a guaranteed-throughput SKU — would give the self-serve funnel somewhere to expand into instead of leaking to Qwen or DeepSeek directly.

3. Give international developers a real on-ramp

The card is public, but it is RMB-only on a Chinese platform that requires identity verification, and the international platform suspended in 2024 never came back. A lightweight global tier — or even an English mirror of the docs billing card with USD conversions — would recover the developer funnel that the open-weight reputation keeps filling. Compare how other AI companies stage their pricing transparency.

4. Expose an enterprise floor or worked example

TrueNorth (万策) shipped in July 2026 as three named, productised agents — Boss AI, Investor AI, TopSales AI — which is exactly the point at which packaging usually earns a published price band. It didn’t get one; the only call to action is “Book a Consultation.” A starting band or a worked finance/energy deployment example would shorten evaluation for mid-market buyers who can’t justify a sales call but have outgrown self-hosting open weights.

Monetization stack & signals : how 01.AI builds & buys its revenue engine

The read — where the monetization investment is going

No monetization or lifecycle build-vs-buy signal is reachable: 01.AI (零一万物) is a Beijing lab with no public Greenhouse/Lever/Ashby board, its careers page returns 404, and its blog feed carries no revenue-org disclosures. The stack is unobserved, not absent — and the live posture is sales-led enterprise solution selling (CN-platform, RMB-only public card) rather than a self-serve meter.

Signals reviewed Jun 2026 · derived from public sources

Key takeaways

Cheap can be the whole pitch — until it isn’t. Yi-Lightning at ¥0.99/1M made 01.AI the price-war benchmark, but winning on token price did not keep the company in the business of selling priced tokens. A headline rate is a moment, not a moat.
You can defend a price by changing what’s behind it. Rather than reprice when it stopped building frontier models, 01.AI made the SKU a router and served ¥0.99 from DeepSeek and Qwen weights. If your headline number is load-bearing for the brand, procurement can hold it after the original cost story stops being true — as long as you’re willing to sell a price band instead of a product.
A page can stay transparent while the offer narrows. 01.AI’s card never stopped rendering publicly, yet between December 2024 and January 2025 it went from seven priced tiers to two. Transparency and optionality are separate things; audit what is still purchasable on the card, not just whether the card is readable.
Cost structure is a pricing weapon. A claimed $3M / 2,000-GPU training run is what made ¥0.99/1M credible in 2024. When your unit economics are genuinely lower — whether from your own efficiency or from arbitraging someone else’s inference — aggressive pricing is defensible rather than suicidal.
Decide whether you’re selling capacity or rate. 01.AI charges the same per-token price at every tier and sells throughput through prepaid balance instead. Splitting the two makes each legible, but it also means growth revenue comes from customers holding money with you, not from customers paying more per unit — a very different forecast.

UBP implications

A price-war winner can still exit usage pricing. 01.AI shows that achieving the lowest token rate doesn’t lock you into per-token monetization — macro economics (frontier training cost) can push even a usage leader toward solution selling. UBP strategists should treat a cheap meter as a tactic, not a permanent business model — and a reported 2027 Hong Kong IPO plus a “China’s Palantir” enterprise-deployment repositioning (July 2026) now puts a public-markets clock on that exit, the kind of pressure that tends to trade a published token card for negotiated deployment contracts.
Open weights change the value question from “what” to “where” and “who runs it.” When the model is free to download, the priced dimension becomes hosted inference, region access, and integration — not the model. UBP design has to follow the cost driver customers can’t trivially replicate.
Routing severs the unit from the deliverable, and UBP has to decide whether that’s allowed. 01.AI’s post-2025-08-07 card sells a token at a fixed price without committing to what produces it. That is the logical end of metering an input rather than an outcome: the meter stays honest while the substance floats. Practitioners adopting routed or model-agnostic SKUs should pair them with a disclosure the buyer can act on — a served-model field, a pin option, a published mix — or the price becomes the only thing the contract actually specifies.

Sources

Yi API docs — model & billing card, rate-limit tiers (accessed 2026-07-22)
01.AI Yi developer platform — prepaid balance and top-up promotion (accessed 2026-07-22)
TrueNorth (万策) enterprise decision platform (accessed 2026-07-22)
01.AI / Lingyiwanwu — company (EN) (accessed 2026-07-22)
01.AI on HuggingFace — open weights (accessed 2026-07-22)
Yi API docs — archived rate card, 22 January 2025 (two SKUs) (accessed 2026-07-22)
Yi API docs — archived rate card, 2 December 2024 (seven priced SKUs) (accessed 2026-07-22)
Yi-Lightning Technical Report — 01.AI, arXiv 2412.01253 (accessed 2026-06-11)
Browse the pricing blueprint corpus

Bottom line

01.AI is a Chinese foundation-model lab that priced its way to the front of the 2024 LLM price war — Yi-Lightning at ¥0.99/1M (~$0.14) — and then spent two years shrinking what that price buys. The open-weight Yi models stay free on HuggingFace under Apache 2.0, and the API card is still public on the Chinese platform at the same ¥0.99, joined by yi-vision-v2 at ¥6/1M. But the 2024 ladder is gone, both remaining SKUs route to DeepSeek and Qwen weights behind a fixed rate, top-ups buy throughput rather than a better price, and the actual business — the TrueNorth (万策) agent platform and its consulting motion — carries no published price at all. A reported 2027 Hong Kong IPO and a “China’s Palantir” repositioning now put a clock on that direction: a public-markets story built on enterprise and government deployment, not on the ¥0.99 card. The number held; the product behind it did not. 01.AI is a case study in keeping a famous headline rate by turning the SKU it names into a price envelope.

Want to compare 01.AI against other foundation-model labs? See Mistral AI, or browse the full pricing blueprint.

Pricing timeline : Major events on a vertical axis

Each milestone below corresponds to a public pricing change, product launch, or material adjustment. Major events use a filled marker; minor adjustments use a faded one.

Rate card re-verified unchanged — yi-lightning ¥0.99/1M, yi-vision-v2 ¥6/1M

Jul 2026

An independent recheck against the live docs payload and Wayback snapshots confirms no price movement: yi-lightning holds ¥0.99/1M tokens (~$0.14) and yi-vision-v2 holds ¥6/1M (~$0.85), both 16K context, both billed on input and output tokens combined, exactly as archived continuously since at least 2025-01-22. The two-SKU card was never withdrawn — the June 2026 'gated' reading was a capture failure. Prepaid mechanics re-confirmed: cumulative top-up buys the rate-limit ladder at ¥500 (Tier 2), ¥2,000 (Tier 3), ¥10,000 (Tier 4) and ¥100,000 (Tier 5), with the token rate identical at every tier.

captured 2026-07-22

TrueNorth (万策) enterprise decision platform ships, quoted only

Jul 2026

01.AI launches TrueNorth (万策), an enterprise AI decision hub packaged as three role-shaped agents — Boss AI, Investor AI and TopSales AI — alongside the Wanzhi (万智) platform, transformation consulting with forward-deployed engineers, and sovereign-AI deployments. Nothing carries a list price; the only call to action is 'Book a Consultation'. The company is now explicitly two-speed: a ¥0.99 self-serve shop window in front of a fully quoted business.

captured 2026-07-01

Live check: open weights free, billing card missed by capture, enterprise-led

Jun 2026

At this review, Yi open-weight models remained free on HuggingFace under Apache 2.0 and the primary motion was sales-led enterprise/vertical solutions. Live capture of the legacy /pricing URL returned 404 and the docs billing card did not render for the capture run, so the API was recorded as gated. That reading was a capture artifact, not a vendor change: archive snapshots show the same public two-SKU card rendering without a login on this date. (Corrected 2026-07-22.)

captured 2026-06-11

Yi-Lightning and Yi-Vision-V2 become intelligent-routing services

Aug 2025

A platform notice states that from 7 August 2025 the yi-lightning and yi-vision-v2 endpoints route each request to whichever model offers better value — DeepSeek-V3, Qwen3-30B-A3B or Yi-Lightning for text, Qwen2.5-VL-72B-Instruct or Yi-Vision-V2 for vision — at one unchanged per-token rate. The branded SKU stops being a set of weights and becomes a price/latency envelope, with the routing margin kept by 01.AI. (Source: platform.lingyiwanwu.com notice, 2025-08-07.)

Frontier pre-training scaled back; pivot to enterprise vertical solutions

Mar 2025

01.AI forms a joint 'industrial large model laboratory' with Alibaba Cloud — Alibaba trains the giant models while 01.AI builds smaller, cost-efficient vertical models (future training capped near Yi-Lightning scale). The business reorients to sales-led enterprise solutions in finance, energy, and gaming; 2024 revenue exceeded RMB 100M (~$14M), ~70% enterprise. Per-token API pricing recedes behind solution selling. (Source: kr-asia, SCMP, 2025-01/03.)

Rate card collapses from seven priced SKUs to two (¥0.99 and ¥6)

Jan 2025

Between 2 December 2024 and 22 January 2025 the public docs billing card drops from seven priced SKUs — Yi-Large ¥20/1M, Yi-Large-Turbo ¥25, Yi-Medium-200K ¥12, Yi-Medium ¥2.5, Yi-Vision ¥6, Yi-Spark ¥1 and Yi-Lightning ¥0.99 — to just two: yi-lightning at ¥0.99/1M and yi-vision-v2 at ¥6/1M. No surviving rate moved; the ladder around them was deleted. The card kept rendering publicly without a login throughout. (Source: platform.lingyiwanwu.com/docs, Wayback snapshots 2024-12-02 and 2025-01-22.)

Pre-training sale rumors denied; Alibaba tie-up reframed as partnership

Jan 2025

Press reports that 01.AI is selling its pre-training and infrastructure teams to Alibaba Cloud; Kai-Fu Lee publicly denies a sale and reframes the Alibaba relationship as a partnership. Signals the wind-down of independent frontier pre-training without an outright acqui-hire. (Source: TechNode, Yicai Global, SCMP, 2025-01.)

Yi-Lightning cut to ¥0.99/1M (~$0.14) in the China price war

Oct 2024

01.AI ships flagship MoE model Yi-Lightning at just ¥0.99 per million tokens (~$0.14) — about 40% faster than its prior flagship and a fraction of GPT-4o-mini's $0.26. It hit LMSYS Chatbot Arena top-6 globally / #1 in China, briefly beating GPT-4o on that board. The signature 01.AI beat in the 2024 China LLM price war. (Source: Wikipedia, PingWest, arXiv 2412.01253, 2024-10.)

International API platform suspended, routed to China platform

Aug 2024

The international platform.01.ai suspends API services (effective ~Aug 25, 2024), inviting users to re-register on the Chinese platform.lingyiwanwu.com. This narrows the publicly-served, English-card API surface and is the first step toward today's RMB-only, China-first platform. (Source: platform notice / OpenRouter, 2024-08.)

Yi-Large closed API launches at ¥20/1M tokens (~$2.7)

May 2024

01.AI opens its proprietary Yi API with Yi-Large priced at ¥20 per million tokens (~$2.7) — under one-third of GPT-4 Turbo. Kai-Fu Lee frames it as proof that 'cash burning is no longer a winning strategy.' The card adds a tiered ladder: Yi-Large-Turbo, Yi-Medium, Yi-Medium-200K (long-doc), Yi-Vision, and Yi-Spark. (Source: NBD / kr-asia, 2024-05.)

Wanzhi free consumer assistant ships

May 2024

01.AI launches Wanzhi (万知), a free Copilot-style productivity assistant in Chinese and English, plus the international PopAI app — seeding a consumer surface alongside the developer API. Consumer apps stay free / freemium rather than per-token. (Source: 01.AI, 2024-05.)

Yi-34B open-weight model released (free)

Nov 2023

01.AI publishes Yi-34B, a 34B-parameter open-weight LLM, ranked first on HuggingFace's pretrained base-model board at release. Free to download and self-host under Apache 2.0 — establishing open weights, not a paid API, as the company's first go-to-market. (Source: 01.AI / HuggingFace, 2023-11.)

Trivia

· 01.AI's Yi-Lightning was priced at just ¥0.99 per million tokens (~$0.14) in October 2024 — a signature shot in the 2024 China LLM price war, undercutting GPT-4o-mini's $0.26 and roughly 1/30th of GPT-4's $4.40.
· Kai-Fu Lee said 01.AI trained a top-tier model on about 2,000 GPUs for roughly $3 million, versus an estimated $80–100 million for comparable OpenAI runs — the cost-efficiency thesis behind its cheap token rates.
· The Yi open-weight models are free to download under Apache 2.0 on HuggingFace, where Yi-34B ranked first on the pretrained base-model board at its November 2023 release — adoption seeded by giving the artifact away.

Questions & answers

What is 01.AI's pricing model?: 01.AI runs a three-layer model: the Yi open-weight models (Yi-34B, Yi-1.5, Yi-VL, Yi-Coder) are free to download on HuggingFace; the Yi API bills per token from a prepaid balance on two public SKUs (yi-lightning ¥0.99/1M, yi-vision-v2 ¥6/1M); and the enterprise agent platforms (万智, 万策) are quoted through sales.
Does 01.AI offer a free tier?: Yes, in three senses. The Yi open-weight models are free to download and self-host from HuggingFace; the API platform grants ¥36 of cash credit on registration; and the rate-limit ladder starts at a Free tier (10 RPM / 80,000 TPM on yi-lightning) before any top-up. Beyond the credit, the API is pay-per-token from a prepaid balance rather than a flat subscription.
How much does the Yi API cost per million tokens?: The live platform card lists two SKUs: yi-lightning at ¥0.99 per 1M tokens and yi-vision-v2 at ¥6 per 1M tokens, both with 16K context. The price covers input and output tokens combined, and images consume roughly 500–700 tokens each. Yi-Lightning has held the ¥0.99 level since the October 2024 China price war; the older Yi-Large, Yi-Medium and Yi-Spark tiers no longer appear on the card.
What is 01.AI's intelligent routing?: Since 7 August 2025 the yi-lightning and yi-vision-v2 endpoints are routed services: the platform inspects your input and dispatches it to whichever model — DeepSeek-V3, Qwen3-30B-A3B, Qwen2.5-VL-72B-Instruct, or the Yi model itself — offers better value, while you pay one fixed per-token rate. You buy a price/latency envelope, not a specific set of weights.
Are Yi models open source?: Yes. The Yi open-weight family — Yi-34B, Yi-1.5 (6B/9B/34B), Yi-VL multimodal, and Yi-Coder — is published under Apache 2.0 on HuggingFace and free to download and self-host. The flagship Yi-Lightning and Yi-Large models are closed and served only via the API.
What happened to 01.AI's frontier model strategy?: In early 2025 Kai-Fu Lee said only tech giants can afford frontier pre-training, so 01.AI scaled it back, formed a joint industrial-large-model lab with Alibaba Cloud (Alibaba trains the giant models), and refocused on smaller, cost-efficient vertical models for finance, energy, and gaming — with future training not exceeding Yi-Lightning scale.