All companies
technology

Unstructured pricing

unstructured.io facts checked analysis reviewed
Quick summary
In this page
AI Summary
  • Unstructured turns unstructured files (PDFs, docs, images, audio/video) into clean, structured, LLM-ready data for RAG and agents.
  • The Serverless API meters per page processed at a flat $0.03/page — one rate for any file type and any processing pipeline.
  • New accounts get 15,000 free pages with no expiration and full feature access, no card required.
  • The Business tier is custom-quoted and adds dedicated-instance / in-VPC / bare-metal deployment, RBAC, and multi-user accounts.
  • Pricing has simplified twice: compute-hour billing (~$12.93/1,000 pages) → strategy-tiered per-page (Fast $1, Hi-Res $10 per 1,000) in 2024 → today's single flat per-page rate.
  • Backed by a $40M Series B (Menlo, Databricks, IBM, NVIDIA); ~6M+ OSS downloads and 45,000+ orgs use it.
Pricing summary
Unstructured 2026 — Pricing overview
A per-page Serverless API for turning messy files into LLM-ready data: flat $0.03/page, 15,000 free pages, custom VPC tier.
Free (Let's Go)
Free
Developers evaluating the platform
Business
Custom
Teams needing privacy, control & isolation
Source: unstructured.io/pricing (captured 2026-06-16). Pay-As-You-Go is a single flat per-page rate; Business is sales-quoted.

About

Unstructured is the document-ingestion / ETL layer for AI. It takes the messiest enterprise inputs — PDFs, Office docs, scanned images, HTML, even audio and video — and turns them into clean, chunked, enriched, embedding-ready data that an LLM can actually use. In RAG and agent stacks, this is the unglamorous-but-decisive step: garbage parsing in, hallucinations out. Unstructured handles partitioning, chunking, enrichment (OCR, NER, table-to-HTML, image/table descriptions), embedding, and loading into 20+ vector and data destinations, with 40+ source connectors on the front end.

The company sits on a widely-used open-source core (6M+ downloads, 12,000+ codebases) and monetizes a managed Serverless API and a self-hostable Platform on top of it. More than 45,000 organizations — including over a third of the Fortune 500 — use it to preprocess proprietary data. Unstructured raised a $40M Series B in March 2024 led by Menlo Ventures, with Databricks Ventures, IBM Ventures, and NVIDIA’s NVentures participating (≈$65M total raised; reported valuation ~$230M, 2024 ARR ~$7.7M). The investor list is a tell: the three biggest data/AI-infra platforms all wrote checks because document ingestion is the on-ramp to everything they sell.

For the most current information on Unstructured’s pricing and market position, visit Unstructured.


Pricing summary : How Unstructured’s pricing model works

Unstructured prices the way a metered utility does: you pay $0.03 per page processed, full stop. There is no per-seat fee, no monthly platform charge on the Pay-As-You-Go plan, and — critically — the same rate applies to any file type and any processing pipeline. Whether a page is run through the cheap text-native Fast strategy or the expensive VLM strategy, you pay the same three cents. That flatness is the headline simplification: cost scales linearly with document volume and nothing else.

Before you pay anything, every account gets 15,000 free pages that never expire, with full access to every connector and transform strategy. That is a real free tier, not a 14-day trial, and it’s enough to build and validate a full pipeline before a card is ever required.

At the top of the range, Business is custom-quoted and is about deployment rather than volume discounts: dedicated instance, in-VPC (AWS/Azure/GCP), bare-metal, multi-user accounts, RBAC, and compliance (HIPAA, SOC 2 Type 2, GDPR, ISO 27001). The meter stays per-page; what you’re buying is data isolation and control.

What makes this different: most document-AI APIs charge more for higher-fidelity processing (premium OCR, layout models, or VLM parsing cost extra). Unstructured deliberately flattened that — one page, one price, every strategy included — making the bill a pure function of how many pages you push, not which model touched them.


Pricing by product

TierPriceIncludedKey mechanics
Free (“Let’s Go”)Free15,000 free pages (no expiration), all features, all connectors & strategiesSelf-serve, no card; pure free allowance, not a time trial
Pay-As-You-Go$0.03 / pageFlat rate for any file type & any pipeline (Fast/Hi-Res/VLM/Auto), all featuresPure usage; no minimums, maximums, or commitment; metered by page processed
BusinessCustom (quoted)Dedicated/VPC/bare-metal deployment, multi-user + RBAC, dedicated support, full complianceSales-led; data isolation and control, not volume tiers

Sales motions across products: self-serve PLG on Free and Pay-As-You-Go (sign up, get 15,000 pages, start sending files), and a sales-led motion for Business where deployment, isolation, and compliance are negotiated. The per-page meter is identical across the self-serve tiers.


Hidden costs : What Unstructured users actually pay

The Pay-As-You-Go line item is refreshingly clean — $0.03 × pages — but the real bill is driven by what counts as a “page” and how your pipeline inflates page counts. A 200-page PDF is 200 pages; but image-heavy decks, spreadsheets exploded into pages, and re-processing the same documents (during pipeline iteration, schema changes, or re-embedding) all multiply the count. The biggest hidden cost is usually reprocessing: every time you tune a chunking or enrichment strategy and re-run a corpus, you pay full per-page rates again. Change-detection and incremental processing exist to limit that, but you have to wire them in.

The other “hidden” cost is downstream, not on this invoice: enrichment with 3rd-party models (OpenAI/Bedrock/Voyage embeddings, VLM enrichment) and the vector store you load into are billed by those providers, not by Unstructured. Unstructured’s flat rate covers the orchestration and parsing; the model and storage costs ride alongside it.

Line itemCost basis
100,000 pages/mo (after free allowance)$0.03 × 100,000 = $3,000
First 15,000 pages (one-time)Free
Reprocessing on strategy changesFull $0.03/page again per re-run
Embeddings / VLM enrichmentBilled by the 3rd-party model provider
Vector destination storageBilled by your vector DB

Want to estimate your own Unstructured bill? Use the Unstructured pricing calculator to model your costs based on page volume and reprocessing.


Pricing evolution : Unstructured pricing history and changes

Cadence

PeriodPrice changesProduct / SKU additionsNotes
Pre-2024Compute-hour billingServerless API not yet launched~$12.93 to process 1,000 PDF pages; cost tied to compute time
2024 Q2Strategy-tiered per-page introducedServerless API launched (Jun 20, 2024)Fast $1 / Hi-Res $10 per 1,000 pages; infra spin-up charge removed
2026Collapsed to flat $0.03/page15,000 free no-expiration pagesOne rate for any file type and any pipeline

Tracked range: 2024–present. Historical per-1,000-page rates sourced from Unstructured’s Serverless API launch blog (Jun 2024).

Notable changes

  • Pre-2024 — Billed by compute hour. Parsing 1,000 PDF pages cost ~$12.93, with cost exposed to processing time and infrastructure spin-up.
  • 2024-06-20Serverless API launch. Switched to per-page pricing, tiered by strategy: Fast $1 per 1,000 pages ($0.001/page), Hi-Res $10 per 1,000 pages ($0.01/page). Infrastructure-creation charges eliminated. The pitch was predictability and transparency over hourly billing.
  • 2026 (current)Flat $0.03/page for any file type and any pipeline, replacing the tiered card, plus 15,000 free pages with no expiration. The strategy distinction (Fast/Hi-Res/VLM/Auto) survives as a quality/speed choice but no longer a price lever.

What’s unique : Unstructured’s distinctive pricing mechanics

1. Strategy-agnostic flat per-page rate. The defining choice is charging the same $0.03/page whether a page goes through cheap text extraction or an expensive VLM. Competitors typically up-charge for premium parsing; Unstructured folded all strategies into one price, so buyers never have to model a per-strategy rate matrix or fear that turning on Hi-Res blows up the bill.

2. A free allowance, not a free trial. 15,000 pages with no expiration and full feature access is a developer-acquisition lever sitting on top of a 6M-download open-source funnel. It lets teams prove the pipeline end-to-end before paying — converting OSS users into metered API users without a clock.

3. Deployment, not volume, is the enterprise lever. The Business tier doesn’t sell cheaper pages; it sells isolation (dedicated instance, in-VPC, bare-metal), RBAC, and compliance. Pricing stays usage-shaped at every tier — the enterprise upsell is where your data is processed, not how much each page costs.


Strengths & weaknesses

StrengthsWeaknesses
Dead-simple, predictable meter — $0.03/page, one rate, no strategy matrixReprocessing costs full rate again; iteration-heavy pipelines can rack up pages
Genuinely generous free tier (15,000 no-expiration pages, all features)Page-count semantics (what inflates a “page”) aren’t obvious until you see the bill
Flat rate means high-fidelity VLM parsing costs the same as FastEmbeddings, VLM enrichment, and vector storage are billed separately by 3rd parties
Strong OSS funnel and blue-chip backers (Menlo, Databricks, IBM, NVIDIA)Enterprise pricing is fully gated; no public volume discounting

Billing UX : Unstructured billing controls and transparency

  • Billing controls — Pay-As-You-Go is no-commitment: no minimums, no maximums, and no contract to start. The 15,000-page free allowance acts as a built-in spend floor of $0 until you exceed it.
  • Usage visibility — Usage is metered per page in the platform dashboard; the flat rate makes forecasting trivial (estimated cost = pages × $0.03). The main thing to watch is reprocessing volume during pipeline tuning.
  • Payment options — Self-serve Pay-As-You-Go is card-based; Business is invoiced under a custom contract with dedicated support and a personal support representative. Support spans a Slack community, email (support@unstructured.io), and named reps on paid plans.

Strategic wins : Why Unstructured’s pricing decisions worked

1. Killing the strategy rate-matrix removed buying friction

By charging one flat rate across Fast/Hi-Res/VLM, Unstructured eliminated the most common objection in document-AI: “if I turn on the good parser, what happens to my bill?” Buyers can default to the highest-quality strategy without re-modeling cost. See choosing the right usage metric for why a single legible meter beats a fidelity-tiered one.

2. The free allowance converts an OSS army into API revenue

15,000 no-expiration pages let the 45,000+ orgs already using the open-source library validate the managed API at zero cost, then flip to per-page billing when they go to production. It’s a textbook PLG bridge from open source to metered SaaS. Related: how AI companies structure pricing.

3. Selling isolation, not discounts, at the top

Enterprises rarely churn over per-page price; they churn over data residency and compliance. Putting VPC/dedicated/bare-metal and SOC 2 / HIPAA behind the Business tier keeps the meter intact while capturing willingness-to-pay on control. See outcome-based pricing trends for adjacent enterprise-lever patterns.


Areas to improve : Gaps in Unstructured’s pricing approach

1. Reprocessing is the silent bill-inflator

Because every re-run is billed at the full per-page rate, teams iterating on chunking/enrichment strategies can pay several times over for the same corpus. Clearer guidance (and defaults) on change-detection and incremental processing would reduce avoidable bill shock.

2. “What counts as a page” needs to be obvious up front

A flat per-page meter is only predictable if buyers can predict page counts. Spreadsheets, image-heavy files, and audio/video inputs don’t map cleanly to “pages,” and the pricing page doesn’t spell out the conversion. Surfacing page-count rules pre-purchase would make the famously simple meter genuinely predictable.

3. No public enterprise signal

Business pricing is fully gated, so larger buyers can’t self-qualify or benchmark before a sales call. A published starting point or volume band would shorten the enterprise evaluation cycle without giving away the negotiation.


Monetization stack & signals : how Unstructured builds & buys its revenue engine

Buys 2 Builds 1 2 open roles

Stack — build vs buy
Builds in-house · 1
  • In-house usage metering & billing In-house build inferred Job post Apr 2026
Buys (vendor) · 2
Open roles in the revenue & lifecycle org — 2
Where the investment is going

Unstructured looks to be building its monetization plumbing in-house rather than buying it: an open Staff Software Engineer role on the Commercial team is staffing a platform that bundles "identity and access controls, dashboards, usage metering, and billing" — fitting for a company whose product is itself a per-page usage meter, where the metering layer is close to core IP. No third-party billing/metering vendor (Stripe Billing, Orb, Metronome) is named in any public posting. On the go-to-market side the buy signals are clearer: a Marketing Operations Manager req names HubSpot (Marketing + Operations Hub) as the owned marketing instance and references an internal Snowflake lakehouse + BI stack run by a data analyst. Net: in-house metering/billing for the product meter, HubSpot for marketing CRM, Snowflake for the data/analytics layer.

Signals reviewed · derived from public job posts

Key takeaways

  1. One rate, every strategy. Unstructured charges a flat $0.03/page regardless of file type or pipeline — the rare document-AI vendor that doesn’t up-charge for high-fidelity parsing.
  2. The free tier is the funnel. 15,000 no-expiration pages on top of a 6M-download OSS base is a deliberate PLG bridge from open source to metered API revenue.
  3. Predictability was the whole strategy. Each pricing change (compute-hour → tiered per-page → flat per-page) traded cost exposure for legibility; today the bill is just pages × $0.03.
  4. Watch reprocessing, not rate. The headline rate is trivial to model; iteration and page-count inflation are where real spend hides.
  5. Enterprise = isolation, not discounts. The Business tier monetizes VPC/dedicated deployment and compliance while leaving the usage meter untouched.

UBP implications

  1. A single legible meter can beat a “fairer” tiered one. Unstructured proves buyers will trade theoretical cost-optimality (pay less for cheap pages) for a meter they can forecast in their head.
  2. Free allowances outperform free trials for infra products. No-expiration free pages let usage-based products earn trust on the buyer’s timeline, not a 14-day clock — a strong pattern for usage-based pricing adoption.
  3. Decouple the enterprise lever from the meter. Selling deployment/compliance separately from per-unit price lets you capture enterprise willingness-to-pay without distorting the simple usage story that wins developers.

Sources


Bottom line

Unstructured turned the unglamorous, decisive step of LLM data prep into a metered utility: a flat $0.03 per page for any file type and any pipeline, 15,000 free no-expiration pages to start, and a custom Business tier that sells isolation rather than discounts. Its pricing arc — compute-hour → strategy-tiered per-page → one flat rate — is a clinic in trading cost-optimality for predictability. Browse the pricing blueprint for fully-researched company profiles.

Want to compare Unstructured against other AI infrastructure companies? Browse the pricing blueprint.

Pricing timeline : Major events on a vertical axis

Each milestone below corresponds to a public pricing change, product launch, or material adjustment. Major events use a filled marker; minor adjustments use a faded one.

Flat $0.03/page + 15,000 free no-expiration pages

Current public pricing is a single flat $0.03/page for any file type and any pipeline (Fast/Hi-Res/VLM/Auto), replacing the strategy-tiered rate card. New accounts get 15,000 free pages with no expiration. Business tier custom-quoted for VPC/dedicated.

Flat $0.03/page + 15,000 free no-expiration pages - Current public pricing is a single flat $0.03/page for any file type and any pip
captured

Serverless API launch — strategy-tiered per-page pricing

Moved off compute-hour billing (~$12.93 per 1,000 PDF pages) to per-page pricing tiered by strategy: Fast at $1 per 1,000 pages ($0.001/page) and Hi-Res at $10 per 1,000 pages ($0.01/page). Infrastructure spin-up charges removed.

Trivia
  • · Unstructured's open-source parsing library has been downloaded more than 6 million times and is used across 12,000+ codebases and 45,000+ organizations — including more than a third of the Fortune 500.
  • · Its 2024 Serverless API launch cut the effective price of parsing 1,000 PDF pages from ~$12.93 (compute-hour billing) to as little as $1 (Fast strategy) — a ~13x drop that traded compute-time exposure for a predictable per-page meter.
  • · The free tier is unusually generous: 15,000 pages that never expire, with full access to every connector and transform strategy — not a time-boxed trial.

Questions & answers

What is Unstructured's pricing model?
Unstructured's Serverless API is pure usage-based: you pay a flat $0.03 per page processed, with the same rate for any file type and any processing pipeline (Fast, High-Res, VLM, Auto). There are no minimums, maximums, or commitments. A Business tier with dedicated/VPC deployment is custom-quoted.
Does Unstructured offer a free tier?
Yes. Every account starts with 15,000 free pages that never expire, with full access to every connector, transform strategy, and feature. No minimums and no credit card to begin.
How much does Unstructured cost per page?
The public Pay-As-You-Go rate is $0.03 per page processed, charged identically regardless of file type or whether you use the Fast, High-Res, or VLM partitioning strategy. After your 15,000 free pages, you only pay for what you process.
Does the per-page rate change for Hi-Res or VLM processing?
No longer. The current public page advertises a single flat $0.03/page 'for any file type and any pipeline.' At the 2024 Serverless API launch the rate was strategy-tiered (Fast $1 per 1,000 pages, Hi-Res $10 per 1,000), but Unstructured has since collapsed that into one flat rate.