Observability Platform Pricing: Examples & Companies

What is it

Observability Platform Pricing is pricing for LLM and ML observability platforms — tracing, evaluation, and monitoring of model behavior in production. The product ingests the telemetry an AI application throws off — traces, spans, events, evaluation scores, logged payloads — and the pricing usually follows the pipe: a free ingestion quota, metered volume above it, and a retention window that decides how long the data stays queryable.

The LLM-native core of the category converged on a remarkably uniform shape — meter ingestion, give away seats, sell retention — and the nine telemetry vendors (Langfuse, Arize AI, Galileo, Comet’s Opik, Braintrust, HoneyHive, Athina AI, Patronus AI, PromptLayer) each vary the recipe without leaving it. What differs from vendor to vendor is which unit gets counted — a span, a whole trace, a composite unit, an ingested byte — and that choice quietly decides the bill.

Two adjacent flavors stretch the category at its edges. Weights & Biases folds LLM observability (its Weave product) into a broader MLOps platform that also bills seats, model storage, and inference tokens. And the FinOps pair — Finout and Vantage — are cost observability: they watch cloud and AI spend and price by the size of the bill being monitored rather than by telemetry ingested. Together the twelve show that “observability pricing” spans a spectrum from pure per-span metering to flat subscriptions keyed to the problem being observed.

The whole category sits under two structural pressures. Open source is everywhere — Arize’s Phoenix, Comet’s Opik, Langfuse’s self-host path, and W&B’s free wandb SDK — so managed pricing is disciplined by the self-hosting alternative. And the category is consolidating fast: ClickHouse acquired Langfuse in January 2026, Cisco closed on Galileo in May 2026, and CoreWeave acquired Weights & Biases for a reported ~$1.7B — which makes pricing (and data-export) stability itself a buying criterion.

One trace, four ways to count it — coarse to fine

How it works

For the LLM/ML-telemetry vendors the standard bill is platform fee + (ingestion − quota) × rate, with retention and seats as packaging levers. The FinOps outliers replace the ingestion meter with a flat tier keyed to monitored spend:

Lever	Range in the cohort	Examples
Ingestion unit	span · trace · event · composite unit · request · GB	Arize spans + GB; Galileo traces; Langfuse units; HoneyHive events; PromptLayer requests
Free quota	5k–50k units/mo	Langfuse 50k units; Arize & Opik 25k spans; HoneyHive 10k events; Galileo 5k traces
Overage rate	~$2–$100 per relevant unit block	Langfuse $8→$6/100k; Arize spans + ~GB; PromptLayer $0.003→$0.002/txn; W&B Weave $0.10/MB
Retention	14 days → 3 years	Braintrust 14→30d; Arize 15→30d; Patronus 2-week window; Langfuse 90d→3y at $199
Seats	Free/unlimited (mostly)	Braintrust unlimited all tiers; Opik $19 covers 50 members; W&B & PromptLayer cap by tier
FinOps variant	Flat fee by monitored spend	Vantage free ≤$2.5k, Pro ≤$7.5k, Business ≤$20k; Finout flat annual ~1% of cloud bill

Worked example — dual-axis fan-out. A RAG app sending 1M spans and 25 GB a month to Arize AI’s Pro plan pays the base fee plus overage on two axes — trace-spans above the included allowance and ingested GB above 10 — with the data axis, not the raw span count, typically doing most of the damage on payload-heavy agent traces. The same behavior on Braintrust hits its processed-data meter, which counts every byte moving through the platform under a $249 Pro fee, while on Weights & Biases the same trace stream runs through Weave at $0.10/MB — roughly $100/GB, about 3,000x the $0.03/GB it charges for cold model storage. Instrument first, then price: the usage-event tracking guide covers how to measure your own fan-out before committing to a tier.

Companies using this

Twelve in-corpus companies carry the observability product segment: the LLM/ML-telemetry group — Arize AI, Athina AI, Braintrust, Comet (Opik), Galileo, HoneyHive, Langfuse, Patronus AI, and PromptLayer — plus Weights & Biases on the MLOps side and the FinOps cost-observability pair, Finout and Vantage. Ten of the twelve ship a free tier (Finout is the exception, quote-only across every tier), and all twelve gate their top tier behind sales.

Patterns observed

The telemetry core prices ingestion, gives away seats, and charges for memory. A per-user tax would fight the meter, so seats are free or unlimited almost everywhere. The money moves to two places instead: metered overage above the free ingestion quota, and retention as a quiet upsell — Patronus AI gates its entire free tier on a rolling 2-week data window rather than on features, while Braintrust and Arize both roughly double their retention window exactly where the paid tier begins. Keeping data queryable, not receiving it, is where the second dollar comes from.

Defensive unit design is the second pattern. Every vendor that survived repricing simplified toward a unit the buyer can audit. Comet prices Opik spans flat per account; Athina AI deliberately meters executions while ingested logs cost zero; PromptLayer folds requests, agent runs, and eval cells into a single transaction, trading precision for forecastability — the legibility-versus-cost-tracking trade-off the choosing the right usage metric guide frames directly. Even where meters multiply, they stay named and published: Braintrust’s three lines and Weights & Biases’s four are each individually visible on the pricing page rather than rolled into an opaque total.

Free tier as an acquisition funnel, backed by open source. Because telemetry is only valuable at production volume, the free quota is the top of the funnel, and behind most tiers sits an open-source escape hatch (Phoenix, Opik, Langfuse self-host, W&B’s wandb SDK) that both seeds adoption and disciplines the managed price. The FinOps outliers reach the same “adopt free, grow into paid” outcome from the other direction: Vantage pairs a genuinely free tier with high-traffic free tools like EC2Instances.info to seed developers before any paid plan.

Counterexamples & variants

Multi-meter cloud bills, not SaaS tiers. Braintrust breaks the “one ingestion meter” mold with three independent meters closer to a cloud invoice than a SaaS plan, with the processed-data line as the recurring surprise. Weights & Biases goes further: four simultaneous meters and a deliberately asymmetric price that runs its hot, queryable Weave trace stream at roughly 3,000x its cold model-storage rate, pricing observability far above artifacts. W&B also codes its self-serve-to-sales handoff into the pricing page with a hard 50-employee cap on Pro — a rare structural tripwire instead of a usage threshold.

Metering work done, not data received. Athina AI inverts the category’s core assumption: telemetry in is free, and only compute the platform itself initiates — prompt runs, eval cells, flow steps — burns credits. Patronus AI breaks it the other way, with no ingestion meter at all on its self-serve tier, just a hard 2-week data window, the real money living in a quoted enterprise plan plus an optional pay-as-you-go evaluation API. Both prove “observability pricing” can meter evaluation and computation rather than raw data volume.

FinOps observability, priced by the bill it watches. The sharpest variant is the cost-observability pair, where neither vendor meters telemetry at all — the “observed” object is the customer’s own spend. Vantage gates plans by monitored cloud spend at fixed monthly rates it advertises as pricing that “doesn’t contribute to your cost problems” — usage-shaped without being a percentage of spend. Finout prices for the certainty its finance buyers want: a flat annual fee tiered by committed cloud spend (reported at roughly 1% of the bill, ~$1,000/mo up to ~$500k spend), unlimited users, and “no surprise overage charges.” The price scales with the size of the problem, not with events ingested.

What this means for buyers vs vendors

For buyers

Run a week of production-shaped traffic through two or three free tiers and read the actual meters before choosing — the same workload registers as wildly different volumes depending on whether the unit is a trace (Galileo), a span (Arize, Comet), a composite unit (Langfuse), or a byte, where a per-MB line can produce four-figure surprises on payload-heavy traces. Price the retention you actually need for audits rather than the tier default, and separate your two questions: if you are buying application observability, the LLM-telemetry cohort is the field; if you are buying cost observability, Finout and Vantage price by your cloud bill, not your trace volume, and the comparison is a different one. In a consolidating category, ask what happens to your rates and your data-export path if the vendor is acquired — three of these twelve were, inside eighteen months.

For vendors

The telemetry cohort’s settled playbook is clear: meter ingestion, free the seats, ladder retention, and keep an open-source pressure valve. The live design question is bill-shock control — graduated curves (Langfuse) and flat team accounts (Comet’s Opik) both beat raw linear overage on buyer trust, and if your true cost driver is bytes rather than events, surface it as its own published axis the way Arize does rather than letting a hidden data meter aggregate into an invoice the buyer cannot reconstruct — the lesson Braintrust’s processed-data line and W&B’s Weave meter both teach. The FinOps variant points to a second strategy entirely: when your buyer is finance, price for certainty the way Finout does — a flat committed-spend fee with no overage — even if it means leaving usage upside on the table, because a predictable invoice is itself the product for that buyer.

Company	Product	Pricing model	Billing units	Free tier	Verified
Arize AI	AI & LLM observability (Arize AX + Phoenix OSS)	freemium hybrid	trace-spans gb-ingested	Yes	2026-06-09
Athina AI	Collaborative AI development platform for building, testing, evaluating and monitoring LLM features	freemium	credits events	Yes	2026-06-04
Braintrust	LLM evaluation & observability platform	hybrid	tokens storage-gb scores	Yes	2026-07-22
Comet	AI/ML observability and experiment-tracking platform — Opik (LLM/agent observability) and Comet MLOps (experiment tracking)	freemium seat-based hybrid	seats gpu-hours storage-gb	Yes	2026-06-02
Finout	Finout — enterprise cloud + AI cost observability (FinOps) platform	subscription commitment	datapoints	No	2026-07-23
Galileo	AI observability, evaluation, and guardrails platform for agents and LLM apps	freemium hybrid	events	Yes	2026-06-04
HoneyHive	AI observability and evaluation platform for LLM and agent applications	freemium	events	Yes	2026-06-04
Langfuse	Open-source LLM observability, evals, and prompt management	freemium hybrid subscription	units events seats	Yes	2026-07-23
Patronus AI	LLM and AI agent evaluation, monitoring, and guardrail platform	freemium pure-usage	api-calls credits	Yes	2026-06-04
PromptLayer	Prompt management, evaluation, and observability platform for LLM and AI-agent teams	freemium hybrid	seats requests transactions	Yes	2026-07-22
Vantage	Vantage — cloud + AI cost monitoring and FinOps platform	subscription hybrid	seats datapoints	Yes	2026-06-10
Weights & Biases	MLOps experiment tracking, W&B Weave LLM observability/evals, Models registry, and Serverless Inference	freemium hybrid seat-plus-usage	seats storage-gb traces	Yes	2026-07-21

Explore this theme in the knowledge graph

FAQ

How is LLM observability priced?

Almost universally on ingestion volume — traces, spans, events, or composite units — against a tier's included quota, with retention windows as the second axis. Langfuse bills units at $8 falling to $6 per 100k past a 50k free allowance, Arize meters trace-spans plus ingested GB, and Galileo meters whole traces (5,000 free, 50,000 on its $100 Pro tier). Seats are usually free or unlimited so the meter does the work.

Which companies are in the observability pricing cohort?

Twelve in-corpus platforms carry the observability product segment: Arize AI, Athina AI, Braintrust, Comet (Opik), Galileo, HoneyHive, Langfuse, Patronus AI, and PromptLayer on the LLM/ML-telemetry side, plus Weights & Biases (MLOps + Weave) and the FinOps cost-observability pair Finout and Vantage. Ten of the twelve ship a free tier.

Why do observability free tiers feel so generous?

Because telemetry only becomes valuable at production volume, the free quota is the acquisition funnel. Langfuse gives 50k units a month, Arize and Comet's Opik 25k spans, HoneyHive 10k events, and Galileo 5k traces. The bet is that instrumented apps grow into the paid meters.

What is the retention axis in observability pricing?

How long your traces stay queryable, sold separately from ingestion. Arize keeps 15 days free and 30 on Pro, Braintrust steps from 14 days on Starter to 30 on Pro, Patronus gates its free tier on a rolling 2-week data window, and Langfuse jumps from 90 days to 3 years only at its $199 Pro tier. Keeping data usually costs more than ingesting it.

Do observability platforms charge per seat?

Mostly no — the meter is ingestion, so seats are deliberately free to spread adoption. Braintrust offers unlimited users on every tier, Comet's Opik Pro covers 50 members for a flat $19, and Langfuse and Finout keep users unlimited. PromptLayer and Weights & Biases are the exceptions, with W&B billing seats and capping its Pro tier to teams under 50 employees.

How is FinOps cost-observability priced differently?

Finout and Vantage price by the size of the cloud bill they watch, not by telemetry ingested. Vantage is free up to $2,500 monitored monthly spend and steps through fixed-rate Pro and Business tiers, while Finout charges a flat annual fee tiered by committed spend at roughly 1% of the bill. Both explicitly avoid a percentage-of-spend meter on the platform fee.

Related guides & calculators

Back to companies