Methodology

How we collect, normalize, and analyze hospital pricing data from thousands of facilities to give you an accurate picture of what medical procedures actually cost.

Overview

ORVO aggregates publicly available hospital pricing data mandated by federal transparency rules. Every data point in our system comes from an auditable, public source — either a hospital's own machine-readable file or a CMS-published fee schedule. We do not estimate, model, or fabricate prices.

When you upload a bill, we match your line items against this dataset to show you what other facilities charge for the same procedures — locally, statewide, and nationally.

Data Sources

Three independent, federally mandated data sources form the foundation of every comparison we generate.

01

Hospital MRF Data

Every US hospital is legally required to publish machine-readable pricing files under 45 CFR Part 180. We ingest all of them — including the new 2026 median allowed amounts derived from real insurer payments.

02

Medicare Benchmarks

CMS publishes what the government pays for every procedure. This is the floor — the baseline rate that no hospital should reasonably exceed by more than 2-3x for a self-pay patient.

03

Geographic Analysis

We compare your bill at three levels — local facilities near your ZIP code, statewide hospital data, and national benchmarks — so you see exactly where your charges sit in the market.

Machine-Readable Files

What are MRFs?

Since January 2021, the Hospital Price Transparency Rule (45 CFR Part 180) requires every US hospital to publish a machine-readable file containing the prices they have negotiated with every insurer, as well as their gross (chargemaster) and cash/self-pay rates.

Starting in 2026, hospitals must also publish the median allowed amount — the actual dollar figure most commonly paid by commercial insurers for each service. This is the most accurate reflection of real-world pricing available.

These files are published in CSV or JSON format, typically containing thousands to hundreds of thousands of rows per facility. We discover, download, parse, and normalize all of them into a single queryable dataset.

Our ingestion pipeline

Discovery

We crawl hospital websites and CMS-published index files to locate every MRF URL. Files are validated for schema conformance before processing.

Parsing & Normalization

Raw MRF data comes in dozens of formats. We normalize every record to a consistent schema — standardizing billing codes (CPT, HCPCS, DRG), aligning payer names, and converting all dollar amounts to a uniform representation.

Entity Resolution

Hospitals appear under multiple names, NPIs, and CCNs across different files. We resolve these to a single canonical facility record, linked to geographic coordinates via PostGIS.

Quality Assurance

Automated audits flag outliers (e.g., a $500,000 aspirin), check for completeness, and verify that price distributions are statistically reasonable before data enters the production dataset.

Medicare Fee Schedules

The government baseline

CMS publishes the Medicare Physician Fee Schedule (MPFS) and Outpatient Prospective Payment System (OPPS) rates annually. These represent what the federal government pays hospitals for every covered procedure — adjusted by geographic locality.

We load the complete fee schedule for every billing code and locality, giving us a floor price for every procedure in every region of the country. When your bill charges 5x or 10x the Medicare rate, that context is immediately visible.

Why Medicare matters for self-pay patients

Medicare rates are publicly negotiated by the largest single payer in the US. Many hospitals offer self-pay discounts benchmarked at 1.5–3x the Medicare rate. Knowing the Medicare rate for your procedure gives you a concrete, defensible number to reference in any negotiation.

Geographic Comparison Model

Pricing varies dramatically by location. A knee MRI in Manhattan can cost 4x what it costs 30 miles away in New Jersey. Our three-tier geographic model ensures comparisons are meaningful.

25mi

Local

All facilities within 25 miles of your ZIP code. This is your most actionable comparison — these are hospitals you could realistically visit instead.

State

Statewide

Every hospital in your state. State-level data reveals whether your local market is generally expensive or competitive, and gives leverage in negotiations.

US

National

All US facilities. The national distribution shows the full range of what hospitals charge, from the cheapest rural clinics to the most expensive academic medical centers.

Statistical Approach

How we compute market rates

For every billing code in every geographic tier, we compute a distribution of observed prices. Key statistics include:

P25

The 25th percentile — what lower-cost facilities charge. A strong benchmark for negotiation.

P50

The median — what a typical facility charges. This is the "fair market rate" for a procedure in your area.

P75

The 75th percentile — the upper range. If your bill is above this, you are paying more than most patients in the market.

These statistics are pre-computed as materialized views in our database and refreshed as new data is ingested. This ensures sub-second query times when you compare your bill.

Data Freshness & Updates

Hospital MRFs are required to be updated annually, though many hospitals update more frequently. We re-crawl and re-ingest the full dataset on a rolling basis to capture updates as they are published.

Medicare fee schedules are published annually by CMS, typically in November for the following calendar year. We load each new schedule within days of release.

What we don't do

  • We do not estimate or predict prices using statistical models
  • We do not scrape patient-reported billing data from forums or surveys
  • We do not use insurance claims data that is not publicly mandated
  • Every number in our system traces back to a published, verifiable source

See the data in action

Upload your bill and compare it against published rates at facilities near you.