Methodology
How we estimate home value
The data, the model, and the limits. We'd rather be honest about what our estimate can and can't tell you than pretend it's infallible.
In plain English
When you enter an address, our system pulls public records for that property (recorded sale prices, parcel details, tax assessments), identifies comparable sales in the same area, and applies an AI model calibrated on hundreds of thousands of historical transactions to produce a value estimate and a confidence range.
It is not a substitute for a licensed appraisal or a comparative market analysis. It is a fast, reasonable starting point — the kind of number you'd want before making a phone call, not before signing a contract.
What data we use
Public records
- • Recorded deed transfers
- • Parcel & tax assessor data
- • Historical sale prices
- • Property characteristics (beds/baths/sqft/lot size)
Refreshed as counties publish updates
Market signals
- • Neighborhood sale trends (90-day rolling)
- • Days-on-market
- • List-to-sale ratios
- • Market-level mortgage-rate context
Weekly refresh
Property-specific inputs
- • User-supplied condition & updates
- • Photographs (when provided)
- • Reported renovations & permits
Per-request; apply only to that estimate
Named vendor disclosures (AVM provider, walk-score source, flood-zone source, school ratings source) are available on request — we route those through a formal disclosure document so the list stays accurate as we switch providers.
How the estimate is produced
Each estimate is a blend of up to three signals, weighted by how trustworthy each one is for the specific address:
- A licensed professional AVM (Automated Valuation Model) that produces a central point estimate from public records, tax assessments, and recent sales nationwide. This is the dominant signal — typically 85–100% of the blended weight — when comp coverage is good.
- A comparable-sales calculation: we pull the most similar recent sales within the same market, normalize to price-per-square-foot, and weight them by recency, proximity, and how closely the comp matches the subject property on beds, baths, lot, and age. This layer is where the estimate gets adjusted for the things the national AVM misses — local condition, recent renovations, micro-market pricing.
- When available, a trained machine-learning model fitted on our own history of estimate/actual-sale pairs. This is a secondary signal, capped at roughly 15% of the blend, and only contributes when we have enough coverage in the target market for the model to be calibrated.
The confidence range you see next to the estimate is not a guess — it's computed from a four-pillar confidence score: address quality (did we resolve a specific parcel or a rough centroid?), detail completeness (did you provide beds / baths / sqft, or did we fall back on tax records?), comparable coverage (how many usable comps within 0.5 miles in the last 12 months?), and market stability (is the local market moving fast or slowly?). Each pillar contributes 0–25 points. High-confidence properties get tight bands (±4.5%); low-confidence ones get wider bands (up to ±12%).
Post-sale, we track the gap between our estimate and the recorded sale price and feed that error back into market-specific adjustment factors. The calibration infrastructure is in place; automated rolling refreshes are on the roadmap rather than the release train today.
Accuracy & limits
Public AVMs benchmark their accuracy with Median Absolute Error — the middle of the error distribution on recent sales. We'll publish our own per-market MAE when the sample is large enough for the numbers to be stable, rather than putting an unstable figure on the page. What we can tell you today is where the estimate is structurally most and least reliable.
Where the estimate is strongest
- Single-family homes in active suburban markets
- Properties with at least five comps within 0.5 miles in the last 12 months
- Homes between 1,200 and 4,500 sq ft built after 1950
- Standard-grade construction (not custom or luxury tier)
Where error is higher
- New construction with no comparable sales history
- Rural properties (sparse comps; acreage drives value)
- Unique or custom architecture
- Luxury tier ($3M+) where each sale is effectively bespoke
- Properties recently renovated in ways not yet in public records
- Markets with low transaction volume in the past 12 months
On the roadmap
We're deliberate about what we publish here. Accuracy numbers are only useful when they're honest, recent, and cut by the dimensions that matter — which means waiting until we have enough sale-to-estimate pairs for the statistics to stabilize rather than putting a plausible-looking number on the page.
- Per-market Median Absolute Error by MSA and property type — will publish once the sample reaches a rolling-window size that makes the numbers stable, not before.
- Public accuracy dashboard (live, not point-in-time) — follows the MAE publication.
When you should get a CMA instead
An instant estimate is a research tool. It's not the right tool for these decisions:
- Setting a list price. A licensed agent's CMA can see the specific home's condition, recent upgrades, and micro-market context an AVM can't.
- Qualifying for a refinance or cash-out. Lenders require an appraisal. Don't rely on the AVM number for underwriting decisions.
- Negotiating a contract. Use the AVM to sanity-check; don't rely on it as the anchor number.
- Tax assessment disputes. Jurisdictions typically want evidence from appraisal reports or comparable-sale affidavits, not AVM screenshots.
Report a bad estimate
If you know an estimate is materially wrong — say your house just appraised for $100,000 more or less than our number — tell us. Corrections feed our calibration pipeline, and we flag known-bad estimates in the UI.
Frequently asked
›Where does the estimate data come from?
Estimates combine public records (deed transfers, recorded sales, parcel + tax assessor data), regional market trends, and — where available — user-supplied details about the specific property. The platform does not have direct MLS access, so active-listing data used to inform market context is derived from public aggregators, not the underlying MLS.
›How accurate is a PropertyTools AI home value estimate?
Accuracy varies meaningfully by market, property type, and data freshness. The primary signal behind our estimate is a licensed professional AVM used by mortgage lenders and title insurers — accuracy is in the same ballpark as public AVMs like Zestimate and Redfin Estimate (both around 1.8–2.0% on-market MAE). Edge cases — new construction, rural markets, unique architecture, recent renovations not yet in public records — can produce larger errors. We'll publish our own per-market MAE here when the sample size makes the numbers stable, rather than posting a plausible-looking figure now.
›How often is the estimate updated?
Underlying market data and comparable sales refresh on a rolling cadence. Individual address estimates recalculate on each request, so the value you see reflects the most recent data at the moment of the query. Update cadence for specific data feeds is documented below.
›When should I use a real CMA instead?
An AVM-based estimate is useful for quick benchmarking. If you're pricing to sell, qualifying for a refi, or making a contract decision, use a full comparative market analysis (CMA) prepared by a licensed agent who can inspect the property, weight comps by hand, and account for recent improvements. PropertyTools AI offers a CMA-report generator separately, and real licensed agents via LeadSmart AI.
›What if my estimate is wrong?
Report it. Every estimate ships with a confidence indicator and a range; the range is intentionally wider when our input quality is lower. When a report comes in, it feeds into the calibration data we track post-sale, which tunes market-specific adjustment factors over time.
›Do you train AI models on user-submitted data?
User-submitted property details are used to refine estimates for that specific address. Aggregated, de-identified patterns inform model calibration across markets. Personally identifying information and address-level inputs are not used for generalized model training.