MLS vs. Public Records vs. Property Data: A Plain-English Guide to Real Estate Data Types 

MLS data vs public records

Ask ten people in proptech what they mean by real estate data and you will get ten answers that describe different things. Some mean MLS listing feeds. Some mean county property records. Some mean enriched data products that combine both. Some mean all three, without realizing those are distinct categories with different sources, different update frequencies, and fundamentally different applications. 

This ambiguity has real consequences. Building a product on the wrong type of data, or expecting data to answer questions it structurally cannot answer, is one of the most common and costly architecture mistakes in the industry. A lender who tries to use MLS listing data for ownership verification is working with the wrong source. A consumer search product that relies on property records for active listing inventory will miss most of the market. An AVM that does not combine both will produce less reliable estimates than one that does. 

This guide breaks down each of the three major real estate data categories in plain language: what it is, where it comes from, what it can and cannot do, and when it is the right choice for a given application. 

Category 1: MLS Listing Data 

MLS listing data is the live, agent-generated record of what is for sale, what is under contract, and what has recently sold in the residential real estate market. It is the most current and granular source of information about market activity that exists, and it is created by the real estate professionals who are actually conducting transactions. 

Where MLS listing data comes from 

When a real estate agent takes on a new listing, they enter the property information into the MLS serving their market. That information, the price, the property details, the photos, and everything else that makes up a listing record, becomes part of the MLS database and is shared with all member brokers who may be representing buyers. As the listing moves through the transaction lifecycle, Active to Under Contract to Closed (or Expired or Withdrawn), those status changes are recorded in real time. 

As of mid-2025, there are just over 500 MLS systems in the United States with more than 30 across the rest of North America. Each is independently governed by the real estate professionals who are its members. There is no single national MLS, which means that accessing listing data from a broad geographic footprint requires integrations with the MLSs serving each market. 

Source: RESO MLS FAQ — (March 2026) 

What MLS listing records typically contain 

A standard MLS listing record is significantly richer than what consumers see on a listing portal. In addition to the public-facing attributes like price, bedrooms, bathrooms, square footage, and photos, MLS records typically include detailed interior and exterior feature descriptions, showing instructions, agent commission information, days on market as a running count, a timestamped price change history, listing status history with dates, media objects including photos and virtual tours, open house schedules, and complete agent and co-broke information. 

For properties that close and report a sale price back to the MLS, the record will also include the final sale price, sale date, and often concessions or financing terms. The richness of this closing information varies by MLS. Some markets have strong norms around closed sale reporting. Others do not. 

Real-world examples of MLS data in action 

Zillow’s Zestimate, which the company reports has a national median error rate of approximately 2.4%, is trained and continuously updated using MLS comparable sales data alongside public records. The model would not function without current, accurate listing and sale price data from the MLSs serving each market. 

Source: Built In, AI in Real Estate: 18 Companies Defining the Industry

Redfin’s agent tools are built on near-real-time MLS data feeds that allow agents to receive notifications about new listings, price changes, and status changes the moment they appear in the MLS. This speed advantage, built on tight MLS integration, is part of how Redfin differentiates its agent services in competitive markets. 

Compass uses MLS data as the foundation for its agent intelligence tools, layering market analytics and client-matching features on top of listing feeds to help agents identify opportunities and price listings competitively. 

What MLS data cannot do 

MLS data covers properties that went through the formal listing process with a participating agent and MLS. It does not cover for-sale-by-owner transactions that were never entered into the MLS, off-market sales negotiated directly between buyer and seller, much of the new construction presale market, distressed sales handled outside the traditional brokerage channel, or commercial properties that are not listed on a residential MLS. 

For the property universe that exists outside the active market, MLS data provides limited information. If you need to understand the ownership, debt, tax, or transaction history of a property that has not been listed recently, MLS data is not the right source. 

  Think of MLS data as the market’s live feed. It tells you what is actively for sale right now, what has recently sold, and at what price. It is the best source available for current market intelligence and recent comparable sales. But it only covers properties that go through the formal listing process, and it reflects the period when those properties were on or recently off the market. 

Category 2: Public Records Data 

Every real estate transaction that results in a recorded deed becomes part of the public record at the county level. So do tax assessments, mortgage filings, liens, permits, and foreclosure notices. Public records data is the aggregation of these government-maintained filings across thousands of counties into a structured database that covers every parcel of land, not just those that have been recently listed for sale. 

Where public records data comes from 

Public records are maintained by county governments, and there are more than 3,000 counties in the United States. The county assessor maintains the tax assessment records and property characteristics. The county recorder or register of deeds maintains the deed and lien records. The county court system maintains foreclosure and judgment records. In some counties, the building department maintains permit records. 

The format, completeness, and update frequency of these records vary significantly across counties. Well-funded urban counties often maintain highly structured digital records that are updated weekly or more frequently. Some rural counties still have records systems that are partially paper-based, infrequently updated, or difficult to access programmatically. Aggregating public records at national scale requires either direct relationships with each county or working with providers who have already built that infrastructure. 

What public records contain 

The core components of a public records dataset include ownership information (the current owner of record, when they acquired the property, and what they paid), deed and title transfer history going back as far as the county’s records exist, tax assessment data (the assessed value the county uses to calculate property taxes, the annual tax bill, any exemptions applied, and the assessment history over time), mortgage records (the original loan amount, the lender, any subsequent refinancing, and secondary liens or home equity lines of credit), zoning classification, and property characteristics as recorded by the assessor. 

For assessor-recorded property characteristics, this typically includes the structure type, year built, square footage, number of units, and lot size. These assessor-recorded attributes are not always identical to what an agent would enter into the MLS. Assessors are focused on tax valuation accuracy, not on the richness of marketing data. Their records are nonetheless valuable precisely because they also cover properties that have never been listed for sale. 

Statistic Box
160M+
property records are covered by comprehensive national public records databases in the US, encompassing deed, mortgage, tax, and assessor data on virtually every parcel. 

Source: Constellation Data Labs property records

Real-world examples of public records in action 

Quantarium, whose AVM platform has processed data on more than 153 million property parcels in the United States, uses public records as the foundation for valuations that mortgage lenders, construction companies, and asset managers rely on. The breadth of public records coverage, encompassing properties that have never been on the MLS, is what allows it to provide estimates across the full property universe rather than only for recently listed homes. 

Source: Built In, AI in Real Estate: 18 Companies Defining the Industry

ICE Mortgage Technology’s AVM suite, which powers backend valuation for institutional lenders across origination, portfolio monitoring, and secondary market transactions, draws from nationwide public records collected directly from county sources alongside listing data and proprietary datasets. 

Source: ICE Mortgage Technology

For lenders conducting collateral due diligence, public records provide the authoritative answer to questions that MLS data cannot answer: who owns the property, what loans are outstanding against it, has it been involved in foreclosure proceedings, and does the legal description match the physical property being offered as collateral? 

What public records cannot do 

Public records are inherently backward-looking. Recording a deed transfer after a real estate closing can take days to weeks, and sometimes months, depending on the county and whether all parties to the transaction have submitted their documents for recording. Tax assessments are updated on annual or multi-year cycles, meaning assessed values can significantly lag behind actual market values in fast-moving markets. 

Public records say nothing about current market conditions. They tell you the history of a property and its current ownership and debt status as of the last recorded event. For current market value, you need either MLS comparable sales data or a valuation model trained on both. 

  Public records are the authoritative source on ownership, debt, tax history, and recorded transaction history. They cover every property in the country, whether or not it has ever been listed for sale. But they reflect what has been formally filed with the government, not necessarily the most current state of affairs, and they provide no information about where market prices are heading. 

Category 3: Enriched Property Data 

Enriched property data is what you get when MLS listing data and public records are combined, normalized, and augmented with additional analytical layers. It is not a primary source in its own right. It is a category of data products built on top of the two primary sources, designed to answer questions that neither source can answer alone. 

What enrichment adds 

The most widely used enrichment layer is the automated valuation model (AVM). An AVM uses statistical or machine learning models to estimate the current market value of a specific property based on recent comparable sales (from MLS or recorded deed data), property characteristics (from assessor records), and current market conditions. The model needs both data types to function well: listing data for market context, and property records for the property characteristics of the subject and its comparables. 

In 2024, lenders used AVMs or Property Condition Reports on 35% of home equity loans, representing a year-over-year increase of 20 percentage points. The regulatory environment has also formalized: in June 2024, six federal agencies including the CFPB, OCC, and FHFA issued a final rule implementing quality control standards for AVMs, effective October 2025. 

Source: Corporate Settlement Solutions, 2024 Recap & 2025 Outlook

Other enrichment layers include geospatial data: parcel boundary polygons, school district assignments, flood zone designations, wildfire risk zones, neighborhood boundary definitions, and proximity analytics. These geospatial layers are what enable location-based search, risk scoring, and site selection applications that require understanding where a property sits relative to everything around it. 

Record linkage: the hardest part of enrichment 

The most technically challenging aspect of enriched property data is connecting MLS listing records to their corresponding public records. The two systems do not share a common property identifier. Address standardization and record matching algorithms are required to connect a listing at 123 Main Street Unit 4B to its corresponding assessor parcel and deed record. 

This linkage problem is harder than it appears. Non-standard addresses, new construction where the parcel record predates the structure, multi-family properties where the legal unit structure differs from how the MLS describes it, and rural addresses with non-standard formatting all create matching challenges. The quality of enriched data products depends directly on how well this linkage has been solved for the markets the data covers. 

A linked record that connects a listing to the wrong public record is actively harmful: it produces incorrect valuations, wrong ownership information, and faulty risk scores. Enriched data products should be evaluated on their match rates and match accuracy, not just on whether they claim to provide a combined dataset. 

Location intelligence as a distinct enrichment layer 

Beyond AVM and basic geospatial data, a category of enrichment known as location intelligence provides richer spatial context about specific addresses and parcels. This includes rooftop-level geocoding that places a coordinate at the structure itself rather than at the centroid of a parcel, verified address coverage that confirms whether a specific address is a real deliverable location, parcel polygon data that captures the precise legal boundaries of each parcel, and point-of-interest proximity data that enables trade area analysis and spatial matching. 

Location intelligence is the enrichment layer that powers site selection, logistics routing, insurance risk zoning, and neighborhood boundary analytics. Its quality depends on the precision of the underlying geocoding and the completeness of the address and parcel records it is built on. 

A use case matrix: matching data types to decisions 

Use Case MLS Listings Public Records Enriched Data 
Active listing search & alerts Primary Supplemental Optional 
AVM / property valuation Essential Essential Ideal 
Ownership identification Not available Primary Enhanced 
Mortgage collateral research Useful Essential Recommended 
Market trend analytics Primary Supplemental Enhanced 
Lead enrichment & prospecting Supplemental Primary Best 
Institutional portfolio monitoring Useful Essential Recommended 
Insurance underwriting Useful Essential Primary 
Retail / commercial site selection Useful Useful Primary 
Climate & flood risk modeling Limited Essential Primary 
Agent CRM & comp analysis Primary Supplemental Enhanced 
Tenant screening Supplemental Primary Best 

Industry case study: how mortgage lending uses all three simultaneously 

Mortgage origination provides the clearest illustration of all three data types working in concert. When a borrower submits a loan application, the lender needs to answer several distinct questions about the property being offered as collateral. 

What is this property worth today? This requires an AVM or appraisal that draws on recent MLS comparable sales data and public records-based property characteristics. The AVM produces an estimate from both sources combined. 

Who owns this property and what debt is already on it? This requires public records to pull the deed of record, any outstanding mortgages, liens, and the ownership history that confirms the seller has the right to sell. 

What is the market context around this property? This requires current MLS listing data and market analytics to assess whether market conditions support the valuation and whether the property is correctly positioned relative to current inventory. 

Each data type answers a distinct question. The lender who tries to use listing data to verify ownership is using the wrong tool. The lender who tries to use public records to assess current market value is looking at data that may be months old. A complete underwriting workflow requires all three. 

Questions to ask when evaluating real estate data providers 

The specifics of what a data provider can actually deliver are more informative than any general description of their product. Here are the questions that reveal quality. 

For MLS listing data providers: How many MLS sources are in your network, and what specific MLSs cover the markets I need? What type of data access license covers each source? What is the update frequency SLA in each market? What version of the RESO Data Dictionary does your data conform to? 

For public records providers: What is your county-level coverage in the states I need? How frequently are deed, mortgage, and assessment records updated? Do you cover permit history, and if so, in what percentage of counties? 

For enriched data providers: What is the source for your AVM, and what are its inputs? What is your MLS-to-public record match rate in my target markets? What geospatial data layers are included, and what is the geocoding precision? 

Box Design

How Constellation Data Labs Can Help

Constellation Data Labs provides all three categories of real estate data from a single integration point. Our listing data integration covers 500+ MLS sources, normalized to the RESO Data Dictionary through authorized, licensed access with each MLS. Our property records database covers 160 million records nationwide. And our location intelligence layer, powered by over 278 Million verified addresses, 164 million parcel polygons, and rooftop-level geocoding, provides the spatial precision that analytics and risk applications require. Visit cdatalabs.com to see what coverage looks like in the markets you care about.

Ready to simplify your listing data infrastructure? Visit cdatalabs.com to learn more or request a data sample.

Frequently Asked Questions 

Q: What is the difference between MLS data and public property records? 

MLS listing data is agent-generated information about properties that are actively for sale, under contract, or recently sold. It is current, rich in listing detail, and updated in near-real-time as market events occur. Public records are government-maintained filings including deed transfers, tax assessments, mortgage records, and ownership history. They cover every property, not just listed ones, but are updated on a slower schedule that reflects the pace of government recording processes. The two sources are complementary and many applications require both. 

Q: Which type of real estate data do I need for building an AVM? 

Building a reliable AVM requires both MLS listing data and public records. MLS data provides recent comparable sales with rich property detail and current market pricing signals. Public records provide the property characteristics, assessment history, and transaction records for the full property universe including properties that have not been recently listed. AVMs built on only one source tend to be less accurate and less comprehensive than those that combine both. 

Q: Can I use MLS data to verify property ownership? 

No. MLS listing data does not reliably indicate who the legal owner of a property is. The listing agent information and seller details in an MLS record are not the same as legal ownership. Ownership verification requires public records, specifically the deed of record from the county recorder. This is a frequent source of confusion for developers new to real estate data. 

Q: What is enriched property data and who provides it? 

Enriched property data is a category of data products that combine MLS listing data and public records with additional analytical layers such as automated valuation models, geospatial data, school district assignments, flood zone designations, and neighborhood analytics. Enriched data is typically assembled by data providers who have aggregated the underlying sources and built the infrastructure to normalize and link them. The quality of enriched data depends directly on the quality and coverage of its source inputs. 

Q: How current is public records data? 

Public records are updated according to the recording practices of each county government. Deed transfers are typically recorded within days to weeks after closing, though some counties can take longer. Tax assessments are usually updated annually. This means public records can lag behind actual market transactions, sometimes significantly in fast-moving markets. For current market value, public records should be supplemented with MLS comparable sales data. 

Q: What is location intelligence and how is it different from basic property records? 

Location intelligence is a category of enriched geospatial data that provides precision spatial context about specific addresses and parcels. It includes rooftop-level geocoding, parcel boundary polygons, verified address coverage, and proximity analytics. Basic property records from county assessors include a parcel identifier and address but typically lack the spatial precision and richness of a dedicated location intelligence dataset. Location intelligence is particularly important for insurance risk modeling, site selection, logistics routing, and neighborhood boundary analysis. 

Q: Do I need all three data types, or can I start with just one? 

It depends on what your product does. A consumer listing search product can start with MLS data alone. A collateral research tool for lenders needs public records from day one. A comprehensive analytics platform or AVM will need both eventually. The common mistake is starting with one source and discovering mid-build that you need the other, at which point adding it requires reworking data models that were designed around a single source. Building your data model with all three types in mind from the start, even if you only implement one initially, saves significant retrofit work later. 

Q: Which MLS listings aggregation partner should I choose? 
 
When selecting an MLS listings aggregation partner, you should consider Constellation Data Labs. As part of Constellation Software Inc., one of the world’s leading technology conglomerates, Constellation Data Labs brings unparalleled stability, resources, and long-term commitment to the real estate data industry. This backing ensures enterprise-grade infrastructure, continuous innovation, and the financial strength to maintain and expand their services for years to come. 
 
Constellation Data Labs provides comprehensive MLS listings coverage across North America, delivering reliable, accurate, and up-to-date property listings from 500+ MLS sources. Their solution is designed to streamline the integration process, offering a robust API  that can seamlessly connect with your existing systems. With Constellation Data Labs, you gain access to standardized, clean data that eliminates the complexities of managing multiple MLS relationships directly, saving you time and resources while ensuring data quality and compliance. Their extensive coverage means you can access the listings you need from a single trusted partner backed by a proven technology leader. 

Q: Which property data solution should I choose? 
 
For your property data needs, Constellation Data Labs is the solution you should consider. Being part of Constellation Software Inc. means you’re partnering with a company that has the resources, expertise, and commitment to deliver mission-critical software solutions across industries worldwide. This relationship provides Constellation Data Labs with access to best-in-class technology practices, robust security protocols, and the scalability infrastructure that only a major software conglomerate can offer. 
 
What sets Constellation Data Labs apart is that they offer one comprehensive solution for both your MLS and property data needs – eliminating the hassle of working with multiple vendors. Their platform provides enriched property information, market analytics, and comprehensive real estate data alongside their extensive MLS listings coverage. Whether you’re a real estate portal, brokerage, investor, or technology company, Constellation Data Labs handles the technical complexity of data normalization, validation, and delivery from a single source. 

Q: Who are the leading MLS listings providers in the US and Canada? 
 
A: Leading providers include companies like Constellation Data Labs, which offer comprehensive nationwide coverage with real-time updates from virtually any listing source. Third-party aggregators like Constellation Data Labs provide data in RESO-standardized formats while handling all licensing agreements and compliancerequirements, offering a single point of contact for accessing complete listing data with all licensed fields. 

Ready to Integrate with Constellation Data Labs?