Trulia, a real estate site, has built a platform that capitalizes on advanced big data analytics and machine learning technologies to create a smarter browsing experience that’s tailored to the preferences of the individual in search of a new home. It invested heavily in personalization, predictive analytics and machine learning (including computer vision and recommender engines) to streamline the home buying process.
The company handles more than 1.5 terabytes of data every day, drawing on two data sets: public records that include listing information and consumer behavior and engagement data.
Processing the public records entails seven stages:
- As data streams come in different formats, Trulia uses parsers to convert them into one standard format and then defines unique short codes for attributes to enable efficient storage.
- All addresses are standardized using Trulia’s own address standardization and normalization technology.
- All pictures of properties are processed and resized for a better consumer experience.
- Location-aware data is added to listings. These include local amenities and neighborhood crime scores, stats and trends.
- All historical datasets are assigned to listings.
- All datasets are merged and matched in an index to make the information searchable for consumers.
- Lastly, the data goes through Trulia’s data service API. From there, it can move into a searchable index on the firm’s site (available on desktop computers and mobile devices) and its mobile app. It uses both dedicated code bases and responsive design.
Monitoring Engagement to Customize Responses
While other real estate sites also make this kind of public data accessible, what sets Trulia’s platform apart is what it does with consumer behavior data. Within a few minutes of engagement on the firm’s site, consumers generate an average of 18 to 20 signals about their intent. Those signals, which include clicks and time spent looking at a particular image, indicate which features on the site attracted the individual’s attention.
Trulia then uses data science and machine learning with predictive science to build digital signatures of the consumers based on their behavior on the site in real time. That kind of insight makes it possible to give the customer only what is relevant, rather than one-size-fits-all promotional materials.
Deep Varma, Trulia’s vice president of data engineering, explains the components of bringing together the vast quantities of data collected with machine learning capabilities to offer people a better search experience. It was achieved by building three pillars for the platform: computer vision, recommender system and click-through models.
Computer vision enables the computers to look at and recognize what the uploaded photos of homes mean. The system can understand what kind of room is represented in the photograph—whether it is a kitchen or bathroom, for instance—and what features it includes, such as granite countertops, wood floors, etc. Those features can then be used for groupings in the results of customer searches.
The recommender system uses behavior data to find relevant listings under the collaborative searching process. So, if the prospective customer indicated a preference for colonial homes with hardwood floors, the system can put together a collection of homes with that feature in the neighborhood of interest.
Click-through models apply machine learning technologies to the data on consumer behavior. If customers engage with the content, Trulia will send more of a similar type of information, but it will not spam them with content that doesn’t engage them.
The result is a smarter system that makes it easier and faster for people to find the information they want about the neighborhoods they’re interested in, along with their preferred features in homes, without being inundated with irrelevant information.
Varma sums up the value of all the technology working together in terms of the improved user experience: “What it all means is that Trulia is helping consumers find their perfect home.”