Inside Airbnb's AI-Powered Search Upgrade

What Caught My Eye This Week - #6

Mar 27, 2025

Each week, I'll bring you the most relevant and insightful tech stories, saving you time and keeping you informed.

black labrador retriever sitting on rock — Photo by Karsten Winegeart on Unsplash

Inside Airbnb's AI-Powered Search Upgrade

Finding the perfect stay on Airbnb, whether it's a unique cabin for a weekend getaway or an apartment for a month-long trip, is central to the user experience. However, delivering highly relevant search results becomes incredibly challenging when dealing with millions of listings, broad geographical searches, high-demand destinations, and flexible date queries. The system needs to be both highly relevant and exceptionally efficient (low latency, low compute cost) to sift through a massive pool of options.

To tackle this, Airbnb's engineering team implemented a sophisticated Embedding-Based Retrieval (EBR) system designed to act early in the search ranking pipeline. Its primary goal: intelligently narrow down the vast number of eligible listings to a smaller, high-quality candidate set, which can then be processed by more complex and computationally expensive machine learning ranking models.

Here's a breakdown of their approach:

1. The Challenge: Scaling search relevance across millions of diverse listings while handling complex user queries (location scope, flexible dates, group size) demanded a system that could efficiently pre-filter candidates without sacrificing quality.

2. Smart Training Data Strategy - Contrastive Learning:

Airbnb leveraged user "journeys"—sequences of searches for the same location, guest count, and trip duration—to build training data.
Positive Samples: Listings that the user actually booked within that journey.
Negative Samples: Listings the user saw, interacted with (e.g., clicked, wishlisted), but ultimately did not book. This targeted negative sampling proved much more effective than random sampling for improving model performance.

3. Model Architecture - The Two-Tower Model:

A classic Two-Tower architecture was employed.
Listing Tower: Processes listing features (historical interactions, amenities, capacity, etc.). Crucially, these features allow listing embeddings to be pre-computed offline daily, significantly reducing online serving latency.
Query Tower: Processes query features (geography, guest count, trip duration, etc.).

4. Efficient Online Serving - ANN with IVF:

To compare query embeddings with millions of listing embeddings quickly, Approximate Nearest Neighbor (ANN) techniques were essential.
Airbnb evaluated both IVF (Inverted File Index) and HNSW (Hierarchical Navigable Small World).
IVF was chosen over HNSW primarily because:
- Frequent updates to listing price and availability led to significant memory overhead issues with HNSW indexes.
- HNSW showed higher latency when performing parallel retrieval alongside essential filters (like geographic constraints).
- IVF's pre-clustering approach was easier to integrate into the existing infrastructure, allowing retrieval to focus only on the most relevant listing clusters for a given query.

5. The Impact:

The new EBR system has been successfully rolled out in both Airbnb Search and Email Marketing campaigns.
A/B testing demonstrated a significant increase in overall bookings, with improvements comparable to the cumulative gains achieved through various ML ranking enhancements over the previous two years.
The primary driver of this success is EBR's ability to effectively integrate query context early in the process, dramatically improving the relevance and ranking accuracy at the initial retrieval stage, especially for queries that generate a large number of potential candidates.

In essence, Airbnb's adoption of EBR represents a major step forward in efficiently connecting travelers with the right listings at scale, proving the power of sophisticated retrieval techniques in complex, real-world applications.

AI

Deep-Live-Cam

Real-time face swap and video deepfake with a single click and only a single image.

https://github.com/hacksider/Deep-Live-Cam

DeepSeek-V3 Technical Report

https://arxiv.org/pdf/2412.19437

Open Source Catches Up (Again): DeepSeek-V3 marks a significant milestone for open-source AI. It demonstrates performance competitive with top-tier proprietary models like GPT-4o and Claude 3.5 Sonnet, particularly excelling in demanding code and math reasoning tasks. This release significantly narrows the gap between open and closed ecosystems.
Efficiency is the New Frontier: Beyond raw power, DeepSeek-V3 highlights the critical focus on training and inference efficiency. Utilizing novel techniques like auxiliary-loss-free MoE balancing, Multi-Token Prediction (enabling ~1.8x faster inference), and pioneering large-scale FP8 precision training, DeepSeek-AI achieved state-of-the-art results with a remarkably low training cost (estimated ~$5.6M).
Architecture Matters: The report underscores the success of DeepSeek's architectural choices (MLA and DeepSeekMoE) combined with training innovations like Multi-Token Prediction. These aren't just theoretical gains; they translate to tangible benefits in performance and cost-effectiveness.
Beyond Next-Token Prediction: The success of Multi-Token Prediction suggests a potential shift in training paradigms. Predicting multiple future tokens not only improves model performance during training but directly enables faster inference via speculative decoding, tackling a major LLM bottleneck.
Specialization through Distillation: DeepSeek-V3's impressive math and code abilities were significantly boosted by distilling knowledge from specialized DeepSeek-R1 models. This highlights the power of targeted post-training techniques to enhance specific capabilities in generalist models.

@shenhuang

Discussion about this post