Taxonomy-based Negative Sampling In Personalized Semantic Search for E-commerce
By: Uthman Jinadu , Siawpeng Er , Le Yu and more
Potential Business Impact:
Finds you the exact online stuff you want.
Large retail outlets offer products that may be domain-specific, and this requires having a model that can understand subtle differences in similar items. Sampling techniques used to train these models are most of the time, computationally expensive or logistically challenging. These models also do not factor in users' previous purchase patterns or behavior, thereby retrieving irrelevant items for them. We present a semantic retrieval model for e-commerce search that embeds queries and products into a shared vector space and leverages a novel taxonomy-based hard-negative sampling(TB-HNS) strategy to mine contextually relevant yet challenging negatives. To further tailor retrievals, we incorporate user-level personalization by modeling each customer's past purchase history and behavior. In offline experiments, our approach outperforms BM25, ANCE and leading neural baselines on Recall@K, while live A/B testing shows substantial uplifts in conversion rate, add-to-cart rate, and average order value. We also demonstrate that our taxonomy-driven negatives reduce training overhead and accelerate convergence, and we share practical lessons from deploying this system at scale.
Similar Papers
Embedding based retrieval for long tail search queries in ecommerce
Information Retrieval
Helps shoppers find rare items online.
BiCA: Effective Biomedical Dense Retrieval with Citation-Aware Hard Negatives
Information Retrieval
Helps computers find science papers better.
Can LLM-Driven Hard Negative Sampling Empower Collaborative Filtering? Findings and Potentials
Information Retrieval
Makes movie suggestions better by finding tricky examples.