Inferring Customer Preferences LLMs boost shopping recommendations by decoding what users want

Published

Apr 30, 2025

Reading time

2 min read

Large language models can improve systems that recommend items to purchase by inferring customer preferences.

What’s new: Fabian Paischer and colleagues at Johannes Kepler University Linz, University of Wisconsin, and Meta introduced Multimodal Preference Discerner (Mender), a recommender that integrates a large language model (LLM).

Key insight: Text that attracts customers, such as product descriptions, and text they write, such as product reviews, may contain information that indicates their preferences, such as the craft projects that required a particular power tool. But it also may include irrelevant information, such as a complaint that the tool was delivered late, which can throw recommendation systems off track. An LLM can derive preferences from text, providing a clearer signal of what a customer wants.

How it works: Mender comprises an LLM (Llama 3 70B-Instruct), an encoder (Flan-T5 pretrained on a wide variety of text and frozen) that embeds customer data, and a decoder (a transformer trained from scratch) that predicts the next item a customer will buy. The system learned to predict the next item based on descriptions of items a customer purchased, the customer’s ratings and reviews of those products (drawn from datasets of Steam reviews of video games and Amazon reviews of items related to beauty, toys-and-games, and sports-and-outdoors), and customer preferences inferred by the LLM from the foregoing data.

The authors started with a list of products a given customer had purchased and reviewed. Given an item’s description and all reviews up to that point, the LLM inferred five customer preferences in the form of instructions such as, “Look for products with vibrant, bold colors.”
The authors built a dataset in which each example included a sequence of items a customer had purchased and on inferred preference that matched the next purchase. To choose the matching preference, they separately embedded all prior preferences and item descriptions using a pretrained Sentence-T5 embedding model. They chose the preference whose embedding was most similar to that of the next purchase.
The encoder embedded the list of purchases and the selected preference. Given the embeddings, the decoder learned to predict the next purchase.

Results: The authors compared Mender to TIGER, a recommender that also takes a purchase history and predicts the next purchase, on the Steam and Amazon datasets. They scored the results using recall @5, a measure of how often the correct item is within the model’s top five most likely predictions.

Mender produced the best recommendations for all datasets.
On Steam, TIGER was close. Mender achieved 16.8 percent recall @5, while TIGER achieved 16.3 percent.
The difference was most pronounced on the Amazon toys-and-games dataset. Mender achieved 5.3 percent recall @5, while TIGER achieved 3.75 percent recall @5.

Why it matters: Drawing inferences from text information like customer reviews and item descriptions boosts a recommender’s signal, making it clearer what a given customer is likely to want. Previous systems used customer reviews or item descriptions directly; Mender uses customer preferences extracted from that information.

We’re thinking: Be on the lookout for innovative ways to use LLMs. We recommend it!

Subscribe to The Batch