Search engines have been with us for several decades already, and we are using them daily in our digital lives, so why are we still talking about search? Isn’t search a solved problem already?
Search probably would have been a long solved problem if human language were a well-defined, thought-through, and strict construct. If it were, our daily communication would be fast, clear, precise -- and extremely colorless and dull. So, unfortunately for search engineers and fortunately for everybody else, human language is an extremely complicated system riddled with redundancies and ambiguities. For example, consider this:
One is a dress shirt, and the other is a shirt dress.
Same words, completely different meanings just by changing the order of words!
People quickly navigate this complexity, relying on the context of interaction and vast background knowledge about the world. They expect modern search engines to be as smart in understanding their requests, to base their understanding not just on the accidental matching of words, but on their meaning, also known as semantics. In other words, people want a semantic search.
In this blog post, we discuss what language phenomena semantic search has to understand and how to approach semantic search with some of the powerful information retrieval techniques.
The semantic search extends the classical boolean retrieval model which was used as a baseline for full text search engine implementations. It addresses many of its term-centric approach limitations with better support for peculiarities of natural language.
In general, semantic search aims to match documents that correspond to the meaning and intent of the query, not just its words like the full text search approach used to do. Because of this, semantic search departs from traditional ways of matching and counting words and focuses on matching concepts. Its advanced techniques allow it to improve precision compared to full text search without worsening its recall.
Originally full text search was designed to work with text documents, and its user interface allowed only one way of search results navigation – scrolling from top to bottom. Under these circumstances, seaerch engine could sacrifice precision in favor of recall. Suboptimal precision can be mitigated by advanced ranking which boosts most relevant documents to the top. This approach was generally acceptable, because the majority of users rarely went through enough pages to see irrelevant ones. This is what we still see in web search engine interfaces (even though their internal implementation has gone far away from classical full text search) - millions of matched results, but we rarely go beyond the first few pages.
As for e-commerce search, it contains additional elements of customer interaction, which can easily reveal irrelevant products and spoil the experience, even if the relevance-based ranking is doing a very good job:
Even if your customers prefer just scrolling down without filtering or re-sorting (which is very likely for mobile experience or products with high importance of visual factor, like fashion or home decor), poor precision still remains a frustrating issue. In the e-commerce domain, it is often very easy for a customer to detect irrelevant products just by looking at their images. If customers sees a lot of products, she will soon become disappointed by time wasted for scrolling through poor precision results.
Full text search algorithms were designed to handle loosely structured texts (articles, books, etc.), which contain a few headers and many paragraphs of texts.
On the opposite side, e-commerce search has quite a different data source: products are often well-attributed and each searchable value has a key which explains its meaning (brand, type, color, style, occasion, material, etc.)
This kind of input data becomes a great resource in semantic search implementation. To be honest, as a customer, I feel stunned when I see e-commerce sites with rich attribution (exposed to me via facets), but with poor search implementation.
Unfortunately, we cannot expect all aspects of customer intent to be always fully discoverable in key-value product attributes (even with help of synonyms). In reality there will always be scenarios when customers search for concepts located only in unstructured data (product names, descriptions or reviews). Because of it, the eventual solution is always hybrid, but if attributes are good enough to serve the majority of search requests, implementation of basic semantic search techniques can bring significant improvement. If unstructured data dominates, or quality of attribution is diverse across the catalog, it’s worth investing into attributes extraction: not only search, but also other aspects (like facet filters, product recommendations) will benefit from it.
The main insight behind semantic search is that one cannot just simply break a search query into tokens and use them to match products in reverse index.
Instead of it, semantic search must adhere to following principles:
Those principles can be applied by introducing the query understanding phase in the search pipeline.
This phase examines the query from different perspectives – and its findings help to build an optimal boolean query as well as ranking criteria for product search.
Depending on available data, building a boolean query can be implemented in different manner.
Customer intent classification based on machine learning algorithms can be employed to convert customer search history data into predictions of search phrase category or product type and improve precision.
Semantic search relies on business domain knowledge base, which contains different types of relations between terms relevant within a particular domain:
If you have a merchandising rule associated with a certain keyword (like a category redirect for couches), then this rule should fire not only for different spellings of this keyword (singular couch and its misspellings), but also for other keywords with same semantics (sofas). In practice it is often achieved by explicit listing of all keywords, their forms and even popular misspellings in the rule itself, but such approach doesn’t scale well.
Once you have semantic search for products, it makes sense to extend it towards evaluation of rule keywords, so that all implemented techniques can be leveraged there. Eventually this approach looks obvious, because in both use cases your final goal is to express customer intention using domain vocabulary.
However, there is one common pitfall that you’d better be warned in advance.
While hypernyms should be considered in product search, they would better be ignored in merchandising rules evaluation if such rule is intended to override the whole result.
Once you implement efficient query parsing, which can convert a text query into filters by attributes, then you can use it in other areas other than search. For example, it can be used to improve autocomplete phrases corpus: you can not only remove poorly matched or misspelled phrases, but also to prevent showing semantically similar phrases.
If an attribute refers directly to some physical properties of a product, it can pose additional challenges for the semantic search engine, as people can use multiple ways to measure or explain the same physical characteristic.
Color search is a valuable use case in the domain of fashion, where the actual look of a product weighs heavily on customer decisions. However, product-to-color relation is not simple. There are different shades of the same color, or a product can use multiple different colors (as striped dress does). Below example shows how differently “blue dress” products can actually look.
Here one can rely not only on the attribute values, but also on computer vision techniques to resolve these issues. The model can extract the dominant and participating colors, and we can consider a color distance between the color mentioned in the query and the multiple colors associated with the product, giving the dominant color the maximum weight.
It can also include grouping products by their relation to this color: shade, coverage, patterns. etc.
Retailers have other use cases for color search. One of the more interesting examples is home improvement’s requirement to use color search for paint matching. A customer who wants to repaint an object like an interior wall, and is looking for paint used initially to match the existing color. Due to the variety of paint colors, the customer may not always use primary color names such as beige. They may use official name variations such as Tuscan beige or desert sand, or unofficial terminology, like light beige or off white. The search system should recognize those colors and return the closest paint products. In this case, the color search is a critical property, so it may be advisable to build distinctive UI elements specifically for paint selections on this site.
Size search has additional challenges in domains, where real physical sizes are used. The semantic search should be able to handle the following aspects of “dimensional” size search:
In addition to it, Semantic size search cannot expect that every customer knows and obeys all above rules. So that when a customer makes a mistake (like using single apostrophe for something usually measured in inches), Semantic size search needs to handle it gracefully.
Age search becomes important for child-related products when using age as a measurement of different characteristics of a product, such as size for clothing or complexity for toys. Semantic search should have the ability to recognize ages formulated in different ways:
A customer may search, not for a particular price, but a price range instead. A typical example is a search for products below some price value using patterns like under $20 or below $50. Semantic search needs to recognize these patterns and convert them to the appropriate filters by price range in a Boolean query.
A semantic partial match is another example of semantic search benefits.
When the query doesn’t match any product exactly, we have to relax matching requirements, which can lead to multiple alternative sub-queries to be evaluated.
In the above example, the query “black torn jeans” couldn’t match any products with all its words, so that eventually different types of black products were returned (jeans, skirts and even rugs). As a customer, I can easily notice that rugs are very irrelevant for this query, so I would expect it to be not only buried but filtered out.
One needs to decide how to choose those subqueries which are too irrelevant to be used. Full text search approach could base filtering out sub-queries on words-related factors, such as a number of preserved words in a query or their index frequency. Both approaches are far from optimal. For example, if your site doesn’t carry “blue calvin klein shirts” products, but carries “blue shirts”, “calvin klein shirts” and “blue calvin klein” products, then “blue calvin klein” (3 preserved words) shouldn’t be considered as more relevant than “blue shirts” (2 preserved words), should it?
When making a decision what words to omit in the query, semantic search has to consider the saliency of the attributes of omitted words in each sub-query.
In our example, dropping the product type “shirt” is not a good idea (as customers are very unlikely to agree on getting other product type, but with same color or brand), so “blue calvin klein” interpretation should be filtered out. Other two options (“blue shirts” vs. “calvin klein shirts”) have similar relevance.
As a part of customer-oriented search experience, it is recommended to explain which words were omitted (so the customer can try to rephrase them) as well as group products by sub-queries (if more than one was eventually used for returned products).
When a customer is searching for brands not carried by your site, then just omitting a brand name in a query is not efficient. Instead, you should still be able to understand the intent of the search request and suggest a reasonable alternative. For example:
It is a good practice to explain the substitution explicitly to the customer (we don’t have …, but we suggest …), so that the customer does not continue fruitlessly searching other phrases or categories, but explore provided suggestions.
When a customer is using a short query like “dress”, “watch” or “nike”, we have to deal with a situation when we have many products that seem to match the customer query perfectly. We may, after all, have hundreds of dresses, watches, or Nike products in the catalog. Which products should display on the first page? A short query is a common example of “head queries”, which are very frequent, quite short, and match many products equally well, so they get an equal relevance score from the search engine.
A popular approach is to break this tie by employing site-wide product-level business metrics such as sales, margins, inventory, rating, and combinations of thereof. Those metrics can produce a ranking score used as a secondary ranking criterion applied to break ties for equally relevant products.
However, while providing a solid baseline for the tiered relevancy ranking, it is not always the best thing we can do. Consider the following:
It is essential to resolve ambiguities to employ an understanding of customer intent obtained from the clickstream. Using clickstream data, we can train both catalog-wide and personalized ML models to predict product types and resolve the query ambiguity.
For doing spell correction it is not enough to merely identify an unknown word and correct it to a known word with the closest edit distance. It must be corrected with a word that fits into the context of adjacent words in a query. Moreover, such correction is required even if the site catalog knows the original word, but it still doesn’t fit the phrase context.
Proper implementation of semantic search helps you to establish a more efficient balance between precision and recall, improve query understanding and make customer search journey seamless and delightful.
However, the power of semantic search largely depends on the richness and quality of the domain data - product attribution as well as synonyms.
If your customers often perform out-of-dictionary search, then semantic search quality will suffer. It can include
Even though such use cases can be handled with the help of business rules and special synonyms, it is generally hard to scale. To deal with such tricky queries, we recommend to look into semantic vector search which can be used to augment capabilities of semantic search implementation.
Happy semantic searching!