Semantic Search
With semantic search, it's possible to understand the meaning of words by utilizing vector searches. You can get results with for example "clothing", even if that word isn't used in any product information. It understands that "shirt" or "pants" is related to "clothing". There are also multilingual machine learning models, so it's possible to search in multiple languages without the need to translate anything. For example, in Dutch; "kleding" or "tas voor op je rug" will also return results.
Preparation
Make sure you're familiar with InstantSearch, as it's used for the autocomplete and product listings such as category pages, the search page and product sliders. Indexing is handled by Laravel Scout and to make InstantSearch compatible with Elasticsearch / OpenSearch we're using Searchkit. More docs on related topics:
- Requirements
- Elasticsearch: configuration, CORS and secure
- OpenSearch: configuration, CORS and secure
- Listing component
- Extending the autocomplete
- Eventy filters
Requirements
It's possible to use semantic search with Elasticsearch and OpenSearch. With OpenSearch, it's a free feature to load a machine learning model into OpenSearch which creates the vectors. For Elasticsearch, you need to handle vector creation yourself or use a paid subscription.
Configuration
Elasticsearch
Have a look at the semantic search Searchkit docs. Keep in mind this requires a paid subscription! More info in the Elasticsearch semantic search docs.
OpenSearch
Follow this gist, read the linked blog or check out this Youtube video. But also check the OpenSearch docs about semantic search and the semantic search tutorial.
Sentence Transformers
With the configuration of Elasticsearch / OpenSearch you'll need to pick a sentence transformer model. The defaults in the tutorials are pretty good but you have to experiment with different models to see which one returns the best results for you. More info on pretrained models and if you're looking for a multilingual model; switch to a different model that supports multiple languages
Installation
NOTE
These examples are using OpenSearch!
Backend
The easiest way to install semantic search capabilities within Rapidez is with Eventy filters from your AppServiceProvider
. First you need to add these index settings:
Eventy::addFilter('index.product.settings', fn ($settings) => array_merge_recursive($settings, [
'index.knn' => true,
'default_pipeline' => 'embedding-ingest-pipeline',
'index.search.default_pipeline' => 'hybrid-search-pipeline'
]));
Create an extra field with all data you'd like to be used for the vector. With the Magento sample data you could do something like this:
Eventy::addFilter('index.product.data', fn ($data) => array_merge_recursive($data, [
'content' => implode(' - ', [
'Product name: '.$data['name'],
'SKU: '.$data['sku'],
'Price: '.$data['price'].' euro',
'Activity: '.implode(', ', $data['activity'] ?? []),
'Material: '.implode(', ', $data['material'] ?? []),
'Style general: '.implode(', ', $data['style_general'] ?? []),
'Style bottom: '.implode(', ', $data['style_bottom'] ?? []),
'Climate: '.implode(', ', $data['climate'] ?? []),
'Pattern: '.implode(', ', $data['pattern'] ?? []),
'Gender: '.implode(', ', $data['gender'] ?? []),
'Description: '.strip_tags($data['description']),
]
)]));
Experiment with different data or split data in multiple fields so it's possible to search independently and apply boosts on certain data. This field also needs a mapping:
Eventy::addFilter('index.product.mapping', fn ($mapping) => array_merge_recursive($mapping, [
'properties' => [
'content_embedding' => [
'type' => 'knn_vector',
'dimension' => 512,
'method' => [
'name' => 'hnsw',
'space_type' => 'innerproduct',
'engine' => 'faiss',
]
],
],
]));
NOTE
Keep in mind this needs to match with your model and configurations within OpenSearch!
Now everything is configured, run the indexer:
php artisan rapidez:index
TIP
Alternatively if you've overwritten the product model you can use these methods within the product model instead of using Eventy:
indexMapping()
indexSettings()
toSearchableArray()
See the Laravel Scout docs, the extending indexer docs and the current searchable trait used by the product model.
Frontend
The query used to get search results needs to include our new vector field. You can overwrite the default query using the query
prop on the listing component. First publish the Blade template and edit: resources/views/vendor/rapidez/components/listing.blade.php
Add a new custom function name:
<listing
...
:query="semanticSearch"
>
And create that function in resources/js/app.js
:
Vue.prototype.semanticSearch = (query, searchAttributes, config) => {
let finalQuery = Vue.prototype.relevanceQueryMatch(query, searchAttributes, config.fuzziness)
finalQuery.bool.should.push({
'neural': {
'content_embedding': {
'query_text': query,
'model_id': 'YOUR-MODEL-ID',
'min_score': 0.9,
}
}
})
return finalQuery
}
TIP
Refer to the OpenSearch docs on neural queries
Temporary workaround
Until this pull request is merged you also need to add this so we can use the default query.
relevanceQueryMatch function
// From: https://github.com/searchkit/searchkit/blob/main/packages/searchkit/src/transformRequest.ts
// See: https://github.com/searchkit/searchkit/pull/1408
Vue.prototype.relevanceQueryMatch = (query, search_attributes, fuzziness = "AUTO:4,8") => {
const getFieldsMap = (boostMultiplier) => {
return search_attributes.map((attribute) => {
return typeof attribute === "string" ? attribute : `${attribute.field}^${(attribute.weight || 1) * boostMultiplier}`;
});
};
return {
bool: {
should: [
{
bool: {
should: [
{
multi_match: {
query,
fields: getFieldsMap(1),
fuzziness
}
},
{
multi_match: {
query,
fields: getFieldsMap(0.5),
type: "bool_prefix"
}
}
]
}
},
{
multi_match: {
query,
type: "phrase",
fields: getFieldsMap(2)
}
}
]
}
};
}