Skip to main content

Improving Elasticsuite Autocomplete functionality

This blog post might be outdated!
This blog post was published more than one year ago and might be outdated!
· 3 min read
Stephan Hochdörfer
Head of IT Business Operations

In a current Magento project we make heavy use of Elasticsearch via the smile/elasticsuite module. Elasticsearch is basically the first point of contact of the single-page application we have built on top of Magento. When we tried to make use of the existing Autocomplete features, we realized that partial word matching was not supported.

Let's step back first and check which solutions are offered by Elasticsearch when it comes to building an "AutoComplete functionality". You basically have 2 options, one is using the edge ngram analyser and the other is using the completion suggester feature. While the latter seems the most performant option due to its usage of an in-memory data structure called Finite State Transducer(FST), we decided to go for the edge ngram analyser. Modifying elasticsuite to support completion suggester queries looked like a huge task. Trying to include the edge ngram analyser seemed a lot easier, and it was. It just took a while to get all pieces of the puzzle together. In the end this is what needs to be done:

At first we need to define an analyzer. The analyzer is called autocomplete_analyzer which makes use of the autocomplete_tokenizer - defined below - and uses a lowercase filter. The filter is important, otherwise the search results will be case-sensitive. That did not make much sense in our use-case, it might be different for you:

$config['analyzer']['autocomplete_analyzer'] = [
'tokenizer' => 'autcomplete_tokenizer',
'filter' => 'lowercase'
];

$config['tokenizer']['autcomplete_tokenizer'] = [
'type' => 'ngram',
'min_gram' => 3,
'max_gram' => 3,
'token_chars' => [
// ignore whitespace, punctuation, symbol
'letter',
'digit'
]
];

The tokenizer relies on the ngram tokenizer and is configured to use 3 characters in a gram. The tokenizer breaks down the text into words whenever it encounters one of a list of specified characters, in our case letters and digits. The tokenizer then emits N-grams of each word where the start of the N-gram is anchored to the beginning of the word.

How can we configure elasticsuite to use that configuration when indexes are created? Unfortunately there is no way of configuring this directly in the module. But thanks to the Magento 2 Dependency Injection magic, we can make use of an After plugin to achieve this:

class AddAutcompleteAnalyzerToIndexConfiguration
{
public function afterGetAnalysisSettings(
IndexSettingsInterface $subject,
array $config
) {
$config['analyzer']['autocomplete_analyzer'] = [
'tokenizer' => 'autcomplete_tokenizer',
'filter' => 'lowercase'
];

$config['tokenizer']['autcomplete_tokenizer'] = [
'type' => 'ngram',
'min_gram' => 3,
'max_gram' => 3,
'token_chars' => [
// ignore whitespace, punctuation, symbol
'letter',
'digit'
]
];

return $config;
}
}

In the di.xml of your module, add the following lines:

<type name="Smile\ElasticsuiteCore\Index\IndexSettings">
<plugin name="index_add_autocomplete_analyzer_config"
type="My\Module\Plugin\Index\AddAutcompleteAnalyzerToIndexConfiguration"
/>
</type>

How to configure fields to make use of the autocomplete_analyzer during indexing? Simply refer to the custom autocomplete_analyzer in the respective field configuration. Let's assume you want to have the product name indexed (or anaylized) via our newly created autocomplete_analyzer, add the following field configuration to your elasticsuite_indices.xml configuration:

<field name="name" type="string">
<isSearchable>1</isSearchable>
<defaultSearchAnalyzer>autocomplete_analyzer</defaultSearchAnalyzer>
</field>

To test this, simply run a search with the following search criteria and the like condition:

$filter = new Filter();
$filter->setField('name');
$filter->setValue('your search string here');
$filter->setConditionType('like');

$filterGroup = new FilterGroup();
$filterGroup->setFilters([$filter]);

$searchCriteria = new SearchCriteria();
$searchCriteria->setFilterGroups([$filterGroup]);
$searchCriteria->setRequestName('quick_search_container');

$searchApi = new \Smile\ElasticsuiteCore\Model\Search(
$searchEngine,
$requestBuilder,
$responseBuilder
);
$searchApi->search($searchCriteria);