The process of tokenization, which breaks down a query into individual words or tokens, is complex and challenging due to edge cases such as punctuation, special characters, and non-standard languages. To address these challenges, search engines use various techniques, including typo tolerance, concatenation, splitting, transliteration, lemmatisation, and synonyms. The Algolia engine has improved its tokenization capabilities over the years, adding new alternatives to enhance query processing. These alternatives are designed to be efficient while maintaining relevance and performance. By leveraging these approaches, search engines can better understand user queries and provide more accurate results, despite the complexities of natural language.