Company
Date Published
Author
Peter Villani
Word count
1480
Language
English
Hacker News points
None

Summary

A search index is a mapping of a query to the content in a corpus, which is an inverted list of words that a search engine uses to find every word in every document within a corpus. The book metaphor for an index is useful because it underscores the general idea that an index is a separate object from the underlying content, used to efficiently find specific parts of the content. However, the book metaphor doesn't fully capture the capabilities and mechanisms of a search engine index, which relies on attributes that describe objects sufficiently so that a searcher can find what they are looking for using a reasonably small set of well-chosen keywords. A successful object-based search requires accuracy and relevance, with the latter being more important in many cases than accuracy. The search engine identifies documents that match a user's query by using an index, which is created before a user searches and saved separately on the server. The index structure used is called an inverted index, which enables fast retrieval of words to find matching documents.