Company
Date Published
Author
Alessandro Negro
Word count
2569
Language
English
Hacker News points
None

Summary

The development of search infrastructure that provides relevant information to users involves the creation of a knowledge graph, which is a multi-relational graph composed of entities as nodes and relationships as edges with different types. The knowledge graph model is designed to handle highly heterogeneous data in terms of sources, schema, volume, and speed of generation. Natural Language Processing (NLP) plays an important role in extracting "knowledge" from large datasets. The search architecture must be able to navigate this data in real-time, providing efficient ways for users to access the information they need. A relevant search application is built on top of a knowledge graph, which includes features such as text extraction and NLP, user modeling and recommendation engines, context information, and business goals. The infrastructure consists of a Neo4j database, an Elasticsearch cluster, and Apache Kafka, which work together to provide real-time data processing and storage. Neo4j stores the entire knowledge graph on which all searches and navigations are performed, while Elasticsearch provides advanced text search capabilities and faceting.