Company
Date Published
Oct. 20, 2023
Author
Vikram Chatterji
Word count
287
Language
English
Hacker News points
None

Summary

Unstructured data is estimated to reach over 175 zettabytes by 2025, with 80% of it being unstructured. Vector embeddings are a numerical representation of complex data such as images and text, allowing for efficient comparison and storage. These embeddings can be extracted from trained machine-learning models, typically using the output of the second-to-last layer of a neural network. The size of the embeddings, training data quality, and model architecture are key factors to consider when generating vector embeddings.