Research Insights – Learning to Retrieve Passages without Supervision
The Learning to Retrieve Passages without Supervision paper by Ori Ram et al. explores Self-Supervised Learning as an alternative route to training new models without human labeling. They introduce Span-based Unsupervised Dense Retriever (Spider), a recent breakthrough for Self-Supervised representation learning applied to text retrieval in search. Spider's results show that it achieves similar performance to Supervised models on unseen data distributions, demonstrating strong Zero-Shot Generalization capabilities. The authors further illustrate how we can target Spider's performance to a particular data distribution with Transfer Learning, requiring only 128 labeled examples. This research opens up new possibilities for training custom retrieval models without the need for large labeled datasets.
Company
Weaviate
Date published
Aug. 30, 2022
Author(s)
Connor Shorten
Word count
2957
Hacker News points
None found.
Language
English