Company
Date Published
Jan. 9, 2025
Author
-
Word count
611
Language
English
Hacker News points
12

Summary

The latest commercial vector embedding models have been released, including proprietary and open-source options from prominent vendors such as Gemini, OpenAI, Jina, Cohere, Voyage, Stella, ModernBert Embed, and TabFQuAD/Shift Project. The models were tested on the ViDoRe image search benchmark and compared in terms of relevance, cost, and performance. Voyage-3-large emerged as the top-performing model, with a wide gap between its performance and that of the second-place group. Voyage-3-lite was found to be a strong option for those looking for high relevancy at a lower cost. Stella, an open-source model, performed well out-of-the-box and can be fine-tuned for better performance. Astra DB provides a complete data API and integrations that make it easier to build production applications with high relevancy and low latency. The results suggest that there is no single "best" model, but rather a range of options depending on specific use cases and requirements.