Turbocharge LangChain: guide to 20x faster embedding

Company

Anyscale

Date Published

May 3, 2023

Author

Amog Kamsetty, Philipp Moritz

Word count

1934

Language

English

Hacker News points

None

URL

www.anyscale.com/blog/turbocharge-langchain-now-guide-to-20x-faster-embedding

Summary

This blog post discusses how to turbocharge embeddings using the Ray framework for distributed computing and data processing. The authors show how to scale out document embedding generation to parallelize across 20 GPUs, leveraging Ray Data, a distributed data processing system part of the Ray framework. They use LangChain to load and embed documents, split text into chunks, and store embeddings in a FAISS vector store. By using Ray Data, they can generate and store embeddings for 2,000 PDF documents from cloud storage in under 4 minutes and in less than 100 lines of code. The authors demonstrate how to use Ray clusters on AWS or other cloud providers and explore the potential of combining a vector database with an LLM to create a fact-based question answering service.