Caching LLM Queries for performance & cost improvements
GPTCache is an open-source semantic cache designed to improve the efficiency and speed of GPT-based applications by storing responses generated by language models. It allows users to customize the cache according to their needs, including options for embedding functions, similarity evaluation functions, storage location, and eviction policy management. The tool supports multiple popular databases for cache storage and provides a range of vector store options for finding the most similar requests based on extracted embeddings from input requests. GPTCache aims to provide flexibility and cater to a wider range of use cases by supporting multiple APIs and vector stores.
Company
Zilliz
Date published
April 10, 2023
Author(s)
Chris Churilo
Word count
1079
Language
English
Hacker News points
None found.