/plushcap/analysis/zilliz/caching-llm-queries-for-performance-improvements

Caching LLM Queries for performance & cost improvements

What's this blog post about?

GPTCache is an open-source semantic cache designed to improve the efficiency and speed of GPT-based applications by storing responses generated by language models. It allows users to customize the cache according to their needs, including options for embedding functions, similarity evaluation functions, storage location, and eviction policy management. The tool supports multiple popular databases for cache storage and provides a range of vector store options for finding the most similar requests based on extracted embeddings from input requests. GPTCache aims to provide flexibility and cater to a wider range of use cases by supporting multiple APIs and vector stores.

Company
Zilliz

Date published
April 10, 2023

Author(s)
Chris Churilo

Word count
1079

Language
English

Hacker News points
None found.


By Matt Makai. 2021-2024.