/plushcap/analysis/zilliz/dissecting-openai-built-in-retrieval-storage-constraints-performance-gaps-cost-concerns

Dissecting OpenAI's Built-in Retrieval: Unveiling Storage Constraints, Performance Gaps, and Cost Concerns

What's this blog post about?

OpenAI's built-in retrieval feature has storage constraints, performance gaps, and cost concerns. The current pricing model of $0.2 per GB per day is expensive compared to traditional document services like Office 365 and Google Workspace. However, the server cost for serving vectors is only about $0.30 per day, which is a bargain compared to the pricing. The architecture of OpenAI Assistants' retrieval feature has limitations such as a maximum of 20 files per assistant, a cap of 512MB per file, and a hidden limitation of 2 million tokens per file. The current architecture may not scale well enough to support larger businesses with more extensive data requirements. To address these challenges and reduce costs, the service's architecture needs to be optimized. A refined vector database solution, hybrid disk/memory vector storage, streamlining disaster recovery by pooling system data, and multi-tenancy support for diverse user base are suggested improvements. Among popular vector databases, Milvus is considered the most mature open-source option with effective separation of system and query components, isolation of query components through Resource Group feature, hybrid memory/disk architecture, and application-level multi-tenancy facilitated by RBAC and Partition features. However, no single vector database solution can comprehensively address all challenges and meet every design requirement for imminent infrastructure development. The choice of vector databases should be tailored to specific requirements to effectively navigate the complexities of optimizing OpenAI Assistants' architecture.

Company
Zilliz

Date published
Jan. 9, 2024

Author(s)
Robert Guo

Word count
2284

Language
English

Hacker News points
None found.


By Matt Makai. 2021-2024.