Dissecting OpenAI's Built-in Retrieval: Unveiling Storage Constraints, Performance Gaps, and Cost Concerns

Company

Zilliz

Date Published

Jan. 9, 2024

Author

Robert Guo

Word count

2284

Language

English

Hacker News points

None

URL

zilliz.com/blog/dissecting-openai-built-in-retrieval-storage-constraints-performance-gaps-cost-concerns

Summary

OpenAI's built-in retrieval feature has storage constraints, performance gaps, and cost concerns. The current pricing model of $0.2 per GB per day is expensive compared to traditional document services like Office 365 and Google Workspace. However, the server cost for serving vectors is only about $0.30 per day, which is a bargain compared to the pricing. The architecture of OpenAI Assistants' retrieval feature has limitations such as a maximum of 20 files per assistant, a cap of 512MB per file, and a hidden limitation of 2 million tokens per file. The current architecture may not scale well enough to support larger businesses with more extensive data requirements. To address these challenges and reduce costs, the service's architecture needs to be optimized. A refined vector database solution, hybrid disk/memory vector storage, streamlining disaster recovery by pooling system data, and multi-tenancy support for diverse user base are suggested improvements. Among popular vector databases, Milvus is considered the most mature open-source option with effective separation of system and query components, isolation of query components through Resource Group feature, hybrid memory/disk architecture, and application-level multi-tenancy facilitated by RBAC and Partition features. However, no single vector database solution can comprehensively address all challenges and meet every design requirement for imminent infrastructure development. The choice of vector databases should be tailored to specific requirements to effectively navigate the complexities of optimizing OpenAI Assistants' architecture.