Dissecting OpenAI's Built-in Retrieval: Unveiling Storage Constraints, Performance Gaps, and Cost Concerns
OpenAI's built-in retrieval feature has storage constraints, performance gaps, and cost concerns. The current pricing model of $0.2 per GB per day is expensive compared to traditional document services like Office 365 and Google Workspace. However, the server cost for serving vectors is only about $0.30 per day, which is a bargain compared to the pricing. The architecture of OpenAI Assistants' retrieval feature has limitations such as a maximum of 20 files per assistant, a cap of 512MB per file, and a hidden limitation of 2 million tokens per file. The current architecture may not scale well enough to support larger businesses with more extensive data requirements. To address these challenges and reduce costs, the service's architecture needs to be optimized. A refined vector database solution, hybrid disk/memory vector storage, streamlining disaster recovery by pooling system data, and multi-tenancy support for diverse user base are suggested improvements. Among popular vector databases, Milvus is considered the most mature open-source option with effective separation of system and query components, isolation of query components through Resource Group feature, hybrid memory/disk architecture, and application-level multi-tenancy facilitated by RBAC and Partition features. However, no single vector database solution can comprehensively address all challenges and meet every design requirement for imminent infrastructure development. The choice of vector databases should be tailored to specific requirements to effectively navigate the complexities of optimizing OpenAI Assistants' architecture.
Company
Zilliz
Date published
Jan. 9, 2024
Author(s)
Robert Guo
Word count
2284
Language
English
Hacker News points
None found.