pgvector has improved its performance with the latest release, introducing quantization features that reduce vector and index footprint while speeding up index builds and prewarming times for RAG applications. The new half-precision vector type, or halvec, uses 16-bit floating point numbers to represent components, reducing storage capacity in half, and achieving a 50% reduction in storage cost without compromising performance. Additionally, the article tests binary quantization, which turns each value of the vector into 0 or 1, showing significant gains in index size and build time, but with lower recall compared to scalar quantization. The author encourages experimenting with halvec before migrating, as results may depend on the dataset, and suggests further experiments with other embedding models and vector lengths for binary quantization.