Husky: Exactly-Once Ingestion and Multi-Tenancy at Scale
Datadog's third-generation event store, Husky, is a distributed, time-series oriented, columnar store optimized for streaming ingestion and hybrid analytical and search queries. To ensure exactly once ingestion of every event into Husky’s storage engine, the company developed auto-scaling, multi-tenant data ingestion pipelines. They introduced locality by deterministically mapping events to groups of partitions called shards by their ID and timestamp. This allowed for efficient deduplication within a shard and reduced storage costs and improved performance. The Sharding Allocator ensures all Shard Router nodes have a consistent view of allocated Shard Placements, while the Autosharder periodically adjusts configured shard counts on a tenant-by-tenant basis to better fit observed traffic volume. Load balancing is achieved by shifting tenant Shard Placements around using a salting technique and a balancing algorithm that shifts placements around until all shards are roughly balanced. The Writers, responsible for exactly-once ingestion to Husky, persist event IDs in separate Husky tables from the raw event data to ensure consistency between the event data itself and the event IDs once they’ve been committed to the Metadata store.
Company
Datadog
Date published
Feb. 22, 2023
Author(s)
Daniel Intskirveli, Cecilia Watt
Word count
4354
Language
English
Hacker News points
22