Quatre raisons pour lesquelles Apache Pulsar est essentiel dans votre stack data moderne
DataStax, a company known for its distributed database technologies, has announced that it will build a messaging solution to complement its existing offerings. The company has been exploring the integration of messaging and streaming capabilities into its ecosystem for several years now, driven by the growing popularity of microservices-based architectures. These architectures often use a message bus to decouple communication between services, simplify replaying, error management, and peak load handling. The company has evaluated various popular options, including Apache Kafka, but found it lacking in four key areas: geo-replication, scaling, multi-tenancy, and queuing. Apache Pulsar, on the other hand, meets all these requirements to DataStax's satisfaction. Geo-replication is a crucial feature for global companies like Netflix that need to serve customers worldwide with minimal latency while adhering to data sovereignty regulations. Cassandra and Pulsar both support geo-replication natively, allowing users to choose synchronous or asynchronous replication configurations and configure topic-level replication. Scalability is another area where Kafka falls short compared to Pulsar. In Kafka, adding capacity requires copying some partitions over to the new node before it can participate in load balancing, which slows down the cluster temporarily. Pulsar introduces an additional layer of indirection and separates storage from computation, allowing for independent scaling without affecting existing data or requiring any additional work by the cluster. Multi-tenancy is another area where Kafka struggles due to its single-tenant design. In contrast, Pulsar supports native multi-tenancy, enabling administrators to manage multiple tenants across different regions from a single interface that includes authentication and authorization, isolation policies, and storage quotas. CapitalOne has written an informative article on Pulsar's multitenancy capabilities. Lastly, while Kafka offers a traditional pub/sub messaging model, it does not support queuing or dead-letter queues. Pulsar supports both pub/sub and queuing models, allowing users to balance message loads across multiple consumers without requiring them to process messages in the same order they were published. This versatility opens up opportunities for cost reduction by replacing existing AMQP and JMS systems with a single solution. In conclusion, Pulsar's architecture offers significant advantages over Kafka in terms of geo-replication, scaling, multi-tenancy, and queuing capabilities. DataStax is excited to join the Pulsar community with its acquisition of Kesque, a PaaS for Apache Pulsar, and by open-sourcing management and monitoring tools built by the Kesque team in their new Luna Streaming distribution of Pulsar.
Company
DataStax
Date published
April 8, 2021
Author(s)
Yahya JARRAYA
Word count
1203
Hacker News points
None found.
Language
français