/plushcap/analysis/datastax/datastax-understanding-pulsar-message-ttl-backlog-and-retention

Understanding Pulsar Message TTL, Backlog, and Retention

What's this blog post about?

Apache Pulsar manages message life cycle using several mechanisms, including backlog quota and retention policy. In an ideal scenario with unlimited storage space, Pulsar retains unacknowledged messages in a backlog until they are acknowledged for deletion. However, real-world scenarios require additional measures to prevent the unlimited growth of message backlogs. Pulsar uses two mechanisms to manage backlogs: time-to-live (TTL) parameter for individual messages and subscription backlog quota for the backlog itself. TTL defines the amount of time a message is allowed to stay in the unacknowledged state, while the backlog quota enforces a hard limit on the logical size of the backlogs in a topic. To prevent backlog overflow, Pulsar offers three policies: interrupting message transmission, clearing existing messages from the backlog, and allowing slowest consumers to lose 10% of the oldest messages in the backlog. The default broker option is producer_request_hold, which relies on consumers to drain the backlog. Pulsar's retention policy tells it to retain acknowledged messages and messages on a topic with no subscription. It defines the limit to keep acknowledged events and marks deletion for messages over the limit. Retention policies can be specified at the namespace level, allowing teams using different namespaces to use different policies. To manage message retention in Pulsar, consider both backlog quota and retention policy configuration. Adjust retention policies accordingly when offloading closed ledgers to a lower-cost storage like AWS S3.

Company
DataStax

Date published
June 22, 2020

Author(s)
Ming Luo

Word count
1098

Hacker News points
None found.

Language
English


By Matt Makai. 2021-2024.