Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

[SPIKE] Different ways of handling data retention in Kafka

See original GitHub issue

Describe the solution you’d like In this spike we would like to define best default settings for data retention in kafka module.

Describe alternatives you’ve considered Check if changing log_retention_bytes == -1 to limited disk space is a better option than unlimited.

Issue Analytics

State:
Created 4 years ago
Comments:6 (5 by maintainers)

Top GitHub Comments

1reaction

atsikhamcommented, Feb 3, 2021

I’ve read docs about Kafka data retention policies and it seems that usage of default Kafka values in Epiphany is the best option. There are 2 retention policies that can be configured on the broker or topic levels: by time and by size. We use the same defult value for size retention as in Kafka configured, -1, which means that by default Kafka log size is unlimited (only retention time is configured).

We could use size retention policy, but in this case if it’s not changed by user and someone starts to spam with large messages amount, old ones will be lost and this case is not better than disk overflow. In my opinion having kafka/disk monitoring with allowing user to make a decision which config values should be defined is the right way.

0reactions

przemyslaviccommented, Feb 12, 2021

I agree with @atsikham. I would be in favor of the default settings as described in the official documentation. This is what we have now in Epiphany. The user now has the option of changing these parameters by modifying the specification and overwriting the original values. Disk monitoring could help solve possible problems here.