Retention policy is applied 1 hour earlier
See original GitHub issueDescribe the bug
As @Tho-Mat pointed out in https://github.com/corona-warn-app/cwa-server/issues/699#issuecomment-671358818, retention policy applied 1 hour earlier than it should:
I also noticed that the index for (examples) the hour-file
> 2020-07-26-hour-06.zip created 26.07.2020 09:05 (German Time)
> was removed at 09.08.2020 08:05 (German Time)
>
> 2020-07-26-hour-08.zip created 26.07.2020 11:05 (German Time)
> was removed at 09.08.2020 10:05 (German Time)
shouldn’t they be removed one hour later to have them 14 Days on the server?
Expected behavior
Retention policy should apply after 14 days.
Issue Analytics
- State:
- Created 3 years ago
- Comments:14 (7 by maintainers)
Top Results From Across the Web
Exchange Server: Retention tags and retention policies
Users can apply a personal tag to a message so that it's moved or deleted sooner or later than the settings specified in...
Read more >Data Retention Policy: What Is It and How to Build One
Data retention policy examples Length of time in a data retention policy ranges from minutes to years. Use a policy engine that involves...
Read more >Office 365 Retention Policy: How to Apply & Avoid Pitfalls
1. Go to Compliance Center and select Policies in the left-hand panel. You'll be followed to the necessary page.
Read more >Office 365 - Common Confusion with Email Retention Policies
Retention Policies are processed by a scheduled task that runs every 7 days. This means emails could be kept up to 7 days...
Read more >Default Retention Policy is getting applied to the... - ServiceNow
2. Retention policy 'X' with a Retention period based on End date and HR Criteria based on location. Let's say the location in...
Read more >
Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free
Top Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Update from our side: release 1.4 is currently planned for 30.09.2020.
I figured out what is going on. Your analysis is correct, the index file deletes the earliest hour one hour too early.
The problem that we’re facing happens because of the retention policy of the diagnosis keys.
To give an overview of what and why happens:
During the retention policy step, we delete all the keys from the database that match the query
submission_timestamp<= threshold
wherethreshold
isnow - retention_policy
. This means that if we run the retention policy now (14.08 at 14:00), it will delete any entries before or equal (31.07 14:00). Meaning that keys generated for 31.07 14:00 will be deleted, which is not correct, since we are now distributing keys generated between 13:00-14:00 today.To illustrate this better, we can assume a scenario with single day retention policy running distribution day 2 at 8 am.
Wrong behavior
This is wrong because we are only holding 23 hours worth of data. Hour 7 is created at 8:00 but removed at 7:00 the next day.
Correct behavior
We should delete everything before 8am (not inclusive), to assure that we will always have 24h of keys in the system.
There’s already a fix for that, will open the PR in a couple of minutes. @Tho-Mat would also appreciate your comments there. Thanks again for your help investigating and explaining this issue.