question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Unusually large snapshots size

See original GitHub issue

CrateDB version: v3.3.4

Environment description: JVM version openjdk version “11.0.4” 2019-07-16 OpenJDK Runtime Environment (build 11.0.4+11-post-Ubuntu-1ubuntu218.04.3) OpenJDK 64-Bit Server VM (build 11.0.4+11-post-Ubuntu-1ubuntu218.04.3, mixed mode, sharing)

Kernel Linux chiphub 4.15.0-66-generic #75-Ubuntu SMP Tue Oct 1 05:24:09 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux

Distribution 18.04.3 LTS (Bionic Beaver)

Problem description:

I have setup a python script which creates a crate db snapshot everyday at noon. The query I ran to initially setup the repo is: CREATE REPOSITORY repo_name TYPE FS WITH (LOCATION='/path/to/folder', compress=true); The query I run everyday in order to create the snapshot is: CREATE SNAPSHOT repo_name.{} ALL WITH (wait_for_completion=true, ignore_unavailable=true); On the initial run, the snapshot directory size was same as the database size (30GB). After about a month, the database has grown to 40GB while the snapshot directory size has grown to ~120GB (almost thrice the size of the database!). Is this normal? If yes, are there any options/optimizations I can try out to reduce the size of the snapshots?

Issue Analytics

  • State:closed
  • Created 4 years ago
  • Comments:8 (5 by maintainers)

github_iconTop GitHub Comments

1reaction
kunz07commented, Jan 17, 2020

Hey @marregui @mfussenegger,

Apologies for the delayed response. Thanks a lot for your inputs. I’ll fine-tune my python script to delete the prior snapshots once a new one is created as suggested and see how it goes. Will reopen in case I run into the issue again.

Regards, Kunal

1reaction
mfusseneggercommented, Jan 15, 2020

To append on what @marregui already mentioned:

Snapshots are incremental and if you keep all the snapshots you could also restore an earlier snapshot, that’s why it needs to keep the data around.

Instead of creating a new repository an alternative is to use DROP SNAPSHOT on older snapshots, so that files that become unreferenced can be cleaned up.

Read more comments on GitHub >

github_iconTop Results From Across the Web

snapshot size is reported to be very large whereas it should ...
Solved: Hi All, I was moving about 8TB data into a Ontap Volume. The snapshots were already enabled when the transfer took place....
Read more >
Snapshots can grow larger than the original virtual disk ...
Hello,. Snapshots used to be FIFO files much like REDO logs. So yes they could grow bigger than the disk. They changed that...
Read more >
Explainer: The arithmetic of snapshot size
Snapshots, the storage space they occupy, and backup sizes aren't easy to understand. This article tries to explain why what appears to be ......
Read more >
Best Method to Remove a very large Snapshot
Best Method to Remove a very large Snapshot ... 15TB snapshot will need 15TB, original disk size and 10-15% free space to merge....
Read more >
Report and Snapshot Size Limits - Microsoft Learn
Snapshots are typically much larger than other items that are stored on a report server. Snapshot size can typically range from a few...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found