[BUG] `pragma synchronous=OFF` causes database corruption on crash
See original GitHub issueVersion 1.2.8/9 began using pragma synchronous=OFF
every time a connection to the sqlite database is opened. This is unsafe if the machine crashes and will result in users having to resync the blockchain/wallet from scratch. This can cause users to lose days of farming time as they wait for the database to resync as well as waste network bandwidth and possibly waste data on their network plan with a data cap. The sqlite page for pragmas details why this is unsafe (sorry, you’ll have to search for synchronous
as I can’t link directly to it). Namely, it does not wait for the kernel to notify sqlite that data is persisted, only that it has been submitted to the kernel.
To Reproduce
- launch any chia component that writes data to the sqlite database (full node, wallet), preferably something that needs to sync
- wait for chia to begin syncing data
- randomly unplug/hard shutdown the machine without waiting for chia to properly stop
- restart chia, database may be reported as corrupted
As the above depends on the precise timing of when the machine was powered off/crashed relative to I/O submitted by sqlite, this may require several attempts to see a corrupted database.
Expected behavior Database should not become corrupted if the machine crashes. Users should not have to resync the entire blockchain and their wallet if their machine crashes.
Desktop Any machine that writes data to any sort of storage device
Additional context While this doesn’t directly explain the numerous database corruption issues seen by people recently (including some in #8694 as those were not caused by a computer crash), there’s a good chance that either this, or this combined with some transaction management issue could cause database corruption if chia is stopped improperly.
The changelog for 1.2.8 indicates that this pragma change was added to improve disk performance. If faster transaction commits are desired, multiple operations should be batched into a single transaction instead of completely disabling the crash consistency mechanism of the database. Since sqlite is run in WAL mode, sqlite could be set to use pragma synchronous=NORMAL
which the documentation seems to indicate would provide sufficient crash consistency while being faster than pragma synchronous=FULL
Issue Analytics
- State:
- Created 2 years ago
- Reactions:6
- Comments:35 (9 by maintainers)
Top GitHub Comments
“Catastrophic” is a very subjective view. Sure, the data can be fetched from the blockchain again, but that ignores the time lost actually farming and attempting to either submit blocks to the chain or submit partials on a pool. A pooler in the pool I’m a part of had their database trashed and missed out on a lucky day yesterday, I’d say that’s unpleasant/unfortunate at best. The whole “just resync the chain again” also still ignores the fact that the database is big and will keep getting bigger. I don’t want to have to downloads 10s-100s GB of blockchain data on a semi-regular basis just because something went wrong with my power for a second.
As for low-end drives having extra caching, sqlite is built to take care of most of that already. The WAL in sqlite is a standard way to handle crash consistency on devices that have an atomic powerfail write unit that is smaller than the total data that needs to be written for a transaction. There’s
fsync
s in sqlite to ensure the WAL is actually persisted before it starts mucking with the actual transaction execution so that it can recover if things go wrong. The only time this would fail on a low-end device is if the USB controller flat-out lied about doing anfsync
. In that case, the person should probably get a better device as it’s not just sqlite that would have problems in the event of a crash. Drives themselves have gotten much better at actually honoring the low-level commands thatfsync
triggers, so if there’s a problem, it’s going to be the USB controller, not the drive itself.For problematic setups, I think a simple question would be: is
synchronous=OFF
safe when Windows update decides to restart a machine randomly? Apart from that, you’ll end up with a lot of disparate answers as something as simple as a BSOD or power failure can break the database. There does not seem to be one single “right” hardware that people in the community use to run chia, so I don’t think there will be many patterns in this.As far as heuristics, the bulk of it could probably be covered by possibly running with
synchronous=OFF
until 1/2 or 3/4 of the blockchain is synced. After that just run it withsynchronous=NORMAL
.synchronous=OFF
is most beneficial when there’s lots of transactions going into the database anyway, which is mostly caused by the initial sync. If normal operation also causes load, then like I said earlier, other routes like better transaction management should be used to increase performance.With the technical stuff aside, this is absolutely be a config option if the chia team wants to keep pushing for the use of
synchronous=OFF
so that those who don’t want to worry about it don’t have to patch the codebase every version. From a UX perspective, the default should be some crash-consistent option likesynchronous=NORMAL
because many people 1. won’t change the defaults, 2. probably won’t have a good enough understanding of the intimate workings of storage to know whysynchronous=OFF
is unsafe, and 3. will be very unhappy/may leave the project if their database files appear to randomly corrupt themselves. All of the above reasons are the same for why all semi-recent major file systems run with some sort of crash consistency mechanism by default. It turns out that people are really unhappy when they have to runfsck
on their multi-TB drive due to a crash or they find they can no longer boot their machine due to a crash.As a user, I think that sounds great! My humble suggestion would be to default to the safest option (SQLite docs seems to indicate ‘normal’ would be totally safe), whilst allowing advanced users to ‘take off the seatbelt’ if they choose to go faster (perhaps in lieu of – or complementing an ‘auto’ setting). That way the risk is acknowledged and adverse outcomes can be expected.
On the general performance front there may be opportunity to SQL-tune the INSERT statement method for significant gains which could make up the difference too. Some preliminary glances at coin_store.py code show individual INSERT statements are being executed; whereas batch of those could (should?) be wrapped into a transaction and executed instead. There is some in-depth analysis here on SQLite INSERT performance tricks, but that’s probably best dropped in different GH Issue altogether.