question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Shall we move from pickles to SQLite3 for local data storage?

See original GitHub issue

We’re currently running all kinds of local storage on Python pickles. While there hasn’t been significant problems with them, there’s some room for improvement.

I came up with the idea of employing a light-weight, easy-to-use DB (I vote for SQLite 3). This way we control the local storage better since there’s only one .db file, SQLite 3 is also easy to setup - no setup needed actually 😃.

We aren’t going to make it too complex - one table may contain only 3 or so columns. As I initially thought, we may be using only SELECT, INSERT, UPDATE and DELETE - Smokey’s data isn’t any complex at all, not even JOIN or such queries, making this easy for people with virtually zero DB knowledge to maintain (I’m one of them).

Benefits we’re getting:

  • Less RAM usage
  • Easier data inspection and maintenance (use sqlite3 CLI tool)
  • Easier backup, migration, adding new stored content (like logs), and intra-instance transfer
  • Potentially less disk I/O (dubious)
  • No more pickling/unpickline errors, unless the whole .db file is corrupt, which happens way less than a pickle file corrupting
  • Potentially easier migration to Helios (if it’s still alive)

What we’re paying for this:

  • More frequent disk I/O
  • Potential difficulty in expanding columns
  • Potential performance degradation (disk is always slower than RAM) I don’t this one will be very much - the majority of CPU are spent running regexes, and the majority of idle time are spent waiting network responses. In case a server has a slow disk, this would be an issue. Otherwise, not much.

An early draft of an example is in the db branch that contains the infrastructure, as well as migrated blacklisted users from pickles to DB. CI is passed and it can be safely merged now.

Is it a good idea?

Issue Analytics

  • State:closed
  • Created 5 years ago
  • Comments:14 (14 by maintainers)

github_iconTop GitHub Comments

1reaction
tripleeecommented, Nov 15, 2018

To the extent that this adds benefits, those seem to be shadowed by the benefits that Helios would bring. So let’s try to revive the Helios branch instead, shall we?

1reaction
AWegnerGitHubcommented, Nov 9, 2018

Chiming in on the Helios points:

I’d love to move over to Helios. It’s close to a year since the initial framework was set up and that’s where it (mostly) died. Interest in converting to it seems to have come to a halt. I have no problem with this - it is a rather large change (and even more so now that my Smokey branch is a year out of date), it requires changes to both Smokey and Metasmoke and effectively adds a third peg on which the entire solution stands on.

If we aren’t moving to a cloud based database (aka. Helios), I am fully in support of moving away from pickles (and config data, but I already argued that point too). @Undo1 is right that there is some synchronization issues, but they are going to be related to the config data I mentioned. In terms of keeping the blacklists in sync, I don’t think we’d see anything different than we do today.

@iBug Something that would be helpful to see is a comparison in terms of file sizes. SQLite has the vacuum command that can optimize space, but I don’t know how well it works, especially after adding/removing “a lot” (trademark pending) of entries. How does this single database compare in terms of on disk usage and in terms of memory usage when loading data into memory.

Read more comments on GitHub >

github_iconTop Results From Across the Web

How can I write data from a pickle file to sqlite3 database?
I have data from a recipe website and want to store it in sqlite3 for easy retrieval. We've stored the data in a...
Read more >
Appropriate Uses For SQLite
SQLite strives to provide local data storage for individual applications and devices. SQLite emphasizes economy, efficiency, reliability, ...
Read more >
Storing Data with SQLite - Python for the Lab
In this article, we are going to cover how to use databases to store different types of data. We will quickly review how...
Read more >
Why do programmers choose to use SQLite 3 instead ... - Quora
Pickle is for the persistence of small amounts of Python objects; you can pickle networks of course but you get the functionality from...
Read more >
pickle — Python object serialization — Python 3.11.1 ...
The pickle module is not secure. Only unpickle data you trust. It is possible to construct malicious pickle data which will execute arbitrary...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found