[SIP-33] Proposal for Removing SQLite Support for Metadata Databases
See original GitHub issue[SIP] Proposal for Removing SQLite Support for Metadata Databases
Motivation
Historically, SQLite has been used as a starting point for development for new contributors to Superset. While the need for an easy onramp still exists, I believe that it is fully covered by the new Docker compose file contributed by Craig Rueda. Deprecating SQLite will encourage docker-compose
as a primary starting point for new contributors to the project. SQLite support causes multiple pain points for developers, including bad migration practices with unnecessary batch of migrations and database schema drift due to SQLite’s inability to alter constraints on existing tables. Issues contributed to Github referenced below which indicate that SQLite is in some instances being used in a production-like environment. This is a strong anti-pattern that we would like to avoid that makes supporting Superset in the community more difficult.
Proposed Change
Immediate:
- Add logging to indicate SQLite’s deprecation in the next version release
- Add release notes in the upgrading.md file relating to deprecation
- Announce deprecation on the mailing lists and in Slack
- Update documentation to recommend Docker Compose with Postgres for new users
In 2 minor versions:
- Update configuration to use a Postgres database by default
- Update the build matrix to stop building Superset against SQLite
- Update Cypress test configuration to use Postgres instead of SQLite
- Add release notes in the upgrading.md file relating to removal
- Announce removal on the mailing lists and in Slack
New or Changed Public Interfaces
Support for SQLite as a metadata database will be officially removed. SQLite may still be usable, but it will not be supported in future Alembic migrations.
Migration Plan and Compatibility
Users who leverage SQLite in production will be required to migrate their data to a different supported metadata database, either MySQL or Postgres. This can be accomplished via third-party tools.
Rejected Alternatives
The primary other option is not deprecating SQLite and not removing it from the build matrix. This is really a binary decision.
Issue Analytics
- State:
- Created 4 years ago
- Reactions:5
- Comments:12 (8 by maintainers)
Top GitHub Comments
The above thread says:
Current Apache superset documentation still has SQLite as the default metadata database.
But in reality, Apache superset already doesn’t work with SQLite. We can’t get started with Apache Superset + SQLite as metadata store today (2021-11-25). Using Apache superset version 1.3.2.
I have raised a new issue in this regard. Please comment there if you are also facing this issue.
Most amazing feature about Apache Superset is that it was so good to get started quickly. SQLite as metadata store was helping on this.
In production, if someone uses SQLite, I believe that is their responsibility. Drawing parallels, if someone in production, has Superset admin password as
password
or some other easy to hack password, are we going to remove the support for all common hackable passwords for Apache Superset ?For small, simple application - SQLite was super useful as a Superset Metadata store. Many applications start small and grow big. I use Superset with SQLite as metadata database in all my data science personal projects, experimental applications etc till the application proves it’s existence.
As @xinbinhuang pointed out, Alembic supports SQLite & non SQLite flows. There is a small additional step that we need to do while writing database migration scripts.
I tried finding alternative to SQLite. In Python + SQLAlchemy world, there is no good alternative to SQLite, as far as I know. H2 & HyperSQL are java based and provide only jdbc drivers. SQLAlchemy python+jdbc driver dialects are not mature enough even in November 2021.
Hi @willbarrett ,
I am pretty new to superset so my comment may not be accurate, but regarding the batch migrations:
This should not have an actual impact on the migration process. The method
bathch_alter_table
only does an actual batch migration on SQLite, while keeping the normalALTER
operations on other databases. Alembic batchAs for SQLite for production, I am neither support nor against it, as it does give you an easy deployment with small users’ size. However, as we are living in a world of containerization, deployment is not an issue either.