question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

[SIP-27] Proposal for Paranoid Deletes

See original GitHub issue

[SIP] Proposal for Paranoid Deletes

Motivation

At Airbnb we have a vast number of entities housed within Superset. Our deployment has tens of thousands of charts (both manually and procedurally generated), thousands of dashboards, and tens of thousands of registered datasources and tables (both physical and virtual).

In a recent analysis of a specific Druid NoSQL (native) cluster, from a sample of ~ 5k charts only 34% of charts rendered, i.e., returned a 200 status code from the /supserset/slice_json route.

The following chart shows the renderability of charts as a function of last saved, which shows that a chart’s viability often decays over time due to creep in the datasource metadata and the saved chart parameters.

Screen Shot 2019-11-22 at 5 50 14 PM

Ideally we would like to have a mechanism to clean up obsolete resources (charts, dashboards, or datasources) in a somewhat paranoid manner, i.e., using soft deletes. This should help keep our deployment at a manageable size and improve the perceived reliability and quality (from a usability standpoint) of Superset assets.

Proposed Change

The proposed solution was originally mentioned by @etr2460 but I thought it was worthwhile formalizing this as a SIP. This borrows an idea from Ruby where we first soft delete records my marking them as deleted (with an associated timestamp) before performing a hard delete (deleting the record n-days later). Users could be prompted that their charts were being deleted and they can take corrective action to undelete it if they see fit.

There’s actually a Python package sqla-paranoid which brings transparent soft deletes to SQLAlchemy which we could use or replicate. The TL;DR is this would add a deleted_at (or deleted_on for consistency) column which would track soft deleted records. Records which are soft deleted wouldn’t show up in the CRUD views by default unless the filter was enabled (not unlike how SQL Lab Views are ignored by default in the tablemodelview).

Records could be marked using a hook, trigger, or cron as deletable based on various criterion using cascading context:

Charts

  • Consistently† returns an error.
  • Consistently† returns no data.
  • Has not been viewed for n-days.

† Note this could leverage a cron (or similar) and be based on customizable rules, i.e., x of the last n-days.

Dashboards

  • Contains no charts.
Tables/Datasources
  • Not referenced by any charts.

New or Changed Public Interfaces

We would need to updated the data model and leverage sql-paranoid (or similar) for enabling the soft-deletes. We would also need to update the FAB views to handle filtering/exclusion of soft deleted records. Finally we would need to implement triggers or similar to i) soft delete records, and ii) hard delete records.

New dependencies

The only new dependency would be sqla-paranoid (no public license) if we decided not to write this ourself. Note the package only contains several hundred lines of codes.

Migration Plan and Compatibility

We would need to update the schema to include the deleted_at (or deleted_on) column for certain tables. Note I think we only need this for charts, dashboards, and datasources (the cascade deletes should handle the cleanup of columns and metrics).

Rejected Alternatives

None.

to: @etr2460 @mistercrunch @villebro @willbarrett cc: @vylc

Issue Analytics

  • State:closed
  • Created 4 years ago
  • Reactions:6
  • Comments:8 (7 by maintainers)

github_iconTop GitHub Comments

1reaction
junlincccommented, Mar 17, 2021

thanks both! @etr2460 and @bkyryliuk To summarize

  • Superset currently only supports hard delete entities, which needs to be changed to soft delete
  • Entity include datasets, charts, dashboards and saved queries
  • Soft deleted records will not appear in search results or in list view, but they are not permanently deleted within a set period of time (time to live)
  • Only system admin can set the time to live ,at database level, for the users
  • Soft deleted records are temporary stored in a trash can
  • Soft deleted records can be restored by both admin and users from the trash can

I’m not very familiar with the role permission in Superset tbh. By reading the documentation, seems like we should change the logic of can_delete, and add something like can_restore?

Model & Action: models are entities like Dashboard, Slice, or User. Each model has a fixed set of permissions, like can_edit, can_show, can_delete, can_list, can_add, and so on. For example, you can allow a user to delete dashboards by adding can_delete on Dashboard entity to a role and granting this user that role.

Please educate me or lmk if I missed anything 😅 @amitmiran137 thoughts?

0reactions
bkyryliukcommented, Mar 17, 2021

+1 to @etr2460 point

time to live should be configurable in the superset config or per database and yeah it should be set by admins

As for implementation, it is unlikely that we would be able to commit to it in the next couple quarters.

Read more comments on GitHub >

github_iconTop Results From Across the Web

Paranoid - Sequelize
Sequelize supports the concept of paranoid tables. A paranoid table is one that, when told to delete a record, it will not truly...
Read more >
Handbook of Clinical Rating Scales and Assessment in ...
They also offer recommendations for locating assessment psychologists, and how ... nal variation, depersonalization/derealization, paranoid symptoms, ...
Read more >
VoIP Telephony with Asterisk (Paul Mahler) - X-Files
When a call arrives at Asterisk over a channel, a dial plan determines what is done wit the call. ... DBdeltree: Delete a...
Read more >
Untitled
Notice buggy sport modelco, Utah 529 plan maximum contribution. ... standard football, Reclining furniture manufacturers, Paranoid android 5.1.1 download, ...
Read more >
a99117a5-a11f-4306-adba-0f5d59a851fe | PDF | Voice Over Ip
Value Added Services (VAS) are services that offer something on top of ... (please note that SIP messages after first INVITE have been...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found