question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

[SIP-38] Visualization plugin refactoring

See original GitHub issue

[SIP-38] Visualization plugin refactoring

Motivation

One of the most commonly reoccurring questions in the Superset community, on Slack and elsewhere, is that of how to add a new data visualization. The answer, in short, has been “it’s hard.” While that may be true, the goal of this SIP is to lay out both tactical refactor needs for the current implementation to mature, as well as proposing a handful of roadmap features to make plugin development significantly easier. These changes will make upcoming modifications of existing plugins (see SIP-34) drastically simpler, and steer toward opening an ecosystem of Superset visualization plugins.

Much planning and work has already been done to address the difficulty of adding/editing plugins, including a new query API endpoint, but there are many blocking issues and code migrations remaining to complete this process. Special thanks to @kristw, @williaster, @xtinec, and @conglei for their significant contributions to the frontend and API work thus far. These issues, and proposed solutions for them, are enumerated below. Additional suggestions are welcome.

Proposed Changes

General Goals:

  • As much code and configuration as possible for individual visualization plugins should be moved out of incubator-superset and into the individual plugin’s repos (in a perfect world, a new plugin wouldn’t require touching two repos and opening two PRs).
  • Reduce frustration in working on plugin repos, allowing people to more easily see changes as they make them

Issue: Control panel configurations for visualizations are centralized in a difficult to maintain controls.jsx file. All controls are located in incubator-superset , necessitating writing code in two PRs for two repos.

Proposal: Control configurations (particularly the ones that are unique to any given plugin) should be migrated into the correct individual control panel config files . An example of this can be found in This PR. These individual configs should then be migrated to the individual plugins, and references removed from setupPlugins.ts.

Issue: Plugins (particularly when using the legacy api (/explore_json) require an entry in viz.py. In addition to requiring code changes to two repos, the logic in viz.py has proven to be fragile and cumbersome to maintain.

Proposal: Use of viz.py should be deprecated in favor of the viz-agnostic api/v1/query endpoint. In an effort to decouple this, viz.py logic (data transformations) should be broken out into individual modules and/or reusable methods, which should be invoked by the new endpoint. This will additionally require that controls should be consolidated wherever possible, e.g. use a single control for metric, metrics, metric_2, secondary_metric, etc.

Issue: New and existing plugins cannot yet fully utilize the new api/v1/queryendpoint due to the following issues:

  • Superset does not yet respect a plugin’s useLegacy flag to call the correct endpoint when required
  • The API has no means to accept data transformation options needed for post-processing (e.g. Pandas) to reach feature parity with the legacy API.
  • The API does not have unit tests
  • The API is not documented

Proposal:

  • Modify exploreUtils.js such that the getURIDirectory method calls the right endpoint depending on the useLegacy flag
  • Add configuration options to the API call to invoke backend post-processing operations, returning transformed data
  • Write unit tests and documentation
  • Deprecating the explore_json endpoint

Issue: Each plugin must be registered manually in incubator-superset’s MainPreset.js file. Additionally, customizing the plugins loaded for a deployment (i.e. disabling some) is done via setupPluginsExtra.js, meaning the plugins are still loaded as dependencies. And this method only supports plugins removal, but does not let you add new plugins that are not listed in package.json from master.

Proposal: Attempting to load plugins via ES2020’s dynamic imports. The exact implementation of this is a bit TBD, but the idea would be to move the responsibility for registering/loading plugins away from MainPreset.js. Instead, the plugin paths/packages (and their associated keys) could be bootstrapped as an overridable configuration file, and Superset could lazy-load the plugins accordingly. (note: dynamic imports are not supported natively by IE, but Babel provides potential recourse for that).

Issue: Development work on plugins requires manually running a npm link operation to load the local plugin, and thus see updates/edits in Superset - this is troublesome in that it is both fragile, and difficult for many developers to discover, as it’s not a common pattern).

Proposal: Automate the process! Create a “plugin dev mode” NPM script that automatically links (or unlinks) viz plugin packages. See a working example of this concept in this PR. This would involve refactoring the NVD3 plugins to not rely on /lib path, preset-chart-xy to not rely on /esm path - all plugins should follow the same build and source directory pattern.

Additional (follow-up) refactoring tasks

  • Follow CSS-in-JS patterns (see SIP-37) in viz components, sharing common theme styles/variables with incubator-superset. Theme variables may need to be moved to superset-ui to be consumed by both superset-ui-plugins and incubator-superset.
  • Audit and address issues with, and completeness of, i18n of plugin text.
  • Converting all viz components to TypeScript (see SIP-36)

New or Changed Public Interfaces

The query endpoint at /api/v1/query needs significant enhancement, as laid out in the proposals above (post-processing options, tests, docs).

New dependencies

N/A

Migration Plan and Compatibility

N/A

Rejected Alternatives

  • Reintroducing viz plugins into incubator-superset Having the plugins be in their own repos is troublesome from a workflow perspective (due to the multiple PRs required, NPM Link work needed, and separate build processes required). The proposals laid out above seek to minimize this difficulty. While it is certainly possible (and indeed likely easier) to move the plugins back into Superset itself (like Redash and Metabase do), solving these more difficult problems seems more likely to open the door to a true plugin ecosystem for Superset.
  • Moving data transformations to plugins (JS), deprecating Pandas The idea has been floated that perhaps data transformation (at least in some cases) might be more the responsibility of the viz plugin itself than the backend, and maybe if we moved that logic, we could deprecate Pandas. To test the theory, some basic benchmarking attempts were made on large rollup and pivot tasks, to compare the performance of Pandas against Zebras, Datalib, Ramda, and Lodash. This approach, at least as a global migration, was decided against for these reasons:
    • Sending an entire dataset over the wire, if the frontend just needs a rollup, is a waste of resources
    • If post-processing is done on the backend, the result can be cached for use by multiple charts (or multiple clients and reloads)
    • Neither Zebras nor Datalib provides an out-of-the-box pivot function on par with Pandas, and the pivotWith “recipe” from the Ramda cookbook looked to be significantly slower than Pandas (approx 10x).
    • All these libraries provide grouping, sorting, map/reduce functionality, so you can pivot the data manually. But then, so does Lodash, which matched (or slightly beat) the other JS libraries’ performance. This was still about 2x slower than Pandas.
    • TL;DR: If you want to avoid writing Python for a new viz or calling it through the new API, and want to do a little data munging on the frontend, just use lodash or vanilla JS for best results.

Issue Analytics

  • State:closed
  • Created 4 years ago
  • Reactions:15
  • Comments:10 (10 by maintainers)

github_iconTop GitHub Comments

1reaction
ktmudcommented, Mar 26, 2020

I think dynamically generating a JS file might not be very practical if we want users (Sueperset admins) to manage plugins in a future UI. I’d imagine all plugins are loaded dynamically by just checking whether a file exists in some folder. There shouldn’t be the need to pre-register a chart type. You just load it as you need it.

1reaction
betodealmeidacommented, Mar 26, 2020

I love this, specially the part about consolidating the data model in api/v1/query so that we don’t have metric vs metrics. Having a common data model for the different plugins will make it much easier to switch visualizations without losing context, which is very powerful feature IMHO.

Read more comments on GitHub >

github_iconTop Results From Across the Web

[SIP-38] Visualization plugin refactoring · Issue #9187 - GitHub
Much planning and work has already been done to address the difficulty of adding/editing plugins, including a new query API endpoint, but there ......
Read more >
Building Custom Viz Plugins in Superset v1 - Preset.io
Hello World is a fully operational model of a viz plugin. The intent is to provide a basic scaffolding to build any sort...
Read more >
Refactoring source code in Visual Studio Code
Visual Studio Code supports refactoring operations (refactorings) such as Extract Method and Extract Variable to improve your code base from within your editor....
Read more >
Good code visualization / refactoring tools for C++?
Are there any good visualization tools for C++ period, and of those are there any that actually play well with "advanced" C++ features?...
Read more >
Code refactoring | Graphlytic
Code Refactoring and Documentation - Manage code and code documentation much more efficient. ... Improve and keep your code fresh with graph visualization....
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found