question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

[chart]Columns with all NA values should not be silently dropped

See original GitHub issue

#8040 restored dropping columns that are all NA, but it is problematic in some visualization types. There is already #13612 for this issue in Bar Charts, for which I opened a PR, but the same applies to dual line charts if one of the lines has only null values. The error is also not informative, since it only shows the name of the column.

Expected results

The null values should be shown as nulls, without data points on the charts

image

Actual results

The null series is dropped by pandas.pivot_table calls which use dropna=True as default, and the user can only see the text of a KeyError:

image

The stacktrace for the dual line chart:

ERROR:superset.views.base:'metric1'
Traceback (most recent call last):
  File "/app/superset/views/base.py", line 181, in wraps
    return f(self, *args, **kwargs)
  File "/app/superset/utils/log.py", line 164, in wrapper
    value = f(*args, **kwargs)
  File "/app/superset/utils/cache.py", line 152, in wrapper
    return f(*args, **kwargs)
  File "/app/superset/views/utils.py", line 451, in wrapper
    return f(*args, **kwargs)
  File "/app/superset/views/core.py", line 619, in explore_json
    return self.generate_json(viz_obj, response_type)
  File "/app/superset/views/core.py", line 456, in generate_json
    payload = viz_obj.get_payload()
  File "/app/superset/viz.py", line 464, in get_payload
    payload["data"] = self.get_data(df)
  File "/app/superset/viz.py", line 1531, in get_data
    chart_data = self.to_series(df)
  File "/app/superset/viz.py", line 1500, in to_series
    ys = series[m]
KeyError: 'metric1'

How to reproduce the bug

  1. Open a bar chart, a dual line chart, or other charts that use pandas.pivot_table in viz.py
  2. Choose a metric that only returns null values for the selected period. This is true for e.g. sum(1)+null in most SQL dialects

Environment

I used the latest master with docker

Checklist

Make sure to follow these steps before submitting your issue - thank you!

  • I have checked the superset logs for python stacktraces and included it here as text if there are any.
  • I have reproduced the issue with at least the latest released version of superset.
  • I have checked the issue tracker for the same issue and I haven’t found one similar.

Issue Analytics

  • State:open
  • Created 2 years ago
  • Comments:6 (5 by maintainers)

github_iconTop GitHub Comments

1reaction
octaviancorladecommented, Apr 7, 2021

@junlincc @kgabryje here are some screenshots and links to reproduce locally with the example data of the ones using df.pivot_table() in viz.py, only Big Number and Time-series Period Pivot seem to handle the situation currently. I am not sure if I’m missing something since the Pivot Table generates errors here, maybe the screenshots by @kgabryje are both not yet in master?

Time-series Table

image

Pivot Table

image

Big Number looks OK to me

image

Dual Line Chart

image

Time-series Period Pivot also looks OK to me

image

Bar Chart

image

Paired t-test Table

image

1reaction
kgabryjecommented, Apr 7, 2021

In the case of Echarts, Timeseries displays null values as zeros. Pie and Graph charts display nothing (just white background) and Box chart displays an error Error: No numeric types to aggregate

Read more comments on GitHub >

github_iconTop Results From Across the Web

python pandas: pivot_table silently drops indices with nans
I think silently dropping these rows from the pivot will at some point cause someone serious pain. import pandas import numpy a =...
Read more >
Drop all-NA columns from a dataframe - tidyverse
Hi, I want to drop the columns of a data frame which are completely filled with NA, and keep all the others (including...
Read more >
Drop columns with NA in Pandas DataFrame - PYnative
Drop columns with NA values from pandas DataFrame. It covers all the cases to remove columns that contain missing values.
Read more >
Working around Glue's habit of dropping unsuspecting columns
Glue silently drops empty columns when it reads a table or partition from the data catalog. I can't for the life of me...
Read more >
11.5 Including missing data in demographics tables
Do not silently drop missing values in this table. ... Including a total column ( total_col = TRUE ) is also useful, as...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found