question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

KeyError when using Redshift

See original GitHub issue

I’m trying to create line charts and time series charts with aggregates. In this case, I am summing up the column items sales over a date, day . However I keep getting this error

screen shot 2018-06-28 at 12 07 04 pm screen shot 2018-06-28 at 12 07 40 pm

- [ x] I have checked the superset logs for python stacktraces and included it here as text if any

2018-06-28 12:07:29,604:INFO:root:Database.get_sqla_engine(). Masked URL: redshift+psycopg2://user:XXXXXXXXXX@my-redshift.host.at.amazonaws.com:5439/testdb
2018-06-28 12:07:30,077:DEBUG:root:[stats_logger] (incr) loaded_from_source
2018-06-28 12:07:30,077:ERROR:root:u'SUM(itemsales)'
Traceback (most recent call last):
  File "/Users/minhmai/envs/py2/lib/python2.7/site-packages/superset/views/core.py", line 1107, in generate_json
    payload = viz_obj.get_payload()
  File "/Users/minhmai/envs/py2/lib/python2.7/site-packages/superset/viz.py", line 329, in get_payload
    payload['data'] = self.get_data(df)
  File "/Users/minhmai/envs/py2/lib/python2.7/site-packages/superset/viz.py", line 580, in get_data
    values=values)
  File "/Users/minhmai/envs/py2/lib/python2.7/site-packages/pandas/core/frame.py", line 4468, in pivot_table
    margins_name=margins_name)
  File "/Users/minhmai/envs/py2/lib/python2.7/site-packages/pandas/core/reshape/pivot.py", line 58, in pivot_table
    raise KeyError(i)
KeyError: u'SUM(itemsales)'

A bit of digging saw that the column names become lower case when turned into a pandas data frame but the metric name is still capitalized, as shown by my logs above. I’ve set a trace and it’s exactly what I expected

(Pdb) l
585  	                records=pt.to_dict(orient='index'),
586  	                columns=list(pt.columns),
587  	                is_group_by=len(fd.get('groupby')) > 0,
588  	            )
589  	        except:
590  ->	            import pdb; pdb.post_mortem()
591
592
593  	class PivotTableViz(BaseViz):
594
595  	    """A pivot table view, define your rows, columns and metrics"""
(Pdb) values
[u'SUM(itemsales)']
(Pdb) df.head()
                __timestamp  sum(itemsales)
0 2018-06-15 00:00:00+00:00             0.0
1 2018-06-11 00:00:00+00:00             0.0
2 2018-06-13 00:00:00+00:00             0.0
3 2018-06-09 00:00:00+00:00             0.0
4 2018-06-07 00:00:00+00:00             0.0
(Pdb) self.metrics
[u'SUM(itemsales)']
(Pdb) df.columns
Index([u'__timestamp', u'sum(itemsales)'], dtype='object')

The error occurred at line 578

            pt = df.pivot_table(
                index=DTTM_ALIAS,
                columns=columns,
                values=values)

Make sure these boxes are checked before submitting your issue - thank you!

- [ x] I have reproduced the issue with at least the latest released version of superset - [ x] I have checked the issue tracker for the same issue and I haven’t found one similar

Superset version

superset==0.25.6

Expected results

I expect either the metrics to be all lower cased or that the column names of the results dataframe to match the form as the aggregate query

Actual results

The data frame has their column name lower cased and the metrics still retain the formatting.

Steps to reproduce

This is used on test data with a random numeric generator. I have seen this error in every case where I am using the SUM aggregation. The database is on Redshift and I have confirmed that I am using pandas==0.22.0.

I can push a fix to make the metrics lower cased or have the column name of the data frame match the metric but I’m not sure if that is the best way to approach this.

Issue Analytics

  • State:closed
  • Created 5 years ago
  • Comments:6 (6 by maintainers)

github_iconTop GitHub Comments

1reaction
minh5commented, Jul 3, 2018

Thanks, @villebro. I tested it out and it worked beautifully, I pushed up a PR.

Read more comments on GitHub >

github_iconTop Results From Across the Web

KeyError when using Redshift #5308 - apache/superset - GitHub
I'm trying to create line charts and time series charts with aggregates. In this case, I am summing up the column items sales...
Read more >
python - SQLAlchemy-Redshift Unable to reflect existing view ...
I'm using sqlalchemy-redshift to connect to a existing redshift database in AWS. Most of the tables and views in our db have several...
Read more >
Troubleshoot Amazon Redshift connection errors
If your Amazon Redshift cluster was recently resized or restored from a snapshot, then check your cluster's subnet.
Read more >
Setting Up Python Redshift Connection: 3 Easy Methods - Learn
In this article, you will see how you can establish a Python Redshift connection to access and query Amazon Redshift data.
Read more >
Source code for bilby.gw.conversion
[docs]def redshift_to_luminosity_distance(redshift, cosmology=None): cosmology ... Cosmology The cosmology to use for the transformation.
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found