This article is about fixing Running
  • 29-Jan-2023
Lightrun Team
Author Lightrun Team
Share
This article is about fixing Running

Running “docker-compose up” fails to compile successfully (Error on importing STR_NA_VALUES from pandas) in Apache Superset

Lightrun Team
Lightrun Team
29-Jan-2023

Explanation of the problem

When running the command “docker-compose up” on version 1.3.2 (and 1.4.0rc3) of the Superset application, the compilation process fails. The error message generated is an “ImportError: cannot import name ‘STR_NA_VALUES’ from ‘pandas.io.parsers’ (/usr/local/lib/python3.8/site-packages/pandas/io/parsers/init.py) in most containers”.

Environment: The environment in which this issue was encountered includes the following components:

  • Docker version 20.10.12, build e91ed57
  • Python version 3.8.12 (as determined by running the command in the Docker container)
  • Node.js version: OCI runtime exec failed: exec failed: container_linux.go:380: starting container process caused: exec: “node”: executable file not found in $PATH: unknown (as determined by running the command in the Docker container)

Reproduction Steps: The issue can be reproduced by following these steps:

  1. Go to the release on the GitHub repository, download the 1.3.2 zip (https://github.com/apache/superset/archive/refs/tags/1.3.2.zip)
  2. Extract the folder
  3. Enter the superset-1.3.2 folder in a terminal
  4. Run the command “docker-compose up”

Expected Results:

  • The Superset login screen / welcome page is displayed when viewing on a browser.
  • The compilation process completes successfully without error.

Actual Results:

  • A stack trace of error is displayed when viewing on a browser.
  • The Docker logs contain the error message “ImportError: cannot import name ‘STR_NA_VALUES’ from ‘pandas.io.parsers’ (/usr/local/lib/python3.8/site-packages/pandas/io/parsers/init.py) in most containers”.

Additional context:

  • The version of pandas installed on the Superset app Docker container is 1.3.4. However, in the requirements base.txt file, the version is specified as 1.2.2. It is unclear how the version of pandas was upgraded.
  • This issue may be related to a pull request on the GitHub repository, “https://github.com/apache/superset/pull/16400
  • Another user encountered a similar issue, “https://github.com/apache/superset/issues/17333“, but the resolution is not clear.

Troubleshooting with the Lightrun Developer Observability Platform

Getting a sense of what’s actually happening inside a live application is a frustrating experience, one that relies mostly on querying and observing whatever logs were written during development.
Lightrun is a Developer Observability Platform, allowing developers to add telemetry to live applications in real-time, on-demand, and right from the IDE.

  • Instantly add logs to, set metrics in, and take snapshots of live applications
  • Insights delivered straight to your IDE or CLI
  • Works where you do: dev, QA, staging, CI/CD, and production

Start for free today

Problem solution for Running “docker-compose up” fails to compile successfully (Error on importing STR_NA_VALUES from pandas) in Apache Superset

When attempting to run the command “docker-compose up” on version 1.3.2 (and 1.4.0rc3) of Apache Superset, the compilation process fails and presents an error related to the import of the variable “STR_NA_VALUES” from the “pandas.io.parsers” module.

This error can be reproduced by following these steps:

  1. Download the 1.3.2 zip from the release section on the Apache Superset github repository (https://github.com/apache/superset/archive/refs/tags/1.3.2.zip)
  2. Extract the folder
  3. Enter the “superset-1.3.2” folder in a terminal
  4. Run the command “docker-compose up”

Expected results would include a successful compilation process and the Superset login screen or welcome page being displayed on the browser. However, the actual results include a stack trace error when viewing the application on the browser. The error message can be found in the docker logs, displaying the following:

ImportError: cannot import name 'STR_NA_VALUES' from 'pandas.io.parsers' (/usr/local/lib/python3.8/site-packages/pandas/io/parsers/__init__.py) in most containers.

Additionally, the environment in which this error was observed includes the following specifications:

  • Docker version 20.10.12
  • Superset version: 1.3.2 / Superset 0.0.0dev (as per running command in docker container)
  • Python version: Python 3.8.12 (as per running command in docker container)
  • Node.js version: OCI runtime exec failed: exec failed: container_linux.go:380: starting container process caused: exec: “node”: executable file not found in $PATH: unknown (as per running command in docker container)

It is also worth noting that there is a discrepancy in the version of pandas being used. While the pip freeze on the superset app docker container shows that pandas is on version 1.3.4, the requirements base.txt file lists it as version 1.2.2. This could potentially be a contributing factor to the error.

Other popular problems with Apache Superset

Problem: Error when importing STR_NA_VALUES from pandas when running “docker-compose up”

When attempting to run “docker-compose up” command on version 1.3.2 (and 1.4.0rc3) of Apache Superset, the following error is encountered:

ImportError: cannot import name 'STR_NA_VALUES' from 'pandas.io.parsers' 

This error occurs in most containers, including app, worker, and worker-beat.

Solution:

The issue is related to a version mismatch between the installed pandas library and the version specified in the requirements file. The pip freeze command on the superset app docker container reveals that pandas is on version 1.3.4, while the requirements file, base.txt, specifies version 1.2.2. One solution is to update the version in the requirements file to match the installed version of pandas.

Problem: Error when running “docker-compose up” command

When attempting to run the “docker-compose up” command, the following error is encountered:

OCI runtime exec failed: exec failed: container_linux.go:380: starting container process caused: exec: "node": executable file not found in $PATH: unknown

Solution:

This error occurs because node.js is not installed or is not in the PATH. To resolve this issue, make sure that node.js is installed and that its executable is in the PATH.

Problem: Error when viewing the Superset login screen / welcome page

When attempting to view the Superset login screen / welcome page in the browser, the following error is encountered:

Presented with stack trace of error

Solution:

This error can occur due to various reasons such as misconfigurations in the superset.config.py file or missing dependencies. To troubleshoot this issue, check the superset logs for python stacktraces, ensure that all the dependencies are installed, and check the configuration settings in the superset.config.py file. Additionally, checking the Apache Superset issue tracker for similar issues can be helpful.

A brief introduction to Apache Superset

Apache Superset is an open-source business intelligence web application that allows users to create and share interactive dashboards and visualizations. It is built using Python, Flask, and React and utilizes various data visualization libraries such as Chart.js and D3.js. Superset provides a user-friendly interface for exploring and analyzing data, as well as the ability to create and save custom visualizations and dashboards.

One of the key features of Apache Superset is its support for a wide range of data sources, including SQL databases, Druid, and Google BigQuery. It also includes support for data exploration through SQL Lab, which allows users to write and execute SQL queries, as well as the ability to schedule and share dashboards with other users. Additionally, Superset supports a variety of authentication methods, including database authentication, LDAP, and OAuth.

Most popular use cases for Apache Superset

  1. Apache Superset is an open-source data visualization and business intelligence platform that allows users to create and share interactive dashboards and charts. It can be used to create visual representations of data from various sources such as SQL databases, CSV files, and others.
  2. One popular use case of Apache Superset is to create interactive dashboards that allow users to explore and analyze data in real-time. This can be done by creating charts, such as line, bar, and pie charts, and adding filters and aggregate functions to the data.
from superset import db
from superset.models.slice import Slice

#Creating a new slice
slice = Slice(
    slice_name="My Chart",
    viz_type='line',
    datasource_type='table',
    datasource_id=1,
    params={
        "groupby": ["date"],
        "metric": "sum__num",
    },
)

#Saving the slice
db.session.add(slice)
db.session.commit()
  1. Superset also offers the ability to schedule and send reports and alerts to specific users or groups. This can be done by creating a new dashboard and scheduling the report to be sent at a specific time or when certain conditions are met. Users can also receive alerts for specific data changes or when certain thresholds are reached.
Share

It’s Really not that Complicated.

You can actually understand what’s going on inside your live applications.

Try Lightrun’s Playground

Lets Talk!

Looking for more information about Lightrun and debugging?
We’d love to hear from you!
Drop us a line and we’ll get back to you shortly.

By submitting this form, I agree to Lightrun’s Privacy Policy and Terms of Use.