Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

st.write (df) gives problem when df is a pivot table

See original GitHub issue

Summary

Showing a dataframe gives an error ArrowInvalid: (‘Could not convert All with type str: tried to convert to int’, ‘Conversion failed for column MM with type object’)

Steps to reproduce

Code snippet:

import pandas as pd
import streamlit as st


def get_data():
    url = "https://raw.githubusercontent.com/rcsmit/streamlit_scripts/main/input/garminactivities_new.csv"
    df = pd.read_csv(url, delimiter=';')
    df["Datum"] = pd.to_datetime(df["Datum"], format="%d-%m-%Y")
    df = df.sort_values(by=['Datum'])
    df["YYYY"] = df["Datum"].dt.year
    df["MM"] = df["Datum"].dt.month
    df["DD"] = df["Datum"].dt.day
    df["count"] = 1
    df = df[df["Activiteittype"] == "Hardlopen"].copy(deep=False)
    df = df[["Datum","Titel", "Afstand","Tijd", "gem_snelh", "count", "MM", "YYYY"]]
    return df

def find_nr_activities_per_month_per_year(df):
    # Aantal activiteiten per maand (per jaar)
    df_pivot = df.pivot_table(index='MM', columns='YYYY', values='count',  aggfunc='sum', fill_value=0, margins = True)
    st.write(df_pivot) #gives error
    print (df_pivot) # not a problem

def main():
    df = get_data().copy(deep=False)

    find_nr_activities_per_month_per_year(df)


if __name__ == "__main__":
    main()

Dtypes of the dataframe:

Datum        datetime64[ns]
Titel                object
Afstand             float64
Tijd                 object
gem_snelh           float64
count                 int64
MM                    int64
YYYY                  int64

Expected behavior: Showing the dataframe. print(df) doesn’t throw an error

Actual behavior:

Is this a regression?

I don’t know

Debug info

Streamlit version: 0.85
Python version: 3.8
Using Conda? PipEnv? PyEnv? Pex? Nothing of this
OS version: Windows
Browser version: Chrome, latest

Additional information

n/a

Issue Analytics

State:
Created 2 years ago
Comments:6

Top GitHub Comments

1reaction

vdonatocommented, Aug 26, 2021

Hi @rcsmit, sorry for the delayed reply – I missed your last reply until now. The error seems to be related / probably has the same root cause, so I don’t think there’s a need to open a second issue.

1reaction

rcsmitcommented, Aug 11, 2021

One thing that could be done here would be to cast the integers in the MM to all have type str.

Thanks a lot for your reply and help.

For those who arrive here after googling

df["MM"] = df["MM"].astype(str).str.zfill(2)

does the job ( .str.zfill(2) is to prevent the 1 10 11 12 3 4 etc order)

Top Results From Across the Web

How can I pivot a dataframe? - Stack Overflow

Question 1 · Good general approach for doing just about any type of pivot · You specify all columns that will constitute the...

Apply automatic column type fixes for additional errors #5477

Context We already transform some dataframe column types (mixed columns) that are ... Closes st.write (df) gives problem when df is a pivot...

A Guide to Pandas Pivot Table - Built In

In the pivot_table function, we specify the DataFrame we are summarizing, and then the column names for the values, index and columns.

Reshaping and pivot tables — pandas 1.5.2 documentation

pivot () will error with a ValueError: Index contains duplicate entries, cannot reshape if the index/column pair is not unique. In this case,...

Pandas .groupby(), Lambda Function, & Pivot Table Tutorial

This lesson of the Python Tutorial for Data Analysis covers grouping data with pandas .groupby(), using lambda functions and pivot tables, and sorting...