Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

chunksize argument removed from read_excel?

See original GitHub issue

Code Sample

import pandas as pd

excel = pd.ExcelFile("test.xlsx")

for sheet in excel.sheet_names:
    reader = excel.parse(sheet, chunksize=1000)
    for chunk in reader:
        # process chunk

Problem description

In version 0.16.1 the chunksize argument was available.

See: http://pandas.pydata.org/pandas-docs/version/0.16.1/generated/pandas.ExcelFile.parse.html

But in latest version it’s not available.

https://pandas.pydata.org/pandas-docs/stable/generated/pandas.ExcelFile.parse.html

What was the reason that it was removed?

Also, how should I process excel file by chunks in latest version?

Issue Analytics

State:
Created 6 years ago
Comments:6 (4 by maintainers)

Top GitHub Comments

1reaction

jorisvandenbosschecommented, Jul 27, 2017

Yes, that feature request is covered by #8011

0reactions

swaritscommented, Jun 22, 2020

@EugeneKovalev It was removed because the excel files would read up into memory as a whole during parsing because of the nature of XLSX file format. Hence, it’d cause ‘MemoryError’ if the file was large. And chunksize wouldn’t change this behavior in case of excel files, but it works perfect in CSVs coz they could be loaded into memory in parts.

Top Results From Across the Web

Is there a chunksize argument for read_excel in pandas?

Edit: I've read the question re: reading an excel file in chunks (Reading a portion of a large xlsx file with python), however,...

[Code]-Is there a chunksize argument for read_excel in pandas?

I'm trying to create a progress bar for reading excel data into pandas using tqdm. I can do this easily with a csv...

Big Data from Excel to Pandas | Python Charmers

Similarly to nrows above, the argument chunksize defines how many rows will be read from the top.

pandas.read_excel — pandas 0.24.0rc1 documentation

Optional keyword arguments can be passed to TextFileReader . Returns: DataFrame or dict of DataFrames. DataFrame from the passed in Excel file. See...

How to Load a Massive File as small chunks in Pandas?

Output: We have a total of 159571 non-null rows. Example 2: Loading a massive amounts of data using chunksize argument.