question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

chunksize argument removed from read_excel?

See original GitHub issue

Code Sample

import pandas as pd

excel = pd.ExcelFile("test.xlsx")

for sheet in excel.sheet_names:
    reader = excel.parse(sheet, chunksize=1000)
    for chunk in reader:
        # process chunk

Problem description

In version 0.16.1 the chunksize argument was available.

See: http://pandas.pydata.org/pandas-docs/version/0.16.1/generated/pandas.ExcelFile.parse.html

But in latest version it’s not available.

https://pandas.pydata.org/pandas-docs/stable/generated/pandas.ExcelFile.parse.html

What was the reason that it was removed?

Also, how should I process excel file by chunks in latest version?

Issue Analytics

  • State:closed
  • Created 6 years ago
  • Comments:6 (4 by maintainers)

github_iconTop GitHub Comments

1reaction
jorisvandenbosschecommented, Jul 27, 2017

Yes, that feature request is covered by #8011

0reactions
swaritscommented, Jun 22, 2020

@EugeneKovalev It was removed because the excel files would read up into memory as a whole during parsing because of the nature of XLSX file format. Hence, it’d cause ‘MemoryError’ if the file was large. And chunksize wouldn’t change this behavior in case of excel files, but it works perfect in CSVs coz they could be loaded into memory in parts.

Read more comments on GitHub >

github_iconTop Results From Across the Web

Is there a chunksize argument for read_excel in pandas?
Edit: I've read the question re: reading an excel file in chunks (Reading a portion of a large xlsx file with python), however,...
Read more >
[Code]-Is there a chunksize argument for read_excel in pandas?
I'm trying to create a progress bar for reading excel data into pandas using tqdm. I can do this easily with a csv...
Read more >
Big Data from Excel to Pandas | Python Charmers
Similarly to nrows above, the argument chunksize defines how many rows will be read from the top.
Read more >
pandas.read_excel — pandas 0.24.0rc1 documentation
Optional keyword arguments can be passed to TextFileReader . Returns: DataFrame or dict of DataFrames. DataFrame from the passed in Excel file. See...
Read more >
How to Load a Massive File as small chunks in Pandas?
Output: We have a total of 159571 non-null rows. Example 2: Loading a massive amounts of data using chunksize argument.
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found