chunksize argument removed from read_excel?
See original GitHub issueCode Sample
import pandas as pd
excel = pd.ExcelFile("test.xlsx")
for sheet in excel.sheet_names:
reader = excel.parse(sheet, chunksize=1000)
for chunk in reader:
# process chunk
Problem description
In version 0.16.1 the chunksize argument was available.
See: http://pandas.pydata.org/pandas-docs/version/0.16.1/generated/pandas.ExcelFile.parse.html
But in latest version it’s not available.
https://pandas.pydata.org/pandas-docs/stable/generated/pandas.ExcelFile.parse.html
What was the reason that it was removed?
Also, how should I process excel file by chunks in latest version?
Issue Analytics
- State:
- Created 6 years ago
- Comments:6 (4 by maintainers)
Top Results From Across the Web
Is there a chunksize argument for read_excel in pandas?
Edit: I've read the question re: reading an excel file in chunks (Reading a portion of a large xlsx file with python), however,...
Read more >[Code]-Is there a chunksize argument for read_excel in pandas?
I'm trying to create a progress bar for reading excel data into pandas using tqdm. I can do this easily with a csv...
Read more >Big Data from Excel to Pandas | Python Charmers
Similarly to nrows above, the argument chunksize defines how many rows will be read from the top.
Read more >pandas.read_excel — pandas 0.24.0rc1 documentation
Optional keyword arguments can be passed to TextFileReader . Returns: DataFrame or dict of DataFrames. DataFrame from the passed in Excel file. See...
Read more >How to Load a Massive File as small chunks in Pandas?
Output: We have a total of 159571 non-null rows. Example 2: Loading a massive amounts of data using chunksize argument.
Read more >Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start FreeTop Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Top GitHub Comments
Yes, that feature request is covered by #8011
@EugeneKovalev It was removed because the excel files would read up into memory as a whole during parsing because of the nature of XLSX file format. Hence, it’d cause ‘MemoryError’ if the file was large. And chunksize wouldn’t change this behavior in case of excel files, but it works perfect in CSVs coz they could be loaded into memory in parts.