MemoryError: While reading csv file
See original GitHub issueCode Sample, a copy-pastable example if possible
# Your code here
import pandas as pd
data = pd.read_csv("stochastic_data_2015.01.01 00_00_.csv",encoding = "utf-16", header = 0,sep="\t")
data.head(10)
The file size is: 27 GB. The RAM I have is 8 GB with 250 GB SSD hard disk on Windows 10.
Problem description
As you can see that my RAM is limited. So I thought that the Memory Error might have occurred due to it. But sometime the file is read but when next trying to do so I am getting the following error:
gist of complete error
Kindly, suggest me what I can do as it is impacting my data analysis. Data Analysis is one of the function performed with pandas, if it is not capable of handling data in less resource, then I am afraid that I nee to leave it and try something else. Which will be a time consuming process, as I am use to of Pandas.
Do let me know the solution. Is there no way that I release the memory of the dataframe not in use?
Issue Analytics
- State:
- Created 5 years ago
- Reactions:1
- Comments:11
Top Results From Across the Web
Memory error when using pandas read_csv - Stack Overflow
Memory error when using pandas read_csv · 1. Definitely pandas should not be having issues with csvs that size. · 1. You can...
Read more >How to handle "Memory Error" while loading a huge file in ...
... memory error while working with a huge file in Pandas-Python.# MemoryError #Python #Pandas# How to read a sample data from csv file...
Read more >Reading Large File as Pandas DataFrame Memory Error Issue
code https://soumilshah1995.blogspot.com/
Read more >MemoryError when calling pd.read_csv - Google Groups
MemoryError. It works fine on the same machine using: import pandas as pd. df = pd.read_csv( file, quotechar='"', encoding='latin-1', index_col=False, ...
Read more >How to avoid Memory errors with Pandas
One strategy for solving this kind of problem is to decrease the amount of data by either reducing the number of rows or...
Read more >Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start FreeTop Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Top GitHub Comments
I suggest you do
data = pd.read_csv("stochastic_data_2015.01.01 00_00_.csv",encoding = "utf-16", header = 0,sep="\t",nrows=10)
Consider using a more powerful machine (for example, HPC). You can’t blame a Honda for not running as fast as a Ferrari, right?
Memory errors happens a lot with python when using the 32bit Windows version . This is because 32bit processes only gets 2GB of memory to play with by default.
The solution for this error is that pandas.read_csv() function takes an option called dtype. This lets pandas know what types exist inside your csv data.
For example: by specifying dtype={‘age’:int} as an option to the .read_csv() will let pandas know that age should be interpreted as a number. This saves you lots of memory.
pd.read_csv('data.csv',dtype={'age':int})
Or try the solution below:
pd.read_csv('data.csv',sep='\t',low_memory=False)