Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

MemoryError: While reading csv file

See original GitHub issue

Code Sample, a copy-pastable example if possible

# Your code here
import pandas as pd
data = pd.read_csv("stochastic_data_2015.01.01 00_00_.csv",encoding = "utf-16", header = 0,sep="\t")
data.head(10)

The file size is: 27 GB. The RAM I have is 8 GB with 250 GB SSD hard disk on Windows 10.

Problem description

As you can see that my RAM is limited. So I thought that the Memory Error might have occurred due to it. But sometime the file is read but when next trying to do so I am getting the following error:
gist of complete error

Kindly, suggest me what I can do as it is impacting my data analysis. Data Analysis is one of the function performed with pandas, if it is not capable of handling data in less resource, then I am afraid that I nee to leave it and try something else. Which will be a time consuming process, as I am use to of Pandas.

Do let me know the solution. Is there no way that I release the memory of the dataframe not in use?

Issue Analytics

State:
Created 5 years ago
Reactions:1
Comments:11

Top GitHub Comments

2reactions

ginwardcommented, Jul 6, 2018

I suggest you do data = pd.read_csv("stochastic_data_2015.01.01 00_00_.csv",encoding = "utf-16", header = 0,sep="\t",nrows=10)

Consider using a more powerful machine (for example, HPC). You can’t blame a Honda for not running as fast as a Ferrari, right?

1reaction

linehammercommented, May 10, 2021

Memory errors happens a lot with python when using the 32bit Windows version . This is because 32bit processes only gets 2GB of memory to play with by default.

The solution for this error is that pandas.read_csv() function takes an option called dtype. This lets pandas know what types exist inside your csv data.

For example: by specifying dtype={‘age’:int} as an option to the .read_csv() will let pandas know that age should be interpreted as a number. This saves you lots of memory.

pd.read_csv('data.csv',dtype={'age':int})

Or try the solution below:

pd.read_csv('data.csv',sep='\t',low_memory=False)

Top Results From Across the Web

Memory error when using pandas read_csv - Stack Overflow

Memory error when using pandas read_csv · 1. Definitely pandas should not be having issues with csvs that size. · 1. You can...

How to handle "Memory Error" while loading a huge file in ...

... memory error while working with a huge file in Pandas-Python.# MemoryError #Python #Pandas# How to read a sample data from csv file...

Reading Large File as Pandas DataFrame Memory Error Issue

code https://soumilshah1995.blogspot.com/

MemoryError when calling pd.read_csv - Google Groups

MemoryError. It works fine on the same machine using: import pandas as pd. df = pd.read_csv( file, quotechar='"', encoding='latin-1', index_col=False, ...

How to avoid Memory errors with Pandas

One strategy for solving this kind of problem is to decrease the amount of data by either reducing the number of rows or...