question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

MemoryError: While reading csv file

See original GitHub issue

Code Sample, a copy-pastable example if possible

# Your code here
import pandas as pd
data = pd.read_csv("stochastic_data_2015.01.01 00_00_.csv",encoding = "utf-16", header = 0,sep="\t")
data.head(10)

The file size is: 27 GB. The RAM I have is 8 GB with 250 GB SSD hard disk on Windows 10.

Problem description

As you can see that my RAM is limited. So I thought that the Memory Error might have occurred due to it. But sometime the file is read but when next trying to do so I am getting the following error:
gist of complete error

Kindly, suggest me what I can do as it is impacting my data analysis. Data Analysis is one of the function performed with pandas, if it is not capable of handling data in less resource, then I am afraid that I nee to leave it and try something else. Which will be a time consuming process, as I am use to of Pandas.

Do let me know the solution. Is there no way that I release the memory of the dataframe not in use?

Issue Analytics

  • State:closed
  • Created 5 years ago
  • Reactions:1
  • Comments:11

github_iconTop GitHub Comments

2reactions
ginwardcommented, Jul 6, 2018

I suggest you do data = pd.read_csv("stochastic_data_2015.01.01 00_00_.csv",encoding = "utf-16", header = 0,sep="\t",nrows=10)

Consider using a more powerful machine (for example, HPC). You can’t blame a Honda for not running as fast as a Ferrari, right?

1reaction
linehammercommented, May 10, 2021

Memory errors happens a lot with python when using the 32bit Windows version . This is because 32bit processes only gets 2GB of memory to play with by default.

The solution for this error is that pandas.read_csv() function takes an option called dtype. This lets pandas know what types exist inside your csv data.

For example: by specifying dtype={‘age’:int} as an option to the .read_csv() will let pandas know that age should be interpreted as a number. This saves you lots of memory.

pd.read_csv('data.csv',dtype={'age':int})

Or try the solution below:

pd.read_csv('data.csv',sep='\t',low_memory=False)

Read more comments on GitHub >

github_iconTop Results From Across the Web

Memory error when using pandas read_csv - Stack Overflow
Memory error when using pandas read_csv · 1. Definitely pandas should not be having issues with csvs that size. · 1. You can...
Read more >
How to handle "Memory Error" while loading a huge file in ...
... memory error while working with a huge file in Pandas-Python.# MemoryError #Python #Pandas# How to read a sample data from csv file...
Read more >
Reading Large File as Pandas DataFrame Memory Error Issue
code https://soumilshah1995.blogspot.com/
Read more >
MemoryError when calling pd.read_csv - Google Groups
MemoryError. It works fine on the same machine using: import pandas as pd. df = pd.read_csv( file, quotechar='"', encoding='latin-1', index_col=False, ...
Read more >
How to avoid Memory errors with Pandas
One strategy for solving this kind of problem is to decrease the amount of data by either reducing the number of rows or...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found