question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

read_sas crashes when reading time data

See original GitHub issue

read_sas works just fine with my other tables. I suspect this may have something to do with converting timeseries or timedeltas, given the nature of the error message and the fact that this only happens with a table that contains time series.

> df = pd.read_sas(filename,encoding='utf-8')

  File "<ipython-input-119-402090ec994d>", line 1, in <module>
    vehicle = pd.read_sas('C:/Users/aliceell/Documents/schemas/new_mexico/dr527_veh_1214.sas7bdat',format='sas7bdat',encoding='ISO-8859-1')

  File "C:\Users\aliceell\AppData\Local\Continuum\Anaconda3\lib\site-packages\pandas\io\sas\sasreader.py", line 61, in read_sas
    return reader.read()

  File "C:\Users\aliceell\AppData\Local\Continuum\Anaconda3\lib\site-packages\pandas\io\sas\sas7bdat.py", line 589, in read
    rslt = self._chunk_to_dataframe()

  File "C:\Users\aliceell\AppData\Local\Continuum\Anaconda3\lib\site-packages\pandas\io\sas\sas7bdat.py", line 634, in _chunk_to_dataframe
    rslt[name] = epoch + pd.to_timedelta(rslt[name], unit='d')

  File "C:\Users\aliceell\AppData\Local\Continuum\Anaconda3\lib\site-packages\pandas\util\decorators.py", line 91, in wrapper
    return func(*args, **kwargs)

  File "C:\Users\aliceell\AppData\Local\Continuum\Anaconda3\lib\site-packages\pandas\tseries\timedeltas.py", line 90, in to_timedelta
    values = _convert_listlike(arg._values, box=False, unit=unit)

  File "C:\Users\aliceell\AppData\Local\Continuum\Anaconda3\lib\site-packages\pandas\tseries\timedeltas.py", line 78, in _convert_listlike
    _ensure_object(arg), unit=unit, errors=errors)

  File "pandas\tslib.pyx", line 2933, in pandas.tslib.array_to_timedelta64 (pandas\tslib.c:52393)

  File "pandas\tslib.pyx", line 3253, in pandas.tslib.convert_to_timedelta64 (pandas\tslib.c:55946)

  File "pandas\tslib.pyx", line 3598, in pandas.tslib.cast_from_unit (pandas\tslib.c:61059)

OverflowError: int too big to convert

Output of pd.show_versions()

pd.show_versions() C:\Users\aliceell\AppData\Local\Continuum\Anaconda3\lib\site-packages\matplotlib\__init__.py:1401: UserWarning: This call to matplotlib.use() has no effect because the backend has already been chosen; matplotlib.use() must be called *before* pylab, matplotlib.pyplot, or matplotlib.backends is imported for the first time.

warnings.warn(_use_error_msg)

INSTALLED VERSIONS

commit: None python: 3.5.2.final.0 python-bits: 64 OS: Windows OS-release: 7 machine: AMD64 processor: Intel64 Family 6 Model 37 Stepping 5, GenuineIntel byteorder: little LC_ALL: None LANG: en

pandas: 0.18.1 nose: 1.3.7 pip: 9.0.1 setuptools: 27.2.0 Cython: 0.24.1 numpy: 1.11.1 scipy: 0.18.1 statsmodels: None xarray: None IPython: 5.1.0 sphinx: 1.4.6 patsy: 0.4.1 dateutil: 2.5.3 pytz: 2016.6.1 blosc: None bottleneck: 1.1.0 tables: 3.2.2 numexpr: 2.6.1 matplotlib: 2.0.0 openpyxl: 2.3.2 xlrd: 1.0.0 xlwt: 1.1.2 xlsxwriter: 0.9.3 lxml: 3.6.4 bs4: 4.5.1 html5lib: None httplib2: None apiclient: None sqlalchemy: 1.0.13 pymysql: None psycopg2: None jinja2: 2.8 boto: 2.42.0 pandas_datareader: None

I’m on Windows 7 64-bit.

Google’s giving me nothing. It’s difficult for me to provide sample data because I don’t actually have SAS and so anything I provide would have to be run through pandas or R or something similar to convert it to a CSV, potentially overwriting whatever’s causing the bug. But if you think it would be helpful, I can do that.

Issue Analytics

  • State:closed
  • Created 6 years ago
  • Comments:9 (5 by maintainers)

github_iconTop GitHub Comments

1reaction
aliceellcommented, Apr 27, 2017

Thanks! I had looked at the docs, but didn’t even know where to start…

Here’s the problem. I found the column where it breaks, it’s not the column I was thinking of (unsurprisingly). Apparently the data contains more than one datetime column, and it has several invalid dates like 2316-01-05, 3015-12-10, and 9988-09-07. So I guess I’ll just read it into R for now…

Thanks for your help! I would not have been able to find that problematic column without your help on the debugger.

0reactions
TomAugspurgercommented, Apr 27, 2017

Commandas are here but basically u a few times to go up frames to the interesting ones, I think around

File "C:\Users\aliceell\AppData\Local\Continuum\Anaconda3\lib\site-packages\pandas\io\sas\sas7bdat.py", line 634, in _chunk_to_dataframe

should be informative. Press l to show the lines of code around there, and p <var> to print variables. Like p name to see the column, and p rslt to see the dataframe. Make sure not to paste anything here that’s sensitive 😉

Read more comments on GitHub >

github_iconTop Results From Across the Web

Proc Export causes instant crash of SAS
Instant crash. ... No messages, no log to read, SAS just gone. ... I tried using the Export Data wizard in the menus...
Read more >
Pandas fails to read SAS as iterable - Stack Overflow
I generated a simple dataset in SAS and saved in the SAS7bdat format: data basic; do i=1 to 20; j=i**2; if ...
Read more >
MACS2 memory usage : Crashes when reading ≈3M ...
MACS2 callpeak function crashes when reading ≈3M of the treatment reads as both RAM (8G) and Swap (16G) get completely saturated. Last output...
Read more >
Camtasia 2022 keep on crashing upon launch
Fixed a crash that could occur when a Library asset was missing key information. Camtasia crashes for other reasons. I've read several threads ......
Read more >
pandas.read_sas — pandas 1.5.2 documentation
Read SAS files stored as either XPORT or SAS7BDAT format files. ... Encoding for text data. ... Read file chunksize lines at a...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found