question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

HS2Error when running as_pandas

See original GitHub issue

I’m running a smallish query (the result is 8MB of data), and getting an HS2Error when I try to read the data. as_pandas is working on smaller queries. Any idea what could be going on here?

Here’s what I’m running:

import impala.dbapi
from impala.util import as_pandas
c = impala.dbapi.connect(port=21050).cursor() # works fine
c.execute("[my query]") # works fine
df = as_pandas(c) # oh no!

and the error:

---------------------------------------------------------------------------
HS2Error                                  Traceback (most recent call last)
<ipython-input-5-bee92ca13acd> in <module>()
----> 1 df = as_pandas(c)

/Users/jocelyn/anaconda/lib/python2.7/site-packages/impyla-0.9.0_dev-py2.7.egg/impala/util.pyc
in as_pandas(cursor)
     21     def as_pandas(cursor):
     22         names = [metadata[0] for metadata in cursor.description]
---> 23         return pd.DataFrame([dict(zip(names, row)) for row in
cursor], columns=names)
     24 except ImportError:
     25     print "Failed to import pandas"

/Users/jocelyn/anaconda/lib/python2.7/site-packages/impyla-0.9.0_dev-py2.7.egg/impala/dbapi.pyc
in next(self)
    246             rows = impala.rpc.fetch_results(self.service,
    247                     self._last_operation_handle, self.description,
--> 248                     self.buffersize)
    249             self._buffer.extend(rows)
    250             if len(self._buffer) == 0:

/Users/jocelyn/anaconda/lib/python2.7/site-packages/impyla-0.9.0_dev-py2.7.egg/impala/rpc.pyc
in wrapper(*args, **kwargs)
    116                 if not transport.isOpen():
    117                     transport.open()
--> 118                 return func(*args, **kwargs)
    119             except socket.error as e:
    120                 pass

/Users/jocelyn/anaconda/lib/python2.7/site-packages/impyla-0.9.0_dev-py2.7.egg/impala/rpc.pyc
in fetch_results(service, operation_handle, schema, max_rows,
orientation)
    235                            maxRows=max_rows)
    236     resp = service.FetchResults(req)
--> 237     err_if_rpc_not_ok(resp)
    238
    239     rows = []

/Users/jocelyn/anaconda/lib/python2.7/site-packages/impyla-0.9.0_dev-py2.7.egg/impala/error.pyc
in err_if_rpc_not_ok(resp)
     55     if (resp.status.statusCode !=
TStatusCode._NAMES_TO_VALUES['SUCCESS_STATUS'] and
     56             resp.status.statusCode !=
TStatusCode._NAMES_TO_VALUES['SUCCESS_WITH_INFO_STATUS']):
---> 57         raise HS2Error(resp.status.errorMessage)

HS2Error: Invalid session id

Issue Analytics

  • State:closed
  • Created 9 years ago
  • Comments:15 (10 by maintainers)

github_iconTop GitHub Comments

2reactions
mariusvniekerkcommented, Aug 23, 2015

Sounds like a timeout. You may want to increase your Connection’s timeout.

After invalidating a table it can take quite a lot of hive metastore calls before it becomes operational again.

0reactions
wilberhcommented, Jul 10, 2020

Adding a timeout to the connection didn’t help. I had to switch to pyhive. 😕

Read more comments on GitHub >

github_iconTop Results From Across the Web

How to solve import error for pandas? - python - Stack Overflow
I got an error message with ipython ImportError: C extension: iNaT not built. If you want to import pandas from the source directory,...
Read more >
Solved: SQL Query Failed with Cloudera Hive JDBC driver bu...
We have a Hive SQL that runs fine in Hue but when we run that same query via Hive JDBC driver it fails...
Read more >
pandas.errors.ParserWarning — pandas 1.5.2 documentation
Read general delimited file into DataFrame. Examples. Using a sep in pd.read_csv other than a single character: >>>
Read more >
pandas.errors.ParserError — pandas 1.5.2 documentation
This is a generic error raised for errors encountered when functions like read_csv or read_html are parsing contents of a file. See also....
Read more >
pandas.read_csv — pandas 1.5.2 documentation
Using this parameter results in much faster parsing time and lower memory usage. ... Changed in version 1.2: When encoding is None ,...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found