HS2Error when running as_pandas
See original GitHub issueI’m running a smallish query (the result is 8MB of data), and getting an HS2Error
when I try to read the data. as_pandas
is working on smaller queries. Any idea what could be going on here?
Here’s what I’m running:
import impala.dbapi
from impala.util import as_pandas
c = impala.dbapi.connect(port=21050).cursor() # works fine
c.execute("[my query]") # works fine
df = as_pandas(c) # oh no!
and the error:
---------------------------------------------------------------------------
HS2Error Traceback (most recent call last)
<ipython-input-5-bee92ca13acd> in <module>()
----> 1 df = as_pandas(c)
/Users/jocelyn/anaconda/lib/python2.7/site-packages/impyla-0.9.0_dev-py2.7.egg/impala/util.pyc
in as_pandas(cursor)
21 def as_pandas(cursor):
22 names = [metadata[0] for metadata in cursor.description]
---> 23 return pd.DataFrame([dict(zip(names, row)) for row in
cursor], columns=names)
24 except ImportError:
25 print "Failed to import pandas"
/Users/jocelyn/anaconda/lib/python2.7/site-packages/impyla-0.9.0_dev-py2.7.egg/impala/dbapi.pyc
in next(self)
246 rows = impala.rpc.fetch_results(self.service,
247 self._last_operation_handle, self.description,
--> 248 self.buffersize)
249 self._buffer.extend(rows)
250 if len(self._buffer) == 0:
/Users/jocelyn/anaconda/lib/python2.7/site-packages/impyla-0.9.0_dev-py2.7.egg/impala/rpc.pyc
in wrapper(*args, **kwargs)
116 if not transport.isOpen():
117 transport.open()
--> 118 return func(*args, **kwargs)
119 except socket.error as e:
120 pass
/Users/jocelyn/anaconda/lib/python2.7/site-packages/impyla-0.9.0_dev-py2.7.egg/impala/rpc.pyc
in fetch_results(service, operation_handle, schema, max_rows,
orientation)
235 maxRows=max_rows)
236 resp = service.FetchResults(req)
--> 237 err_if_rpc_not_ok(resp)
238
239 rows = []
/Users/jocelyn/anaconda/lib/python2.7/site-packages/impyla-0.9.0_dev-py2.7.egg/impala/error.pyc
in err_if_rpc_not_ok(resp)
55 if (resp.status.statusCode !=
TStatusCode._NAMES_TO_VALUES['SUCCESS_STATUS'] and
56 resp.status.statusCode !=
TStatusCode._NAMES_TO_VALUES['SUCCESS_WITH_INFO_STATUS']):
---> 57 raise HS2Error(resp.status.errorMessage)
HS2Error: Invalid session id
Issue Analytics
- State:
- Created 9 years ago
- Comments:15 (10 by maintainers)
Top Results From Across the Web
How to solve import error for pandas? - python - Stack Overflow
I got an error message with ipython ImportError: C extension: iNaT not built. If you want to import pandas from the source directory,...
Read more >Solved: SQL Query Failed with Cloudera Hive JDBC driver bu...
We have a Hive SQL that runs fine in Hue but when we run that same query via Hive JDBC driver it fails...
Read more >pandas.errors.ParserWarning — pandas 1.5.2 documentation
Read general delimited file into DataFrame. Examples. Using a sep in pd.read_csv other than a single character: >>>
Read more >pandas.errors.ParserError — pandas 1.5.2 documentation
This is a generic error raised for errors encountered when functions like read_csv or read_html are parsing contents of a file. See also....
Read more >pandas.read_csv — pandas 1.5.2 documentation
Using this parameter results in much faster parsing time and lower memory usage. ... Changed in version 1.2: When encoding is None ,...
Read more >Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start FreeTop Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Top GitHub Comments
Sounds like a timeout. You may want to increase your Connection’s timeout.
After invalidating a table it can take quite a lot of hive metastore calls before it becomes operational again.
Adding a timeout to the connection didn’t help. I had to switch to pyhive. 😕