Client hang for Presto query
See original GitHub issueAlluxio Version: 2.0.1
Describe the bug
We repeatedly run a query and find it hang when running for several days.
The query is
select count(1) from a table.
This table we already load in ahead from hdfs to alluxio During the running period, some alluxio workers may be down and some lost data may need to read directly from hdfs.
From jstack
, find 30 threads (as we set presto concurrency to 30) stuck at
"20190919_081107_00969_nmmwc.1.7-19-4862" #4862 prio=5 os_prio=0 tid=0x00007f9618009dd0 nid=0xa86a2 waiting on condition [0x00007f925c2f1000]
java.lang.Thread.State: WAITING (parking)
at sun.misc.Unsafe.park(Native Method)
- parking to wait for <0x00007f9bf8c94008> (a java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2039)
at alluxio.resource.ResourcePool.acquire(ResourcePool.java:132)
at alluxio.resource.ResourcePool.acquire(ResourcePool.java:77)
at alluxio.client.file.FileSystemContext.acquireBlockMasterClient(FileSystemContext.java:421)
at alluxio.client.file.FileSystemContext.acquireBlockMasterClientResource(FileSystemContext.java:441)
at alluxio.client.block.AlluxioBlockStore.getInStream(AlluxioBlockStore.java:169)
at alluxio.client.file.FileInStream.positionedReadInternal(FileInStream.java:252)
at alluxio.client.file.FileInStream.positionedRead(FileInStream.java:232)
at alluxio.hadoop.HdfsFileInputStream.read(HdfsFileInputStream.java:128)
at alluxio.hadoop.HdfsFileInputStream.readFully(HdfsFileInputStream.java:145)
at org.apache.hadoop.fs.FSDataInputStream.readFully(FSDataInputStream.java:111)
at org.apache.hadoop.fs.FSDataInputStream.readFully(FSDataInputStream.java:120)
at io.prestosql.parquet.reader.MetadataReader.readFully(MetadataReader.java:313)
at io.prestosql.parquet.reader.MetadataReader.readFooter(MetadataReader.java:92)
at io.prestosql.plugin.hive.parquet.ParquetPageSourceFactory.createParquetPageSource(ParquetPageSourceFactory.java:161)
at io.prestosql.plugin.hive.parquet.ParquetPageSourceFactory.createPageSource(ParquetPageSourceFactory.java:120)
at io.prestosql.plugin.hive.HivePageSourceProvider.createHivePageSource(HivePageSourceProvider.java:162)
at io.prestosql.plugin.hive.HivePageSourceProvider.createPageSource(HivePageSourceProvider.java:96)
at io.prestosql.spi.connector.classloader.ClassLoaderSafeConnectorPageSourceProvider.createPageSource(ClassLoaderSafeConnectorPageSourceProvider.java:53)
at io.prestosql.split.PageSourceManager.createPageSource(PageSourceManager.java:56)
at io.prestosql.operator.TableScanOperator.getOutput(TableScanOperator.java:271)
at io.prestosql.operator.Driver.processInternal(Driver.java:379)
at io.prestosql.operator.Driver.lambda$processFor$8(Driver.java:283)
at io.prestosql.operator.Driver$$Lambda$2810/201986911.get(Unknown Source)
at io.prestosql.operator.Driver.tryWithLock(Driver.java:675)
at io.prestosql.operator.Driver.processFor(Driver.java:276)
at io.prestosql.execution.SqlTaskExecution$DriverSplitRunner.processFor(SqlTaskExecution.java:1075)
at io.prestosql.execution.executor.PrioritizedSplitRunner.process(PrioritizedSplitRunner.java:163)
at io.prestosql.execution.executor.TaskExecutor$TaskRunner.run(TaskExecutor.java:484)
at io.prestosql.$gen.Presto_shopee_101_dirty__shopee102____20190918_090137_1.run(Unknown Source)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
To Reproduce Currently, not clear.
Expected behavior No hanging
Urgency High
Additional context
Issue Analytics
- State:
- Created 4 years ago
- Comments:5 (2 by maintainers)
Top Results From Across the Web
Presto 0.77 how to debug why query is hanging
Open up the Presto UI (normally http://coordinator.host:8080/), click on your hung query. That will show you which parts of the query are still...
Read more >Hive Connector — Presto 0.278 Documentation
Overview#. The Hive connector allows querying data stored in a Hive data warehouse. Hive is a combination of three components:.
Read more >Presto: SQL on Everything - Trino
We describe the SQL dialect that Presto supports, then follow the query lifecycle all the way from client to distributed execution. We also...
Read more >How to set query timeout when using Presto CLI?
You can set query time limit using query_max_execution_time session property. SET SESSION query_max_execution_time = '30s';.
Read more >18.48. Release 0.109 — Teradata Distribution of Presto 0.148-t.1.4 ...
Fix bug that could cause JOIN queries to hang forever, if the right side of the JOIN had too little data or ......
Read more >Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start FreeTop Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Top GitHub Comments
@cabhi we will patch the system with #10719 for this issue
is there any update on the same ?