question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Client hang for Presto query

See original GitHub issue

Alluxio Version: 2.0.1

Describe the bug

We repeatedly run a query and find it hang when running for several days.

The query is

select count(1) from a table. 

This table we already load in ahead from hdfs to alluxio During the running period, some alluxio workers may be down and some lost data may need to read directly from hdfs.

From jstack, find 30 threads (as we set presto concurrency to 30) stuck at

"20190919_081107_00969_nmmwc.1.7-19-4862" #4862 prio=5 os_prio=0 tid=0x00007f9618009dd0 nid=0xa86a2 waiting on condition [0x00007f925c2f1000]
   java.lang.Thread.State: WAITING (parking)
        at sun.misc.Unsafe.park(Native Method)
        - parking to wait for  <0x00007f9bf8c94008> (a java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
        at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
        at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2039)
        at alluxio.resource.ResourcePool.acquire(ResourcePool.java:132)
        at alluxio.resource.ResourcePool.acquire(ResourcePool.java:77)
        at alluxio.client.file.FileSystemContext.acquireBlockMasterClient(FileSystemContext.java:421)
        at alluxio.client.file.FileSystemContext.acquireBlockMasterClientResource(FileSystemContext.java:441)
        at alluxio.client.block.AlluxioBlockStore.getInStream(AlluxioBlockStore.java:169)
        at alluxio.client.file.FileInStream.positionedReadInternal(FileInStream.java:252)
        at alluxio.client.file.FileInStream.positionedRead(FileInStream.java:232)
        at alluxio.hadoop.HdfsFileInputStream.read(HdfsFileInputStream.java:128)
        at alluxio.hadoop.HdfsFileInputStream.readFully(HdfsFileInputStream.java:145)
        at org.apache.hadoop.fs.FSDataInputStream.readFully(FSDataInputStream.java:111)
        at org.apache.hadoop.fs.FSDataInputStream.readFully(FSDataInputStream.java:120)
        at io.prestosql.parquet.reader.MetadataReader.readFully(MetadataReader.java:313)
        at io.prestosql.parquet.reader.MetadataReader.readFooter(MetadataReader.java:92)
        at io.prestosql.plugin.hive.parquet.ParquetPageSourceFactory.createParquetPageSource(ParquetPageSourceFactory.java:161)
        at io.prestosql.plugin.hive.parquet.ParquetPageSourceFactory.createPageSource(ParquetPageSourceFactory.java:120)
        at io.prestosql.plugin.hive.HivePageSourceProvider.createHivePageSource(HivePageSourceProvider.java:162)
        at io.prestosql.plugin.hive.HivePageSourceProvider.createPageSource(HivePageSourceProvider.java:96)
        at io.prestosql.spi.connector.classloader.ClassLoaderSafeConnectorPageSourceProvider.createPageSource(ClassLoaderSafeConnectorPageSourceProvider.java:53)
        at io.prestosql.split.PageSourceManager.createPageSource(PageSourceManager.java:56)
        at io.prestosql.operator.TableScanOperator.getOutput(TableScanOperator.java:271)
        at io.prestosql.operator.Driver.processInternal(Driver.java:379)
        at io.prestosql.operator.Driver.lambda$processFor$8(Driver.java:283)
        at io.prestosql.operator.Driver$$Lambda$2810/201986911.get(Unknown Source)
        at io.prestosql.operator.Driver.tryWithLock(Driver.java:675)
        at io.prestosql.operator.Driver.processFor(Driver.java:276)
        at io.prestosql.execution.SqlTaskExecution$DriverSplitRunner.processFor(SqlTaskExecution.java:1075)
        at io.prestosql.execution.executor.PrioritizedSplitRunner.process(PrioritizedSplitRunner.java:163)
        at io.prestosql.execution.executor.TaskExecutor$TaskRunner.run(TaskExecutor.java:484)
        at io.prestosql.$gen.Presto_shopee_101_dirty__shopee102____20190918_090137_1.run(Unknown Source)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
        at java.lang.Thread.run(Thread.java:748)

To Reproduce Currently, not clear.

Expected behavior No hanging

Urgency High

Additional context

Issue Analytics

  • State:closed
  • Created 4 years ago
  • Comments:5 (2 by maintainers)

github_iconTop GitHub Comments

1reaction
apc999commented, Jan 10, 2020

@cabhi we will patch the system with #10719 for this issue

0reactions
cabhicommented, Dec 11, 2019

is there any update on the same ?

Read more comments on GitHub >

github_iconTop Results From Across the Web

Presto 0.77 how to debug why query is hanging
Open up the Presto UI (normally http://coordinator.host:8080/), click on your hung query. That will show you which parts of the query are still...
Read more >
Hive Connector — Presto 0.278 Documentation
Overview#. The Hive connector allows querying data stored in a Hive data warehouse. Hive is a combination of three components:.
Read more >
Presto: SQL on Everything - Trino
We describe the SQL dialect that Presto supports, then follow the query lifecycle all the way from client to distributed execution. We also...
Read more >
How to set query timeout when using Presto CLI?
You can set query time limit using query_max_execution_time session property. SET SESSION query_max_execution_time = '30s';.
Read more >
18.48. Release 0.109 — Teradata Distribution of Presto 0.148-t.1.4 ...
Fix bug that could cause JOIN queries to hang forever, if the right side of the JOIN had too little data or ......
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found