question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

.collect() call hangs indefinitely when spline lineage tracing is enabled

See original GitHub issue

Describe the bug

Calling .collect() on a DataSet obtained using jdbc hangs indefinitely. Same code works just fine when Spline lineage is turned off. It also works fine on spline 0.5.3, but not on Spline 0.5.5

Versions

Scala 2.11 Spark 2.4.6 Spline 0.5.5 (hang doesn’t occur on Spline 0.5.3)

Components State

  • ArangoDB running without errors
  • ArangoDB spline database initialized
  • Rest Gateway running and
    • connects to ArangoDB
    • there are no errors in logs
  • Spline UI running and
    • connects to Rest Gateway consumer
    • there are no errors in logs

To Reproduce

Steps to reproduce the behavior OR commands run:

  1. Create a relational table with say two columns (using Postgres below, but issue is on any db)
  2. Add some dummy rows
  3. Write some very basic Java Spark code to load the table into a DataSet and call .collect() on it
    SparkSession spark = SparkSession
      .builder()
      .appName("Java Spark SQL basic example")
      .master("local")
      .getOrCreate();
    
    SparkLineageInitializer.enableLineageTracking(spark);

    String dbConnectionUrl = "jdbc:postgresql://postgresserver/spark_labs";
    Properties prop = new Properties();
    prop.setProperty("driver", "org.postgresql.Driver");
    prop.setProperty("user", "****");
    prop.setProperty("password", "****"); 
    
    Dataset<Row> dsp = spark.read().jdbc(dbConnectionUrl, "<tablename>", prop);
    Object rows = dsp.collect(); //Hangs here when Spline is turned on, works when Spline is turned off

Expected behaviour

Expecting the rows to be returned fairly quickly without hanging. While the example above is in Java, same hang occurs when similar code is written in Scala.

Screenshots

Desktop (please complete the following information):

  • OS: [Windows Server 2016]
  • Java 8

Additional context

Same example doesn’t hang in Spline 0.5.3

Issue Analytics

  • State:closed
  • Created 3 years ago
  • Comments:10 (6 by maintainers)

github_iconTop GitHub Comments

1reaction
metametametametacommented, Nov 13, 2020

I verified the following in 0.5.6 agent (running against 0.5.5 spline rest server):

In log4j properties, with

log4j.logger.za.co.absa.spline.harvester=debug

no more hanging

With

log4j.logger.za.co.absa.spline.harvester=trace

continues to hang (as expected in 0.5.6)

So, basically, unless someone wants trace level in 0.5.6, the fix works. Thanks!

1reaction
metametametametacommented, Nov 12, 2020

@metametametameta Can you try to capture a thread dump to see which class or method it’s stuck in?

I have reasons to suspect ObjectStructureDumper. If so then the PR #150 should temporarily fix the issue when the TRACE level logging is disabled.

Here is the thread dump I see in jvisualvm (ObjectStructureDumper does seem to be involved as you suspect)

  java.lang.Thread.State: RUNNABLE
       at java.util.Arrays.copyOf(Arrays.java:3332)
       at java.lang.AbstractStringBuilder.ensureCapacityInternal(AbstractStringBuilder.java:124)
       at java.lang.AbstractStringBuilder.append(AbstractStringBuilder.java:448)
       at java.lang.StringBuilder.append(StringBuilder.java:136)
       at scala.collection.mutable.StringBuilder.append(StringBuilder.scala:200)
       at za.co.absa.spline.harvester.logging.ObjectStructureDumper$.objectToStringRec(ObjectStructureDumper.scala:85)
       at za.co.absa.spline.harvester.logging.ObjectStructureDumper$.dump(ObjectStructureDumper.scala:32)
       at za.co.absa.spline.harvester.LineageHarvester.harvest(LineageHarvester.scala:79)
       at za.co.absa.spline.harvester.QueryExecutionEventHandler.onSuccess(QueryExecutionEventHandler.scala:42)
       at za.co.absa.spline.harvester.listener.SplineQueryExecutionListener$$anonfun$onSuccess$1$$anonfun$apply$mcV$sp$1.apply(SplineQueryExecutionListener.scala:40)
       at za.co.absa.spline.harvester.listener.SplineQueryExecutionListener$$anonfun$onSuccess$1$$anonfun$apply$mcV$sp$1.apply(SplineQueryExecutionListener.scala:40)
       at scala.Option.foreach(Option.scala:257)
       at za.co.absa.spline.harvester.listener.SplineQueryExecutionListener$$anonfun$onSuccess$1.apply$mcV$sp(SplineQueryExecutionListener.scala:40)
       at za.co.absa.spline.harvester.listener.SplineQueryExecutionListener.withErrorHandling(SplineQueryExecutionListener.scala:49)
       at za.co.absa.spline.harvester.listener.SplineQueryExecutionListener.onSuccess(SplineQueryExecutionListener.scala:39)
       at org.apache.spark.sql.util.ExecutionListenerManager$$anonfun$onSuccess$1$$anonfun$apply$mcV$sp$1.apply(QueryExecutionListener.scala:129)
       at org.apache.spark.sql.util.ExecutionListenerManager$$anonfun$onSuccess$1$$anonfun$apply$mcV$sp$1.apply(QueryExecutionListener.scala:128)
       at org.apache.spark.sql.util.ExecutionListenerManager$$anonfun$org$apache$spark$sql$util$ExecutionListenerManager$$withErrorHandling$1.apply(QueryExecutionListener.scala:157)
       at org.apache.spark.sql.util.ExecutionListenerManager$$anonfun$org$apache$spark$sql$util$ExecutionListenerManager$$withErrorHandling$1.apply(QueryExecutionListener.scala:155)
       at scala.collection.immutable.List.foreach(List.scala:392)
       at scala.collection.generic.TraversableForwarder$class.foreach(TraversableForwarder.scala:35)
       at scala.collection.mutable.ListBuffer.foreach(ListBuffer.scala:45)
       at org.apache.spark.sql.util.ExecutionListenerManager.org$apache$spark$sql$util$ExecutionListenerManager$$withErrorHandling(QueryExecutionListener.scala:155)
       at org.apache.spark.sql.util.ExecutionListenerManager$$anonfun$onSuccess$1.apply$mcV$sp(QueryExecutionListener.scala:128)
       at org.apache.spark.sql.util.ExecutionListenerManager$$anonfun$onSuccess$1.apply(QueryExecutionListener.scala:128)
       at org.apache.spark.sql.util.ExecutionListenerManager$$anonfun$onSuccess$1.apply(QueryExecutionListener.scala:128)
       at org.apache.spark.sql.util.ExecutionListenerManager.readLock(QueryExecutionListener.scala:168)
       at org.apache.spark.sql.util.ExecutionListenerManager.onSuccess(QueryExecutionListener.scala:127)
       at org.apache.spark.sql.Dataset.withAction(Dataset.scala:3373)
       at org.apache.spark.sql.Dataset.collect(Dataset.scala:2788)
       at net.jgp.books.spark.ch02.lab100_csv_to_db.JavaSparkSQLExample.runDatasetCreationExample(JavaSparkSQLExample.java:270)
       at net.jgp.books.spark.ch02.lab100_csv_to_db.JavaSparkSQLExample.main(JavaSparkSQLExample.java:115)
Read more comments on GitHub >

github_iconTop Results From Across the Web

.collect() call hangs indefinitely when spline lineage tracing is ...
Calling .collect() on a DataSet obtained using jdbc hangs indefinitely. Same code works just fine when Spline lineage is turned off.
Read more >
ArcGIS 10.2.1 Issues Addressed List - Esri Support
The Delete Selected button is grayed out in a DBF table in an editing session of ... ArcMap 10.1 crashes when ITraceTasks is...
Read more >
Lineage Tracing in Humans Enabled by Mitochondrial ...
Lineage tracing provides key insights into the fate of individual cells in complex organisms. Although effective genetic labeling approaches ...
Read more >
Spline: Central Data-Lineage Tracking, Not Only For Spark
Spline has started as a data- lineage tracking tool for Apache Spark. But now it offers a generic API and model that is...
Read more >
Spline: Central Data-Lineage Tracking, Not Only For Spark
Spline has started as a data-lineage tracking tool for Apache Spark. But now it offers a generic API and model that is capable...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found