ElasticSearch :: No lineage is capturedSee original GitHub issue
Environment Apache Tomcat 9.0.58 Delta Lake 1.0.0 Elasticsearch-7.13.4 hadoop-3.2.2 Java-11.0.10 spark-3.1.2-bin-hadoop3.2 spark-3.1-spline-agent-bundle_2.12-0.7.3 spline-web-ui-0.7.3 spline-rest-server-0.7.5
Content of spark-defaults.conf
spark.master spark://h8:7077,h5:7077 spark.eventLog.enabled true spark.eventLog.dir hdfs://masters/spark/eventLog spark.serializer org.apache.spark.serializer.KryoSerializer spark.delta.logStore.class org.apache.spark.sql.delta.storage.HDFSLogStore spark.executor.extraClassPath /home/hadoop/SW/extra-libs/* spark.driver.extraClassPath /home/hadoop/SW/extra-libs/* spark.hive.metastore.uris thrift://192.168.21.8:9083 spark.sql.warehouse.dir hdfs://masters/ spark.sql.queryExecutionListeners za.co.absa.spline.harvester.listener.SplineQueryExecutionListener spark.spline.producer.url http://h8:9090/spline-rest/producer
Spline init type codeless
Question Dear team, I just ran into a very tricky problem. I was running Spline as Java application. Spline spark agent was initialized successfully. I created a task which reads data from HDFS stored in format delta lake and write the data into Elasticsearch and submitted it to yarn. Everything looked good so far. Spline initialized successfully and data was written into Elasticsearch. However no lineage data was shown on spline web. Until now I haven’t been able to figure it out. Could you please kindly help me find out the what’s wrong and what I missed? @cerveada @wajda
- Created a year ago
- Comments:20 (10 by maintainers)
Top GitHub Comments
I found I made a extremely stupid mistake. After plenty of experiments, I finally figured out what the real problem is.
First, elasticsearch-hadoop might not be compatible due to reason about spark or scala version. I replaced it with elasticsearch-spark-30_2.12-7.13.4. And here comes the most important part, I forgot to place the jar under
spark.executor.extraClassPath. That’s why ElasticsearchPlugin didn’t recognize the output source. Sorry for my stupid mistake. I think this issue could be closed now. And thank you again for your support.
Hi @wajda, I’ve been working on this. But I’m not sure if I can make it. Here is one information I can share that embedded-elasticsearch is no longer maintained and ES 7.X is not supported. Testcontainer is recommended by their github. I’m stilling learning how to use it.