Cannot send lineage data
See original GitHub issueHave a very simple test that creates and materializes 1 dataframe using the following SQL query : select 123 as id
I get the following error
15:25:49 [ScalaTest-run-running-LineageTest] WARN org.apache.spark.sql.util.ExecutionListenerManager Error executing query execution listener
java.lang.RuntimeException: Cannot send lineage data to http://localhost:8080/producer/execution-plans
at za.co.absa.spline.harvester.dispatcher.HttpLineageDispatcher.sendJson(HttpLineageDispatcher.scala:57)
rest gateway is returning :
{
"error": "JSON parse error: ; nested exception is com.twitter.finatra.json.internal.caseclass.exceptions.CaseClassMappingException: \nErrors:\t\tcom.twitter.finatra.json.internal.caseclass.exceptions.CaseClassValidationException: operations.other.childIds: field is required\n\n"
}
Execution plan sent to the rest gateway by spline agent has indeed a missing childIds field for operations.other with id=3
{
"id": "070e41dd-7d70-422e-a8f7-305ab4fcdd92",
"operations": {
"write": {
"outputSource": "file:/c:/tmp/instrument",
"append": false,
"id": 0,
"childIds": [
1
],
"params": {
"path": "c:\\tmp\\instrument"
},
"extra": {
"name": "InsertIntoHadoopFsRelationCommand",
"destinationType": "Parquet"
}
},
"other": [
{
"id": 3,
"extra": {
"name": "OneRowRelation"
}
},
{
"id": 2,
"childIds": [
3
],
"schema": [
"ae29dfe3-39fe-4271-8482-f6826b2c00b5"
],
"params": {
"projectList": [
{
"_typeHint": "expr.Alias",
"alias": "id",
"child": {
"_typeHint": "expr.Literal",
"value": 123,
"dataTypeId": "129f2969-214f-43dd-8f13-ebf285c6cb5f"
}
}
]
},
"extra": {
"name": "Project"
}
},
{
"id": 1,
"childIds": [
2
],
"schema": [
"1e8bf5b0-1863-4674-b802-f6c238a5cf90"
],
"params": {
"name": "`instrument`"
},
"extra": {
"name": "SubqueryAlias"
}
}
]
},
"systemInfo": {
"name": "spark",
"version": "2.4.1"
},
"agentInfo": {
"name": "spline",
"version": "0.4.0"
},
"extraInfo": {
"appName": "runner_test",
"dataTypes": [
{
"_typeHint": "dt.Simple",
"id": "129f2969-214f-43dd-8f13-ebf285c6cb5f",
"name": "integer",
"nullable": false
}
],
"attributes": [
{
"id": "ae29dfe3-39fe-4271-8482-f6826b2c00b5",
"name": "id",
"dataTypeId": "129f2969-214f-43dd-8f13-ebf285c6cb5f"
},
{
"id": "1e8bf5b0-1863-4674-b802-f6c238a5cf90",
"name": "id",
"dataTypeId": "129f2969-214f-43dd-8f13-ebf285c6cb5f"
}
]
}
}
Adding an empty array childIds field and sending the same json with an API testing tool works fine.
Issue Analytics
- State:
- Created 4 years ago
- Comments:5 (3 by maintainers)
Top Results From Across the Web
Developers - Cannot send lineage data .... Read timed out error -
I am trying to integrate my spark jobs with spline. Added the below dependency za.co.absa.spline.agent.spark agent-core_2.12 0.6.1. Error :
Read more >Lineage OS 7.1.1: Unable to send SMS messages | XDA Forums
I'm using LOS 7.1.1 20170207 kltespr nightly, LTE SMS can only receive, sending doesn't work. Only 3G SMS works. Things I noticed:
Read more >Azure Databricks: trying to run Spline for capturing Spark ...
This is because you are trying to execute a shell command as a Scala code snippet. Please follow this guide explaining how to...
Read more >Capture and view data lineage with Unity Catalog
Unity Catalog captures lineage to the column level as much as possible. However, there are some cases where column-level lineage cannot be ...
Read more >Spline: Central Data-Lineage Tracking, Not Only For Spark
Data lineage tracking continues to be a major problem for many organizations. The variety of data ... Your browser can't play this video....
Read more >Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start FreeTop Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Top GitHub Comments
@shubhluck, from your logs it looks like you are using an old snapshot Spline version. The config parameter
spline.server.rest_endpoint
doesn’t exist in Spline 0.4.0. Please make sure you are running a correct version.Our tests passed. Closing the issue. Please open another one if the error persists.