question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Bug: 404 error on retrieving /api/v1/lineage

See original GitHub issue

When we manually adding a dataset and then fetch data via lineage endpoint, on UI we have “Something went wrong while fetching lineage”. When we hit lineage endpoint we expect that job already exist and trying to fetch jobId, as we didn’t have job yet (only dataset), we return 404.

Requests to reproduce:

--Create a namespace
PUT http://localhost:5000/api/v1/namespaces/postgres%3A%2F%2Flocalhost%3A6432
Content-Type: application/json

{
  "ownerName": "Me"
}

###
--Create a source
PUT http://localhost:5000/api/v1/sources/postgres%3A%2F%2Flocalhost%3A6432
Content-Type: application/json

{
  "type": "DB_TABLE",
  "connectionUrl": "postgres://localhost:6432"
}

###

--Create a dataset
PUT http://localhost:5000/api/v1/namespaces/postgres%3A%2F%2Flocalhost%3A6432/datasets/dvdrental.public.actor_info
Content-Type: application/json

{
  "type": "DB_TABLE",
  "physicalName": "dvdrental.public.actor_info",
  "sourceName": "postgres://localhost:6432",
  "fields": [
    {
      "name": "value",
      "type": "string",
      "nullable": true,
      "metadata": {}
    }
  ]
}

###
--Retrieve info
GET http://localhost:3000/api/v1/lineage/?nodeId=dataset:postgres://localhost:6432:dvdrental.public.actor_info

404 response 

Issue Analytics

  • State:open
  • Created 2 years ago
  • Reactions:2
  • Comments:8 (1 by maintainers)

github_iconTop GitHub Comments

1reaction
collado-mikecommented, Sep 29, 2021

I think the core issue is here: https://github.com/MarquezProject/marquez/blob/main/api/src/main/java/marquez/service/LineageService.java#L40-L43 . The LineageService only queries jobs to determine lineage- if the node in the query param is a dataset, it’ll find the first job connected to that dataset, then determine lineage. If there’s a dataset with no jobs, it’ll throw that NodeIdNotFoundException, which will end up returning a 404.

1reaction
phixMecommented, Sep 29, 2021

It appears that there are some special characters that can cause us problems with the way parse out the nodeId from the query parameters.

We won’t be able to transmit this http://localhost:3000/api/v1/lineage/?nodeId=dataset:postgres://localhost:6432:dvdrental.public.actor_info as is because we have no way of knowing what is a delimiter and what is part of the dataset name or namespace.

I will open up an issue to fix this in our web project that will encode the segments of the nodeId that contain names.

Read more comments on GitHub >

github_iconTop Results From Across the Web

Error 404: 4 Ways to Fix It - Hostinger
Error 404 is a response code, meaning the server could not locate the requested content. Check this article to learn 4 steps to...
Read more >
Error 404 not found - What does it mean & how to fix it! - IONOS
The solution is easy - an HTTP 404 error page appears when a web page can't be found. See why this happens and...
Read more >
What Is a 404 Error? How to Deal With the Web Error
A 404 error indicates that the webpage you're trying to reach can't be found, and usually means that the page has moved or...
Read more >
How to Fix Error 404 Not Found on Your WordPress Site - Kinsta
The Error 404 Not Found status code indicates that the origin server did not find the target resource. Check out these common causes...
Read more >
CSCvq93471 - Getting 404 error page while ... - Cisco Bug
Cisco Bug: CSCvq93471 - Getting 404 error page while ... login will fail and the user will experience 404(resource not found)SSO login will ......
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found