A SPARQL LOAD with a presigned URL doesn't work
See original GitHub issueVersion
4.6.1
What happened?
I tried to do a SPARQL LOAD with a Minio presigned URL and instead of actually loading the remote data I got an error and stacktrace telling me: Failed to determine the content type.
When looking into the problem the issue seems to be the implementation of org.apache.jena.util.FileUtils#getFilenameExt. This method makes some assumptions that might be correct for a normal file path, but in the case of a SPARQL LOAD the same method is used, but a URL is passed and a URL, especially in the case of the presigned URL, does not necessarily end with the file extension.
I tried a quick override of the class/method that checks for a question mark in the filename parameter of the method and if that character is found tries to determine the file extension differently:
if (filename.contains("?")) {
try {
URL fileIneed = new URL(filename);
String path = fileIneed.getPath();
return FilenameUtils.getExtension(path);
} catch (MalformedURLException e) {
e.printStackTrace();
}
}
With this hack and a fall back to the original code the SPARQL LOAD works as expected.
Relevant output and stacktrace
[2022-12-12 10:42:18] INFO Fuseki :: [5] POST http://fuseki.localhost/example-data-product.example-data-product.sparql/update
[2022-12-12 10:42:18] WARN Fuseki :: [5] ActionErrorException with cause
org.apache.jena.fuseki.servlets.ActionErrorException: Failed to LOAD 'http://minio-service.minio-dev.svc.cluster.local:9000/example-data-product.ddt.tst.212765240740/example-data-product/s3/test.ttl?X-Amz-Algorithm=AWS4-HMAC-SHA256&X-Amz-Credential=DZOTE2HTDZ0N2O3R252P%2F20221212%2Fus-east-1%2Fs3%2Faws4_request&X-Amz-Date=20221212T104141Z&X-Amz-Expires=604800&X-Amz-Security-Token=eyJhbGciOiJIUzUxMiIsInR5cCI6IkpXVCJ9.eyJhY2Nlc3NLZXkiOiJEWk9URTJIVERaME4yTzNSMjUyUCIsImV4cCI6MTY3MDg4NDg4NCwicGFyZW50IjoibWluaW9hZG1pbiJ9.-xyMI3oAnQ82xBW4j2vyCHdvzUC33pKIR_YsRO9am6KDus9qisodrCVqHOR9Xc4D4h539MSKDfdqyv70DKFYbg&X-Amz-SignedHeaders=host&versionId=null&X-Amz-Signature=746bfd4562f9fd7cac122dc3e201eea3cdfb208c671a6139dc642719ead1af64' :: Failed to determine the content type: (URI=http://minio-service.minio-dev.svc.cluster.local:9000/example-data-product.ddt.tst.212765240740/example-data-product/s3/test.ttl?X-Amz-Algorithm=AWS4-HMAC-SHA256&X-Amz-Credential=DZOTE2HTDZ0N2O3R252P%2F20221212%2Fus-east-1%2Fs3%2Faws4_request&X-Amz-Date=20221212T104141Z&X-Amz-Expires=604800&X-Amz-Security-Token=eyJhbGciOiJIUzUxMiIsInR5cCI6IkpXVCJ9.eyJhY2Nlc3NLZXkiOiJEWk9URTJIVERaME4yTzNSMjUyUCIsImV4cCI6MTY3MDg4NDg4NCwicGFyZW50IjoibWluaW9hZG1pbiJ9.-xyMI3oAnQ82xBW4j2vyCHdvzUC33pKIR_YsRO9am6KDus9qisodrCVqHOR9Xc4D4h539MSKDfdqyv70DKFYbg&X-Amz-SignedHeaders=host&versionId=null&X-Amz-Signature=746bfd4562f9fd7cac122dc3e201eea3cdfb208c671a6139dc642719ead1af64 : stream=application/octet-stream)
at org.apache.jena.fuseki.servlets.ServletOps.errorOccurred(ServletOps.java:275) ~[fuseki-server.jar:4.6.1]
at org.apache.jena.fuseki.servlets.SPARQL_Update.execute(SPARQL_Update.java:259) ~[fuseki-server.jar:4.6.1]
at org.apache.jena.fuseki.servlets.SPARQL_Update.executeForm(SPARQL_Update.java:207) ~[fuseki-server.jar:4.6.1]
at org.apache.jena.fuseki.servlets.SPARQL_Update.execute(SPARQL_Update.java:110) ~[fuseki-server.jar:4.6.1]
at org.apache.jena.fuseki.servlets.ActionService.executeLifecycle(ActionService.java:58) ~[fuseki-server.jar:4.6.1]
at org.apache.jena.fuseki.servlets.SPARQL_Update.execPost(SPARQL_Update.java:91) ~[fuseki-server.jar:4.6.1]
...
Caused by: org.apache.jena.riot.RiotException: Failed to determine the content type: (URI=http://minio-service.minio-dev.svc.cluster.local:9000/example-data-product.ddt.tst.212765240740/example-data-product/s3/test.ttl?X-Amz-Algorithm=AWS4-HMAC-SHA256&X-Amz-Credential=DZOTE2HTDZ0N2O3R252P%2F20221212%2Fus-east-1%2Fs3%2Faws4_request&X-Amz-Date=20221212T104141Z&X-Amz-Expires=604800&X-Amz-Security-Token=eyJhbGciOiJIUzUxMiIsInR5cCI6IkpXVCJ9.eyJhY2Nlc3NLZXkiOiJEWk9URTJIVERaME4yTzNSMjUyUCIsImV4cCI6MTY3MDg4NDg4NCwicGFyZW50IjoibWluaW9hZG1pbiJ9.-xyMI3oAnQ82xBW4j2vyCHdvzUC33pKIR_YsRO9am6KDus9qisodrCVqHOR9Xc4D4h539MSKDfdqyv70DKFYbg&X-Amz-SignedHeaders=host&versionId=null&X-Amz-Signature=746bfd4562f9fd7cac122dc3e201eea3cdfb208c671a6139dc642719ead1af64 : stream=application/octet-stream)
at org.apache.jena.riot.RDFParser.parseURI(RDFParser.java:380) ~[fuseki-server.jar:4.6.1]
at org.apache.jena.riot.RDFParser.parse(RDFParser.java:360) ~[fuseki-server.jar:4.6.1]
at org.apache.jena.riot.RDFParserBuilder.parse(RDFParserBuilder.java:568) ~[fuseki-server.jar:4.6.1]
at org.apache.jena.riot.RDFDataMgr.parseFromURI(RDFDataMgr.java:737) ~[fuseki-server.jar:4.6.1]
at org.apache.jena.riot.RDFDataMgr.read(RDFDataMgr.java:464) ~[fuseki-server.jar:4.6.1]
at org.apache.jena.riot.RDFDataMgr.read(RDFDataMgr.java:441) ~[fuseki-server.jar:4.6.1]
at org.apache.jena.riot.RDFDataMgr.read(RDFDataMgr.java:421) ~[fuseki-server.jar:4.6.1]
at org.apache.jena.sparql.modify.UpdateEngineWorker.lambda$visit$2(UpdateEngineWorker.java:172) ~[fuseki-server.jar:4.6.1]
at org.apache.jena.sparql.modify.UpdateEngineWorker.executeOperation(UpdateEngineWorker.java:550) ~[fuseki-server.jar:4.6.1]
at org.apache.jena.sparql.modify.UpdateEngineWorker.visit(UpdateEngineWorker.java:157) ~[fuseki-server.jar:4.6.1]
at org.apache.jena.sparql.modify.request.UpdateLoad.visit(UpdateLoad.java:65) ~[fuseki-server.jar:4.6.1]
...
[2022-12-12 10:42:18] INFO Fuseki :: [5] 500 Server Error (106 ms)
Are you interested in making a pull request?
Maybe
Issue Analytics
- State:
- Created 9 months ago
- Comments:5 (4 by maintainers)
Top GitHub Comments
I’ve shortened the stack trace.
The URI is:
in essence
Full stacktrace: stacktrace.txt