Support GCS files without credentials
See original GitHub issueProblem description
Be able to read public GCS files without providing credentials.
Steps/code to reproduce the problem
path = "gs://tensorflow-nightly/prod/tensorflow/release/ubuntu_16/gpu_py37_full/nightly_release/18/20190813-010608/github/tensorflow/pip_pkg/tf_nightly_gpu-1.15.0.dev20190813-cp37-cp37m-linux_x86_64.whl"
import smart_open
try:
f = smart_open.smart_open(path)
except Exception as e:
print(e)
import tensorflow as tf
f = tf.io.gfile.GFile(path, "rb")
with open("out.whl", "wb") as fout:
fout.write(f.read())
Running the above code, smart_open
failed with
Could not automatically determine credentials. Please set GOOGLE_APPLICATION_CREDENTIALS or explicitly create credentials and re-run the application. For more information, please see https://cloud.google.com/docs/authentication/getting-started
while tf.io
is able to successfully download the public file, although with a warning:
W tensorflow/core/platform/cloud/google_auth_provider.cc:184] All attempts to get a Google authentication bearer token failed, returning an empty token. Retrieving token from files failed with "Not found: Could not locate the credentials file.". Retrieving token from GCE failed with "Failed precondition: Error executing an HTTP request: libcurl code 6 meaning 'Couldn't resolve host name', error details: Could not resolve host: metadata".
Since it’s possible to download the file, it’s best to not require a credential so that public files can be easily downloaded by anyone.
Versions
Linux-5.4.63-1-lts-x86_64-with-glibc2.2.5 Python 3.8.5 (default, Sep 17 2020, 00:56:56) smart_open 2.1.1
Checklist
Before you create the issue, please make sure you have:
- Described the problem clearly
- Provided a minimal reproducible example, including any required data
- Provided the version numbers of the relevant software
Issue Analytics
- State:
- Created 3 years ago
- Reactions:1
- Comments:10 (6 by maintainers)
Top Results From Across the Web
Upload a file without authentication | Cloud Storage
Upload a file without authentication. ... The new ID for your GCS file ... you can make requests without credentials. const [location] =...
Read more >Google Cloud Storage access without providing credentials?
Yet I was able to retrieve file without supplying any service account keys or authentication tokens from a local server using NodeJS.
Read more >Google Cloud Storage — django-storages 1.12.2 documentation
In most cases, the default service accounts are not sufficient to read/write and sign files in GCS, so you will need to create...
Read more >Google Cloud Storage — Dataiku DSS 11 documentation
“files” with names containing / are not supported ... create an OAuth2 client in your GCP project and configure the credentials in your...
Read more >Working with Cloud Storage (S3, GCS) - Apache Arrow
On Linux when installing from source, S3 and GCS support is not always ... Define them in a ~/.aws/credentials file, according to the...
Read more >
Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free
Top Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
@piskvorky We already have a how-to guide explicitly for capturing edge cases like this.
https://github.com/RaRe-Technologies/smart_open/blob/develop/howto.md
@petedannemann I agree, let’s deal with this in documentation for now.
@ppwwyyxx Please feel free to add to that guide using a PR.
My understanding is that the goal of this project was to provide a unified API for file like objects . I thought handling authentication to the “file systems” to access these file like objects was expected to be so different from system to system that smart_open defers to the underlying Python package’s for each file system for authentication. That is why our
transport_params
kwarg exists. I defer to the maintainers of this project on this topic though.