Expose ability to set blob metadata in GCSHook
See original GitHub issueDescription
Expose ability to set blob metadata in the GCSHook. Probably via the upload method with a new parameter: metadata: Optional[Dict[str, str]] = None
.
Use case / motivation
As a best practice, I always set blob metadata attributes which are useful or provide more information. Any information about the blob, such as tracking information, foreign keys, job information, etc.
Using the Google SDK, this is what it looks like:
metadata = {
"job-id": job_id,
"user-id": user_id,
"batch-size": batch_size,
# ...
}
storage_client = storage.Client()
bucket = storage_client.bucket(gcp_bucket_name)
blob = bucket.blob(destination_blob_name)
blob.content_type = content_type
blob.metadata = metadata
Issue Analytics
- State:
- Created 2 years ago
- Comments:7 (4 by maintainers)
Top Results From Across the Web
View and edit object metadata | Cloud Storage
View object metadata · In the Google Cloud console, go to the Cloud Storage Buckets page. · In the list of buckets, click...
Read more >Manage properties and metadata for a blob with .NET
Learn how to set and retrieve system properties and store custom metadata on blobs in your Azure Storage account using the .
Read more >airflow.providers.google.cloud.hooks.gcs
GCSHook. Interact with Google Cloud Storage. This hook uses the Google Cloud ... Return True if given Google Cloud Storage URL (gs://<bucket>/<blob>) ...
Read more >Misconfigured Azure Blob Storage Exposed the Data of 65K ...
However, blocking some types of cookies may impact your experience of the site and the services we are able to offer. Our Privacy...
Read more >Hunting Azure Blobs Exposes Millions of Sensitive Files
Container and blob data can be read by anonymous request, except for container permission settings and container metadata. Clients can enumerate ...
Read more >Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start FreeTop Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Top GitHub Comments
And i believe you can make it in non-breaking way. Airflow requires all the operator init args.to be keyword args, so adding a new keyword with default should not be breaking
Airflow is community driven, so it is normal for users like you to contribute. Many do. It’s a nice way to give back to the community.
See CONTRIBITING.rst in the root of the repo (the repo is exactly the one we are discussing it).