question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Expose ability to set blob metadata in GCSHook

See original GitHub issue

Description

Expose ability to set blob metadata in the GCSHook. Probably via the upload method with a new parameter: metadata: Optional[Dict[str, str]] = None.

Use case / motivation

As a best practice, I always set blob metadata attributes which are useful or provide more information. Any information about the blob, such as tracking information, foreign keys, job information, etc.

Using the Google SDK, this is what it looks like:

metadata = {
    "job-id": job_id,
    "user-id": user_id,
    "batch-size": batch_size,
    # ...
}

storage_client = storage.Client()
bucket = storage_client.bucket(gcp_bucket_name)
blob = bucket.blob(destination_blob_name)
blob.content_type = content_type
blob.metadata = metadata

Issue Analytics

  • State:open
  • Created 2 years ago
  • Comments:7 (4 by maintainers)

github_iconTop GitHub Comments

1reaction
potiukcommented, Jun 16, 2021

And i believe you can make it in non-breaking way. Airflow requires all the operator init args.to be keyword args, so adding a new keyword with default should not be breaking

1reaction
potiukcommented, Jun 16, 2021

Airflow is community driven, so it is normal for users like you to contribute. Many do. It’s a nice way to give back to the community.

See CONTRIBITING.rst in the root of the repo (the repo is exactly the one we are discussing it).

Read more comments on GitHub >

github_iconTop Results From Across the Web

View and edit object metadata | Cloud Storage
View object metadata · In the Google Cloud console, go to the Cloud Storage Buckets page. · In the list of buckets, click...
Read more >
Manage properties and metadata for a blob with .NET
Learn how to set and retrieve system properties and store custom metadata on blobs in your Azure Storage account using the .
Read more >
airflow.providers.google.cloud.hooks.gcs
GCSHook. Interact with Google Cloud Storage. This hook uses the Google Cloud ... Return True if given Google Cloud Storage URL (gs://<bucket>/<blob>) ...
Read more >
Misconfigured Azure Blob Storage Exposed the Data of 65K ...
However, blocking some types of cookies may impact your experience of the site and the services we are able to offer. Our Privacy...
Read more >
Hunting Azure Blobs Exposes Millions of Sensitive Files
Container and blob data can be read by anonymous request, except for container permission settings and container metadata. Clients can enumerate ...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found