question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

PROPOSAL: Storage API consistency fix

See original GitHub issue

Guiding principles:

  • Getters and setters should never make HTTP requests. Lazy loading is OK, but only when it involves instance creation / other local (non-network bound) behavior. For example, in Bucket.acl this already happens:
@property
def acl(self):
    """Create our ACL on demand."""
    if self._acl is None:
        self._acl = BucketACL(self)
    return self._acl
  • More generally HTTP requests should be limited to explicit method calls. This also rules out constructors loading data.
  • Blob, Bucket, and *ACL (the only nouns) instances should have load(), exists(), create(), delete(), and update() methods. This design gives rise to code like
blob = Blob('/remote/path.txt', bucket=bucket, properties=properties)
try:
    blob.load()  # Just metadata
except NotFound:
    blob.upload_from_file(filename)  # Sends metadata from properties

(this maybe screams for get_or_create(), we’ll feel it out as we develop). It’s unclear if it’s worth making a distinction between storage.NOUN.update <--> PUT and storage.NOUN.patch <--> PATCH. (As of right now, we don’t implement PUT / update anywhere.)

  • exists() should use fields in the requests to minimize the payload.
  • A Connection should not be required to be bound to any object (one of the nouns Bucket, ACL, or Blob) but should be an optional argument to methods which actually talk to the API.
  • We should strongly consider adding a last_updated field to classes to indicate the last time the values were updated (from the server).
  • For list methods: List all buckets top-level, e.g.
storage.get_all_buckets(connection=optional_connection)

and then bucket.get_all_objects(). It’s unclear how the other 3 nouns (objectAccessControls, bucketAccessControls and defaultObjectAccessControls) will handle this. Right now they are handled via ObjectACL.reload() and BucketACL.reload() (a superclass of DefaultObjectACL).

  • Implicit behavior (default project, default bucket and/or default connection) should be used wherever possible (and documented)

@tseaver Please weigh in. This was inspired by our discussion at the end of #604.

/cc @thobrla I’d like to close #545 and focus on this (possibly broken down into sub-bugs). Does that sound OK?

Issue Analytics

  • State:closed
  • Created 9 years ago
  • Comments:11 (8 by maintainers)

github_iconTop GitHub Comments

1reaction
jgrahncommented, Jun 21, 2016

Hi. Is there any progress on the part that says blob.upload_from_file(filename) # Sends metadata from properties in this issue (or is there a separate issue for it I haven’t found)?

Having to use patch() after upload_...() unfortunately has several more or less serious drawbacks:

  1. It requires the client to have credentials allowing PATCH requests, preventing immutable append-only semantics.
  2. It is not atomic, so if the PATCH operation fails, or the client fails between the calls, the data in the storage service might be left in an inconsistent state.
  3. It lacks consistency. There will be a window when the blob data has been uploaded but when metadata are still incorrect or missing, leading to potential race conditions.
  4. The metadata-generation will never be 1.

These are currently blocking me from using gcloud-python.

User @pdknsk seems to have provided a patch in #536. I think it would make sense implementing that even before/without having the “load(), exists(), create(), delete(), and update()” interface described in this issue.

I’d be happy to provide a pull request if that helps.

Read more comments on GitHub >

github_iconTop Results From Across the Web

Make Storage API consistent and more useful #210 - GitHub
Problem(s). Currently when uploading via supabaseClient.storage.from(bucket).upload(filePath, file) the result is the Key which seems to be ...
Read more >
Consistency | Cloud Storage
This page explains which Cloud Storage operations are strongly consistent and which are eventually consistent. In the case of cacheable, publicly readable ...
Read more >
Eventual consistency in REST APIs - · Los Techies
The easiest approach for our API is to simply not care. Non-stale results should only really appear when we're dealing with queries and ......
Read more >
Consistency levels in Azure Cosmos DB - Microsoft Learn
Azure Cosmos DB has five consistency levels to help balance eventual consistency, availability, and latency trade-offs.
Read more >
Resolve consistency errors when updating the S3 Object Lock ...
To resolve this error, follow these steps: Attempt to make all Storage Nodes or sites available again as soon as possible. If you...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found