question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Make Cloud Storage client retry on backend error

See original GitHub issue

We’re operating at scale on GCS and are regularly experiencing transient HTTP 410 status codes when accessing Cloud storage. Those 410 status codes returned by Cloud storage are bogus though, as they are effectively just hiding an internal backend error on GCS, which is reflected in the error details:

Caused by: com.google.api.client.googleapis.json.GoogleJsonResponseException: 
410 Gone { "code" : 503, "errors" : [ { "domain" : "global", "message" : "Backend Error", "reason" : "backendError" } ], "message" : "Backend Error" }

The google-cloud-storage client does not treat the 410 status code as retryable, understandibly so. It should be retrying on backend errors, though, which are typically exposed with status code 500 or 503. I’m suggesting to treat backend errors in the client in the same way as it treats internal errors, namely match on reason == backendError independently of HTTP status code.

Note that we’re not the first ones to experience this, and the client should be resilient against these transient GCS errors.

Issue Analytics

  • State:closed
  • Created 5 years ago
  • Comments:7 (1 by maintainers)

github_iconTop GitHub Comments

3reactions
yihanzhencommented, Sep 4, 2018

Based on the discussion with storage backend the 410 happens during a JSON API resumable upload session. The error likely indicates that the upload session has already been terminated and retrying the individual HTTP request would not work (the entire upload session has to be restarted). An internal bug has been filed and storage team is actively working on it.

1reaction
yihanzhencommented, Aug 24, 2018

I’ve contacted the storage backend team and if they aren’t against it I’ll add the retry logic.

Read more comments on GitHub >

github_iconTop Results From Across the Web

Demonstrate retry configuration | Cloud Storage
Demonstrate retry configuration. ... Create the client configuration: ... On error, it backs off for 1 second, then 3 seconds, then 9 seconds,...
Read more >
Automatic retries in client libraries | Data Integration
Most Cloud Storage client libraries perform retries automatically. Retries are helpful in case of temporary server errors or connectivity ...
Read more >
GCS: Retry on 410 gone errors in BlobWriteChannel.flushBuffer
This is the error 410 gone. And according to the GCP documentation, “410 gone” means you are trying to upload a file by...
Read more >
Retry asynchronous functions | Cloud Functions for Firebase
If your function has retries enabled, any unhandled error will trigger a retry. Make sure that your code captures any errors that should...
Read more >
Retry pattern - Azure Architecture Center | Microsoft Learn
Context and problem. An application that communicates with elements running in the cloud has to be sensitive to the transient faults that can...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found