question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Support Google Storage Buckets as output directory

See original GitHub issue
  • Weights and Biases version: 0.8.13
  • Python version: 3.7
  • Operating System: Debian

Description

On Google Cloud, gs:// buckets are not recognised by wandb.init(..., dir="gs://etc/etc")

Reading and writing to buckets requires doing something like this: https://cloud.google.com/appengine/docs/standard/python/googlecloudstorageclient/read-write-to-cloud-storage#writing_to_cloud_storage

Or using tf.gfile

Since TPUs can only read and write checkpoints to gs this prevents us from using wanb whenever we want to use TPU hardware.

Issue Analytics

  • State:closed
  • Created 4 years ago
  • Reactions:6
  • Comments:9 (4 by maintainers)

github_iconTop GitHub Comments

4reactions
aidangomezcommented, Nov 12, 2019

So when using TF w/ TPUs you actually can’t read or write data + checkpoints locally, it’s all done with gs:// (there isn’t an alternative).

So we could transfer checkpoints and summaries from the bucket to the local machine and then sync with wandb, but if I have a model with 5GB checkpoint files I don’t want to have to pause training to perform that large of a sync, so we’ll need to write logic to do this asynchronously.

I think that since read/write from gs buckets is considered standard for TF, wandb stands to gain a great deal from supporting it. The overhead of writing and incorporating into a training script a daemon syncing between gs:// and the local fs means that a lot of people simply won’t use wandb.

Hopefully it will be prioritised at some point to improve usability of wandb for TPU users.

2reactions
issue-label-bot[bot]commented, Oct 31, 2019

Issue-Label Bot is automatically applying the label enhancement to this issue, with a confidence of 0.91. Please mark this comment with 👍 or 👎 to give our bot feedback!

Links: app homepage, dashboard and code for this bot.

Read more comments on GitHub >

github_iconTop Results From Across the Web

Support Google Storage Buckets as output directory · Issue #652
Support Google Storage Buckets as output directory #652 ... On Google Cloud, gs:// buckets are not recognised by wandb.init(..., ...
Read more >
Folders | Cloud Storage - Google Cloud
In the Google Cloud console, go to the Cloud Storage Buckets page. Go to Buckets. Navigate to the bucket. Click on Create folder...
Read more >
Cloud Storage - Google Cloud Platform Console Help
Find Google Cloud Storage in the left side menu of the Google Cloud Platform Console, under Storage. Get started. Create buckets to hold...
Read more >
Refreshing Directory Tables Automatically for Google Cloud ...
Assigning the Custom Role to the Cloud Storage Service Account¶ · Log into the Google Cloud Platform Console as a project editor. ·...
Read more >
Google Cloud Storage — Cyberduck Help documentation
When connecting the first time, you must first create a new bucket with File → New Folder… (macOS ⌘N Windows Ctrl+Shift+N ). You...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found