question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

[CLI]: Wandb tries to write to /wandb/ even when dir argument is specified

See original GitHub issue

Describe the bug

When working on a distributed setup wandb seems to be using the /tmp directory to store some things and the directory specified in the dir argument.

Here is how I initialize the wandb run

if global_rank == 0:
    import wandb
    os.makedirs(args.output_dir, exist_ok=True)
    wandb.init(project="geo-pretrain", name=args.experiment, dir=args.output_dir, config=args.__dict__, resume=True)

However when running it I get:

[default0]:wandb: WARNING Path /wandb/ wasn't writable, using system temp directory.
[default0]:wandb: Currently logged in as: dvd42. Use `wandb login --relogin` to force relogin
[default0]:wandb: Tracking run with wandb version 0.13.6
[default0]:wandb: Run data is saved locally in /mnt/home/git/geo-pretrain/output/task_0_warmup/wandb/run-20221209_170413-r4d8mrji

And it is populating my system temp directory with stuff, as well as the specified folder.

Additional Files

No response

Environment

WandB version:

OS:

Python version:

Versions of relevant libraries:

Additional Context

No response

Issue Analytics

  • State:open
  • Created 9 months ago
  • Comments:5 (2 by maintainers)

github_iconTop GitHub Comments

1reaction
nate-wandbcommented, Dec 15, 2022

Hi @dvd42, it sounds like /mnt/ is a network attached drive correct? Would it be possible to make the wandb directory on a drive that is local to each training machine?

I’ve seen this on clusters where the NAS becomes temporarily unavailable through either a network hiccup, too many machines trying to write at the same time, etc… so the machines write to /temp until it can access the NAS. This may be a behavior that can be changed through whichever service you are using to run your cluster but I don’t know much about your setup.

0reactions
dvd42commented, Dec 15, 2022

Hi, yeah I guess this might be it, when I run the script on my local machine the issue dissappears.

Read more comments on GitHub >

github_iconTop Results From Across the Web

Launch Experiments with wandb.init - Documentation
Call wandb. init in all your processes, using the group keyword argument to define a shared group. Each process will have its own...
Read more >
Issues · wandb/wandb - GitHub
This repo contains the CLI and Python API. ... [CLI]: Wandb tries to write to /wandb/ even when dir argument is specified cli....
Read more >
Weights and Biases: Login and network errors - Stack Overflow
Still same error. Using wandb version 0.10.2 on Ubuntu 18.04; Also, tried downgrading to version 0.8.36, no change. If I try the command: ......
Read more >
wandb - PyPI
A CLI and library for interacting with the Weights and Biases API. ... import wandb # Your custom arguments defined here args =...
Read more >
Technical FAQ · GitBook
Set the environment variable WANDB_MODE=dryrun to save the metrics locally, no internet required. When you're ready, run wandb init in your directory to...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found