[CLI]: Wandb tries to write to /wandb/ even when dir argument is specified
See original GitHub issueDescribe the bug
When working on a distributed setup wandb seems to be using the /tmp directory to store some things and the directory specified in the dir argument.
Here is how I initialize the wandb run
if global_rank == 0:
import wandb
os.makedirs(args.output_dir, exist_ok=True)
wandb.init(project="geo-pretrain", name=args.experiment, dir=args.output_dir, config=args.__dict__, resume=True)
However when running it I get:
[default0]:wandb: WARNING Path /wandb/ wasn't writable, using system temp directory.
[default0]:wandb: Currently logged in as: dvd42. Use `wandb login --relogin` to force relogin
[default0]:wandb: Tracking run with wandb version 0.13.6
[default0]:wandb: Run data is saved locally in /mnt/home/git/geo-pretrain/output/task_0_warmup/wandb/run-20221209_170413-r4d8mrji
And it is populating my system temp directory with stuff, as well as the specified folder.
Additional Files
No response
Environment
WandB version:
OS:
Python version:
Versions of relevant libraries:
Additional Context
No response
Issue Analytics
- State:
- Created 9 months ago
- Comments:5 (2 by maintainers)
Top Results From Across the Web
Launch Experiments with wandb.init - Documentation
Call wandb. init in all your processes, using the group keyword argument to define a shared group. Each process will have its own...
Read more >Issues · wandb/wandb - GitHub
This repo contains the CLI and Python API. ... [CLI]: Wandb tries to write to /wandb/ even when dir argument is specified cli....
Read more >Weights and Biases: Login and network errors - Stack Overflow
Still same error. Using wandb version 0.10.2 on Ubuntu 18.04; Also, tried downgrading to version 0.8.36, no change. If I try the command: ......
Read more >wandb - PyPI
A CLI and library for interacting with the Weights and Biases API. ... import wandb # Your custom arguments defined here args =...
Read more >Technical FAQ · GitBook
Set the environment variable WANDB_MODE=dryrun to save the metrics locally, no internet required. When you're ready, run wandb init in your directory to...
Read more >
Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free
Top Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Hi @dvd42, it sounds like
/mnt/
is a network attached drive correct? Would it be possible to make the wandb directory on a drive that is local to each training machine?I’ve seen this on clusters where the NAS becomes temporarily unavailable through either a network hiccup, too many machines trying to write at the same time, etc… so the machines write to
/temp
until it can access the NAS. This may be a behavior that can be changed through whichever service you are using to run your cluster but I don’t know much about your setup.Hi, yeah I guess this might be it, when I run the script on my local machine the issue dissappears.