[CLI] With WANDB_MODE=offline python client still try to sync results
See original GitHub issueDescription I use W&B on a server without an Internet connection. For working with W&B, I set WANDB_MODE=offline then sync my runs from another machine with an Internet connection. All work fine until I update to wandb version 0.12.2. With version 0.12.2 when the program ends, I get the message: “… wandb: Network error (ConnectTimeout), entering retry loop.” and the program cannot terminate.
The code to reproduce, run on a machine without internet connection:
# test_wandb_offline.py
import os
os.environ["WANDB_MODE"]="offline"
import numpy as np
import pandas as pd
import wandb
if __name__ == "__main__":
print(wandb.__version__)
wandb.init(project="my_wandb_test_project")
wandb.config["run_1"] = 1
wandb.log({"val_1":1})
wandb.log({"val_1":2})
wandb.log({"val_1":3})
wandb.summary.update({"val_2":123})
print("Done!")
With wandb version 0.12.2 the program cannot terminate and output looks like:
0.12.2
wandb: W&B syncing is set to `offline` in this directory. Run `wandb online` or set WANDB_MODE=online to enable cloud syncing.
~/anaconda/lib/python3.8/site-packages/requests/__init__.py:89: RequestsDependencyWarning: urllib3 (1.26.6) or chardet (3.0.4) doesn't match a supported version!
warnings.warn("urllib3 ({}) or chardet ({}) doesn't match a supported "
Done!
wandb: Waiting for W&B process to finish, PID 24846
wandb: Program ended successfully.
wandb: Network error (ConnectTimeout), entering retry loop.
With wandb version 0.12.1 and early, the program terminates normally.
Environment
- OS: CentOS Linux release 7.9.2009 (Core)
- Environment: n/a
- Python Version: 3.7.0
Issue Analytics
- State:
- Created 2 years ago
- Comments:10 (2 by maintainers)
Top Results From Across the Web
[CLI] With WANDB_MODE=offline python client still try to ...
With wandb version 0.12.2 the program cannot terminate and output looks like: 0.12.2 wandb: W&B syncing is set to `offline` in this directory....
Read more >sync — AWS CLI 1.27.34 Command Reference
The current local directory has no files: aws s3 sync s3://mybucket . Output: download: s3://mybucket/test ...
Read more >Paramiko: read from standard output of remotely executed ...
@jabaldonedo I have tried this code with my project, what I am having an issue with is that I SSH in and when...
Read more >Databricks CLI | Databricks on AWS
The CLI is built on top of the Databricks REST API and is ... and running the CLI with this Python installation results...
Read more >Odrive Sync Agent: A CLI/scriptable interface for ...
The odrive CLI is used from the command-line or from shell scripts to control and enhance Sync Agent behavior. The client is written...
Read more >
Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free
Top Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Thanks for the report, we’re tracking this and will get a fix in the next release. Until then please use 0.12.1 on nodes without internet and
WANDB_MODE=offline
.Hi, i’m still having the same problem on 0.12.14 on an HPC that doesn’t have internet access. More specifically, I’m running PyTorch lightning 1.6.0 and have no problem for the first training logging, but the error appears when finishing the first validation epoch. Also running the same code on a computer having internet access but with offline mode, the problem doesn’t appear. Didn’t have the problem previously on 0.12.4.