BUG: Syncing takes forever.
See original GitHub issueDescribe the bug
Am using Pytorch lightning for training. Once the training is complete, am getting the dialog
Still waiting for the remaining operations to complete
and this is going on forever ( atleast 2days now)
Reproduction
This isn’t reproducible all the time, only under certain cases… In the past 2 months i would have faced this 3-4 times.
Traceback
### Environment Collecting environment information... PyTorch version: 1.10.0+cu111 Is debug build: False CUDA used to build PyTorch: 11.1 ROCM used to build PyTorch: N/AOS: Ubuntu 20.04.3 LTS (x86_64) GCC version: (Ubuntu 9.3.0-17ubuntu1~20.04) 9.3.0 Clang version: Could not collect CMake version: version 3.22.1 Libc version: glibc-2.31
Python version: 3.8.12 (default, Oct 12 2021, 13:49:34) [GCC 7.5.0] (64-bit runtime) Python platform: Linux-5.11.0-38-generic-x86_64-with-glibc2.17 Is CUDA available: True CUDA runtime version: Could not collect GPU models and configuration: GPU 0: RTX A6000 GPU 1: RTX A6000 GPU 2: RTX A6000 GPU 3: RTX A6000 GPU 4: RTX A6000 GPU 5: RTX A6000 GPU 6: RTX A6000 GPU 7: RTX A6000
Nvidia driver version: 460.91.03 cuDNN version: Could not collect HIP runtime version: N/A MIOpen runtime version: N/A
Versions of relevant libraries: [pip3] mypy==0.910 [pip3] mypy-extensions==0.4.3 [pip3] numpy==1.21.1 [pip3] pytorch-lightning==1.5.7 [pip3] torch==1.10.0+cu111 [pip3] torch-poly-lr-decay==0.0.1 [pip3] torchaudio==0.10.0+cu111 [pip3] torchmetrics==0.6.2 [pip3] neptune-client==0.12.1
Additional context
Add any other context about the problem here.
Issue Analytics
- State:
- Created 2 years ago
- Comments:11 (8 by maintainers)
Top GitHub Comments
Hey @stonelazy
Any updates?
Additionally,
I also got to know that if you stdout lots of data (especially progress bars such as tqdm), Neptune might take a long time to post-process.