question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Error during training (Assertion input_val >= zero && input_val <= one failed.)

See original GitHub issue

Problem

thank you for contribution, I encountered gradient exploding during training the model tood_r50_fpn_1x_coco.

  • I tried to train this model in Mix-Precision Training strategy, and the loss scale was set ‘dynamic’. The training soon stopped, and raise RuntimeError: CUDA error: device-side assert triggered.

  • I also retrained the model with FP32 precision, but it did not work.

  • A lower lr did not address gradient exploding.

  • Gradient cutting helps avoid training failure (Mix-Precision Training, loss scale=512.) , but the model can not converge.

    I try to google this issue. I think it is not OOM. It seems to relate with the NaN value in prediction head and further cause the error at calculating loss. I do not know if the environment(mmdet-1.15.0) affects with training.

My modification

  • I port the TOOD code to my working environment (MMDet-1.15.0), without edit.
  • I edit the training config to train my own dataset.

Environment

2021-12-09 16:50:01,643 - mmdet - INFO - Environment info:
------------------------------------------------------------
sys.platform: linux
Python: 3.7.11 (default, Jul 27 2021, 14:32:16) [GCC 7.5.0]
CUDA available: True
GPU 0: NVIDIA GeForce RTX 2070
CUDA_HOME: /usr/local/cuda
NVCC: Build cuda_11.4.r11.4/compiler.30033411_0
GCC: gcc (Ubuntu 7.5.0-3ubuntu1~18.04) 7.5.0
PyTorch: 1.9.0
PyTorch compiling details: PyTorch built with:
  - GCC 7.3
  - C++ Version: 201402
  - Intel(R) oneAPI Math Kernel Library Version 2021.3-Product Build 20210617 for Intel(R) 64 architecture applications
  - Intel(R) MKL-DNN v2.1.2 (Git Hash 98be7e8afa711dc9b66c8ff3504129cb82013cdb)
  - OpenMP 201511 (a.k.a. OpenMP 4.5)
  - NNPACK is enabled
  - CPU capability usage: AVX2
  - CUDA Runtime 11.1
  - NVCC architecture flags: -gencode;arch=compute_37,code=sm_37;-gencode;arch=compute_50,code=sm_50;-gencode;arch=compute_60,code=sm_60;-gencode;arch=compute_61,code=sm_61;-gencode;arch=compute_70,code=sm_70;-gencode;arch=compute_75,code=sm_75;-gencode;arch=compute_80,code=sm_80;-gencode;arch=compute_86,code=sm_86;-gencode;arch=compute_37,code=compute_37
  - CuDNN 8.0.5
  - Magma 2.5.2
  - Build settings: BLAS_INFO=mkl, BUILD_TYPE=Release, CUDA_VERSION=11.1, CUDNN_VERSION=8.0.5, CXX_COMPILER=/opt/rh/devtoolset-7/root/usr/bin/c++, CXX_FLAGS= -Wno-deprecated -fvisibility-inlines-hidden -DUSE_PTHREADPOOL -fopenmp -DNDEBUG -DUSE_KINETO -DUSE_FBGEMM -DUSE_QNNPACK -DUSE_PYTORCH_QNNPACK -DUSE_XNNPACK -DSYMBOLICATE_MOBILE_DEBUG_HANDLE -O2 -fPIC -Wno-narrowing -Wall -Wextra -Werror=return-type -Wno-missing-field-initializers -Wno-type-limits -Wno-array-bounds -Wno-unknown-pragmas -Wno-sign-compare -Wno-unused-parameter -Wno-unused-variable -Wno-unused-function -Wno-unused-result -Wno-unused-local-typedefs -Wno-strict-overflow -Wno-strict-aliasing -Wno-error=deprecated-declarations -Wno-stringop-overflow -Wno-psabi -Wno-error=pedantic -Wno-error=redundant-decls -Wno-error=old-style-cast -fdiagnostics-color=always -faligned-new -Wno-unused-but-set-variable -Wno-maybe-uninitialized -fno-math-errno -fno-trapping-math -Werror=format -Wno-stringop-overflow, LAPACK_INFO=mkl, PERF_WITH_AVX=1, PERF_WITH_AVX2=1, PERF_WITH_AVX512=1, TORCH_VERSION=1.9.0, USE_CUDA=ON, USE_CUDNN=ON, USE_EXCEPTION_PTR=1, USE_GFLAGS=OFF, USE_GLOG=OFF, USE_MKL=ON, USE_MKLDNN=ON, USE_MPI=OFF, USE_NCCL=ON, USE_NNPACK=ON, USE_OPENMP=ON,

TorchVision: 0.10.0
OpenCV: 4.5.3
MMCV: 1.3.10
MMCV Compiler: GCC 7.3
MMCV CUDA Compiler: 11.1
MMDetection: 2.15.0+87eda06
------------------------------------------------------------

Error Report

/opt/conda/conda-bld/pytorch_1623448265233/work/aten/src/ATen/native/cuda/Loss.cu:111: operator(): block: [19,0,0], thread: [32,0,0] Assertion `input_val >= zero && input_val <= one` failed.
/opt/conda/conda-bld/pytorch_1623448265233/work/aten/src/ATen/native/cuda/Loss.cu:111: operator(): block: [19,0,0], thread: [33,0,0] Assertion `input_val >= zero && input_val <= one` failed.
/opt/conda/conda-bld/pytorch_1623448265233/work/aten/src/ATen/native/cuda/Loss.cu:111: operator(): block: [19,0,0], thread: [34,0,0] Assertion `input_val >= zero && input_val <= one` failed.
/opt/conda/conda-bld/pytorch_1623448265233/work/aten/src/ATen/native/cuda/Loss.cu:111: operator(): block: [19,0,0], thread: [35,0,0] Assertion `input_val >= zero && input_val <= one` failed.
/opt/conda/conda-bld/pytorch_1623448265233/work/aten/src/ATen/native/cuda/Loss.cu:111: operator(): block: [19,0,0], thread: [36,0,0] Assertion `input_val >= zero && input_val <= one` failed.
/opt/conda/conda-bld/pytorch_1623448265233/work/aten/src/ATen/native/cuda/Loss.cu:111: operator(): block: [19,0,0], thread: [37,0,0] Assertion `input_val >= zero && input_val <= one` failed.
/opt/conda/conda-bld/pytorch_1623448265233/work/aten/src/ATen/native/cuda/Loss.cu:111: operator(): block: [19,0,0], thread: [38,0,0] Assertion `input_val >= zero && input_val <= one` failed.
/opt/conda/conda-bld/pytorch_1623448265233/work/aten/src/ATen/native/cuda/Loss.cu:111: operator(): block: [19,0,0], thread: [39,0,0] Assertion `input_val >= zero && input_val <= one` failed.
/opt/conda/conda-bld/pytorch_1623448265233/work/aten/src/ATen/native/cuda/Loss.cu:111: operator(): block: [19,0,0], thread: [40,0,0] Assertion `input_val >= zero && input_val <= one` failed.
/opt/conda/conda-bld/pytorch_1623448265233/work/aten/src/ATen/native/cuda/Loss.cu:111: operator(): block: [19,0,0], thread: [41,0,0] Assertion `input_val >= zero && input_val <= one` failed.
/opt/conda/conda-bld/pytorch_1623448265233/work/aten/src/ATen/native/cuda/Loss.cu:111: operator(): block: [19,0,0], thread: [42,0,0] Assertion `input_val >= zero && input_val <= one` failed.
/opt/conda/conda-bld/pytorch_1623448265233/work/aten/src/ATen/native/cuda/Loss.cu:111: operator(): block: [19,0,0], thread: [43,0,0] Assertion `input_val >= zero && input_val <= one` failed.
/opt/conda/conda-bld/pytorch_1623448265233/work/aten/src/ATen/native/cuda/Loss.cu:111: operator(): block: [19,0,0], thread: [44,0,0] Assertion `input_val >= zero && input_val <= one` failed.
/opt/conda/conda-bld/pytorch_1623448265233/work/aten/src/ATen/native/cuda/Loss.cu:111: operator(): block: [19,0,0], thread: [45,0,0] Assertion `input_val >= zero && input_val <= one` failed.
/opt/conda/conda-bld/pytorch_1623448265233/work/aten/src/ATen/native/cuda/Loss.cu:111: operator(): block: [19,0,0], thread: [46,0,0] Assertion `input_val >= zero && input_val <= one` failed.
/opt/conda/conda-bld/pytorch_1623448265233/work/aten/src/ATen/native/cuda/Loss.cu:111: operator(): block: [19,0,0], thread: [47,0,0] Assertion `input_val >= zero && input_val <= one` failed.
/opt/conda/conda-bld/pytorch_1623448265233/work/aten/src/ATen/native/cuda/Loss.cu:111: operator(): block: [19,0,0], thread: [48,0,0] Assertion `input_val >= zero && input_val <= one` failed.
/opt/conda/conda-bld/pytorch_1623448265233/work/aten/src/ATen/native/cuda/Loss.cu:111: operator(): block: [19,0,0], thread: [49,0,0] Assertion `input_val >= zero && input_val <= one` failed.
/opt/conda/conda-bld/pytorch_1623448265233/work/aten/src/ATen/native/cuda/Loss.cu:111: operator(): block: [19,0,0], thread: [50,0,0] Assertion `input_val >= zero && input_val <= one` failed.
/opt/conda/conda-bld/pytorch_1623448265233/work/aten/src/ATen/native/cuda/Loss.cu:111: operator(): block: [19,0,0], thread: [51,0,0] Assertion `input_val >= zero && input_val <= one` failed.
/opt/conda/conda-bld/pytorch_1623448265233/work/aten/src/ATen/native/cuda/Loss.cu:111: operator(): block: [19,0,0], thread: [52,0,0] Assertion `input_val >= zero && input_val <= one` failed.
/opt/conda/conda-bld/pytorch_1623448265233/work/aten/src/ATen/native/cuda/Loss.cu:111: operator(): block: [19,0,0], thread: [53,0,0] Assertion `input_val >= zero && input_val <= one` failed.
/opt/conda/conda-bld/pytorch_1623448265233/work/aten/src/ATen/native/cuda/Loss.cu:111: operator(): block: [19,0,0], thread: [54,0,0] Assertion `input_val >= zero && input_val <= one` failed.
/opt/conda/conda-bld/pytorch_1623448265233/work/aten/src/ATen/native/cuda/Loss.cu:111: operator(): block: [19,0,0], thread: [55,0,0] Assertion `input_val >= zero && input_val <= one` failed.
/opt/conda/conda-bld/pytorch_1623448265233/work/aten/src/ATen/native/cuda/Loss.cu:111: operator(): block: [19,0,0], thread: [56,0,0] Assertion `input_val >= zero && input_val <= one` failed.
/opt/conda/conda-bld/pytorch_1623448265233/work/aten/src/ATen/native/cuda/Loss.cu:111: operator(): block: [19,0,0], thread: [57,0,0] Assertion `input_val >= zero && input_val <= one` failed.
/opt/conda/conda-bld/pytorch_1623448265233/work/aten/src/ATen/native/cuda/Loss.cu:111: operator(): block: [19,0,0], thread: [58,0,0] Assertion `input_val >= zero && input_val <= one` failed.
/opt/conda/conda-bld/pytorch_1623448265233/work/aten/src/ATen/native/cuda/Loss.cu:111: operator(): block: [19,0,0], thread: [59,0,0] Assertion `input_val >= zero && input_val <= one` failed.
/opt/conda/conda-bld/pytorch_1623448265233/work/aten/src/ATen/native/cuda/Loss.cu:111: operator(): block: [19,0,0], thread: [60,0,0] Assertion `input_val >= zero && input_val <= one` failed.
/opt/conda/conda-bld/pytorch_1623448265233/work/aten/src/ATen/native/cuda/Loss.cu:111: operator(): block: [19,0,0], thread: [61,0,0] Assertion `input_val >= zero && input_val <= one` failed.
/opt/conda/conda-bld/pytorch_1623448265233/work/aten/src/ATen/native/cuda/Loss.cu:111: operator(): block: [19,0,0], thread: [62,0,0] Assertion `input_val >= zero && input_val <= one` failed.
/opt/conda/conda-bld/pytorch_1623448265233/work/aten/src/ATen/native/cuda/Loss.cu:111: operator(): block: [19,0,0], thread: [63,0,0] Assertion `input_val >= zero && input_val <= one` failed.
/opt/conda/conda-bld/pytorch_1623448265233/work/aten/src/ATen/native/cuda/Loss.cu:111: operator(): block: [31,0,0], thread: [32,0,0] Assertion `input_val >= zero && input_val <= one` failed.
/opt/conda/conda-bld/pytorch_1623448265233/work/aten/src/ATen/native/cuda/Loss.cu:111: operator(): block: [31,0,0], thread: [33,0,0] Assertion `input_val >= zero && input_val <= one` failed.
/opt/conda/conda-bld/pytorch_1623448265233/work/aten/src/ATen/native/cuda/Loss.cu:111: operator(): block: [31,0,0], thread: [34,0,0] Assertion `input_val >= zero && input_val <= one` failed.
/opt/conda/conda-bld/pytorch_1623448265233/work/aten/src/ATen/native/cuda/Loss.cu:111: operator(): block: [31,0,0], thread: [35,0,0] Assertion `input_val >= zero && input_val <= one` failed.
/opt/conda/conda-bld/pytorch_1623448265233/work/aten/src/ATen/native/cuda/Loss.cu:111: operator(): block: [31,0,0], thread: [36,0,0] Assertion `input_val >= zero && input_val <= one` failed.
/opt/conda/conda-bld/pytorch_1623448265233/work/aten/src/ATen/native/cuda/Loss.cu:111: operator(): block: [31,0,0], thread: [37,0,0] Assertion `input_val >= zero && input_val <= one` failed.
/opt/conda/conda-bld/pytorch_1623448265233/work/aten/src/ATen/native/cuda/Loss.cu:111: operator(): block: [31,0,0], thread: [38,0,0] Assertion `input_val >= zero && input_val <= one` failed.
/opt/conda/conda-bld/pytorch_1623448265233/work/aten/src/ATen/native/cuda/Loss.cu:111: operator(): block: [31,0,0], thread: [39,0,0] Assertion `input_val >= zero && input_val <= one` failed.
/opt/conda/conda-bld/pytorch_1623448265233/work/aten/src/ATen/native/cuda/Loss.cu:111: operator(): block: [31,0,0], thread: [40,0,0] Assertion `input_val >= zero && input_val <= one` failed.
/opt/conda/conda-bld/pytorch_1623448265233/work/aten/src/ATen/native/cuda/Loss.cu:111: operator(): block: [31,0,0], thread: [41,0,0] Assertion `input_val >= zero && input_val <= one` failed.
/opt/conda/conda-bld/pytorch_1623448265233/work/aten/src/ATen/native/cuda/Loss.cu:111: operator(): block: [31,0,0], thread: [42,0,0] Assertion `input_val >= zero && input_val <= one` failed.
/opt/conda/conda-bld/pytorch_1623448265233/work/aten/src/ATen/native/cuda/Loss.cu:111: operator(): block: [31,0,0], thread: [43,0,0] Assertion `input_val >= zero && input_val <= one` failed.
/opt/conda/conda-bld/pytorch_1623448265233/work/aten/src/ATen/native/cuda/Loss.cu:111: operator(): block: [31,0,0], thread: [44,0,0] Assertion `input_val >= zero && input_val <= one` failed.
/opt/conda/conda-bld/pytorch_1623448265233/work/aten/src/ATen/native/cuda/Loss.cu:111: operator(): block: [31,0,0], thread: [45,0,0] Assertion `input_val >= zero && input_val <= one` failed.
/opt/conda/conda-bld/pytorch_1623448265233/work/aten/src/ATen/native/cuda/Loss.cu:111: operator(): block: [31,0,0], thread: [46,0,0] Assertion `input_val >= zero && input_val <= one` failed.
/opt/conda/conda-bld/pytorch_1623448265233/work/aten/src/ATen/native/cuda/Loss.cu:111: operator(): block: [31,0,0], thread: [47,0,0] Assertion `input_val >= zero && input_val <= one` failed.
/opt/conda/conda-bld/pytorch_1623448265233/work/aten/src/ATen/native/cuda/Loss.cu:111: operator(): block: [31,0,0], thread: [48,0,0] Assertion `input_val >= zero && input_val <= one` failed.
/opt/conda/conda-bld/pytorch_1623448265233/work/aten/src/ATen/native/cuda/Loss.cu:111: operator(): block: [31,0,0], thread: [49,0,0] Assertion `input_val >= zero && input_val <= one` failed.
/opt/conda/conda-bld/pytorch_1623448265233/work/aten/src/ATen/native/cuda/Loss.cu:111: operator(): block: [31,0,0], thread: [50,0,0] Assertion `input_val >= zero && input_val <= one` failed.
/opt/conda/conda-bld/pytorch_1623448265233/work/aten/src/ATen/native/cuda/Loss.cu:111: operator(): block: [31,0,0], thread: [51,0,0] Assertion `input_val >= zero && input_val <= one` failed.
/opt/conda/conda-bld/pytorch_1623448265233/work/aten/src/ATen/native/cuda/Loss.cu:111: operator(): block: [31,0,0], thread: [52,0,0] Assertion `input_val >= zero && input_val <= one` failed.
/opt/conda/conda-bld/pytorch_1623448265233/work/aten/src/ATen/native/cuda/Loss.cu:111: operator(): block: [31,0,0], thread: [53,0,0] Assertion `input_val >= zero && input_val <= one` failed.
/opt/conda/conda-bld/pytorch_1623448265233/work/aten/src/ATen/native/cuda/Loss.cu:111: operator(): block: [31,0,0], thread: [54,0,0] Assertion `input_val >= zero && input_val <= one` failed.
/opt/conda/conda-bld/pytorch_1623448265233/work/aten/src/ATen/native/cuda/Loss.cu:111: operator(): block: [31,0,0], thread: [55,0,0] Assertion `input_val >= zero && input_val <= one` failed.
/opt/conda/conda-bld/pytorch_1623448265233/work/aten/src/ATen/native/cuda/Loss.cu:111: operator(): block: [31,0,0], thread: [56,0,0] Assertion `input_val >= zero && input_val <= one` failed.
/opt/conda/conda-bld/pytorch_1623448265233/work/aten/src/ATen/native/cuda/Loss.cu:111: operator(): block: [31,0,0], thread: [57,0,0] Assertion `input_val >= zero && input_val <= one` failed.
/opt/conda/conda-bld/pytorch_1623448265233/work/aten/src/ATen/native/cuda/Loss.cu:111: operator(): block: [31,0,0], thread: [58,0,0] Assertion `input_val >= zero && input_val <= one` failed.
/opt/conda/conda-bld/pytorch_1623448265233/work/aten/src/ATen/native/cuda/Loss.cu:111: operator(): block: [31,0,0], thread: [59,0,0] Assertion `input_val >= zero && input_val <= one` failed.
/opt/conda/conda-bld/pytorch_1623448265233/work/aten/src/ATen/native/cuda/Loss.cu:111: operator(): block: [31,0,0], thread: [60,0,0] Assertion `input_val >= zero && input_val <= one` failed.
/opt/conda/conda-bld/pytorch_1623448265233/work/aten/src/ATen/native/cuda/Loss.cu:111: operator(): block: [31,0,0], thread: [61,0,0] Assertion `input_val >= zero && input_val <= one` failed.
/opt/conda/conda-bld/pytorch_1623448265233/work/aten/src/ATen/native/cuda/Loss.cu:111: operator(): block: [31,0,0], thread: [62,0,0] Assertion `input_val >= zero && input_val <= one` failed.
/opt/conda/conda-bld/pytorch_1623448265233/work/aten/src/ATen/native/cuda/Loss.cu:111: operator(): block: [31,0,0], thread: [63,0,0] Assertion `input_val >= zero && input_val <= one` failed.
/opt/conda/conda-bld/pytorch_1623448265233/work/aten/src/ATen/native/cuda/Loss.cu:111: operator(): block: [33,0,0], thread: [0,0,0] Assertion `input_val >= zero && input_val <= one` failed.
/opt/conda/conda-bld/pytorch_1623448265233/work/aten/src/ATen/native/cuda/Loss.cu:111: operator(): block: [33,0,0], thread: [1,0,0] Assertion `input_val >= zero && input_val <= one` failed.
/opt/conda/conda-bld/pytorch_1623448265233/work/aten/src/ATen/native/cuda/Loss.cu:111: operator(): block: [33,0,0], thread: [2,0,0] Assertion `input_val >= zero && input_val <= one` failed.
/opt/conda/conda-bld/pytorch_1623448265233/work/aten/src/ATen/native/cuda/Loss.cu:111: operator(): block: [33,0,0], thread: [3,0,0] Assertion `input_val >= zero && input_val <= one` failed.
/opt/conda/conda-bld/pytorch_1623448265233/work/aten/src/ATen/native/cuda/Loss.cu:111: operator(): block: [33,0,0], thread: [4,0,0] Assertion `input_val >= zero && input_val <= one` failed.
/opt/conda/conda-bld/pytorch_1623448265233/work/aten/src/ATen/native/cuda/Loss.cu:111: operator(): block: [33,0,0], thread: [5,0,0] Assertion `input_val >= zero && input_val <= one` failed.
/opt/conda/conda-bld/pytorch_1623448265233/work/aten/src/ATen/native/cuda/Loss.cu:111: operator(): block: [33,0,0], thread: [6,0,0] Assertion `input_val >= zero && input_val <= one` failed.
/opt/conda/conda-bld/pytorch_1623448265233/work/aten/src/ATen/native/cuda/Loss.cu:111: operator(): block: [33,0,0], thread: [7,0,0] Assertion `input_val >= zero && input_val <= one` failed.
/opt/conda/conda-bld/pytorch_1623448265233/work/aten/src/ATen/native/cuda/Loss.cu:111: operator(): block: [33,0,0], thread: [8,0,0] Assertion `input_val >= zero && input_val <= one` failed.
/opt/conda/conda-bld/pytorch_1623448265233/work/aten/src/ATen/native/cuda/Loss.cu:111: operator(): block: [33,0,0], thread: [9,0,0] Assertion `input_val >= zero && input_val <= one` failed.
/opt/conda/conda-bld/pytorch_1623448265233/work/aten/src/ATen/native/cuda/Loss.cu:111: operator(): block: [33,0,0], thread: [10,0,0] Assertion `input_val >= zero && input_val <= one` failed.
/opt/conda/conda-bld/pytorch_1623448265233/work/aten/src/ATen/native/cuda/Loss.cu:111: operator(): block: [33,0,0], thread: [11,0,0] Assertion `input_val >= zero && input_val <= one` failed.
/opt/conda/conda-bld/pytorch_1623448265233/work/aten/src/ATen/native/cuda/Loss.cu:111: operator(): block: [33,0,0], thread: [12,0,0] Assertion `input_val >= zero && input_val <= one` failed.
/opt/conda/conda-bld/pytorch_1623448265233/work/aten/src/ATen/native/cuda/Loss.cu:111: operator(): block: [33,0,0], thread: [13,0,0] Assertion `input_val >= zero && input_val <= one` failed.
/opt/conda/conda-bld/pytorch_1623448265233/work/aten/src/ATen/native/cuda/Loss.cu:111: operator(): block: [33,0,0], thread: [14,0,0] Assertion `input_val >= zero && input_val <= one` failed.
/opt/conda/conda-bld/pytorch_1623448265233/work/aten/src/ATen/native/cuda/Loss.cu:111: operator(): block: [33,0,0], thread: [15,0,0] Assertion `input_val >= zero && input_val <= one` failed.
/opt/conda/conda-bld/pytorch_1623448265233/work/aten/src/ATen/native/cuda/Loss.cu:111: operator(): block: [33,0,0], thread: [16,0,0] Assertion `input_val >= zero && input_val <= one` failed.
/opt/conda/conda-bld/pytorch_1623448265233/work/aten/src/ATen/native/cuda/Loss.cu:111: operator(): block: [33,0,0], thread: [17,0,0] Assertion `input_val >= zero && input_val <= one` failed.
/opt/conda/conda-bld/pytorch_1623448265233/work/aten/src/ATen/native/cuda/Loss.cu:111: operator(): block: [33,0,0], thread: [18,0,0] Assertion `input_val >= zero && input_val <= one` failed.
/opt/conda/conda-bld/pytorch_1623448265233/work/aten/src/ATen/native/cuda/Loss.cu:111: operator(): block: [33,0,0], thread: [19,0,0] Assertion `input_val >= zero && input_val <= one` failed.
/opt/conda/conda-bld/pytorch_1623448265233/work/aten/src/ATen/native/cuda/Loss.cu:111: operator(): block: [33,0,0], thread: [20,0,0] Assertion `input_val >= zero && input_val <= one` failed.
/opt/conda/conda-bld/pytorch_1623448265233/work/aten/src/ATen/native/cuda/Loss.cu:111: operator(): block: [33,0,0], thread: [21,0,0] Assertion `input_val >= zero && input_val <= one` failed.
/opt/conda/conda-bld/pytorch_1623448265233/work/aten/src/ATen/native/cuda/Loss.cu:111: operator(): block: [33,0,0], thread: [22,0,0] Assertion `input_val >= zero && input_val <= one` failed.
/opt/conda/conda-bld/pytorch_1623448265233/work/aten/src/ATen/native/cuda/Loss.cu:111: operator(): block: [33,0,0], thread: [23,0,0] Assertion `input_val >= zero && input_val <= one` failed.
/opt/conda/conda-bld/pytorch_1623448265233/work/aten/src/ATen/native/cuda/Loss.cu:111: operator(): block: [33,0,0], thread: [24,0,0] Assertion `input_val >= zero && input_val <= one` failed.
/opt/conda/conda-bld/pytorch_1623448265233/work/aten/src/ATen/native/cuda/Loss.cu:111: operator(): block: [33,0,0], thread: [25,0,0] Assertion `input_val >= zero && input_val <= one` failed.
/opt/conda/conda-bld/pytorch_1623448265233/work/aten/src/ATen/native/cuda/Loss.cu:111: operator(): block: [33,0,0], thread: [26,0,0] Assertion `input_val >= zero && input_val <= one` failed.
/opt/conda/conda-bld/pytorch_1623448265233/work/aten/src/ATen/native/cuda/Loss.cu:111: operator(): block: [33,0,0], thread: [27,0,0] Assertion `input_val >= zero && input_val <= one` failed.
/opt/conda/conda-bld/pytorch_1623448265233/work/aten/src/ATen/native/cuda/Loss.cu:111: operator(): block: [33,0,0], thread: [28,0,0] Assertion `input_val >= zero && input_val <= one` failed.
/opt/conda/conda-bld/pytorch_1623448265233/work/aten/src/ATen/native/cuda/Loss.cu:111: operator(): block: [33,0,0], thread: [29,0,0] Assertion `input_val >= zero && input_val <= one` failed.
/opt/conda/conda-bld/pytorch_1623448265233/work/aten/src/ATen/native/cuda/Loss.cu:111: operator(): block: [33,0,0], thread: [30,0,0] Assertion `input_val >= zero && input_val <= one` failed.
/opt/conda/conda-bld/pytorch_1623448265233/work/aten/src/ATen/native/cuda/Loss.cu:111: operator(): block: [33,0,0], thread: [31,0,0] Assertion `input_val >= zero && input_val <= one` failed.
/opt/conda/conda-bld/pytorch_1623448265233/work/aten/src/ATen/native/cuda/Loss.cu:111: operator(): block: [103,0,0], thread: [0,0,0] Assertion `input_val >= zero && input_val <= one` failed.
/opt/conda/conda-bld/pytorch_1623448265233/work/aten/src/ATen/native/cuda/Loss.cu:111: operator(): block: [103,0,0], thread: [1,0,0] Assertion `input_val >= zero && input_val <= one` failed.
/opt/conda/conda-bld/pytorch_1623448265233/work/aten/src/ATen/native/cuda/Loss.cu:111: operator(): block: [103,0,0], thread: [2,0,0] Assertion `input_val >= zero && input_val <= one` failed.
/opt/conda/conda-bld/pytorch_1623448265233/work/aten/src/ATen/native/cuda/Loss.cu:111: operator(): block: [103,0,0], thread: [3,0,0] Assertion `input_val >= zero && input_val <= one` failed.
/opt/conda/conda-bld/pytorch_1623448265233/work/aten/src/ATen/native/cuda/Loss.cu:111: operator(): block: [103,0,0], thread: [4,0,0] Assertion `input_val >= zero && input_val <= one` failed.
/opt/conda/conda-bld/pytorch_1623448265233/work/aten/src/ATen/native/cuda/Loss.cu:111: operator(): block: [103,0,0], thread: [5,0,0] Assertion `input_val >= zero && input_val <= one` failed.
/opt/conda/conda-bld/pytorch_1623448265233/work/aten/src/ATen/native/cuda/Loss.cu:111: operator(): block: [103,0,0], thread: [6,0,0] Assertion `input_val >= zero && input_val <= one` failed.
/opt/conda/conda-bld/pytorch_1623448265233/work/aten/src/ATen/native/cuda/Loss.cu:111: operator(): block: [103,0,0], thread: [7,0,0] Assertion `input_val >= zero && input_val <= one` failed.
/opt/conda/conda-bld/pytorch_1623448265233/work/aten/src/ATen/native/cuda/Loss.cu:111: operator(): block: [103,0,0], thread: [8,0,0] Assertion `input_val >= zero && input_val <= one` failed.
/opt/conda/conda-bld/pytorch_1623448265233/work/aten/src/ATen/native/cuda/Loss.cu:111: operator(): block: [103,0,0], thread: [9,0,0] Assertion `input_val >= zero && input_val <= one` failed.
/opt/conda/conda-bld/pytorch_1623448265233/work/aten/src/ATen/native/cuda/Loss.cu:111: operator(): block: [103,0,0], thread: [10,0,0] Assertion `input_val >= zero && input_val <= one` failed.
/opt/conda/conda-bld/pytorch_1623448265233/work/aten/src/ATen/native/cuda/Loss.cu:111: operator(): block: [103,0,0], thread: [11,0,0] Assertion `input_val >= zero && input_val <= one` failed.
/opt/conda/conda-bld/pytorch_1623448265233/work/aten/src/ATen/native/cuda/Loss.cu:111: operator(): block: [103,0,0], thread: [12,0,0] Assertion `input_val >= zero && input_val <= one` failed.
/opt/conda/conda-bld/pytorch_1623448265233/work/aten/src/ATen/native/cuda/Loss.cu:111: operator(): block: [103,0,0], thread: [13,0,0] Assertion `input_val >= zero && input_val <= one` failed.
/opt/conda/conda-bld/pytorch_1623448265233/work/aten/src/ATen/native/cuda/Loss.cu:111: operator(): block: [103,0,0], thread: [14,0,0] Assertion `input_val >= zero && input_val <= one` failed.
/opt/conda/conda-bld/pytorch_1623448265233/work/aten/src/ATen/native/cuda/Loss.cu:111: operator(): block: [103,0,0], thread: [15,0,0] Assertion `input_val >= zero && input_val <= one` failed.
/opt/conda/conda-bld/pytorch_1623448265233/work/aten/src/ATen/native/cuda/Loss.cu:111: operator(): block: [103,0,0], thread: [16,0,0] Assertion `input_val >= zero && input_val <= one` failed.
/opt/conda/conda-bld/pytorch_1623448265233/work/aten/src/ATen/native/cuda/Loss.cu:111: operator(): block: [103,0,0], thread: [17,0,0] Assertion `input_val >= zero && input_val <= one` failed.
/opt/conda/conda-bld/pytorch_1623448265233/work/aten/src/ATen/native/cuda/Loss.cu:111: operator(): block: [103,0,0], thread: [18,0,0] Assertion `input_val >= zero && input_val <= one` failed.
/opt/conda/conda-bld/pytorch_1623448265233/work/aten/src/ATen/native/cuda/Loss.cu:111: operator(): block: [103,0,0], thread: [19,0,0] Assertion `input_val >= zero && input_val <= one` failed.
/opt/conda/conda-bld/pytorch_1623448265233/work/aten/src/ATen/native/cuda/Loss.cu:111: operator(): block: [103,0,0], thread: [20,0,0] Assertion `input_val >= zero && input_val <= one` failed.
/opt/conda/conda-bld/pytorch_1623448265233/work/aten/src/ATen/native/cuda/Loss.cu:111: operator(): block: [103,0,0], thread: [21,0,0] Assertion `input_val >= zero && input_val <= one` failed.
/opt/conda/conda-bld/pytorch_1623448265233/work/aten/src/ATen/native/cuda/Loss.cu:111: operator(): block: [103,0,0], thread: [22,0,0] Assertion `input_val >= zero && input_val <= one` failed.
/opt/conda/conda-bld/pytorch_1623448265233/work/aten/src/ATen/native/cuda/Loss.cu:111: operator(): block: [103,0,0], thread: [23,0,0] Assertion `input_val >= zero && input_val <= one` failed.
/opt/conda/conda-bld/pytorch_1623448265233/work/aten/src/ATen/native/cuda/Loss.cu:111: operator(): block: [103,0,0], thread: [24,0,0] Assertion `input_val >= zero && input_val <= one` failed.
/opt/conda/conda-bld/pytorch_1623448265233/work/aten/src/ATen/native/cuda/Loss.cu:111: operator(): block: [103,0,0], thread: [25,0,0] Assertion `input_val >= zero && input_val <= one` failed.
/opt/conda/conda-bld/pytorch_1623448265233/work/aten/src/ATen/native/cuda/Loss.cu:111: operator(): block: [103,0,0], thread: [26,0,0] Assertion `input_val >= zero && input_val <= one` failed.
/opt/conda/conda-bld/pytorch_1623448265233/work/aten/src/ATen/native/cuda/Loss.cu:111: operator(): block: [103,0,0], thread: [27,0,0] Assertion `input_val >= zero && input_val <= one` failed.
/opt/conda/conda-bld/pytorch_1623448265233/work/aten/src/ATen/native/cuda/Loss.cu:111: operator(): block: [103,0,0], thread: [28,0,0] Assertion `input_val >= zero && input_val <= one` failed.
/opt/conda/conda-bld/pytorch_1623448265233/work/aten/src/ATen/native/cuda/Loss.cu:111: operator(): block: [103,0,0], thread: [29,0,0] Assertion `input_val >= zero && input_val <= one` failed.
/opt/conda/conda-bld/pytorch_1623448265233/work/aten/src/ATen/native/cuda/Loss.cu:111: operator(): block: [103,0,0], thread: [30,0,0] Assertion `input_val >= zero && input_val <= one` failed.
/opt/conda/conda-bld/pytorch_1623448265233/work/aten/src/ATen/native/cuda/Loss.cu:111: operator(): block: [103,0,0], thread: [31,0,0] Assertion `input_val >= zero && input_val <= one` failed.
/opt/conda/conda-bld/pytorch_1623448265233/work/aten/src/ATen/native/cuda/Loss.cu:111: operator(): block: [31,0,0], thread: [0,0,0] Assertion `input_val >= zero && input_val <= one` failed.
/opt/conda/conda-bld/pytorch_1623448265233/work/aten/src/ATen/native/cuda/Loss.cu:111: operator(): block: [31,0,0], thread: [1,0,0] Assertion `input_val >= zero && input_val <= one` failed.
/opt/conda/conda-bld/pytorch_1623448265233/work/aten/src/ATen/native/cuda/Loss.cu:111: operator(): block: [31,0,0], thread: [2,0,0] Assertion `input_val >= zero && input_val <= one` failed.
/opt/conda/conda-bld/pytorch_1623448265233/work/aten/src/ATen/native/cuda/Loss.cu:111: operator(): block: [31,0,0], thread: [3,0,0] Assertion `input_val >= zero && input_val <= one` failed.
/opt/conda/conda-bld/pytorch_1623448265233/work/aten/src/ATen/native/cuda/Loss.cu:111: operator(): block: [31,0,0], thread: [4,0,0] Assertion `input_val >= zero && input_val <= one` failed.
/opt/conda/conda-bld/pytorch_1623448265233/work/aten/src/ATen/native/cuda/Loss.cu:111: operator(): block: [31,0,0], thread: [5,0,0] Assertion `input_val >= zero && input_val <= one` failed.
/opt/conda/conda-bld/pytorch_1623448265233/work/aten/src/ATen/native/cuda/Loss.cu:111: operator(): block: [31,0,0], thread: [6,0,0] Assertion `input_val >= zero && input_val <= one` failed.
/opt/conda/conda-bld/pytorch_1623448265233/work/aten/src/ATen/native/cuda/Loss.cu:111: operator(): block: [31,0,0], thread: [7,0,0] Assertion `input_val >= zero && input_val <= one` failed.
/opt/conda/conda-bld/pytorch_1623448265233/work/aten/src/ATen/native/cuda/Loss.cu:111: operator(): block: [31,0,0], thread: [8,0,0] Assertion `input_val >= zero && input_val <= one` failed.
/opt/conda/conda-bld/pytorch_1623448265233/work/aten/src/ATen/native/cuda/Loss.cu:111: operator(): block: [31,0,0], thread: [9,0,0] Assertion `input_val >= zero && input_val <= one` failed.
/opt/conda/conda-bld/pytorch_1623448265233/work/aten/src/ATen/native/cuda/Loss.cu:111: operator(): block: [31,0,0], thread: [10,0,0] Assertion `input_val >= zero && input_val <= one` failed.
/opt/conda/conda-bld/pytorch_1623448265233/work/aten/src/ATen/native/cuda/Loss.cu:111: operator(): block: [31,0,0], thread: [11,0,0] Assertion `input_val >= zero && input_val <= one` failed.
/opt/conda/conda-bld/pytorch_1623448265233/work/aten/src/ATen/native/cuda/Loss.cu:111: operator(): block: [31,0,0], thread: [12,0,0] Assertion `input_val >= zero && input_val <= one` failed.
/opt/conda/conda-bld/pytorch_1623448265233/work/aten/src/ATen/native/cuda/Loss.cu:111: operator(): block: [31,0,0], thread: [13,0,0] Assertion `input_val >= zero && input_val <= one` failed.
/opt/conda/conda-bld/pytorch_1623448265233/work/aten/src/ATen/native/cuda/Loss.cu:111: operator(): block: [31,0,0], thread: [14,0,0] Assertion `input_val >= zero && input_val <= one` failed.
/opt/conda/conda-bld/pytorch_1623448265233/work/aten/src/ATen/native/cuda/Loss.cu:111: operator(): block: [31,0,0], thread: [15,0,0] Assertion `input_val >= zero && input_val <= one` failed.
/opt/conda/conda-bld/pytorch_1623448265233/work/aten/src/ATen/native/cuda/Loss.cu:111: operator(): block: [31,0,0], thread: [16,0,0] Assertion `input_val >= zero && input_val <= one` failed.
/opt/conda/conda-bld/pytorch_1623448265233/work/aten/src/ATen/native/cuda/Loss.cu:111: operator(): block: [31,0,0], thread: [17,0,0] Assertion `input_val >= zero && input_val <= one` failed.
/opt/conda/conda-bld/pytorch_1623448265233/work/aten/src/ATen/native/cuda/Loss.cu:111: operator(): block: [31,0,0], thread: [18,0,0] Assertion `input_val >= zero && input_val <= one` failed.
/opt/conda/conda-bld/pytorch_1623448265233/work/aten/src/ATen/native/cuda/Loss.cu:111: operator(): block: [31,0,0], thread: [19,0,0] Assertion `input_val >= zero && input_val <= one` failed.
/opt/conda/conda-bld/pytorch_1623448265233/work/aten/src/ATen/native/cuda/Loss.cu:111: operator(): block: [31,0,0], thread: [20,0,0] Assertion `input_val >= zero && input_val <= one` failed.
/opt/conda/conda-bld/pytorch_1623448265233/work/aten/src/ATen/native/cuda/Loss.cu:111: operator(): block: [31,0,0], thread: [21,0,0] Assertion `input_val >= zero && input_val <= one` failed.
/opt/conda/conda-bld/pytorch_1623448265233/work/aten/src/ATen/native/cuda/Loss.cu:111: operator(): block: [31,0,0], thread: [22,0,0] Assertion `input_val >= zero && input_val <= one` failed.
/opt/conda/conda-bld/pytorch_1623448265233/work/aten/src/ATen/native/cuda/Loss.cu:111: operator(): block: [31,0,0], thread: [23,0,0] Assertion `input_val >= zero && input_val <= one` failed.
/opt/conda/conda-bld/pytorch_1623448265233/work/aten/src/ATen/native/cuda/Loss.cu:111: operator(): block: [31,0,0], thread: [24,0,0] Assertion `input_val >= zero && input_val <= one` failed.
/opt/conda/conda-bld/pytorch_1623448265233/work/aten/src/ATen/native/cuda/Loss.cu:111: operator(): block: [31,0,0], thread: [25,0,0] Assertion `input_val >= zero && input_val <= one` failed.
/opt/conda/conda-bld/pytorch_1623448265233/work/aten/src/ATen/native/cuda/Loss.cu:111: operator(): block: [31,0,0], thread: [26,0,0] Assertion `input_val >= zero && input_val <= one` failed.
/opt/conda/conda-bld/pytorch_1623448265233/work/aten/src/ATen/native/cuda/Loss.cu:111: operator(): block: [31,0,0], thread: [27,0,0] Assertion `input_val >= zero && input_val <= one` failed.
/opt/conda/conda-bld/pytorch_1623448265233/work/aten/src/ATen/native/cuda/Loss.cu:111: operator(): block: [31,0,0], thread: [28,0,0] Assertion `input_val >= zero && input_val <= one` failed.
/opt/conda/conda-bld/pytorch_1623448265233/work/aten/src/ATen/native/cuda/Loss.cu:111: operator(): block: [31,0,0], thread: [29,0,0] Assertion `input_val >= zero && input_val <= one` failed.
/opt/conda/conda-bld/pytorch_1623448265233/work/aten/src/ATen/native/cuda/Loss.cu:111: operator(): block: [31,0,0], thread: [30,0,0] Assertion `input_val >= zero && input_val <= one` failed.
/opt/conda/conda-bld/pytorch_1623448265233/work/aten/src/ATen/native/cuda/Loss.cu:111: operator(): block: [31,0,0], thread: [31,0,0] Assertion `input_val >= zero && input_val <= one` failed.
Traceback (most recent call last):
  File "tools/train.py", line 188, in <module>
    main()
  File "tools/train.py", line 184, in main
    meta=meta)
  File "/root/anaconda3/envs/openmmlab/lib/python3.7/site-packages/mmdet-2.15.0-py3.7.egg/mmdet/apis/train.py", line 170, in train_detector
    runner.run(data_loaders, cfg.workflow)
  File "/root/anaconda3/envs/openmmlab/lib/python3.7/site-packages/mmcv/runner/epoch_based_runner.py", line 127, in run
    epoch_runner(data_loaders[i], **kwargs)
  File "/root/anaconda3/envs/openmmlab/lib/python3.7/site-packages/mmcv/runner/epoch_based_runner.py", line 50, in train
    self.run_iter(data_batch, train_mode=True, **kwargs)
  File "/root/anaconda3/envs/openmmlab/lib/python3.7/site-packages/mmcv/runner/epoch_based_runner.py", line 30, in run_iter
    **kwargs)
  File "/root/anaconda3/envs/openmmlab/lib/python3.7/site-packages/mmcv/parallel/data_parallel.py", line 67, in train_step
    return self.module.train_step(*inputs[0], **kwargs[0])
  File "/root/anaconda3/envs/openmmlab/lib/python3.7/site-packages/mmdet-2.15.0-py3.7.egg/mmdet/models/detectors/base.py", line 237, in train_step
    losses = self(**data)
  File "/root/anaconda3/envs/openmmlab/lib/python3.7/site-packages/torch/nn/modules/module.py", line 1051, in _call_impl
    return forward_call(*input, **kwargs)
  File "/root/anaconda3/envs/openmmlab/lib/python3.7/site-packages/mmcv/runner/fp16_utils.py", line 97, in new_func
    return old_func(*args, **kwargs)
  File "/root/anaconda3/envs/openmmlab/lib/python3.7/site-packages/mmdet-2.15.0-py3.7.egg/mmdet/models/detectors/base.py", line 171, in forward
    return self.forward_train(img, img_metas, **kwargs)
  File "/root/anaconda3/envs/openmmlab/lib/python3.7/site-packages/mmdet-2.15.0-py3.7.egg/mmdet/models/detectors/single_stage.py", line 83, in forward_train
    gt_labels, gt_bboxes_ignore)
  File "/root/anaconda3/envs/openmmlab/lib/python3.7/site-packages/mmdet-2.15.0-py3.7.egg/mmdet/models/dense_heads/base_dense_head.py", line 54, in forward_train
    losses = self.loss(*loss_inputs, gt_bboxes_ignore=gt_bboxes_ignore)
  File "/root/anaconda3/envs/openmmlab/lib/python3.7/site-packages/mmcv/runner/fp16_utils.py", line 185, in new_func
    return old_func(*args, **kwargs)
  File "/root/anaconda3/envs/openmmlab/lib/python3.7/site-packages/mmdet-2.15.0-py3.7.egg/mmdet/models/dense_heads/tood_head.py", line 426, in loss
    num_total_samples=num_total_samples)
  File "/root/anaconda3/envs/openmmlab/lib/python3.7/site-packages/mmdet-2.15.0-py3.7.egg/mmdet/core/utils/misc.py", line 29, in multi_apply
    return tuple(map(list, zip(*map_results)))
  File "/root/anaconda3/envs/openmmlab/lib/python3.7/site-packages/mmdet-2.15.0-py3.7.egg/mmdet/models/dense_heads/tood_head.py", line 333, in loss_single
    & (labels < bg_class_ind)).nonzero().squeeze(1)
RuntimeError: CUDA error: device-side assert triggered
CUDA kernel errors might be asynchronously reported at some other API call,so the stacktrace below might be incorrect.
For debugging consider passing CUDA_LAUNCH_BLOCKING=1.
terminate called after throwing an instance of 'c10::CUDAError'
  what():  CUDA error: device-side assert triggered
CUDA kernel errors might be asynchronously reported at some other API call,so the stacktrace below might be incorrect.
For debugging consider passing CUDA_LAUNCH_BLOCKING=1.
Exception raised from create_event_internal at /opt/conda/conda-bld/pytorch_1623448265233/work/c10/cuda/CUDACachingAllocator.cpp:1055 (most recent call first):
frame #0: c10::Error::Error(c10::SourceLocation, std::string) + 0x42 (0x7f12c21efa22 in /root/anaconda3/envs/openmmlab/lib/python3.7/site-packages/torch/lib/libc10.so)
frame #1: <unknown function> + 0x10ac3 (0x7f12c2451ac3 in /root/anaconda3/envs/openmmlab/lib/python3.7/site-packages/torch/lib/libc10_cuda.so)
frame #2: c10::cuda::CUDACachingAllocator::raw_delete(void*) + 0x1a7 (0x7f12c2453167 in /root/anaconda3/envs/openmmlab/lib/python3.7/site-packages/torch/lib/libc10_cuda.so)
frame #3: c10::TensorImpl::release_resources() + 0x54 (0x7f12c21d95a4 in /root/anaconda3/envs/openmmlab/lib/python3.7/site-packages/torch/lib/libc10.so)
frame #4: <unknown function> + 0xa2bb12 (0x7f133bad0b12 in /root/anaconda3/envs/openmmlab/lib/python3.7/site-packages/torch/lib/libtorch_python.so)
frame #5: <unknown function> + 0xa2bbb1 (0x7f133bad0bb1 in /root/anaconda3/envs/openmmlab/lib/python3.7/site-packages/torch/lib/libtorch_python.so)
<omitting python frames>
frame #24: __libc_start_main + 0xe7 (0x7f1376d75bf7 in /lib/x86_64-linux-gnu/libc.so.6)

Aborted

Issue Analytics

  • State:open
  • Created 2 years ago
  • Comments:6 (1 by maintainers)

github_iconTop GitHub Comments

2reactions
fcjiancommented, Dec 25, 2021
1reaction
MeteoriteWenycommented, Dec 29, 2021

@fcjian Thanks for reply! It solves the CUDA error, but the model can not converge. During training, a problem similar with gradient cutting happened. The log shows a sudden increase of loss. After that, the loss fluctuates in a tiny range. I’ll try again with the original TOOD code without transfering to higher mmdet version.

2021-12-29 09:32:52,217 - mmdet - INFO - Epoch [1][600/1162]	lr: 2.000e-03, eta: 8:39:18, time: 0.544, data_time: 0.013, memory: 5142, loss_cls: 0.6940, loss_bbox: 1.2061, loss: 1.9001
2021-12-29 09:33:18,832 - mmdet - INFO - Epoch [1][650/1162]	lr: 2.000e-03, eta: 8:38:09, time: 0.532, data_time: 0.013, memory: 5142, loss_cls: 0.6794, loss_bbox: 1.1886, loss: 1.8680
2021-12-29 09:33:45,535 - mmdet - INFO - Epoch [1][700/1162]	lr: 2.000e-03, eta: 8:37:13, time: 0.534, data_time: 0.013, memory: 5142, loss_cls: 0.6674, loss_bbox: 1.0485, loss: 1.7159
2021-12-29 09:34:12,217 - mmdet - INFO - Epoch [1][750/1162]	lr: 2.000e-03, eta: 8:36:19, time: 0.534, data_time: 0.013, memory: 5142, loss_cls: 0.6646, loss_bbox: 1.0119, loss: 1.6765
2021-12-29 09:34:38,781 - mmdet - INFO - Epoch [1][800/1162]	lr: 2.000e-03, eta: 8:35:20, time: 0.531, data_time: 0.013, memory: 5142, loss_cls: 0.6487, loss_bbox: 0.9564, loss: 1.6051
2021-12-29 09:35:05,190 - mmdet - INFO - Epoch [1][850/1162]	lr: 2.000e-03, eta: 8:34:14, time: 0.528, data_time: 0.013, memory: 5142, loss_cls: 0.6176, loss_bbox: 0.8406, loss: 1.4582
2021-12-29 09:35:31,799 - mmdet - INFO - Epoch [1][900/1162]	lr: 2.000e-03, eta: 8:33:26, time: 0.532, data_time: 0.013, memory: 5142, loss_cls: 0.6210, loss_bbox: 0.9229, loss: 1.5439
2021-12-29 09:35:58,144 - mmdet - INFO - Epoch [1][950/1162]	lr: 2.000e-03, eta: 8:32:24, time: 0.527, data_time: 0.013, memory: 5142, loss_cls: 1.1693, loss_bbox: 1.1850, loss: 2.3543
2021-12-29 09:36:25,339 - mmdet - INFO - Exp name: tood_r50_fpn_on_input_1x_coco_cloth.py
2021-12-29 09:36:25,340 - mmdet - INFO - Epoch [1][1000/1162]	lr: 2.000e-03, eta: 8:32:14, time: 0.544, data_time: 0.013, memory: 5142, loss_cls: 1.2817, loss_bbox: 1.3174, loss: 2.5991
2021-12-29 09:36:52,114 - mmdet - INFO - Epoch [1][1050/1162]	lr: 2.000e-03, eta: 8:31:39, time: 0.535, data_time: 0.013, memory: 5142, loss_cls: 1.2358, loss_bbox: 1.2847, loss: 2.5205
2021-12-29 09:37:18,908 - mmdet - INFO - Epoch [1][1100/1162]	lr: 2.000e-03, eta: 8:31:07, time: 0.536, data_time: 0.013, memory: 5142, loss_cls: 1.2365, loss_bbox: 1.3173, loss: 2.5538
2021-12-29 09:37:45,867 - mmdet - INFO - Epoch [1][1150/1162]	lr: 2.000e-03, eta: 8:30:43, time: 0.539, data_time: 0.013, memory: 5142, loss_cls: 1.2022, loss_bbox: 1.2296, loss: 2.4319
2021-12-29 09:37:52,329 - mmdet - INFO - Saving checkpoint at 1 epochs
2021-12-29 09:38:47,804 - mmdet - INFO - Evaluating bbox...
2021-12-29 09:38:51,494 - mmdet - INFO - Exp name: tood_r50_fpn_on_input_1x_coco_cloth.py
2021-12-29 09:38:51,495 - mmdet - INFO - Epoch(val) [1][793]	bbox_mAP: 0.0170, bbox_mAP_50: 0.0560, bbox_mAP_75: 0.0090, bbox_mAP_s: -1.0000, bbox_mAP_m: 0.0240, bbox_mAP_l: 0.0190, bbox_mAP_copypaste: 0.017 0.056 0.009 -1.000 0.024 0.019
2021-12-29 09:39:21,128 - mmdet - INFO - Epoch [2][50/1162]	lr: 2.000e-03, eta: 8:27:14, time: 0.592, data_time: 0.062, memory: 5142, loss_cls: 1.2236, loss_bbox: 1.2423, loss: 2.4659
2021-12-29 09:39:47,839 - mmdet - INFO - Epoch [2][100/1162]	lr: 2.000e-03, eta: 8:26:45, time: 0.534, data_time: 0.013, memory: 5142, loss_cls: 1.2410, loss_bbox: 1.2517, loss: 2.4927
2021-12-29 09:40:14,530 - mmdet - INFO - Epoch [2][150/1162]	lr: 2.000e-03, eta: 8:26:16, time: 0.534, data_time: 0.013, memory: 5142, loss_cls: 1.2827, loss_bbox: 1.2900, loss: 2.5726
2021-12-29 09:40:41,392 - mmdet - INFO - Epoch [2][200/1162]	lr: 2.000e-03, eta: 8:25:54, time: 0.537, data_time: 0.013, memory: 5142, loss_cls: 1.2351, loss_bbox: 1.2374, loss: 2.4725
2021-12-29 09:41:08,168 - mmdet - INFO - Epoch [2][250/1162]	lr: 2.000e-03, eta: 8:25:28, time: 0.536, data_time: 0.013, memory: 5142, loss_cls: 1.1736, loss_bbox: 1.1955, loss: 2.3691
2021-12-29 09:41:34,806 - mmdet - INFO - Epoch [2][300/1162]	lr: 2.000e-03, eta: 8:24:57, time: 0.533, data_time: 0.013, memory: 5142, loss_cls: 1.2357, loss_bbox: 1.2372, loss: 2.4729
2021-12-29 09:42:01,528 - mmdet - INFO - Epoch [2][350/1162]	lr: 2.000e-03, eta: 8:24:29, time: 0.534, data_time: 0.013, memory: 5142, loss_cls: 1.2839, loss_bbox: 1.2587, loss: 2.5425
2021-12-29 09:42:28,154 - mmdet - INFO - Epoch [2][400/1162]	lr: 2.000e-03, eta: 8:23:58, time: 0.533, data_time: 0.013, memory: 5142, loss_cls: 1.2595, loss_bbox: 1.2359, loss: 2.4954
2021-12-29 09:42:54,986 - mmdet - INFO - Epoch [2][450/1162]	lr: 2.000e-03, eta: 8:23:35, time: 0.537, data_time: 0.013, memory: 5142, loss_cls: 1.2725, loss_bbox: 1.3049, loss: 2.5773
2021-12-29 09:43:21,637 - mmdet - INFO - Epoch [2][500/1162]	lr: 2.000e-03, eta: 8:23:05, time: 0.533, data_time: 0.013, memory: 5142, loss_cls: 1.2867, loss_bbox: 1.2862, loss: 2.5730
2021-12-29 09:43:48,377 - mmdet - INFO - Epoch [2][550/1162]	lr: 2.000e-03, eta: 8:22:38, time: 0.535, data_time: 0.013, memory: 5142, loss_cls: 1.2554, loss_bbox: 1.2227, loss: 2.4781
2021-12-29 09:44:15,013 - mmdet - INFO - Epoch [2][600/1162]	lr: 2.000e-03, eta: 8:22:08, time: 0.533, data_time: 0.013, memory: 5142, loss_cls: 1.2519, loss_bbox: 1.2955, loss: 2.5474
2021-12-29 09:44:42,014 - mmdet - INFO - Epoch [2][650/1162]	lr: 2.000e-03, eta: 8:21:49, time: 0.540, data_time: 0.013, memory: 5142, loss_cls: 1.2472, loss_bbox: 1.2727, loss: 2.5199
2021-12-29 09:45:08,675 - mmdet - INFO - Epoch [2][700/1162]	lr: 2.000e-03, eta: 8:21:20, time: 0.533, data_time: 0.013, memory: 5142, loss_cls: 1.1740, loss_bbox: 1.2461, loss: 2.4200
2021-12-29 09:45:35,666 - mmdet - INFO - Epoch [2][750/1162]	lr: 2.000e-03, eta: 8:21:00, time: 0.540, data_time: 0.013, memory: 5142, loss_cls: 1.2391, loss_bbox: 1.2960, loss: 2.5351
2021-12-29 09:46:02,395 - mmdet - INFO - Epoch [2][800/1162]	lr: 2.000e-03, eta: 8:20:33, time: 0.535, data_time: 0.013, memory: 5142, loss_cls: 1.2462, loss_bbox: 1.2470, loss: 2.4933
2021-12-29 09:46:29,543 - mmdet - INFO - Epoch [2][850/1162]	lr: 2.000e-03, eta: 8:20:17, time: 0.543, data_time: 0.013, memory: 5142, loss_cls: 1.2525, loss_bbox: 1.3128, loss: 2.5653
2021-12-29 09:46:56,271 - mmdet - INFO - Epoch [2][900/1162]	lr: 2.000e-03, eta: 8:19:50, time: 0.535, data_time: 0.013, memory: 5142, loss_cls: 1.2501, loss_bbox: 1.2733, loss: 2.5234
2021-12-29 09:47:22,898 - mmdet - INFO - Epoch [2][950/1162]	lr: 2.000e-03, eta: 8:19:19, time: 0.533, data_time: 0.013, memory: 5142, loss_cls: 1.3215, loss_bbox: 1.2575, loss: 2.5790
Read more comments on GitHub >

github_iconTop Results From Across the Web

Assertion `input_val >= zero && input_val <= one` failed
Hi, all Recently, I changed the cpu and motherboard of my PC. But when I tried to run the training code, I encountered...
Read more >
Error during training (Assertion `input_val >= zero ... - GitHub
Error during training (Assertion input_val >= zero && input_val <= one ... [2,0,0] Assertion input_val >= zero && input_val <= one failed.
Read more >
CUDA error: device-side assert triggered on loss function ...
There might be two reasons of the error: As the log says input_val is not between the range [0; 1]. So you should...
Read more >
Nextflow training
1. Processes and Channels. In practice, a Nextflow pipeline is made by joining together different processes. Each process can be written in any...
Read more >
Untitled
Testcase Class per Class (617): We put all the Test Methods for one SUT class ... Unfinished Test Assertion (494): We ensure that...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found