custom classes
See original GitHub issueHi! Tell me please, how can I train your model on my data set with a different number of classes?
I created a dataset the same as the directory structure of the dasset PASCAL_VOC.
Then, I changed the path to the dataset and the number of classes in the configuration file.
num_classes = 1
But, i get an error:
/opt/conda/conda-bld/pytorch-nightly_1551849226410/work/aten/src/THC/THCTensorScatterGather.cu:97: void THCudaTensor_gatherKernel(TensorInfo<Real, IndexType>, TensorInfo<Real, IndexType>, TensorInfo<long, IndexType>, int, IndexType) [with IndexType = unsigned int, Real = float, Dims = 2]: block: [273,0,0], thread: [353,0,0] Assertion
indexValue >= 0 && indexValue < src.sizes[dim]
failed. /opt/conda/conda-bld/pytorch-nightly_1551849226410/work/aten/src/THC/THCTensorScatterGather.cu:97: void THCudaTensor_gatherKernel(TensorInfo<Real, IndexType>, TensorInfo<Real, IndexType>, TensorInfo<long, IndexType>, int, IndexType) [with IndexType = unsigned int, Real = float, Dims = 2]: block: [271,0,0], thread: [225,0,0] AssertionindexValue >= 0 && indexValue < src.sizes[dim]
failed. /opt/conda/conda-bld/pytorch-nightly_1551849226410/work/aten/src/THC/THCTensorScatterGather.cu:97: void THCudaTensor_gatherKernel(TensorInfo<Real, IndexType>, TensorInfo<Real, IndexType>, TensorInfo<long, IndexType>, int, IndexType) [with IndexType = unsigned int, Real = float, Dims = 2]: block: [204,0,0], thread: [147,0,0] AssertionindexValue >= 0 && indexValue < src.sizes[dim]
failed. nightly_1551849226410/work/aten/src/THC/THCTensorScatterGather.cu:97: void THCudaTensor_gatherKernel(TensorInfo<Real, IndexType>, TensorInfo<Real, IndexType>, TensorInfo<long, IndexType>, int, IndexType) [with IndexType = unsigned int, Real = float, Dims = 2]: block: [31,0,0], thread: [61,0,0] AssertionindexValue >= 0 && indexValue < src.sizes[dim]
failed. Traceback (most recent call last): File “train.py”, line 104, in <module> loss_l, loss_c = criterion(out, priors, targets) File “/home/maksim/anaconda3/lib/python3.7/site-packages/torch/nn/modules/module.py”, line 491, in call result = self.forward(*input, **kwargs) File “/mnt/data/code/object_detection/M2Det/layers/modules/multibox_loss.py”, line 95, in forward loss_c[pos.view(-1,1)] = 0 # filter out pos boxes for now RuntimeError: copy_if failed to synchronize: device-side assert triggered
I would be grateful for any help
Issue Analytics
- State:
- Created 5 years ago
- Comments:16
@ufohuang98 I could not fix this problem. It seems the developers need to pay attention to this problem, and explain why this is happening and how to solve it. @dshahrokhian we can hope to get clear explanations from you about this problem? Or do you prefer to leave us alone with this problem? 😃
@dshahrokhian @Maxfashko @DeqiangWang @Xiehuaiqi @ufohuang98 @rw1995 maybe you should adjust the weights between “loss_l” and “loss_c”, you could find “loss = loss_l + loss_c” in “train.py” of this project, and you can introduce a parameter lambda to change the loss, like “loss = loss_l + 0.1*loss_c”, finally, you may get the correct results~