Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

RuntimeError when resume a pretrained model.

See original GitHub issue

I want to finetune a model, but when I resume a pretrained model ,it get error below: Called with args: Namespace(batch_size=1, checkepoch=20, checkpoint=3557, checkpoint_interval=10000, checksession=1, class_agnostic=False, cuda=‘–cuda’, dataset=‘pascal_voc’, disp_interval=100, large_scale=False, lr=0.0005, lr_decay_gamma=0.1, lr_decay_step=5, mGPUs=False, max_epochs=26, net=‘vgg16’, num_workers=0, optimizer=‘sgd’, resume=True, save_dir=‘/home/smartdsp/new_home/faster-rcnn.pytorch/models’, session=1, start_epoch=1, use_tfboard=False) Using config: {‘ANCHOR_RATIOS’: [0.5, 1, 2], ‘ANCHOR_SCALES’: [8, 16, 32], ‘CROP_RESIZE_WITH_MAX_POOL’: False, ‘CUDA’: False, ‘DATA_DIR’: ‘/home/smartdsp/new_home/faster-rcnn.pytorch/data’, ‘DEDUP_BOXES’: 0.0625, ‘EPS’: 1e-14, ‘EXP_DIR’: ‘vgg16’, ‘FEAT_STRIDE’: [16], ‘GPU_ID’: 0, ‘MATLAB’: ‘matlab’, ‘MAX_NUM_GT_BOXES’: 20, ‘MOBILENET’: {‘DEPTH_MULTIPLIER’: 1.0, ‘FIXED_LAYERS’: 5, ‘REGU_DEPTH’: False, ‘WEIGHT_DECAY’: 4e-05}, ‘PIXEL_MEANS’: array([[[102.9801, 115.9465, 122.7717]]]), ‘POOLING_MODE’: ‘align’, ‘POOLING_SIZE’: 7, ‘RESNET’: {‘FIXED_BLOCKS’: 1, ‘MAX_POOL’: False}, ‘RNG_SEED’: 3, ‘ROOT_DIR’: ‘/home/smartdsp/new_home/faster-rcnn.pytorch’, ‘TEST’: {‘BBOX_REG’: True, ‘HAS_RPN’: True, ‘MAX_SIZE’: 1000, ‘MODE’: ‘nms’, ‘NMS’: 0.3, ‘PROPOSAL_METHOD’: ‘gt’, ‘RPN_MIN_SIZE’: 16, ‘RPN_NMS_THRESH’: 0.7, ‘RPN_POST_NMS_TOP_N’: 300, ‘RPN_PRE_NMS_TOP_N’: 6000, ‘RPN_TOP_N’: 5000, ‘SCALES’: [600], ‘SVM’: False}, ‘TRAIN’: {‘ASPECT_GROUPING’: False, ‘BATCH_SIZE’: 256, ‘BBOX_INSIDE_WEIGHTS’: [1.0, 1.0, 1.0, 1.0], ‘BBOX_NORMALIZE_MEANS’: [0.0, 0.0, 0.0, 0.0], ‘BBOX_NORMALIZE_STDS’: [0.1, 0.1, 0.2, 0.2], ‘BBOX_NORMALIZE_TARGETS’: True, ‘BBOX_NORMALIZE_TARGETS_PRECOMPUTED’: True, ‘BBOX_REG’: True, ‘BBOX_THRESH’: 0.5, ‘BG_THRESH_HI’: 0.5, ‘BG_THRESH_LO’: 0.0, ‘BIAS_DECAY’: False, ‘BN_TRAIN’: False, ‘DISPLAY’: 10, ‘DOUBLE_BIAS’: True, ‘FG_FRACTION’: 0.25, ‘FG_THRESH’: 0.5, ‘GAMMA’: 0.1, ‘HAS_RPN’: True, ‘IMS_PER_BATCH’: 1, ‘LEARNING_RATE’: 0.01, ‘MAX_SIZE’: 1000, ‘MOMENTUM’: 0.9, ‘PROPOSAL_METHOD’: ‘gt’, ‘RPN_BATCHSIZE’: 256, ‘RPN_BBOX_INSIDE_WEIGHTS’: [1.0, 1.0, 1.0, 1.0], ‘RPN_CLOBBER_POSITIVES’: False, ‘RPN_FG_FRACTION’: 0.5, ‘RPN_MIN_SIZE’: 8, ‘RPN_NEGATIVE_OVERLAP’: 0.3, ‘RPN_NMS_THRESH’: 0.7, ‘RPN_POSITIVE_OVERLAP’: 0.7, ‘RPN_POSITIVE_WEIGHT’: -1.0, ‘RPN_POST_NMS_TOP_N’: 2000, ‘RPN_PRE_NMS_TOP_N’: 12000, ‘SCALES’: [600], ‘SNAPSHOT_ITERS’: 5000, ‘SNAPSHOT_KEPT’: 3, ‘SNAPSHOT_PREFIX’: ‘res101_faster_rcnn’, ‘STEPSIZE’: [30000], ‘SUMMARY_INTERVAL’: 180, ‘TRIM_HEIGHT’: 600, ‘TRIM_WIDTH’: 600, ‘TRUNCATED’: False, ‘USE_ALL_GT’: True, ‘USE_FLIPPED’: True, ‘USE_GT’: False, ‘WEIGHT_DECAY’: 0.0005}, ‘USE_GPU_NMS’: True} Loaded dataset voc_2007_trainval for training Set proposal method: gt Appending horizontally-flipped training examples… voc_2007_trainval gt roidb loaded from /home/smartdsp/new_home/faster-rcnn.pytorch/data/cache/voc_2007_trainval_gt_roidb.pkl done Preparing training data… done before filtering, there are 2372 images… after filtering, there are 2372 images… 2372 roidb entries Loading pretrained weights from data/pretrained_model/vgg16_caffe.pth loading checkpoint /home/smartdsp/new_home/faster-rcnn.pytorch/models/vgg16/pascal_voc/vgg16_baseline/faster_rcnn_1_20_3557.pth loaded checkpoint /home/smartdsp/new_home/faster-rcnn.pytorch/models/vgg16/pascal_voc/vgg16_baseline/faster_rcnn_1_20_3557.pth lib/model/rpn/rpn.py:68: UserWarning: Implicit dimension choice for softmax has been deprecated. Change the call to include dim=X as an argument. rpn_cls_prob_reshape = F.softmax(rpn_cls_score_reshape) lib/model/faster_rcnn/faster_rcnn.py:98: UserWarning: Implicit dimension choice for softmax has been deprecated. Change the call to include dim=X as an argument. cls_prob = F.softmax(cls_score) /home/smartdsp/new_home/faster-rcnn.pytorch/trainval_net_finetune.py:330: UserWarning: invalid index of a 0-dim tensor. This will be an error in PyTorch 0.5. Use tensor.item() to convert a 0-dim tensor to a Python number loss_temp += loss.data[0] Traceback (most recent call last):

File “<ipython-input-1-cbbe4ee4d4f2>”, line 1, in <module> runfile(‘/home/smartdsp/new_home/faster-rcnn.pytorch/trainval_net_finetune.py’, wdir=‘/home/smartdsp/new_home/faster-rcnn.pytorch’)

File “/home/smartdsp/anaconda2/lib/python2.7/site-packages/spyder/utils/site/sitecustomize.py”, line 705, in runfile execfile(filename, namespace)

File “/home/smartdsp/anaconda2/lib/python2.7/site-packages/spyder/utils/site/sitecustomize.py”, line 94, in execfile builtins.execfile(filename, *where)

File “/home/smartdsp/new_home/faster-rcnn.pytorch/trainval_net_finetune.py”, line 337, in <module> optimizer.step()

File “/home/smartdsp/anaconda2/lib/python2.7/site-packages/torch/optim/sgd.py”, line 101, in step buf.mul_(momentum).add_(1 - dampening, d_p)

RuntimeError: Expected object of type torch.FloatTensor but found type torch.cuda.FloatTensor for argument #4 ‘other’

Issue Analytics

State:
Created 5 years ago
Comments:14 (1 by maintainers)

Top GitHub Comments

11reactions

ljtruongcommented, Jul 19, 2018

@wjx2 @babyjie57 This update is due to the new pytorch 0.4.

you can re-initialise the weights manually using this

model.load_state_dict(checkpoint['model'])
model.cuda()
optimizer = optim.SGD(model.parameters(), momentum = 0.9, weight_decy = 0.0001)
optimizer.load_state_dict(checkpoint['optimizer'])
for state in optimizer.state.values():
    for k, v in state.items():
        if isinstance(v, torch.Tensor):
            state[k] = v.cuda()

10reactions

jwyangcommented, Jun 30, 2018

@wjx2 see the error in last row. it is because the mismatch of cpu data and cpu data. use cuda when you run the code.