question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

RuntimeError when resume a pretrained model.

See original GitHub issue

I want to finetune a model, but when I resume a pretrained model ,it get error below: Called with args: Namespace(batch_size=1, checkepoch=20, checkpoint=3557, checkpoint_interval=10000, checksession=1, class_agnostic=False, cuda=‘–cuda’, dataset=‘pascal_voc’, disp_interval=100, large_scale=False, lr=0.0005, lr_decay_gamma=0.1, lr_decay_step=5, mGPUs=False, max_epochs=26, net=‘vgg16’, num_workers=0, optimizer=‘sgd’, resume=True, save_dir=‘/home/smartdsp/new_home/faster-rcnn.pytorch/models’, session=1, start_epoch=1, use_tfboard=False) Using config: {‘ANCHOR_RATIOS’: [0.5, 1, 2], ‘ANCHOR_SCALES’: [8, 16, 32], ‘CROP_RESIZE_WITH_MAX_POOL’: False, ‘CUDA’: False, ‘DATA_DIR’: ‘/home/smartdsp/new_home/faster-rcnn.pytorch/data’, ‘DEDUP_BOXES’: 0.0625, ‘EPS’: 1e-14, ‘EXP_DIR’: ‘vgg16’, ‘FEAT_STRIDE’: [16], ‘GPU_ID’: 0, ‘MATLAB’: ‘matlab’, ‘MAX_NUM_GT_BOXES’: 20, ‘MOBILENET’: {‘DEPTH_MULTIPLIER’: 1.0, ‘FIXED_LAYERS’: 5, ‘REGU_DEPTH’: False, ‘WEIGHT_DECAY’: 4e-05}, ‘PIXEL_MEANS’: array([[[102.9801, 115.9465, 122.7717]]]), ‘POOLING_MODE’: ‘align’, ‘POOLING_SIZE’: 7, ‘RESNET’: {‘FIXED_BLOCKS’: 1, ‘MAX_POOL’: False}, ‘RNG_SEED’: 3, ‘ROOT_DIR’: ‘/home/smartdsp/new_home/faster-rcnn.pytorch’, ‘TEST’: {‘BBOX_REG’: True, ‘HAS_RPN’: True, ‘MAX_SIZE’: 1000, ‘MODE’: ‘nms’, ‘NMS’: 0.3, ‘PROPOSAL_METHOD’: ‘gt’, ‘RPN_MIN_SIZE’: 16, ‘RPN_NMS_THRESH’: 0.7, ‘RPN_POST_NMS_TOP_N’: 300, ‘RPN_PRE_NMS_TOP_N’: 6000, ‘RPN_TOP_N’: 5000, ‘SCALES’: [600], ‘SVM’: False}, ‘TRAIN’: {‘ASPECT_GROUPING’: False, ‘BATCH_SIZE’: 256, ‘BBOX_INSIDE_WEIGHTS’: [1.0, 1.0, 1.0, 1.0], ‘BBOX_NORMALIZE_MEANS’: [0.0, 0.0, 0.0, 0.0], ‘BBOX_NORMALIZE_STDS’: [0.1, 0.1, 0.2, 0.2], ‘BBOX_NORMALIZE_TARGETS’: True, ‘BBOX_NORMALIZE_TARGETS_PRECOMPUTED’: True, ‘BBOX_REG’: True, ‘BBOX_THRESH’: 0.5, ‘BG_THRESH_HI’: 0.5, ‘BG_THRESH_LO’: 0.0, ‘BIAS_DECAY’: False, ‘BN_TRAIN’: False, ‘DISPLAY’: 10, ‘DOUBLE_BIAS’: True, ‘FG_FRACTION’: 0.25, ‘FG_THRESH’: 0.5, ‘GAMMA’: 0.1, ‘HAS_RPN’: True, ‘IMS_PER_BATCH’: 1, ‘LEARNING_RATE’: 0.01, ‘MAX_SIZE’: 1000, ‘MOMENTUM’: 0.9, ‘PROPOSAL_METHOD’: ‘gt’, ‘RPN_BATCHSIZE’: 256, ‘RPN_BBOX_INSIDE_WEIGHTS’: [1.0, 1.0, 1.0, 1.0], ‘RPN_CLOBBER_POSITIVES’: False, ‘RPN_FG_FRACTION’: 0.5, ‘RPN_MIN_SIZE’: 8, ‘RPN_NEGATIVE_OVERLAP’: 0.3, ‘RPN_NMS_THRESH’: 0.7, ‘RPN_POSITIVE_OVERLAP’: 0.7, ‘RPN_POSITIVE_WEIGHT’: -1.0, ‘RPN_POST_NMS_TOP_N’: 2000, ‘RPN_PRE_NMS_TOP_N’: 12000, ‘SCALES’: [600], ‘SNAPSHOT_ITERS’: 5000, ‘SNAPSHOT_KEPT’: 3, ‘SNAPSHOT_PREFIX’: ‘res101_faster_rcnn’, ‘STEPSIZE’: [30000], ‘SUMMARY_INTERVAL’: 180, ‘TRIM_HEIGHT’: 600, ‘TRIM_WIDTH’: 600, ‘TRUNCATED’: False, ‘USE_ALL_GT’: True, ‘USE_FLIPPED’: True, ‘USE_GT’: False, ‘WEIGHT_DECAY’: 0.0005}, ‘USE_GPU_NMS’: True} Loaded dataset voc_2007_trainval for training Set proposal method: gt Appending horizontally-flipped training examples… voc_2007_trainval gt roidb loaded from /home/smartdsp/new_home/faster-rcnn.pytorch/data/cache/voc_2007_trainval_gt_roidb.pkl done Preparing training data… done before filtering, there are 2372 images… after filtering, there are 2372 images… 2372 roidb entries Loading pretrained weights from data/pretrained_model/vgg16_caffe.pth loading checkpoint /home/smartdsp/new_home/faster-rcnn.pytorch/models/vgg16/pascal_voc/vgg16_baseline/faster_rcnn_1_20_3557.pth loaded checkpoint /home/smartdsp/new_home/faster-rcnn.pytorch/models/vgg16/pascal_voc/vgg16_baseline/faster_rcnn_1_20_3557.pth lib/model/rpn/rpn.py:68: UserWarning: Implicit dimension choice for softmax has been deprecated. Change the call to include dim=X as an argument. rpn_cls_prob_reshape = F.softmax(rpn_cls_score_reshape) lib/model/faster_rcnn/faster_rcnn.py:98: UserWarning: Implicit dimension choice for softmax has been deprecated. Change the call to include dim=X as an argument. cls_prob = F.softmax(cls_score) /home/smartdsp/new_home/faster-rcnn.pytorch/trainval_net_finetune.py:330: UserWarning: invalid index of a 0-dim tensor. This will be an error in PyTorch 0.5. Use tensor.item() to convert a 0-dim tensor to a Python number loss_temp += loss.data[0] Traceback (most recent call last):

File “<ipython-input-1-cbbe4ee4d4f2>”, line 1, in <module> runfile(‘/home/smartdsp/new_home/faster-rcnn.pytorch/trainval_net_finetune.py’, wdir=‘/home/smartdsp/new_home/faster-rcnn.pytorch’)

File “/home/smartdsp/anaconda2/lib/python2.7/site-packages/spyder/utils/site/sitecustomize.py”, line 705, in runfile execfile(filename, namespace)

File “/home/smartdsp/anaconda2/lib/python2.7/site-packages/spyder/utils/site/sitecustomize.py”, line 94, in execfile builtins.execfile(filename, *where)

File “/home/smartdsp/new_home/faster-rcnn.pytorch/trainval_net_finetune.py”, line 337, in <module> optimizer.step()

File “/home/smartdsp/anaconda2/lib/python2.7/site-packages/torch/optim/sgd.py”, line 101, in step buf.mul_(momentum).add_(1 - dampening, d_p)

RuntimeError: Expected object of type torch.FloatTensor but found type torch.cuda.FloatTensor for argument #4 ‘other’

Issue Analytics

  • State:closed
  • Created 5 years ago
  • Comments:14 (1 by maintainers)

github_iconTop GitHub Comments

11reactions
ljtruongcommented, Jul 19, 2018

@wjx2 @babyjie57 This update is due to the new pytorch 0.4.

you can re-initialise the weights manually using this

model.load_state_dict(checkpoint['model'])
model.cuda()
optimizer = optim.SGD(model.parameters(), momentum = 0.9, weight_decy = 0.0001)
optimizer.load_state_dict(checkpoint['optimizer'])
for state in optimizer.state.values():
    for k, v in state.items():
        if isinstance(v, torch.Tensor):
            state[k] = v.cuda()
10reactions
jwyangcommented, Jun 30, 2018

@wjx2 see the error in last row. it is because the mismatch of cpu data and cpu data. use cuda when you run the code.

Read more comments on GitHub >

github_iconTop Results From Across the Web

Training my pretrained model in different dataset and I got an ...
RuntimeError : Error(s) in loading state_dict for Generator: size mismatch for crop_encoder.bn1.embed.weight: copying a param with shape torch.
Read more >
Getting an runtime error while using ViT Pre -Trained model
I have been trying to develop an image classification model using ViT and done with it till model building phase. when I'm trying...
Read more >
How do I change the classification head of a model?
The reason is: you are trying to use mode, which has already pretrained on a particular classification task. You have to remove the...
Read more >
Detectron2 - Object Detection with PyTorch - Gilbert Tanner
After executing the cell, click the "RESTART RUNTIME" button at the bottom of the output for the ... Using a pre-trained model is...
Read more >
How to train an existing word2vec gensim model on new words?
I think you cannot sort vocabulary after model weights already initialized.In your code you try to diplay the length of your ...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found