Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Cannot find pseudo label for frame

See original GitHub issue

I am getting an error when running train.py, it seems to have something to do with PSEUDO_LABEL not being updated. The Traceback repeats for multiple frames, not just 002080 as seen below. I’ve also put the full output on this gist, in case the information below is not enough. Am I missing something? Thanks for any help!

Commands Run

$ NUM_GPUS=8
$ CONFIG_FILE=cfgs/da-waymo-kitti_models/pvrcnn_st3d/pvrcnn_st3d.yaml
$ bash scripts/dist_train.sh ${NUM_GPUS} --cfg_file ${CONFIG_FILE}

Error

[2021-07-21 15:05:09,022  train.py 168  INFO]  **********************Start training da-waymo-kitti_models/pvrcnn_st3d/pvrcnn_st3d(default)**********************
generate_ps_e0: 100%|████████████████████| 232/232 [03:14<00:00,  1.19it/s, pos_ps_box=0.000(0.000), ign_ps_box=15.000(14.899)]
Traceback (most recent call last):                                                                                             
  File "train.py", line 199, in <module>
    main()
  File "train.py", line 191, in main
    ema_model=None
  File "/home/user5/open-mmlab/ST3D/tools/train_utils/train_st_utils.py", line 157, in train_model_st
    dataloader_iter=dataloader_iter, ema_model=ema_model
  File "/home/user5/open-mmlab/ST3D/tools/train_utils/train_st_utils.py", line 42, in train_one_epoch_st
    target_batch = next(dataloader_iter)
  File "/home/user5/anaconda3/envs/st3d7/lib/python3.7/site-packages/torch/utils/data/dataloader.py", line 582, in __next__
    return self._process_next_batch(batch)
  File "/home/user5/anaconda3/envs/st3d7/lib/python3.7/site-packages/torch/utils/data/dataloader.py", line 608, in _process_next_batch
    raise batch.exc_type(batch.exc_msg)
ValueError: Traceback (most recent call last):
  File "/home/user5/anaconda3/envs/st3d7/lib/python3.7/site-packages/torch/utils/data/_utils/worker.py", line 99, in _worker_loop
    samples = collate_fn([dataset[i] for i in batch_indices])
  File "/home/user5/anaconda3/envs/st3d7/lib/python3.7/site-packages/torch/utils/data/_utils/worker.py", line 99, in <listcomp>
    samples = collate_fn([dataset[i] for i in batch_indices])
  File "/home/user5/open-mmlab/ST3D/tools/../pcdet/datasets/kitti/kitti_dataset.py", line 413, in __getitem__
    self.fill_pseudo_labels(input_dict)
  File "/home/user5/open-mmlab/ST3D/tools/../pcdet/datasets/dataset.py", line 146, in fill_pseudo_labels
    gt_boxes = self_training_utils.load_ps_label(input_dict['frame_id'])
  File "/home/user5/open-mmlab/ST3D/tools/../pcdet/utils/self_training_utils.py", line 221, in load_ps_label
    raise ValueError('Cannot find pseudo label for frame: %s' % frame_id)
ValueError: Cannot find pseudo label for frame: 002080

epochs:   0%|                                                                                           | 0/30 [04:05<?, ?it/s]

Environment

Python 3.7 CUDA 10.0 PyTorch 1.1 spconv 1.0 (commit 8da6f96) pcdet 0.2.0+73dda8c

Issue Analytics

State:
Created 2 years ago
Comments:17 (6 by maintainers)

Top GitHub Comments

4reactions

hughjazzmancommented, Aug 5, 2021

@Liz66666 I followed @AndyYuan96’s advice on using Manager().dict() for PSEUDO_LABEL in self_training_utils.py.

from multiprocessing import Manager

PSEUDO_LABEL = Manager().dict()

Actually before doing this, I managed to run the training by adding the pkl.load code from check_already_exsit_pseudo_label to load_ps_label, but changed to the above solution and reran the training as it should be better.

1reaction

AndyYuan96commented, Aug 4, 2021

Yes.

>>> len(data['002829']['gt_boxes'])
36
>>> data['002829']['gt_boxes'][0]
array([-12.67259693,   2.17882085,   0.4152911 ,   4.12861061,
         1.84807122,   1.557392  ,   3.2007041 ,   1.        ,
         0.81863701])
>>> data['002829']['cls_scores']
array([0.93731874, 0.92046964, 0.9030588 , 0.9078988 , 0.8054646 ,
       0.80939907, 0.88196886, 0.85936695, 0.86304134, 0.8979211 ,
       0.8379814 , 0.6361919 , 0.71232146, 0.8505274 , 0.70933247,
       0.7508229 , 0.6692671 , 0.4446245 , 0.46104938, 0.15859666,
       0.3290353 , 0.1491799 , 0.24889277, 0.13674273, 0.13740459,
       0.11474051, 0.14078389, 0.14664578, 0.11770795, 0.25883213,
       0.11457415, 0.12707922, 0.13901637, 0.12712085, 0.16028087,
       0.18077461], dtype=float32)
>>> data['002829']['iou_scores']
array([0.818637  , 0.8166671 , 0.8074603 , 0.8054869 , 0.79433286,
       0.7942598 , 0.7878299 , 0.78539443, 0.7752307 , 0.76748055,
       0.75878924, 0.7525175 , 0.7474871 , 0.74367374, 0.7227673 ,
       0.7154487 , 0.67805743, 0.64385206, 0.61908954, 0.55265254,
       0.5234711 , 0.4574411 , 0.34837985, 0.33632967, 0.32025087,
       0.2999313 , 0.26421076, 0.24447218, 0.24083728, 0.21878265,
       0.21566562, 0.19372286, 0.19200402, 0.16185848, 0.1382765 ,
       0.10244821], dtype=float32)
>>> data['002829']['memory_counter']
array([0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.,
       0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.,
       0., 0.])

use one gpu to train is normal, for 8 gpus, you need to change the code, use multiprocessing library, use Manager().dict() to replace dict(), and then carefully change the pseudo label save and load code.

Top Results From Across the Web

simba/pseudoLabel.md at master - GitHub

In the sub-menu titled Pseudo Labelling , there is a entry-box called Frame folder . · Beneath the Frame folder entry-box, there is...

Tkinter Label won't appear inside Frame [duplicate]

I have two frames: big_frame and small_frame . small_frame is inside of big_frame , and I want to place a label inside of...

Meta Pseudo Labels - YouTube

This video explains Meta Pseudo Labels ! This is a really interesting algorithm for dynamically adapting the ground truth targets (y) while ...

CLIP-FLOW: CONTRASTIVE LEARNING WITH ITERATIVE ...

The paper introduces a semi-supervised optical flow approach based on an iterative pseudo-labeling scheme. Given a pre-trained model (e.g., trained on synthetic ...

[2211.06007] Continuous Soft Pseudo-Labeling in ASR - arXiv

Surprisingly and unexpectedly, we find that soft-labels targets can ... the model collapsing to a degenerate token distribution per frame.