Total time difference when training
See original GitHub issueHi guys, is there any time difference between training 300 images vs 150 images ? I tried training 300 images with 55 test using (ecpoh=300) which Google Colab terminate it at 267/300 ,it took me over 7 hours to reach. so i divided the dataset into 2 hoping it will reduces the time by 50% but from what i can see here, it is going by the previous time
Using resnet50 as network backbone For Mask R-CNN model
Applying Default Augmentation on Dataset
Train 300 images
Validate 55 images
Checkpoint Path: /content/mask_rcnn_models
Selecting layers to train
Epoch 1/300
100/100 [==============================] - 205s 1s/step - batch: 49.5000 - size: 4.0000 - loss: 0.9180 - rpn_class_loss: 0.0146 - rpn_bbox_loss: 0.3307 - mrcnn_class_loss: 0.0552 - mrcnn_bbox_loss: 0.3484 - mrcnn_mask_loss: 0.1690 - val_loss: 0.6101 - val_rpn_class_loss: 0.0078 - val_rpn_bbox_loss: 0.2722 - val_mrcnn_class_loss: 0.0228 - val_mrcnn_bbox_loss: 0.1798 - val_mrcnn_mask_loss: 0.1275
Epoch 2/300
100/100 [==============================] - 121s 1s/step - batch: 49.5000 - size: 4.0000 - loss: 0.4959 - rpn_class_loss: 0.0057 - rpn_bbox_loss: 0.1984 - mrcnn_class_loss: 0.0274 - mrcnn_bbox_loss: 0.1425 - mrcnn_mask_loss: 0.1218 - val_loss: 0.5547 - val_rpn_class_loss: 0.0047 - val_rpn_bbox_loss: 0.2960 - val_mrcnn_class_loss: 0.0110 - val_mrcnn_bbox_loss: 0.1219 - val_mrcnn_mask_loss: 0.1212
Epoch 3/300
100/100 [==============================] - 126s 1s/step - batch: 49.5000 - size: 4.0000 - loss: 0.4234 - rpn_class_loss: 0.0043 - rpn_bbox_loss: 0.1824 - mrcnn_class_loss: 0.0206 - mrcnn_bbox_loss: 0.1022 - mrcnn_mask_loss: 0.1140 - val_loss: 0.3582 - val_rpn_class_loss: 0.0029 - val_rpn_bbox_loss: 0.1576 - val_mrcnn_class_loss: 0.0124 - val_mrcnn_bbox_loss: 0.0807 - val_mrcnn_mask_loss: 0.1046
Epoch 4/300
100/100 [==============================] - 121s 1s/step - batch: 49.5000 - size: 4.0000 - loss: 0.3597 - rpn_class_loss: 0.0033 - rpn_bbox_loss: 0.1438 - mrcnn_class_loss: 0.0164 - mrcnn_bbox_loss: 0.0839 - mrcnn_mask_loss: 0.1123 - val_loss: 0.3611 - val_rpn_class_loss: 0.0018 - val_rpn_bbox_loss: 0.1736 - val_mrcnn_class_loss: 0.0076 - val_mrcnn_bbox_loss: 0.0670 - val_mrcnn_mask_loss: 0.1111
Epoch 5/300
100/100 [==============================] - 121s 1s/step - batch: 49.5000 - size: 4.0000 - loss: 0.3001 - rpn_class_loss: 0.0025 - rpn_bbox_loss: 0.1137 - mrcnn_class_loss: 0.0122 - mrcnn_bbox_loss: 0.0595 - mrcnn_mask_loss: 0.1123 - val_loss: 0.3264 - val_rpn_class_loss: 0.0020 - val_rpn_bbox_loss: 0.1344 - val_mrcnn_class_loss: 0.0089 - val_mrcnn_bbox_loss: 0.0771 - val_mrcnn_mask_loss: 0.1040
Epoch 6/300
100/100 [==============================] - 125s 1s/step - batch: 49.5000 - size: 4.0000 - loss: 0.2718 - rpn_class_loss: 0.0019 - rpn_bbox_loss: 0.0992 - mrcnn_class_loss: 0.0095 - mrcnn_bbox_loss: 0.0556 - mrcnn_mask_loss: 0.1055 - val_loss: 0.2959 - val_rpn_class_loss: 0.0023 - val_rpn_bbox_loss: 0.1174 - val_mrcnn_class_loss: 0.0098 - val_mrcnn_bbox_loss: 0.0614 - val_mrcnn_mask_loss: 0.1050
Epoch 7/300
100/100 [==============================] - 120s 1s/step - batch: 49.5000 - size: 4.0000 - loss: 0.2894 - rpn_class_loss: 0.0022 - rpn_bbox_loss: 0.1127 - mrcnn_class_loss: 0.0113 - mrcnn_bbox_loss: 0.0562 - mrcnn_mask_loss: 0.1071 - val_loss: 0.3831 - val_rpn_class_loss: 0.0028 - val_rpn_bbox_loss: 0.1883 - val_mrcnn_class_loss: 0.0095 - val_mrcnn_bbox_loss: 0.0698 - val_mrcnn_mask_loss: 0.1127
Epoch 8/300
Using resnet50 as network backbone For Mask R-CNN model
Applying Default Augmentation on Dataset
Train 150 images
Validate 28 images
Checkpoint Path: /content/mask_rcnn_models
Selecting layers to train
Epoch 1/300
100/100 [==============================] - 192s 1s/step - batch: 49.5000 - size: 4.0000 - loss: 0.8760 - rpn_class_loss: 0.0113 - rpn_bbox_loss: 0.2987 - mrcnn_class_loss: 0.0464 - mrcnn_bbox_loss: 0.3224 - mrcnn_mask_loss: 0.1972 - val_loss: 0.6289 - val_rpn_class_loss: 0.0057 - val_rpn_bbox_loss: 0.2860 - val_mrcnn_class_loss: 0.0325 - val_mrcnn_bbox_loss: 0.1937 - val_mrcnn_mask_loss: 0.1110
Epoch 2/300
100/100 [==============================] - 121s 1s/step - batch: 49.5000 - size: 4.0000 - loss: 0.4598 - rpn_class_loss: 0.0034 - rpn_bbox_loss: 0.1973 - mrcnn_class_loss: 0.0159 - mrcnn_bbox_loss: 0.1235 - mrcnn_mask_loss: 0.1197 - val_loss: 0.4465 - val_rpn_class_loss: 0.0044 - val_rpn_bbox_loss: 0.1912 - val_mrcnn_class_loss: 0.0217 - val_mrcnn_bbox_loss: 0.1230 - val_mrcnn_mask_loss: 0.1063
Epoch 3/300
100/100 [==============================] - 120s 1s/step - batch: 49.5000 - size: 4.0000 - loss: 0.3609 - rpn_class_loss: 0.0029 - rpn_bbox_loss: 0.1573 - mrcnn_class_loss: 0.0120 - mrcnn_bbox_loss: 0.0824 - mrcnn_mask_loss: 0.1064 - val_loss: 0.4007 - val_rpn_class_loss: 0.0034 - val_rpn_bbox_loss: 0.1635 - val_mrcnn_class_loss: 0.0149 - val_mrcnn_bbox_loss: 0.1122 - val_mrcnn_mask_loss: 0.1068
Epoch 4/300
100/100 [==============================] - 121s 1s/step - batch: 49.5000 - size: 4.0000 - loss: 0.3058 - rpn_class_loss: 0.0023 - rpn_bbox_loss: 0.1208 - mrcnn_class_loss: 0.0082 - mrcnn_bbox_loss: 0.0707 - mrcnn_mask_loss: 0.1039 - val_loss: 0.3454 - val_rpn_class_loss: 0.0027 - val_rpn_bbox_loss: 0.1386 - val_mrcnn_class_loss: 0.0125 - val_mrcnn_bbox_loss: 0.0861 - val_mrcnn_mask_loss: 0.1055
Epoch 5/300
100/100 [==============================] - 121s 1s/step - batch: 49.5000 - size: 4.0000 - loss: 0.2659 - rpn_class_loss: 0.0020 - rpn_bbox_loss: 0.1040 - mrcnn_class_loss: 0.0076 - mrcnn_bbox_loss: 0.0521 - mrcnn_mask_loss: 0.1001 - val_loss: 0.3603 - val_rpn_class_loss: 0.0025 - val_rpn_bbox_loss: 0.1673 - val_mrcnn_class_loss: 0.0144 - val_mrcnn_bbox_loss: 0.0813 - val_mrcnn_mask_loss: 0.0949
Epoch 6/300
100/100 [==============================] - 120s 1s/step - batch: 49.5000 - size: 4.0000 - loss: 0.2282 - rpn_class_loss: 0.0016 - rpn_bbox_loss: 0.0842 - mrcnn_class_loss: 0.0054 - mrcnn_bbox_loss: 0.0399 - mrcnn_mask_loss: 0.0971 - val_loss: 0.3388 - val_rpn_class_loss: 0.0021 - val_rpn_bbox_loss: 0.1219 - val_mrcnn_class_loss: 0.0145 - val_mrcnn_bbox_loss: 0.0944 - val_mrcnn_mask_loss: 0.1059
Epoch 7/300
100/100 [==============================] - 120s 1s/step - batch: 49.5000 - size: 4.0000 - loss: 0.1911 - rpn_class_loss: 0.0014 - rpn_bbox_loss: 0.0589 - mrcnn_class_loss: 0.0052 - mrcnn_bbox_loss: 0.0340 - mrcnn_mask_loss: 0.0915 - val_loss: 0.3143 - val_rpn_class_loss: 0.0018 - val_rpn_bbox_loss: 0.1305 - val_mrcnn_class_loss: 0.0069 - val_mrcnn_bbox_loss: 0.0735 - val_mrcnn_mask_loss: 0.1016
Epoch 8/300
100/100 [==============================] - 122s 1s/step - batch: 49.5000 - size: 4.0000 - loss: 0.1919 - rpn_class_loss: 0.0012 - rpn_bbox_loss: 0.0599 - mrcnn_class_loss: 0.0048 - mrcnn_bbox_loss: 0.0326 - mrcnn_mask_loss: 0.0934 - val_loss: 0.2894 - val_rpn_class_loss: 0.0019 - val_rpn_bbox_loss: 0.1249 - val_mrcnn_class_loss: 0.0065 - val_mrcnn_bbox_loss: 0.0618 - val_mrcnn_mask_loss: 0.0943
which means there is no difference between them, so i’m still tied to be using over 7 hours to train 150 images.
Is there any recommendation or i’m missing something ? I would appreciate any help.
Issue Analytics
- State:
- Created 2 years ago
- Comments:8 (5 by maintainers)
Top Results From Across the Web
How Long Does It Take To See Results From Working Out?
“Depending on the training program, a beginner can be half marathon-ready in roughly 12 to 20 weeks.” FYI: VO2 max is basically the...
Read more >Guide to Sets, Reps, and Rest Time in Strength Training
The rest interval is the time spent resting between sets that allow the muscle to recover. The rest period between sets may range...
Read more >Calculate the difference between two times - Microsoft Support
There are several ways to calculate the difference between two times. ... Total seconds between two times (17700). 7. =HOUR(B2-A2). The difference in...
Read more >Calculate Time in Excel (Time Difference, Hours Worked, Add
To calculate the time difference in minutes, you need to multiply the resulting value by the total number of minutes in a day...
Read more >Total work and training time comparison between TER and ...
total work performed was 77% greater in the TER compared with the SIT group over the 2 weeks of training. The time commitment...
Read more >Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start FreeTop Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Top GitHub Comments
@alic-xc This is the weakness of Mask R-CNN algorithm. Training with Mask R-CNN consumes a lot of power. If you want to train faster you will have to make use of a GPU with a bigger capacity. The alternative available is to train all the heads of Mask R-CNN, I set it by default to train all the layers.
In the train_model function you set the parameter layers to heads.
Note: Training the heads of the Mask R-CNN layers may not reach lower validation losses compared to training all the layers.
Okay, i think i understand it better now. Thanks @ayoolaolafenwa @khanfarhan10