Step-by-step tutorial - How to train your own dataset
See original GitHub issue🚀 Feature: Training a custom dataset
In order to complete and unify all the issues about how to train a model using a custom dataset, you will find here the basic steps to do it. Taking into account that this implementation it’s a recent release, this steps can change in the future. I’m requesting feedback, this was the steps that I followed for training models and it works perfectly.
In my case, the context of my problem is completely different to the COCO dataset, with only 4 classes.
Related issues -> #15, #297, #372, etc.
Steps
1) COCO format
The easier way to to use this model is labelling your dataset with the official coco format according with the problem you have. In my case, It’s for instance segmentation, so I used the detection format.
2) Creating a Dataset
class for your data
Following the example coco.py
. Create a new class extending from torchvision.datasets.coco.CocoDetection
(you can find another classes in the official docs), this class encapsulates the pycocoapi methods to manage your coco dataset.
This class has to be created in maskrcnn-benchmark/maskrcnn_benchmark/data/datasets
folder, same as coco.py
, and included in the __init__.py
.
3) Adding dataset paths
This class will needs as parameters the paths for the JSON
file, it contains the metadata of your dataset in coco format, and for the images, the folder where they are. The engine automatically searches in paths_catalog.py
for these parameters, the easier way is including your paths to the dict DATASETS
following the format of course, then include an elif
statement in the get
method.
4) Evaluation file
Here is the importance of use the coco format, if your dataset have the same structure then you can use the same evaluation file used for the class COCODataset
, in the __init_.py
file just add an if statement like the COCODataset
statement.
This evaluation file follows the coco evaluation standard with the pycocoapi evaluation methods. You can create your own evaluation file, and you have to do it if your dataset have another structure.
5) Training script
Here you can find the standard training and testing scripts for the model, add your own arguments, change the output dir (this one is very important), etc.
6) Changing the hyper-parameters
The engine uses yacs config files, in the repo you can find different ways to change the hyper-parameters.
If you are using a Single GPU, look at the README, there is a section for this case, you have to change some hyper-parameters, the default hyper-parameters have been written for Multi-GPUs (8 GPUs). I trained my model using a single one, so no problem at all, just change SOLVER.IMS_PER_BATCH
and adjust the other SOLVER
params.
Update DATASETS.TRAIN
and DATASETS.TEST
with the name that you used in paths_catalog.py
. Also consider in change the min/max input sizes hyper-parameters.
7) Finetuning the model
The issue #15 has all the explanation.
- Download the official weights for the model that you want to setup.
- Change in the config file
MODEL.ROI_BOX_HEAD.NUM_CLASSES = your_classes + background
. - Use
trim_detectron_model.py
to remove those layers that are setup for the coco dataset, if you run the train model before this, there will be troubles with layers that expects the 81 classes (80 coco classes + background), those are the layers you have to remove. - This script will save the new weights, link the path to the
MODEL.WEIGHT
hyper-parameter.
Now all it is ready for trainnig!!
This the general modifications to the code for a custom dataset, I made more changes according to my needs.
Visualizing the results
Once the model finishes the training, the weights are saved, you can use the Mask_R-CNN_demo.ipynb
notebook to visualize the results of your model on the test dataset, but you have to change the class names in predictor.py
, it has the coco classes by default, put them in the same order used for the annotations.
Issue Analytics
- State:
- Created 5 years ago
- Reactions:93
- Comments:67 (15 by maintainers)
Top GitHub Comments
Isn’t it more straightforward to use a pretrained model from this repo’s Model Zoo instead of taking them from the detectron repo? They also have slightly higher reported accuracies.
The approach for trimming those models is very similar: https://gist.github.com/bernhardschaefer/01905b0fe83615f79e2928a2a10b6f28
Hi @AdanMora,
This is a very helpful, valuable tutorial! Among the many things that needs to be modified to train the network on your dataset I think the most crucial things are the data loading and the evaluation. Both are quite strongly hardcoded at the moment to fit the requirements of training and validating on COCO, however I am now refactoring a few parts to make the custom data training more feasible / convenient. In a week or two I am going to push these as PRs.
Datasets
First of all, here is a Dataset with all the necessary fields and methods implemented, and documented. Also you could use this dataset to make unit-tests on your training / valiadtion: https://gist.github.com/botcs/72a221f8a95471155b25a9e655a654e1 Basically, for compatibility with the training script, you just need 4 things to be implemented, of which the first one takes most of the effort, the remaining are quite trivial to have:
__getitem__
a function for returning the input image and the targetBoxList
with additional fields like masks and labels__len__
which returns the number of entries in your datasetclassid_to_name
a mapping between integerss and stringsget_img_info
which returns a dict of the input image metadata, at least thewidth
andheight
This could also help understanding what are the essential methods and fields which are used by the training (since the COCODataset has many convenience functions implemented as well).
Evaluation
Currently Pascal-VOC and COCO evaluation is supported only, and a
cityscapes -> coco-style
converter script is available. However making the evaluation work is way more trickier than you could have thought. Both evaluation script makes tremendous amounts of assumptions, which , while a simple mAP evaluation could be made with a dataset that only implements the bare minimum provided in thedebugdataset.py
.COCO style
The major issue with the COCO evaluation implemented in the current version that it requires your dataset to have a field called
coco
that actually does all the dirty work behind, imported from the officialpycocotools
. The problem with this approach that this lib assumes that your data structure is the same as COCO, also your class labels will be ignored as well, etc. To avoid this I have implemented aCOCOWrapper
which handles all the requirements fromCOCOeval
while works with a generic dataset likedebugdataset
. I have made attempts at validating this approach, which seems to be working fine, but has a few issues with the unit test. What remains is to find out if the original evaluation script is returning the same answers for this unit test.CityScapes style
I would like to call the attention to the CityScapes instance level evaluation script which is quite amazing and way better documented than the COCO script. Similarly, it requires your data to be organized in the same directory structure the way the original dataset is available, also each of the predicted instance should go to a different binary mask file, which just causes a lot of headache again. However you can hijack the evaluation script from the point all the annotations and predictions are loaded. Thankfully, it has passed the perfect-score unit test (visualized in this notebook), in which I feed the annotations as predictions. I am currently validating this approach, if it gives the same / similar score to the COCOeval’s scores.
My plan is to submit some Pull Requests as soon as I am finished with cleaning up these abstractions, which would help the customization of
maskrcnn_benchmark
lib (which is AFAIK the best performing implementation available).Once it is out there, would you like to help making a step-by-step tutorial just like the Matterport Mask-RCNN implementation has, Splash of Colors?