How to train Custom Dataset
See original GitHub issueThis guide explains how to train your own custom dataset with fastreid’s data loaders.
Before You Start
Following Getting Started to setup the environment and install requirements.txt dependencies.
Train on Custom Dataset
-
Register your dataset (i.e., tell fastreid how to obtain your dataset).
To let fastreid know how to obtain a dataset named “my_dataset”, users need to implement a
Class
that inheritsfastreid.data.datasets.bases.ImageDataset
:from fastreid.data.datasets import DATASET_REGISTRY from fastreid.data.datasets.bases import ImageDataset @DATASET_REGISTRY.register() class MyOwnDataset(ImageDataset): def __init__(self, root='datasets', **kwargs): ... super().__init__(train, query, gallery)
Here, the snippet associates a dataset named “MyOwnDataset” with a class that processes train set, query set and gallery set and then pass to the baseClass. Then add a decorator to this class for registration.
The class can do arbitrary things and should generate train list:
list(str, str, str)
, query list:list(str, int, int)
and gallery list:list(str, int, int)
as below.train_list = [ (train_path1, pid1, camid1), (train_path2, pid2, camid2), ...] query_list = [ (query_path1, pid1, camid1), (query_path2, pid2, camid2), ...] gallery_list = [ (gallery_path1, pid1, camid1), (gallery_path2, pid2, camid2), ...]
You can also pass an empty train_list to generate a “Testset” only with
super().__init__([], query, gallery)
.Notice: query and gallery sets could have the same camera views, but for each individual query identity, his/her gallery samples from the same camera are excluded. So if your dataset has no camera annotations, you can set all query identities camera number to
0
and all gallery identities camera number to1
, then you can get the testing results. -
Import your dataset.
Aftre registering your own dataset, you need to import it in
train_net.py
to make it effective.from dataset_file import MyOwnDataset
Issue Analytics
- State:
- Created 3 years ago
- Comments:21 (5 by maintainers)
@AnhPC03 Yes, you are right! It doesn’t matter how your data structure is. The key idea is preparing the
train
,query
andgallery
as required and then pass viasuper().__init__(train, query, gallery)
.This issue was closed because it has been inactive for 14 days since being marked as stale.