Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Question about what is Full Labeled Training and Datasets

See original GitHub issue

There required structure of the images is as follows:

# YOUR_DATA should be a directory contains coco dataset.
# For eg.:
# YOUR_DATA/
#  coco/
#     train2017/
#     val2017/
#     unlabeled2017/
#     annotations/
ln -s ${YOUR_DATA} data
bash tools/dataset/prepare_coco_data.sh conduct

My Questions are:

If my understanding is correct, the unlabeled2017 contains all the unlabeled images, right?
When you say X% labeled data (e.g. 5%, 10%, etc), does that take X% from the train2017/ training data? What happens to the 100-X% of the data in the training data? Does it get added to the unlabeled pool for training?
When you say full-labeled training, does it mean it trains on all the data in train2017/ (supervised) then use the unlabeled2017/ data for unsupervised part of the semi-supervised learning? Or is it just supervised training on all training dataset?
When using a custom dataset in COCO format, do I just follow the same instructions or do I need to change something more?

Issue Analytics

State:
Created 2 years ago
Comments:7

Top GitHub Comments

1reaction

MendelXucommented, Sep 28, 2021

Q1: Yes Q2: Yes. Yes. Q3: Yes, the supervised baseline is trained on all labeled data (train2017) and the semi-supervised method is trained on all labeled data( train2017 and unlabeled2017). Q4: I think there is something you can check before your training: 1) Do you modify the annotation file path, image file prefix in the config file and replace them with your dataset configuration?; 2) Does your dataset share the same categories with COCO? If not, add the following snippet to the config file.

data = dict(
     train=dict(
           ...
           classes=YOUR_CLASS_LIST
           ...
 ),
val=dict(
          ...
           classes=YOUR_CLASS_LIST 
          ...
),
test=dict(
           ...
           classes=YOUR_CLASS_LIST
          ... 
)
)

0reactions

MendelXucommented, Sep 29, 2021

Yes. Just to add something like

data=dict(
    ...
    val=dict(
       img_prefix='YOUR_PATH',
      ann_file='YOUR_ANN',

)
   ...
)

Top Results From Across the Web

What Is Training Data? How It's Used in Machine Learning

A training dataset is an initial dataset that teaches the ML models to identify desired patterns or perform a particular task.

Labeled Training Sets for Machine Learning - insideBIGDATA

One consistent problem faced by data scientists is how to obtain labels for a given data set for use with machine learning.

What is Data Labeling and How to Do It Efficiently [Tutorial]

Data labeling is the process of assigning labels to data. Explore different types of data labeling, and learn how to do it efficiently....

What Is Training Data in Machine Learning? - MonkeyLearn

Training data (or a training dataset) is the initial data used to train machine learning models. Training datasets are fed to machine learning ......

The Difference Between Training Data vs. Test Data in ...

In machine learning, datasets are split into two subsets. The first subset is known as the training data - it's a portion of...

Troubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.

Start Free

Top Related Reddit Thread

No results found

Top Related Tweet

No results found

Top Related Dev.to Post

No results found

Question about what is Full Labeled Training and Datasets

Issue Analytics

Top GitHub Comments

Top Results From Across the Web

Top Related Medium Post

Top Related StackOverflow Question

Troubleshoot Live Code

Top Related Reddit Thread

Top Related Hackernoon Post

Top Related Tweet

Top Related Dev.to Post

Top Related Hashnode Post

assert len(indices) == len(self)

Query regarding some equations of the paper in the code