question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Errors and losses

See original GitHub issue

In log.txt, I see several errors and losses. I would like to know a bit more about some of them.

I would expect the log file to contain:

  • a total loss (train_loss),
  • the three components to the loss (train_loss_ce, train_loss_bbox, train_loss_giou), as detailed in section 3.1 of the paper with the Hungarian loss, then table 4, and also on page 12, where it is mentioned that:

There are three components to the loss: classification loss, L1 bounding box distance loss, and GIoU loss.

  1. In the log file, first, there are these:
    "train_class_error": 0.801971435546875,
    "train_loss": 2.8446195403734844,
    "train_loss_ce": 0.02718701681296807,
    "train_loss_bbox": 0.14008397981524467,
    "train_loss_giou": 0.2731118065615495,

First, it is mostly fine, except I am not sure what ce stands for. I assume it is the loss associated with the classification error. Is that correct?

Second, I would assume that “class error” is the “weighted fraction of misclassified observations”. However, I have seen cases where it was much higher that 1. Isn’t it supposed to be between 0 and 1?

  1. Then, with the suffix going from 0 to 4:
    "train_loss_ce_3": 0.026680172781925648,
    "train_loss_bbox_3": 0.138540934274594,
    "train_loss_giou_3": 0.26943153887987137,

This looks OK.

I assume these are layer-specific losses, which allow to compute the Hungarian loss as mentioned in section Auxiliary decoding losses in the paper:

We add prediction Prediction feed-forward networks (FFNs) and Hungarian loss after each decoder layer.

This is consistent with the fact that there are 6 decoding layers by default:

    # * Transformer
    parser.add_argument('--dec_layers', default=6, type=int,
                        help="Number of decoding layers in the transformer")

    # Loss
    parser.add_argument('--no_aux_loss', dest='aux_loss', action='store_false',
                        help="Disables auxiliary decoding losses (loss at each layer)")
  1. Finally:
    "train_class_error_unscaled": 0.801971435546875,
    "train_loss_ce_unscaled": 0.02718701681296807,
    "train_loss_bbox_unscaled": 0.02801679583887259,
    "train_loss_giou_unscaled": 0.13655590328077474,
    "train_cardinality_error_unscaled": 0.85,

I don’t know what “cardinality error” is.

I don’t know why train_class_error_unscaled is mentioned, because the classification error is not going to be normalized like the losses are.

Apart from that, it looks OK, as I assume the suffix _unscaled means before the normalization mentioned in appendix A.2:

All losses are normalized by the number of objects inside the batch.

Issue Analytics

  • State:closed
  • Created 3 years ago
  • Reactions:2
  • Comments:5 (1 by maintainers)

github_iconTop GitHub Comments

3reactions
woctezumacommented, Aug 26, 2020

Info is all in: https://github.com/facebookresearch/detr/blob/master/models/detr.py

    def loss_boxes(self, outputs, targets, indices, num_boxes):
        """Compute the losses related to the bounding boxes, the L1 regression loss and the GIoU loss
           targets dicts must contain the key "boxes" containing a tensor of dim [nb_target_boxes, 4]
           The target boxes are expected in format (center_x, center_y, w, h), normalized by the image size.
        """

Answer about the _ce suffix: it is for cross-entropy. That is the loss about classes.

    def loss_labels(self, outputs, targets, indices, num_boxes, log=True):
        """Classification loss (NLL)
        targets dicts must contain the key "labels" containing a tensor of dim [nb_target_boxes]
        """
        [...]
        loss_ce = F.cross_entropy(src_logits.transpose(1, 2), target_classes, self.empty_weight)
        losses = {'loss_ce': loss_ce}

Answer about class error:

        if log:
            # TODO this should probably be a separate loss, not hacked in this one here
            losses['class_error'] = 100 - accuracy(src_logits[idx], target_classes_o)[0]
        return losses

where:

def accuracy(output, target, topk=(1,)):
    """Computes the precision@k for the specified values of k"""
    if target.numel() == 0:
        return [torch.zeros([], device=output.device)]
    maxk = max(topk)
    batch_size = target.size(0)

    _, pred = output.topk(maxk, 1, True, True)
    pred = pred.t()
    correct = pred.eq(target.view(1, -1).expand_as(pred))

    res = []
    for k in topk:
        correct_k = correct[:k].view(-1).float().sum(0)
        res.append(correct_k.mul_(100.0 / batch_size))
    return res

Answer about cardinality error:

    def loss_cardinality(self, outputs, targets, indices, num_boxes):
        """ Compute the cardinality error, ie the absolute error in the number of predicted non-empty boxes
        This is not really a loss, it is intended for logging purposes only. It doesn't propagate gradients
        """

Answer about “mask” (Focal) loss and "Dice loss:

    def loss_masks(self, outputs, targets, indices, num_boxes):
        """Compute the losses related to the masks: the focal loss and the dice loss.
           targets dicts must contain the key "masks" containing a tensor of dim [nb_target_boxes, h, w]
        """
        [...]
        losses = {
            "loss_mask": sigmoid_focal_loss(src_masks, target_masks, num_boxes),
            "loss_dice": dice_loss(src_masks, target_masks, num_boxes),
        }
2reactions
fmassacommented, Aug 26, 2020

Hey, sorry for not getting back to you before.

Your findings are correct. One last point about the _unnormalized losses: we have scaling coefficients for each loss that you can see in https://github.com/facebookresearch/detr/blob/5e66b4cd15b2b182da347103dd16578d28b49d69/main.py#L74-L77 those scaling coefficients are used to balance the contribution of each loss to the total loss. The _unscaled values you see in the logs are the original values of the losses, before being scaled by those scaling coefficients, as you can see in https://github.com/facebookresearch/detr/blob/5e66b4cd15b2b182da347103dd16578d28b49d69/engine.py#L39-L42

Read more comments on GitHub >

github_iconTop Results From Across the Web

Financial Loss Due to Error - Smith Brothers Insurance
Armed with the results of a BlackLine study, a recent CFO Daily News article revealed the number one cause of financial mistakes is...
Read more >
Small Mistakes Can Lead to Big Losses - Remodeling Magazine
Learning how to identify and correct seemingly minor mistakes can add significant dollars, and sense of control, to projects and the company gross...
Read more >
How to Handle Mistakes, Failures, and Losses |
Preemptively know that mistakes are going to happen. · When a mistake occurs, be honest. · Put into practice a routine or “speed...
Read more >
The High Costs of Errors and How to Avoid Them in Your ...
Errors can spell disaster for any organization. ... This could be hard costs or losses, missed revenue, liability, additional time, cost to resolve, ......
Read more >
Human Error Causes Most Serious Data Loss | CSO Online
Human error accounts for three-quarters of incidents where sensitive data is ... or more sensitive data losses a year, with customer, financial, corporate, ......
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found