Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Transferring Boxes to torch device changes their dtype to float32.

See original GitHub issue

I am trying to support mixed precision training in Detectron2 for my use-case, using NVIDIA Apex, and I am unable to make drop-in replacements as the library suggests due to some internal wiring. I think they should be fixed regardless, because their behavior is non-intuitive.

Instructions To Reproduce the Issue:

>>> f16_boxes = Boxes(torch.tensor([[1.0, 2.0, 3.0, 4.0]]).to(torch.float16))
>>> f16_boxes = f16_boxes.to(torch.device("cuda:0"))
>>> f16_boxes.tensor.dtype  # Expected: torch.float16
torch.float32

Location of Issue:

The problem lies in this method: https://github.com/facebookresearch/detectron2/blob/master/detectron2/structures/boxes.py#L154

This definition calls the constructor:

def to(self, device: str) -> "Boxes":
    return Boxes(self.tensor.to(device))

and the constructor sets everything to torch.float32 by force.

def __init__(self, tensor: torch.Tensor):
    device = tensor.device if isinstance(tensor, torch.Tensor) else torch.device("cpu")
    tensor = torch.as_tensor(tensor, dtype=torch.float32, device=device)  # This line!
    ...

Proposed Solution:

The method body of to should look like:

def to(self, device: str) -> "Boxes":
    boxes_new = self.clone()
    boxes_new.tensor = self.tensor.to(device)
    return boxes_new

This method body will retain current behavior in pure FP32 settings, and provide ease with mixed precision training, as well as looks more intuitive.

[I trimmed down other parts from issue template because they aren’t needed here.]

Issue Analytics

State:
Created 4 years ago
Comments:5 (2 by maintainers)

Top GitHub Comments

1reaction

ppwwyyxxcommented, Feb 21, 2020

It seems the original issue is no longer valid? Feel free to reopen if it’s not the case.

1reaction

ppwwyyxxcommented, Feb 2, 2020

I don’t think it’s necessary to use float16 for boxes because I couldn’t think of any place where this would improve speed, and it would cause issues for box-related operations which might not support fp16.

It might be better to just keep using float32 and apply appropriate casting when needed (e.g. from predicted fp16 deltas to fp32 deltas).

(unrelated to your issue, but Apex’s claim of “drop-in replacement” is just not true for any sufficiently complicated model. Hacks might be required to make it work

Top Results From Across the Web

PyTorch on XLA Devices

Importing torch_xla initializes PyTorch/XLA, and xm.xla_device() returns the current XLA device. This may be a CPU or TPU depending on your environment.

Building your own object detector — PyTorch vs TensorFlow ...

Short "how-to" article on training your own object detection algorithms based on pre-trained networks with TensorFlow vs. PyTorch (transfer ...

Transforms — MONAI 1.0.1 Documentation

dtype ( dtype ) – kernel data type (torch.dtype). Defaults to torch.float32 . device ( Union [ device , int , str ])...

Source code for detectron2.structures.masks

If a mask is empty, it's bounding box will be all zero. """ boxes = torch.zeros(self.tensor.shape[0], 4, dtype=torch.float32) x_any = torch.any(self.tensor, ...

14.4. Anchor Boxes - Dive into Deep Learning

When drawing anchor boxes, we need to restore their original coordinate values; ... 4), dtype=torch.float32, device=device) # Label classes of anchor boxes ......