Failed Cases When Testing with Pytorch v1.12
See original GitHub issueDescribe the bug Using current main branch (without any change in the code), several test cases fail
To Reproduce Steps to reproduce the behavior:
- Clone the project to your local machine and install required packages (requirements.txt and requirements-dev.txt)
- Make sure installed Pytorch is version 1.12 (latest stable version)
- Go to root folder (torchinfo) of the project
- Run pytest --overwrite (as suggested in README.md)
- Observe the failed cases
Expected behavior I think it’s supposed to give no failed case (maybe a few warning)
Desktop:
- OS: Ubuntu 20.04 LTS
More details After running pytest, summary output from pytest is the following (click for the detailed output):
========================================================================================== short test summary info ==========================================================================================
FAILED tests/exceptions_test.py::test_input_size_half_precision - RuntimeError: Failed to run torchinfo. See above stack traces for more details. Executed layers up to: []
FAILED tests/torchinfo_test.py::test_pack_padded - RuntimeError: Failed to run torchinfo. See above stack traces for more details. Executed layers up to: [Embedding: 1]
FAILED tests/torchinfo_test.py::test_namedtuple - RuntimeError: Failed to run torchinfo. See above stack traces for more details. Executed layers up to: []
FAILED tests/torchinfo_xl_test.py::test_eval_order_doesnt_matter - RuntimeError: Input type (torch.FloatTensor) and weight type (torch.cuda.FloatTensor) should be the same or input should be a MKLDNN te...
============================================================================ 4 failed, 65 passed, 1 skipped, 3 warnings in 8.47s ============================================================================
========================================================================================== short test summary info ==========================================================================================
FAILED tests/exceptions_test.py::test_input_size_half_precision - RuntimeError: Failed to run torchinfo. See above stack traces for more details. Executed layers up to: []
FAILED tests/torchinfo_test.py::test_pack_padded - RuntimeError: Failed to run torchinfo. See above stack traces for more details. Executed layers up to: [Embedding: 1]
FAILED tests/torchinfo_test.py::test_namedtuple - RuntimeError: Failed to run torchinfo. See above stack traces for more details. Executed layers up to: []
FAILED tests/torchinfo_xl_test.py::test_eval_order_doesnt_matter - RuntimeError: Input type (torch.FloatTensor) and weight type (torch.cuda.FloatTensor) should be the same or input should be a MKLDNN te...
============================================================================ 4 failed, 65 passed, 1 skipped, 3 warnings in 8.47s ============================================================================
============================================================================================ test session starts ============================================================================================
platform linux -- Python 3.9.12, pytest-7.1.2, pluggy-1.0.0
rootdir: /home/mertkurttutan/Desktop/main/software-dev/pytest/torchinfo
plugins: cov-3.0.0
collected 70 items
tests/exceptions_test.py ...F. [ 7%]
tests/gpu_test.py .. [ 10%]
tests/half_precision_test.py ... [ 14%]
tests/torchinfo_test.py ............................F......F.............. [ 85%]
tests/torchinfo_xl_test.py ..F...s... [100%]
================================================================================================= FAILURES ==================================================================================================
______________________________________________________________________________________ test_input_size_half_precision _______________________________________________________________________________________
model = Linear(in_features=2, out_features=5, bias=True)
x = [tensor([[0.6099, 0.2002],
[0.7334, 0.5176],
[0.0652, 0.5923],
[0.8931, 0.7656],
[0.12... [0.9878, 0.7974],
[0.8638, 0.2712],
[0.3899, 0.2676],
[0.9009, 0.7832]], dtype=torch.float16)]
batch_dim = None, cache_forward_pass = False, device = 'cpu', mode = <Mode.EVAL: 'eval'>, kwargs = {}, model_name = 'Linear', summary_list = [Linear: 0], global_layer_info = {140396508354400: Linear: 0}
hooks = {140396508354400: (<torch.utils.hooks.RemovableHandle object at 0x7fb09c021310>, <torch.utils.hooks.RemovableHandle object at 0x7fb09c0331f0>)}, saved_model_mode = True
def forward_pass(
model: nn.Module,
x: CORRECTED_INPUT_DATA_TYPE,
batch_dim: int | None,
cache_forward_pass: bool,
device: torch.device | str,
mode: Mode,
**kwargs: Any,
) -> list[LayerInfo]:
"""Perform a forward pass on the model using forward hooks."""
global _cached_forward_pass # pylint: disable=global-variable-not-assigned
model_name = model.__class__.__name__
if cache_forward_pass and model_name in _cached_forward_pass:
return _cached_forward_pass[model_name]
summary_list, global_layer_info, hooks = apply_hooks(
model_name, model, x, batch_dim
)
if x is None:
set_children_layers(summary_list)
return summary_list
kwargs = set_device(kwargs, device)
saved_model_mode = model.training
try:
if mode == Mode.TRAIN:
model.train()
elif mode == Mode.EVAL:
model.eval()
else:
raise RuntimeError(
f"Specified model mode ({list(Mode)}) not recognized: {mode}"
)
with torch.no_grad(): # type: ignore[no-untyped-call]
if isinstance(x, (list, tuple)):
> _ = model.to(device)(*x, **kwargs)
torchinfo/torchinfo.py:290:
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
self = Linear(in_features=2, out_features=5, bias=True)
input = (tensor([[0.6099, 0.2002],
[0.7334, 0.5176],
[0.0652, 0.5923],
[0.8931, 0.7656],
[0.12...[0.9878, 0.7974],
[0.8638, 0.2712],
[0.3899, 0.2676],
[0.9009, 0.7832]], dtype=torch.float16),)
kwargs = {}, forward_call = <bound method Linear.forward of Linear(in_features=2, out_features=5, bias=True)>, full_backward_hooks = [], non_full_backward_hooks = []
hook = <function construct_pre_hook.<locals>.pre_hook at 0x7fb0a424ff70>, result = None, bw_hook = None
def _call_impl(self, *input, **kwargs):
forward_call = (self._slow_forward if torch._C._get_tracing_state() else self.forward)
# If we don't have any hooks, we want to skip the rest of the logic in
# this function, and just call forward.
if not (self._backward_hooks or self._forward_hooks or self._forward_pre_hooks or _global_backward_hooks
or _global_forward_hooks or _global_forward_pre_hooks):
return forward_call(*input, **kwargs)
# Do not call functions when jit is used
full_backward_hooks, non_full_backward_hooks = [], []
if self._backward_hooks or _global_backward_hooks:
full_backward_hooks, non_full_backward_hooks = self._get_backward_hooks()
if _global_forward_pre_hooks or self._forward_pre_hooks:
for hook in (*_global_forward_pre_hooks.values(), *self._forward_pre_hooks.values()):
result = hook(self, input)
if result is not None:
if not isinstance(result, tuple):
result = (result,)
input = result
bw_hook = None
if full_backward_hooks:
bw_hook = hooks.BackwardHook(self, full_backward_hooks)
input = bw_hook.setup_input_hook(input)
> result = forward_call(*input, **kwargs)
../../../../../miniconda3/envs/cs224n_a3/lib/python3.9/site-packages/torch/nn/modules/module.py:1148:
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
self = Linear(in_features=2, out_features=5, bias=True)
input = tensor([[0.6099, 0.2002],
[0.7334, 0.5176],
[0.0652, 0.5923],
[0.8931, 0.7656],
[0.123... [0.9878, 0.7974],
[0.8638, 0.2712],
[0.3899, 0.2676],
[0.9009, 0.7832]], dtype=torch.float16)
def forward(self, input: Tensor) -> Tensor:
> return F.linear(input, self.weight, self.bias)
E RuntimeError: "addmm_impl_cpu_" not implemented for 'Half'
../../../../../miniconda3/envs/cs224n_a3/lib/python3.9/site-packages/torch/nn/modules/linear.py:114: RuntimeError
The above exception was the direct cause of the following exception:
def test_input_size_half_precision() -> None:
test = torch.nn.Linear(2, 5).half()
with pytest.warns(
UserWarning,
match=(
"Half precision is not supported with input_size parameter, and "
"may output incorrect results. Try passing input_data directly."
),
):
> summary(test, dtypes=[torch.float16], input_size=(10, 2), device="cpu")
tests/exceptions_test.py:59:
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
torchinfo/torchinfo.py:218: in summary
summary_list = forward_pass(
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
model = Linear(in_features=2, out_features=5, bias=True)
x = [tensor([[0.6099, 0.2002],
[0.7334, 0.5176],
[0.0652, 0.5923],
[0.8931, 0.7656],
[0.12... [0.9878, 0.7974],
[0.8638, 0.2712],
[0.3899, 0.2676],
[0.9009, 0.7832]], dtype=torch.float16)]
batch_dim = None, cache_forward_pass = False, device = 'cpu', mode = <Mode.EVAL: 'eval'>, kwargs = {}, model_name = 'Linear', summary_list = [Linear: 0], global_layer_info = {140396508354400: Linear: 0}
hooks = {140396508354400: (<torch.utils.hooks.RemovableHandle object at 0x7fb09c021310>, <torch.utils.hooks.RemovableHandle object at 0x7fb09c0331f0>)}, saved_model_mode = True
def forward_pass(
model: nn.Module,
x: CORRECTED_INPUT_DATA_TYPE,
batch_dim: int | None,
cache_forward_pass: bool,
device: torch.device | str,
mode: Mode,
**kwargs: Any,
) -> list[LayerInfo]:
"""Perform a forward pass on the model using forward hooks."""
global _cached_forward_pass # pylint: disable=global-variable-not-assigned
model_name = model.__class__.__name__
if cache_forward_pass and model_name in _cached_forward_pass:
return _cached_forward_pass[model_name]
summary_list, global_layer_info, hooks = apply_hooks(
model_name, model, x, batch_dim
)
if x is None:
set_children_layers(summary_list)
return summary_list
kwargs = set_device(kwargs, device)
saved_model_mode = model.training
try:
if mode == Mode.TRAIN:
model.train()
elif mode == Mode.EVAL:
model.eval()
else:
raise RuntimeError(
f"Specified model mode ({list(Mode)}) not recognized: {mode}"
)
with torch.no_grad(): # type: ignore[no-untyped-call]
if isinstance(x, (list, tuple)):
_ = model.to(device)(*x, **kwargs)
elif isinstance(x, dict):
_ = model.to(device)(**x, **kwargs)
else:
# Should not reach this point, since process_input_data ensures
# x is either a list, tuple, or dict
raise ValueError("Unknown input type")
except Exception as e:
executed_layers = [layer for layer in summary_list if layer.executed]
> raise RuntimeError(
"Failed to run torchinfo. See above stack traces for more details. "
f"Executed layers up to: {executed_layers}"
) from e
E RuntimeError: Failed to run torchinfo. See above stack traces for more details. Executed layers up to: []
torchinfo/torchinfo.py:299: RuntimeError
_____________________________________________________________________________________________ test_pack_padded ______________________________________________________________________________________________
model = PackPaddedLSTM(
(embedding): Embedding(60, 128)
(lstm): LSTM(128, 32)
(hidden2out): Linear(in_features=32, out_features=18, bias=True)
(dropout_layer): Dropout(p=0.2, inplace=False)
)
x = [tensor([[1, 1, 1, ..., 1, 1, 1],
[1, 1, 1, ..., 1, 1, 1],
[1, 1, 1, ..., 1, 1, 1],
...,
[1, 1, 1, ..., 1, 1, 1],
[1, 1, 1, ..., 1, 1, 1],
[1, 1, 1, ..., 1, 1, 1]], device='cuda:0')]
batch_dim = None, cache_forward_pass = False, device = device(type='cuda'), mode = <Mode.EVAL: 'eval'>
kwargs = {'lengths': tensor([13, 12, 11, 11, 11, 11, 11, 11, 11, 11, 10, 10, 10, 10, 10, 10, 10, 10,
10, 10, 10, 10, 10... 5,
5, 5, 5, 5, 5, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4,
4, 4], device='cuda:0')}
model_name = 'PackPaddedLSTM', summary_list = [PackPaddedLSTM: 0, Embedding: 1]
global_layer_info = {140396508397776: Linear: 1, 140396508398688: Embedding: 1, 140396508399792: PackPaddedLSTM: 0, 140396508399840: LSTM: 1, ...}
hooks = {140396508397776: (<torch.utils.hooks.RemovableHandle object at 0x7fb09c015c70>, <torch.utils.hooks.RemovableHandle ob...ls.hooks.RemovableHandle object at 0x7fb0969c8250>, <torch.utils.hooks.RemovableHandle object at 0x7fb0969c89a0>), ...}
saved_model_mode = True
def forward_pass(
model: nn.Module,
x: CORRECTED_INPUT_DATA_TYPE,
batch_dim: int | None,
cache_forward_pass: bool,
device: torch.device | str,
mode: Mode,
**kwargs: Any,
) -> list[LayerInfo]:
"""Perform a forward pass on the model using forward hooks."""
global _cached_forward_pass # pylint: disable=global-variable-not-assigned
model_name = model.__class__.__name__
if cache_forward_pass and model_name in _cached_forward_pass:
return _cached_forward_pass[model_name]
summary_list, global_layer_info, hooks = apply_hooks(
model_name, model, x, batch_dim
)
if x is None:
set_children_layers(summary_list)
return summary_list
kwargs = set_device(kwargs, device)
saved_model_mode = model.training
try:
if mode == Mode.TRAIN:
model.train()
elif mode == Mode.EVAL:
model.eval()
else:
raise RuntimeError(
f"Specified model mode ({list(Mode)}) not recognized: {mode}"
)
with torch.no_grad(): # type: ignore[no-untyped-call]
if isinstance(x, (list, tuple)):
> _ = model.to(device)(*x, **kwargs)
torchinfo/torchinfo.py:290:
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
self = PackPaddedLSTM(
(embedding): Embedding(60, 128)
(lstm): LSTM(128, 32)
(hidden2out): Linear(in_features=32, out_features=18, bias=True)
(dropout_layer): Dropout(p=0.2, inplace=False)
)
input = (tensor([[1, 1, 1, ..., 1, 1, 1],
[1, 1, 1, ..., 1, 1, 1],
[1, 1, 1, ..., 1, 1, 1],
...,
[1, 1, 1, ..., 1, 1, 1],
[1, 1, 1, ..., 1, 1, 1],
[1, 1, 1, ..., 1, 1, 1]], device='cuda:0'),)
kwargs = {'lengths': tensor([13, 12, 11, 11, 11, 11, 11, 11, 11, 11, 10, 10, 10, 10, 10, 10, 10, 10,
10, 10, 10, 10, 10... 5,
5, 5, 5, 5, 5, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4,
4, 4], device='cuda:0')}
forward_call = <bound method PackPaddedLSTM.forward of PackPaddedLSTM(
(embedding): Embedding(60, 128)
(lstm): LSTM(128, 32)
(hidden2out): Linear(in_features=32, out_features=18, bias=True)
(dropout_layer): Dropout(p=0.2, inplace=False)
)>
full_backward_hooks = [], non_full_backward_hooks = [], hook = <function construct_pre_hook.<locals>.pre_hook at 0x7fb096abd790>, result = None, bw_hook = None
def _call_impl(self, *input, **kwargs):
forward_call = (self._slow_forward if torch._C._get_tracing_state() else self.forward)
# If we don't have any hooks, we want to skip the rest of the logic in
# this function, and just call forward.
if not (self._backward_hooks or self._forward_hooks or self._forward_pre_hooks or _global_backward_hooks
or _global_forward_hooks or _global_forward_pre_hooks):
return forward_call(*input, **kwargs)
# Do not call functions when jit is used
full_backward_hooks, non_full_backward_hooks = [], []
if self._backward_hooks or _global_backward_hooks:
full_backward_hooks, non_full_backward_hooks = self._get_backward_hooks()
if _global_forward_pre_hooks or self._forward_pre_hooks:
for hook in (*_global_forward_pre_hooks.values(), *self._forward_pre_hooks.values()):
result = hook(self, input)
if result is not None:
if not isinstance(result, tuple):
result = (result,)
input = result
bw_hook = None
if full_backward_hooks:
bw_hook = hooks.BackwardHook(self, full_backward_hooks)
input = bw_hook.setup_input_hook(input)
> result = forward_call(*input, **kwargs)
../../../../../miniconda3/envs/cs224n_a3/lib/python3.9/site-packages/torch/nn/modules/module.py:1148:
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
self = PackPaddedLSTM(
(embedding): Embedding(60, 128)
(lstm): LSTM(128, 32)
(hidden2out): Linear(in_features=32, out_features=18, bias=True)
(dropout_layer): Dropout(p=0.2, inplace=False)
)
batch = tensor([[1, 1, 1, ..., 1, 1, 1],
[1, 1, 1, ..., 1, 1, 1],
[1, 1, 1, ..., 1, 1, 1],
...,
[1, 1, 1, ..., 1, 1, 1],
[1, 1, 1, ..., 1, 1, 1],
[1, 1, 1, ..., 1, 1, 1]], device='cuda:0')
lengths = tensor([13, 12, 11, 11, 11, 11, 11, 11, 11, 11, 10, 10, 10, 10, 10, 10, 10, 10,
10, 10, 10, 10, 10, 10, 9, 9..., 5,
5, 5, 5, 5, 5, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4,
4, 4], device='cuda:0')
def forward(self, batch: torch.Tensor, lengths: torch.Tensor) -> torch.Tensor:
hidden1 = torch.ones(1, batch.size(-1), self.hidden_size, device=batch.device)
hidden2 = torch.ones(1, batch.size(-1), self.hidden_size, device=batch.device)
embeds = self.embedding(batch)
> packed_input = pack_padded_sequence(embeds, lengths)
tests/fixtures/models.py:393:
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
input = tensor([[[-1.0181, 0.6983, 1.2245, ..., -0.4112, 0.9594, -0.8608],
[-1.0181, 0.6983, 1.2245, ..., -0.4...12, 0.9594, -0.8608],
[-1.0181, 0.6983, 1.2245, ..., -0.4112, 0.9594, -0.8608]]],
device='cuda:0')
lengths = tensor([13, 12, 11, 11, 11, 11, 11, 11, 11, 11, 10, 10, 10, 10, 10, 10, 10, 10,
10, 10, 10, 10, 10, 10, 9, 9..., 5,
5, 5, 5, 5, 5, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4,
4, 4], device='cuda:0')
batch_first = False, enforce_sorted = True
def pack_padded_sequence(
input: Tensor,
lengths: Tensor,
batch_first: bool = False,
enforce_sorted: bool = True,
) -> PackedSequence:
r"""Packs a Tensor containing padded sequences of variable length.
:attr:`input` can be of size ``T x B x *`` where `T` is the length of the
longest sequence (equal to ``lengths[0]``), ``B`` is the batch size, and
``*`` is any number of dimensions (including 0). If ``batch_first`` is
``True``, ``B x T x *`` :attr:`input` is expected.
For unsorted sequences, use `enforce_sorted = False`. If :attr:`enforce_sorted` is
``True``, the sequences should be sorted by length in a decreasing order, i.e.
``input[:,0]`` should be the longest sequence, and ``input[:,B-1]`` the shortest
one. `enforce_sorted = True` is only necessary for ONNX export.
Note:
This function accepts any input that has at least two dimensions. You
can apply it to pack the labels, and use the output of the RNN with
them to compute the loss directly. A Tensor can be retrieved from
a :class:`PackedSequence` object by accessing its ``.data`` attribute.
Args:
input (Tensor): padded batch of variable length sequences.
lengths (Tensor or list(int)): list of sequence lengths of each batch
element (must be on the CPU if provided as a tensor).
batch_first (bool, optional): if ``True``, the input is expected in ``B x T x *``
format.
enforce_sorted (bool, optional): if ``True``, the input is expected to
contain sequences sorted by length in a decreasing order. If
``False``, the input will get sorted unconditionally. Default: ``True``.
Returns:
a :class:`PackedSequence` object
"""
if torch._C._get_tracing_state() and not isinstance(lengths, torch.Tensor):
warnings.warn('pack_padded_sequence has been called with a Python list of '
'sequence lengths. The tracer cannot track the data flow of Python '
'values, and it will treat them as constants, likely rendering '
'the trace incorrect for any other combination of lengths.',
stacklevel=2)
lengths = torch.as_tensor(lengths, dtype=torch.int64)
if enforce_sorted:
sorted_indices = None
else:
lengths, sorted_indices = torch.sort(lengths, descending=True)
sorted_indices = sorted_indices.to(input.device)
batch_dim = 0 if batch_first else 1
input = input.index_select(batch_dim, sorted_indices)
data, batch_sizes = \
> _VF._pack_padded_sequence(input, lengths, batch_first)
E RuntimeError: 'lengths' argument should be a 1D CPU int64 tensor, but got 1D cuda:0 Long tensor
../../../../../miniconda3/envs/cs224n_a3/lib/python3.9/site-packages/torch/nn/utils/rnn.py:260: RuntimeError
The above exception was the direct cause of the following exception:
def test_pack_padded() -> None:
x = torch.ones([20, 128]).long()
# fmt: off
y = torch.Tensor([
13, 12, 11, 11, 11, 11, 11, 11, 11, 11, 10, 10, 10, 10, 10,
10, 10, 10, 10, 10, 10, 10, 10, 10, 9, 9, 9, 9, 9, 9, 9, 9,
9, 9, 9, 9, 9, 9, 8, 8, 8, 8, 8, 8, 8, 8, 8, 7, 7, 7, 7, 7,
7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 6, 6, 6, 6, 6, 6,
6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6,
6, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5,
5, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4
]).long()
# fmt: on
> summary(PackPaddedLSTM(), input_data=x, lengths=y)
tests/torchinfo_test.py:307:
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
torchinfo/torchinfo.py:218: in summary
summary_list = forward_pass(
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
model = PackPaddedLSTM(
(embedding): Embedding(60, 128)
(lstm): LSTM(128, 32)
(hidden2out): Linear(in_features=32, out_features=18, bias=True)
(dropout_layer): Dropout(p=0.2, inplace=False)
)
x = [tensor([[1, 1, 1, ..., 1, 1, 1],
[1, 1, 1, ..., 1, 1, 1],
[1, 1, 1, ..., 1, 1, 1],
...,
[1, 1, 1, ..., 1, 1, 1],
[1, 1, 1, ..., 1, 1, 1],
[1, 1, 1, ..., 1, 1, 1]], device='cuda:0')]
batch_dim = None, cache_forward_pass = False, device = device(type='cuda'), mode = <Mode.EVAL: 'eval'>
kwargs = {'lengths': tensor([13, 12, 11, 11, 11, 11, 11, 11, 11, 11, 10, 10, 10, 10, 10, 10, 10, 10,
10, 10, 10, 10, 10... 5,
5, 5, 5, 5, 5, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4,
4, 4], device='cuda:0')}
model_name = 'PackPaddedLSTM', summary_list = [PackPaddedLSTM: 0, Embedding: 1]
global_layer_info = {140396508397776: Linear: 1, 140396508398688: Embedding: 1, 140396508399792: PackPaddedLSTM: 0, 140396508399840: LSTM: 1, ...}
hooks = {140396508397776: (<torch.utils.hooks.RemovableHandle object at 0x7fb09c015c70>, <torch.utils.hooks.RemovableHandle ob...ls.hooks.RemovableHandle object at 0x7fb0969c8250>, <torch.utils.hooks.RemovableHandle object at 0x7fb0969c89a0>), ...}
saved_model_mode = True
def forward_pass(
model: nn.Module,
x: CORRECTED_INPUT_DATA_TYPE,
batch_dim: int | None,
cache_forward_pass: bool,
device: torch.device | str,
mode: Mode,
**kwargs: Any,
) -> list[LayerInfo]:
"""Perform a forward pass on the model using forward hooks."""
global _cached_forward_pass # pylint: disable=global-variable-not-assigned
model_name = model.__class__.__name__
if cache_forward_pass and model_name in _cached_forward_pass:
return _cached_forward_pass[model_name]
summary_list, global_layer_info, hooks = apply_hooks(
model_name, model, x, batch_dim
)
if x is None:
set_children_layers(summary_list)
return summary_list
kwargs = set_device(kwargs, device)
saved_model_mode = model.training
try:
if mode == Mode.TRAIN:
model.train()
elif mode == Mode.EVAL:
model.eval()
else:
raise RuntimeError(
f"Specified model mode ({list(Mode)}) not recognized: {mode}"
)
with torch.no_grad(): # type: ignore[no-untyped-call]
if isinstance(x, (list, tuple)):
_ = model.to(device)(*x, **kwargs)
elif isinstance(x, dict):
_ = model.to(device)(**x, **kwargs)
else:
# Should not reach this point, since process_input_data ensures
# x is either a list, tuple, or dict
raise ValueError("Unknown input type")
except Exception as e:
executed_layers = [layer for layer in summary_list if layer.executed]
> raise RuntimeError(
"Failed to run torchinfo. See above stack traces for more details. "
f"Executed layers up to: {executed_layers}"
) from e
E RuntimeError: Failed to run torchinfo. See above stack traces for more details. Executed layers up to: [Embedding: 1]
torchinfo/torchinfo.py:299: RuntimeError
______________________________________________________________________________________________ test_namedtuple ______________________________________________________________________________________________
model = NamedTuple()
x = [tensor([[[[0.0426, 0.8419, 0.7895, ..., 0.2941, 0.8007, 0.9451],
[0.6184, 0.2044, 0.7810, ..., 0.3096, 0.... [5.4683e-01, 3.8055e-01, 9.6086e-01, ..., 1.3338e-01,
6.0339e-01, 7.6829e-01]]]], device='cuda:0')]
batch_dim = None, cache_forward_pass = False, device = device(type='cuda'), mode = <Mode.EVAL: 'eval'>, kwargs = {'z': Point(x=(2, 1, 28, 28), y=(2, 1, 28, 28))}, model_name = 'NamedTuple'
summary_list = [NamedTuple: 0], global_layer_info = {140396508353824: NamedTuple: 0}
hooks = {140396508353824: (<torch.utils.hooks.RemovableHandle object at 0x7fb09688e970>, <torch.utils.hooks.RemovableHandle object at 0x7fb09688e5e0>)}, saved_model_mode = True
def forward_pass(
model: nn.Module,
x: CORRECTED_INPUT_DATA_TYPE,
batch_dim: int | None,
cache_forward_pass: bool,
device: torch.device | str,
mode: Mode,
**kwargs: Any,
) -> list[LayerInfo]:
"""Perform a forward pass on the model using forward hooks."""
global _cached_forward_pass # pylint: disable=global-variable-not-assigned
model_name = model.__class__.__name__
if cache_forward_pass and model_name in _cached_forward_pass:
return _cached_forward_pass[model_name]
summary_list, global_layer_info, hooks = apply_hooks(
model_name, model, x, batch_dim
)
if x is None:
set_children_layers(summary_list)
return summary_list
kwargs = set_device(kwargs, device)
saved_model_mode = model.training
try:
if mode == Mode.TRAIN:
model.train()
elif mode == Mode.EVAL:
model.eval()
else:
raise RuntimeError(
f"Specified model mode ({list(Mode)}) not recognized: {mode}"
)
with torch.no_grad(): # type: ignore[no-untyped-call]
if isinstance(x, (list, tuple)):
> _ = model.to(device)(*x, **kwargs)
torchinfo/torchinfo.py:290:
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
self = NamedTuple()
input = (tensor([[[[0.0426, 0.8419, 0.7895, ..., 0.2941, 0.8007, 0.9451],
[0.6184, 0.2044, 0.7810, ..., 0.3096, 0.... [5.4683e-01, 3.8055e-01, 9.6086e-01, ..., 1.3338e-01,
6.0339e-01, 7.6829e-01]]]], device='cuda:0'))
kwargs = {'z': Point(x=(2, 1, 28, 28), y=(2, 1, 28, 28))}, forward_call = <bound method NamedTuple.forward of NamedTuple()>, full_backward_hooks = [], non_full_backward_hooks = []
hook = <function construct_pre_hook.<locals>.pre_hook at 0x7fb096abd940>, result = None, bw_hook = None
def _call_impl(self, *input, **kwargs):
forward_call = (self._slow_forward if torch._C._get_tracing_state() else self.forward)
# If we don't have any hooks, we want to skip the rest of the logic in
# this function, and just call forward.
if not (self._backward_hooks or self._forward_hooks or self._forward_pre_hooks or _global_backward_hooks
or _global_forward_hooks or _global_forward_pre_hooks):
return forward_call(*input, **kwargs)
# Do not call functions when jit is used
full_backward_hooks, non_full_backward_hooks = [], []
if self._backward_hooks or _global_backward_hooks:
full_backward_hooks, non_full_backward_hooks = self._get_backward_hooks()
if _global_forward_pre_hooks or self._forward_pre_hooks:
for hook in (*_global_forward_pre_hooks.values(), *self._forward_pre_hooks.values()):
result = hook(self, input)
if result is not None:
if not isinstance(result, tuple):
result = (result,)
input = result
bw_hook = None
if full_backward_hooks:
bw_hook = hooks.BackwardHook(self, full_backward_hooks)
input = bw_hook.setup_input_hook(input)
> result = forward_call(*input, **kwargs)
../../../../../miniconda3/envs/cs224n_a3/lib/python3.9/site-packages/torch/nn/modules/module.py:1148:
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
self = NamedTuple()
x = tensor([[[[0.0426, 0.8419, 0.7895, ..., 0.2941, 0.8007, 0.9451],
[0.6184, 0.2044, 0.7810, ..., 0.3096, 0.0..., 0.3716, 0.1156, 0.9332],
[0.6786, 0.6320, 0.4887, ..., 0.8578, 0.7893, 0.7768]]]],
device='cuda:0')
y = tensor([[[[8.8717e-02, 3.5092e-01, 3.9958e-01, ..., 9.2321e-01,
1.7570e-01, 9.7187e-01],
[9.9338... [5.4683e-01, 3.8055e-01, 9.6086e-01, ..., 1.3338e-01,
6.0339e-01, 7.6829e-01]]]], device='cuda:0')
z = Point(x=(2, 1, 28, 28), y=(2, 1, 28, 28))
def forward(self, x: Any, y: Any, z: Any) -> Any:
> return self.Point(x, y).x + torch.ones(z.x)
E RuntimeError: Expected all tensors to be on the same device, but found at least two devices, cuda:0 and cpu!
tests/fixtures/models.py:332: RuntimeError
The above exception was the direct cause of the following exception:
def test_namedtuple() -> None:
model = NamedTuple()
input_size = [(2, 1, 28, 28), (2, 1, 28, 28)]
named_tuple = model.Point(*input_size)
> summary(model, input_size=input_size, z=named_tuple)
tests/torchinfo_test.py:373:
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
torchinfo/torchinfo.py:218: in summary
summary_list = forward_pass(
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
model = NamedTuple()
x = [tensor([[[[0.0426, 0.8419, 0.7895, ..., 0.2941, 0.8007, 0.9451],
[0.6184, 0.2044, 0.7810, ..., 0.3096, 0.... [5.4683e-01, 3.8055e-01, 9.6086e-01, ..., 1.3338e-01,
6.0339e-01, 7.6829e-01]]]], device='cuda:0')]
batch_dim = None, cache_forward_pass = False, device = device(type='cuda'), mode = <Mode.EVAL: 'eval'>, kwargs = {'z': Point(x=(2, 1, 28, 28), y=(2, 1, 28, 28))}, model_name = 'NamedTuple'
summary_list = [NamedTuple: 0], global_layer_info = {140396508353824: NamedTuple: 0}
hooks = {140396508353824: (<torch.utils.hooks.RemovableHandle object at 0x7fb09688e970>, <torch.utils.hooks.RemovableHandle object at 0x7fb09688e5e0>)}, saved_model_mode = True
def forward_pass(
model: nn.Module,
x: CORRECTED_INPUT_DATA_TYPE,
batch_dim: int | None,
cache_forward_pass: bool,
device: torch.device | str,
mode: Mode,
**kwargs: Any,
) -> list[LayerInfo]:
"""Perform a forward pass on the model using forward hooks."""
global _cached_forward_pass # pylint: disable=global-variable-not-assigned
model_name = model.__class__.__name__
if cache_forward_pass and model_name in _cached_forward_pass:
return _cached_forward_pass[model_name]
summary_list, global_layer_info, hooks = apply_hooks(
model_name, model, x, batch_dim
)
if x is None:
set_children_layers(summary_list)
return summary_list
kwargs = set_device(kwargs, device)
saved_model_mode = model.training
try:
if mode == Mode.TRAIN:
model.train()
elif mode == Mode.EVAL:
model.eval()
else:
raise RuntimeError(
f"Specified model mode ({list(Mode)}) not recognized: {mode}"
)
with torch.no_grad(): # type: ignore[no-untyped-call]
if isinstance(x, (list, tuple)):
_ = model.to(device)(*x, **kwargs)
elif isinstance(x, dict):
_ = model.to(device)(**x, **kwargs)
else:
# Should not reach this point, since process_input_data ensures
# x is either a list, tuple, or dict
raise ValueError("Unknown input type")
except Exception as e:
executed_layers = [layer for layer in summary_list if layer.executed]
> raise RuntimeError(
"Failed to run torchinfo. See above stack traces for more details. "
f"Executed layers up to: {executed_layers}"
) from e
E RuntimeError: Failed to run torchinfo. See above stack traces for more details. Executed layers up to: []
torchinfo/torchinfo.py:299: RuntimeError
_______________________________________________________________________________________ test_eval_order_doesnt_matter _______________________________________________________________________________________
def test_eval_order_doesnt_matter() -> None:
input_size = (1, 3, 224, 224)
input_tensor = torch.ones(input_size)
model1 = torchvision.models.resnet18(pretrained=True)
model1.eval()
summary(model1, input_size=input_size)
with torch.inference_mode(): # type: ignore[no-untyped-call]
> output1 = model1(input_tensor)
tests/torchinfo_xl_test.py:43:
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
../../../../../miniconda3/envs/cs224n_a3/lib/python3.9/site-packages/torch/nn/modules/module.py:1130: in _call_impl
return forward_call(*input, **kwargs)
../../../../../miniconda3/envs/cs224n_a3/lib/python3.9/site-packages/torchvision/models/resnet.py:285: in forward
return self._forward_impl(x)
../../../../../miniconda3/envs/cs224n_a3/lib/python3.9/site-packages/torchvision/models/resnet.py:268: in _forward_impl
x = self.conv1(x)
../../../../../miniconda3/envs/cs224n_a3/lib/python3.9/site-packages/torch/nn/modules/module.py:1130: in _call_impl
return forward_call(*input, **kwargs)
../../../../../miniconda3/envs/cs224n_a3/lib/python3.9/site-packages/torch/nn/modules/conv.py:457: in forward
return self._conv_forward(input, self.weight, self.bias)
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
self = Conv2d(3, 64, kernel_size=(7, 7), stride=(2, 2), padding=(3, 3), bias=False)
input = tensor([[[[1., 1., 1., ..., 1., 1., 1.],
[1., 1., 1., ..., 1., 1., 1.],
[1., 1., 1., ..., 1., 1... [1., 1., 1., ..., 1., 1., 1.],
[1., 1., 1., ..., 1., 1., 1.],
[1., 1., 1., ..., 1., 1., 1.]]]])
weight = Parameter containing:
tensor([[[[-1.0419e-02, -6.1356e-03, -1.8098e-03, ..., 5.6615e-02,
1.7083e-02, -1....0065e-03, 3.6341e-02, ..., -2.4361e-02,
-7.1195e-02, -6.6788e-02]]]], device='cuda:0', requires_grad=True)
bias = None
def _conv_forward(self, input: Tensor, weight: Tensor, bias: Optional[Tensor]):
if self.padding_mode != 'zeros':
return F.conv2d(F.pad(input, self._reversed_padding_repeated_twice, mode=self.padding_mode),
weight, bias, self.stride,
_pair(0), self.dilation, self.groups)
> return F.conv2d(input, weight, bias, self.stride,
self.padding, self.dilation, self.groups)
E RuntimeError: Input type (torch.FloatTensor) and weight type (torch.cuda.FloatTensor) should be the same or input should be a MKLDNN tensor and weight is a dense tensor
../../../../../miniconda3/envs/cs224n_a3/lib/python3.9/site-packages/torch/nn/modules/conv.py:453: RuntimeError
------------------------------------------------------------------------------------------- Captured stdout call --------------------------------------------------------------------------------------------
==========================================================================================
Layer (type:depth-idx) Output Shape Param #
==========================================================================================
ResNet [1, 1000] --
├─Conv2d: 1-1 [1, 64, 112, 112] 9,408
├─BatchNorm2d: 1-2 [1, 64, 112, 112] 128
├─ReLU: 1-3 [1, 64, 112, 112] --
├─MaxPool2d: 1-4 [1, 64, 56, 56] --
├─Sequential: 1-5 [1, 64, 56, 56] --
│ └─BasicBlock: 2-1 [1, 64, 56, 56] --
│ │ └─Conv2d: 3-1 [1, 64, 56, 56] 36,864
│ │ └─BatchNorm2d: 3-2 [1, 64, 56, 56] 128
│ │ └─ReLU: 3-3 [1, 64, 56, 56] --
│ │ └─Conv2d: 3-4 [1, 64, 56, 56] 36,864
│ │ └─BatchNorm2d: 3-5 [1, 64, 56, 56] 128
│ │ └─ReLU: 3-6 [1, 64, 56, 56] --
│ └─BasicBlock: 2-2 [1, 64, 56, 56] --
│ │ └─Conv2d: 3-7 [1, 64, 56, 56] 36,864
│ │ └─BatchNorm2d: 3-8 [1, 64, 56, 56] 128
│ │ └─ReLU: 3-9 [1, 64, 56, 56] --
│ │ └─Conv2d: 3-10 [1, 64, 56, 56] 36,864
│ │ └─BatchNorm2d: 3-11 [1, 64, 56, 56] 128
│ │ └─ReLU: 3-12 [1, 64, 56, 56] --
├─Sequential: 1-6 [1, 128, 28, 28] --
│ └─BasicBlock: 2-3 [1, 128, 28, 28] --
│ │ └─Conv2d: 3-13 [1, 128, 28, 28] 73,728
│ │ └─BatchNorm2d: 3-14 [1, 128, 28, 28] 256
│ │ └─ReLU: 3-15 [1, 128, 28, 28] --
│ │ └─Conv2d: 3-16 [1, 128, 28, 28] 147,456
│ │ └─BatchNorm2d: 3-17 [1, 128, 28, 28] 256
│ │ └─Sequential: 3-18 [1, 128, 28, 28] 8,448
│ │ └─ReLU: 3-19 [1, 128, 28, 28] --
│ └─BasicBlock: 2-4 [1, 128, 28, 28] --
│ │ └─Conv2d: 3-20 [1, 128, 28, 28] 147,456
│ │ └─BatchNorm2d: 3-21 [1, 128, 28, 28] 256
│ │ └─ReLU: 3-22 [1, 128, 28, 28] --
│ │ └─Conv2d: 3-23 [1, 128, 28, 28] 147,456
│ │ └─BatchNorm2d: 3-24 [1, 128, 28, 28] 256
│ │ └─ReLU: 3-25 [1, 128, 28, 28] --
├─Sequential: 1-7 [1, 256, 14, 14] --
│ └─BasicBlock: 2-5 [1, 256, 14, 14] --
│ │ └─Conv2d: 3-26 [1, 256, 14, 14] 294,912
│ │ └─BatchNorm2d: 3-27 [1, 256, 14, 14] 512
│ │ └─ReLU: 3-28 [1, 256, 14, 14] --
│ │ └─Conv2d: 3-29 [1, 256, 14, 14] 589,824
│ │ └─BatchNorm2d: 3-30 [1, 256, 14, 14] 512
│ │ └─Sequential: 3-31 [1, 256, 14, 14] 33,280
│ │ └─ReLU: 3-32 [1, 256, 14, 14] --
│ └─BasicBlock: 2-6 [1, 256, 14, 14] --
│ │ └─Conv2d: 3-33 [1, 256, 14, 14] 589,824
│ │ └─BatchNorm2d: 3-34 [1, 256, 14, 14] 512
│ │ └─ReLU: 3-35 [1, 256, 14, 14] --
│ │ └─Conv2d: 3-36 [1, 256, 14, 14] 589,824
│ │ └─BatchNorm2d: 3-37 [1, 256, 14, 14] 512
│ │ └─ReLU: 3-38 [1, 256, 14, 14] --
├─Sequential: 1-8 [1, 512, 7, 7] --
│ └─BasicBlock: 2-7 [1, 512, 7, 7] --
│ │ └─Conv2d: 3-39 [1, 512, 7, 7] 1,179,648
│ │ └─BatchNorm2d: 3-40 [1, 512, 7, 7] 1,024
│ │ └─ReLU: 3-41 [1, 512, 7, 7] --
│ │ └─Conv2d: 3-42 [1, 512, 7, 7] 2,359,296
│ │ └─BatchNorm2d: 3-43 [1, 512, 7, 7] 1,024
│ │ └─Sequential: 3-44 [1, 512, 7, 7] 132,096
│ │ └─ReLU: 3-45 [1, 512, 7, 7] --
│ └─BasicBlock: 2-8 [1, 512, 7, 7] --
│ │ └─Conv2d: 3-46 [1, 512, 7, 7] 2,359,296
│ │ └─BatchNorm2d: 3-47 [1, 512, 7, 7] 1,024
│ │ └─ReLU: 3-48 [1, 512, 7, 7] --
│ │ └─Conv2d: 3-49 [1, 512, 7, 7] 2,359,296
│ │ └─BatchNorm2d: 3-50 [1, 512, 7, 7] 1,024
│ │ └─ReLU: 3-51 [1, 512, 7, 7] --
├─AdaptiveAvgPool2d: 1-9 [1, 512, 1, 1] --
├─Linear: 1-10 [1, 1000] 513,000
==========================================================================================
Total params: 11,689,512
Trainable params: 11,689,512
Non-trainable params: 0
Total mult-adds (G): 1.81
==========================================================================================
Input size (MB): 0.60
Forward/backward pass size (MB): 39.75
Params size (MB): 46.76
Estimated Total Size (MB): 87.11
==========================================================================================
============================================================================================= warnings summary ==============================================================================================
tests/torchinfo_xl_test.py::test_eval_order_doesnt_matter
/home/mertkurttutan/miniconda3/envs/cs224n_a3/lib/python3.9/site-packages/torchvision/models/_utils.py:208: UserWarning: The parameter 'pretrained' is deprecated since 0.13 and will be removed in 0.15, please use 'weights' instead.
warnings.warn(
tests/torchinfo_xl_test.py::test_eval_order_doesnt_matter
/home/mertkurttutan/miniconda3/envs/cs224n_a3/lib/python3.9/site-packages/torchvision/models/_utils.py:223: UserWarning: Arguments other than a weight enum or `None` for 'weights' are deprecated since 0.13 and will be removed in 0.15. The current behavior is equivalent to passing `weights=ResNet18_Weights.IMAGENET1K_V1`. You can also use `weights=ResNet18_Weights.DEFAULT` to get the most up-to-date weights.
warnings.warn(msg)
tests/torchinfo_xl_test.py::test_google
/home/mertkurttutan/miniconda3/envs/cs224n_a3/lib/python3.9/site-packages/torchvision/models/googlenet.py:47: FutureWarning: The default weight initialization of GoogleNet will be changed in future releases of torchvision. If you wish to keep the old behavior (which leads to long initialization times due to scipy/scipy#11299), please set init_weights=True.
warnings.warn(
-- Docs: https://docs.pytest.org/en/stable/how-to/capture-warnings.html
========================================================================================== short test summary info ==========================================================================================
FAILED tests/exceptions_test.py::test_input_size_half_precision - RuntimeError: Failed to run torchinfo. See above stack traces for more details. Executed layers up to: []
FAILED tests/torchinfo_test.py::test_pack_padded - RuntimeError: Failed to run torchinfo. See above stack traces for more details. Executed layers up to: [Embedding: 1]
FAILED tests/torchinfo_test.py::test_namedtuple - RuntimeError: Failed to run torchinfo. See above stack traces for more details. Executed layers up to: []
FAILED tests/torchinfo_xl_test.py::test_eval_order_doesnt_matter - RuntimeError: Input type (torch.FloatTensor) and weight type (torch.cuda.FloatTensor) should be the same or input should be a MKLDNN te...
============================================================================ 4 failed, 65 passed, 1 skipped, 3 warnings in 8.47s ============================================================================
3 warnings are not that important since they were just deprecation warnings from torchvision. They were resolved once I used the new input formats.
Regarding failed cases, the first one stems from the following runtime error:
self = Linear(in_features=2, out_features=5, bias=True)
input = tensor([[0.6099, 0.2002],
[0.7334, 0.5176],
[0.0652, 0.5923],
[0.8931, 0.7656],
[0.123... [0.9878, 0.7974],
[0.8638, 0.2712],
[0.3899, 0.2676],
[0.9009, 0.7832]], dtype=torch.float16)
def forward(self, input: Tensor) -> Tensor:
> return F.linear(input, self.weight, self.bias)
E RuntimeError: "addmm_impl_cpu_" not implemented for 'Half'
../../../../../miniconda3/envs/cs224n_a3/lib/python3.9/site-packages/torch/nn/modules/linear.py:114: RuntimeError
It seems that in Pytorch v1.12, half precision is not supported on cpu ??? (Also see this remark here)
The other 3 failed cases were because inputs tensors are initialized outside summary and with no explicit device. So, they were created on cpu (in my case). But, the model created inside summary uses GPU automatically unless I make cuda unavailable because of the following line in torchinfo.py (summary function)
if device is None:
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
Indeed, these 3 cases were resolved once I used device=“cpu” when running summary.
Issue Analytics
- State:
- Created a year ago
- Comments:5 (5 by maintainers)

Top Related StackOverflow Question
Then, I am moving only the first warning case to gpu_test and delete the second warning case since it raises already runtime error, right?
By second warning case, I mean the following in
test_input_size_half_precision():Yep, we can remove that test case. We can leave the warning in code though, since it will warn users using earlier versions of PyTorch that do not raise a runtime error