size mismatch for classifier.4.weight
  • 10-May-2023
Lightrun Team
Author Lightrun Team
Share
size mismatch for classifier.4.weight

size mismatch for classifier.4.weight: copying a param with shape torch.Size([751, 256]) from checkpoint, the shape in current model is torch.Size([1, 256]).

Lightrun Team
Lightrun Team
10-May-2023

Explanation of the problem

When running the “train.py” script with the “–resume” option, a Python program is attempting to load a saved neural network model checkpoint file from disk. However, during the loading process, an error occurs. Specifically, the error message states that there is a size mismatch when attempting to load the state_dict for the Net class. The mismatch is occurring for two specific parameters within the “classifier” layer of the model, namely the “classifier.4.weight” and “classifier.4.bias” parameters. According to the error message, the shape of these parameters in the saved checkpoint file is different from the shape expected by the current version of the Net class.

Troubleshooting with the Lightrun Developer Observability Platform

Getting a sense of what’s actually happening inside a live application is a frustrating experience, one that relies mostly on querying and observing whatever logs were written during development.
Lightrun is a Developer Observability Platform, allowing developers to add telemetry to live applications in real-time, on-demand, and right from the IDE.

  • Instantly add logs to, set metrics in, and take snapshots of live applications
  • Insights delivered straight to your IDE or CLI
  • Works where you do: dev, QA, staging, CI/CD, and production

Start for free today

Problem solution for: size mismatch for classifier.4.weight: copying a param with shape torch.Size([751, 256]) from checkpoint, the shape in current model is torch.Size([1, 256]).

To resolve this issue, one potential solution is to modify the code for the Net class to match the shape of the parameters in the saved checkpoint file. Specifically, the code needs to ensure that the “classifier” layer has 751 output units for the “weight” parameter and 751 bias terms for the “bias” parameter, rather than just a single output unit and bias term as it currently has. The following code snippet demonstrates one possible modification to the Net class that would accomplish this:

class Net(nn.Module):
    def __init__(self):
        super(Net, self).__init__()
        self.features = nn.Sequential(
            # ... omitted for brevity ...
        )
        self.classifier = nn.Sequential(
            nn.Linear(256, 751),  # Modified to have 751 output units
            nn.ReLU(inplace=True),
            nn.Dropout(),
            nn.Linear(751, 751),  # Modified to have 751 output units
            nn.ReLU(inplace=True),
            nn.Dropout(),
            nn.Linear(751, 4 + 1 + 751),  # Modified to have 751 output units
        )

After modifying the Net class as shown above, the script can be run again with the “–resume” option to load the saved checkpoint file without encountering the “size mismatch” error.

Other popular problems with deep_sort_pytorch

Problem: When trying to run the deep_sort demo on a webcam, the video feed is displayed but there is no bounding box detection or tracking happening. The terminal displays the following message repeatedly: “Unexpected error in main loop”.

Solution: This error occurs when the program is not able to read the video stream from the webcam or when the webcam is not detected. In such cases, you can try changing the video stream source to a file or another camera device. To do this, modify the code in demo.py to specify the correct input source. For example, to read from a video file, replace this line:

video_capture = cv2.VideoCapture(0)

with:

video_capture = cv2.VideoCapture(0)

If you want to use a different camera device, replace the argument 0 with the index of the desired device. For example, to use the second camera connected to the system, use:

video_capture = cv2.VideoCapture(1)

Once you have made the necessary changes, run the demo.py script again and check if the problem is resolved.

Problem 2:

Another common issue reported by users of the deep_sort_pytorch package is related to the speed of the object tracking algorithm. Some users have found that the algorithm is slow when processing large amounts of data, which can be problematic in real-world scenarios where tracking must be performed in real-time.

One possible solution to this issue is to optimize the code to make it more efficient. For example, users can use the PyTorch profiler to identify and eliminate bottlenecks in the code. Additionally, users can experiment with different hardware configurations to find the optimal setup for their specific use case.

Problem 3:

A third issue that users have encountered when working with the deep_sort_pytorch package is related to the accuracy of the object tracking algorithm. In some cases, the algorithm may fail to correctly identify and track objects, which can result in incorrect or incomplete data.

One potential solution to this problem is to fine-tune the algorithm by training it on a larger and more diverse dataset. Users can also experiment with different hyperparameters and model architectures to improve the accuracy of the algorithm. For example, users can adjust the threshold values used to determine whether a given object is a match, or they can modify the feature extractor to better capture the distinguishing characteristics of the objects being tracked.

A brief introduction to deep_sort_pytorch

Deep SORT is an object tracking algorithm based on the combination of deep appearance features and simple geometric information such as position and size. The algorithm was initially implemented in MATLAB and then in Python. The deep_sort_pytorch project is a PyTorch-based implementation of the Deep SORT algorithm for object tracking. The project provides a simple-to-use API for the user to integrate Deep SORT into their own application.

The deep_sort_pytorch project has several advantages over other Deep SORT implementations. The use of PyTorch allows for more efficient training and inference on GPU, and the PyTorch API makes it easier for developers to modify the implementation for their own needs. Additionally, the project is well-documented, making it easier for users to understand and work with the code. Overall, deep_sort_pytorch is a powerful tool for object tracking that can be easily integrated into a wide range of applications.

Most popular use cases for deep_sort_pytorch

  1. Object tracking in videos: deep_sort_pytorch can be used for tracking objects in videos, particularly in scenarios where multiple objects need to be tracked simultaneously. This is achieved through the use of deep learning-based object detection and tracking algorithms, such as YOLOv3 and DeepSORT, which are integrated into the framework. The following code block demonstrates how to perform object tracking using deep_sort_pytorch:

 

from deep_sort import build_tracker

# Create a tracker object
tracker = build_tracker(model_type='DeepSORT')

# Initialize object detections and their corresponding features
detections = [[xmin, ymin, xmax, ymax, confidence, feature], ...]

# Update the tracker with new detections
tracker.update(detections)

# Get the IDs and bounding boxes of the tracked objects
tracked_objects = tracker.get_tracked_objects()
  1. Person re-identification: deep_sort_pytorch can also be used for person re-identification, which is the task of matching people across different camera views in a multi-camera surveillance system. This is achieved through the use of deep learning-based feature extraction and matching algorithms, such as the ones implemented in the Person Re-Identification with Deep Convolutional Neural Networks (PersonReID) module. The following code block demonstrates how to perform person re-identification using deep_sort_pytorch:

 

from person_reid import build_person_reid

# Create a PersonReID object
person_reid = build_person_reid(model_type='PersonReID')

# Extract features from the images of two people
person_1_features = person_reid.extract_features(person_1_images)
person_2_features = person_reid.extract_features(person_2_images)

# Compute the similarity score between the two people
similarity_score = person_reid.compute_similarity(person_1_features, person_2_features)
  1. Custom object tracking and re-identification models: deep_sort_pytorch can be used to train and deploy custom object detection, tracking, and re-identification models using PyTorch. This provides researchers and developers with a flexible and customizable framework for building object tracking and re-identification applications tailored to their specific needs. The following code block demonstrates how to train a custom object detection model using deep_sort_pytorch:

 

from deep_sort.train import Trainer

# Create a trainer object
trainer = Trainer(model_type='YOLOv3', data_path='data/train', model_path='models')

# Train the model
trainer.train(num_epochs=100)

# Evaluate the model on a test set
evaluation_results = trainer.evaluate(data_path='data/test')

# Save the trained model
trainer.save_model(model_path='models/trained_model.pth')
Share

It’s Really not that Complicated.

You can actually understand what’s going on inside your live applications.

Try Lightrun’s Playground

Lets Talk!

Looking for more information about Lightrun and debugging?
We’d love to hear from you!
Drop us a line and we’ll get back to you shortly.

By submitting this form, I agree to Lightrun’s Privacy Policy and Terms of Use.