question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

fine-tune RuntimeError: expected dtype Float but got dtype Long

See original GitHub issue

When fine-tune on my own data, error happen when run model.fit

Traceback (most recent call last):
  File "fine_tune_sbert.py", line 109, in <module>
    output_path=model_save_path)
  File "/opt/conda/lib/python3.7/site-packages/sentence_transformers/SentenceTransformer.py", line 402, in fit
    loss_value.backward()
  File "/opt/conda/lib/python3.7/site-packages/torch/tensor.py", line 198, in backward
    torch.autograd.backward(self, gradient, retain_graph, create_graph)
  File "/opt/conda/lib/python3.7/site-packages/torch/autograd/__init__.py", line 100, in backward
    allow_unreachable=True)  # allow_unreachable flag
RuntimeError: expected dtype Float but got dtype Long (validate_dtype at /pytorch/aten/src/ATen/native/TensorIterator.cpp:143)
frame #0: c10::Error::Error(c10::SourceLocation, std::string const&) + 0x46 (0x7f4a9deda536 in /opt/conda/lib/python3.7/site-packages/torch/lib/libc10.so)
frame #1: at::TensorIterator::compute_types() + 0xce3 (0x7f4adb981a23 in /opt/conda/lib/python3.7/site-packages/torch/lib/libtorch_cpu.so)
frame #2: at::TensorIterator::build() + 0x44 (0x7f4adb984404 in /opt/conda/lib/python3.7/site-packages/torch/lib/libtorch_cpu.so)
frame #3: at::native::mse_loss_backward_out(at::Tensor&, at::Tensor const&, at::Tensor const&, at::Tensor const&, long) + 0x193 (0x7f4adb7d1953 in /opt/conda/lib/python3.7/site-packages/torch/lib/libtorch_cpu.so)
frame #4: <unknown function> + 0xf903d7 (0x7f4a9f2c63d7 in /opt/conda/lib/python3.7/site-packages/torch/lib/libtorch_cuda.so)
frame #5: at::native::mse_loss_backward(at::Tensor const&, at::Tensor const&, at::Tensor const&, long) + 0x172 (0x7f4adb7da092 in /opt/conda/lib/python3.7/site-packages/torch/lib/libtorch_cpu.so)
frame #6: <unknown function> + 0xf9068f (0x7f4a9f2c668f in /opt/conda/lib/python3.7/site-packages/torch/lib/libtorch_cuda.so)
frame #7: <unknown function> + 0x10c2536 (0x7f4adbc0a536 in /opt/conda/lib/python3.7/site-packages/torch/lib/libtorch_cpu.so)
frame #8: <unknown function> + 0x2a9ecdb (0x7f4add5e6cdb in /opt/conda/lib/python3.7/site-packages/torch/lib/libtorch_cpu.so)
frame #9: <unknown function> + 0x10c2536 (0x7f4adbc0a536 in /opt/conda/lib/python3.7/site-packages/torch/lib/libtorch_cpu.so)
frame #10: torch::autograd::generated::MseLossBackward::apply(std::vector<at::Tensor, std::allocator<at::Tensor> >&&) + 0x1f7 (0x7f4add3ee777 in /opt/conda/lib/python3.7/site-packages/torch/lib/libtorch_cpu.so)
frame #11: <unknown function> + 0x2d89705 (0x7f4add8d1705 in /opt/conda/lib/python3.7/site-packages/torch/lib/libtorch_cpu.so)
frame #12: torch::autograd::Engine::evaluate_function(std::shared_ptr<torch::autograd::GraphTask>&, torch::autograd::Node*, torch::autograd::InputBuffer&) + 0x16f3 (0x7f4add8cea03 in /opt/conda/lib/python3.7/site-packages/torch/lib/libtorch_cpu.so)
frame #13: torch::autograd::Engine::thread_main(std::shared_ptr<torch::autograd::GraphTask> const&, bool) + 0x3d2 (0x7f4add8cf7e2 in /opt/conda/lib/python3.7/site-packages/torch/lib/libtorch_cpu.so)
frame #14: torch::autograd::Engine::thread_init(int) + 0x39 (0x7f4add8c7e59 in /opt/conda/lib/python3.7/site-packages/torch/lib/libtorch_cpu.so)
frame #15: torch::autograd::python::PythonEngine::thread_init(int) + 0x38 (0x7f4aea20fac8 in /opt/conda/lib/python3.7/site-packages/torch/lib/libtorch_python.so)
frame #16: <unknown function> + 0xbd66f (0x7f4aeb0bf66f in /usr/lib/x86_64-linux-gnu/libstdc++.so.6)
frame #17: <unknown function> + 0x76db (0x7f4aece476db in /lib/x86_64-linux-gnu/libpthread.so.0)
frame #18: clone + 0x3f (0x7f4aecb7088f in /lib/x86_64-linux-gnu/libc.so.6)

My torch version is 1.15

Issue Analytics

  • State:closed
  • Created 3 years ago
  • Comments:9 (3 by maintainers)

github_iconTop GitHub Comments

1reaction
nreimerscommented, Apr 19, 2021

@naty88 Appears your labels are ints. But you need to pass floats

0reactions
chintanckgcommented, Oct 21, 2021

You are doing supervised training without labels, this will not work. Please try un-supervised training (well documented on the sentence-transformers site).

Issue can be marked closed.

Read more comments on GitHub >

github_iconTop Results From Across the Web

Pytorch: RuntimeError: expected dtype Float but got dtype Long
So I changed it's type to float (as the expected dtype is Float) while passing it to the criterion . You can check...
Read more >
expected dtype Float but got dtype Long for my loss function ...
RuntimeError : expected dtype Float but got dtype Long for my loss function despite converting all tensors to float.
Read more >
return f.linear(input, self.weight, self.bias) runtimeerror - You.com
This means when you create a tensor, its default dtype is torch.float32 .try: ... Pytorch error: input is expected to be scalar type...
Read more >
Problem when using Autograd with nn.Embedding in Pytorch
Indices are required to be long, embeddings are float. And you don't need gradient for the indices cause you use them only to...
Read more >
PyTorch Basic Tutorial - Harshit Kumar
However, you won't be able to use GPU, and will have to write the ... uninitialized tensor print(torch.empty(2, 2, dtype=torch.bool)) ...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found