question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

[RFC] Make the TorchData Library Standalone from PyTorch Core Library

See original GitHub issue

🚀 The feature

Note that this is a request for comment; currently, there is no plan to make TorchData a standalone library. We would like to solicit feedback from the community.

Proposal: Make the TorchData library standalone with little to no dependency on the PyTorch Core library (i.e. torch).

Motivation, pitch

An argument for a standalone library is that, it will allow users to use all the data loading functionalities in this library without installing/using PyTorch. Datasets implemented using TorchData may become usable by other frameworks.

An argument against this change is - in order to certain DataLoader functionalities backward compatible with DataPipes, the torch library may need to become dependent on TorchData instead.

The list of arguments here is not comprehensive, feel free to leave a comment about potential use cases and how they will be impacted.

Alternatives

Leave the library as it is with dependency on torch.

Additional context

Please feel free to leave any comment/reaction to this proposal whether you are for or against this change. We’d like to hear from you!

cc: @VitalyFedyunin @ejguan @NivekT

Issue Analytics

  • State:open
  • Created 2 years ago
  • Reactions:7
  • Comments:10 (5 by maintainers)

github_iconTop GitHub Comments

7reactions
abhi-mosaiccommented, Mar 16, 2022

I’m working on a downstream library (Composer) that has the exact same issue – we would love to allow users to build datasets with torchdata without hard requirements on the torch version.

3reactions
ejguancommented, May 25, 2022

Just for record, when decoupled, expecttest can be removed from our test dependency.

Read more comments on GitHub >

github_iconTop Results From Across the Web

TorchData main documentation - PyTorch
torchdata is a Beta library of common modular data loading primitives for easily constructing flexible and performant data pipelines.
Read more >
PyTorch 1.9 Release, including torch.linalg and Mobile ...
With this latest release we are making it much simpler to integrate the interpreter by providing pre-built libraries for iOS and Android.
Read more >
Map-style DataPipes — TorchData 0.5.0 (beta) documentation
This is a close equivalent of Dataset from the PyTorch core library. ... To make it work with a map-style DataPipe with non-integral...
Read more >
TorchData 0.3.0 documentation - PyTorch
TorchData. This library is part of the PyTorch project. PyTorch is an open source machine learning framework. torchdata is a prototype library of...
Read more >
Package List — Spack 0.20.0.dev0 documentation
BLIS is a portable software framework for instantiating high-performance BLAS-like dense linear algebra libraries. The framework was designed to isolate ...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found