[RFC] Make the TorchData Library Standalone from PyTorch Core Library
See original GitHub issue🚀 The feature
Note that this is a request for comment; currently, there is no plan to make TorchData a standalone library. We would like to solicit feedback from the community.
Proposal: Make the TorchData library standalone with little to no dependency on the PyTorch Core library (i.e. torch
).
Motivation, pitch
An argument for a standalone library is that, it will allow users to use all the data loading functionalities in this library without installing/using PyTorch. Datasets implemented using TorchData may become usable by other frameworks.
An argument against this change is - in order to certain DataLoader functionalities backward compatible with DataPipes, the torch
library may need to become dependent on TorchData instead.
The list of arguments here is not comprehensive, feel free to leave a comment about potential use cases and how they will be impacted.
Alternatives
Leave the library as it is with dependency on torch
.
Additional context
Please feel free to leave any comment/reaction to this proposal whether you are for or against this change. We’d like to hear from you!
Issue Analytics
- State:
- Created 2 years ago
- Reactions:7
- Comments:10 (5 by maintainers)
I’m working on a downstream library (Composer) that has the exact same issue – we would love to allow users to build datasets with
torchdata
without hard requirements on thetorch
version.Just for record, when decoupled,
expecttest
can be removed from our test dependency.