[testing] when to @slow and when not to? (huge models download)
See original GitHub issueLooking at the CI logs we do have huge models downloaded (i.e. not @slow):
Downloading: 100% 1.16G/1.16G [00:52<00:00, 22.3MB/s]
Downloading: 100% 433M/433M [00:08<00:00, 48.4MB/s]s]
Downloading: 43% 369M/863M [00:08<00:10, 45.4MB/s]
so it’s very inconsistent. Why not have a whole bunch more of tests not be @slow
then if we are downloading huge files anyway? A lot of those tests are very fast, other than the download overhead. Or, perhaps, those currently doing huge downloads should be @slow
in first place?
I’m asking since I was told not to run any fsmt tests with the full model unless it’s @slow
(size ~1.1GB). So it’s unclear when it’s OK to include huge models in the non-slow test suite and when not to.
Also, here is an alternative approach to think about - why not download large weights while other tests not needing them are running? i.e. fork a process early on on CI after pip installs are done and let it cache the models - then they will be ready to be used by the time the tests that need them get to run. This is an unpolished idea, since one needs to figure out how to re-sort the tests so that these large-model tests aren’t run first…
Issue Analytics
- State:
- Created 3 years ago
- Comments:9 (8 by maintainers)
Hi, sorry for getting back to you so late. I believe this was due to the pipeline tests, but that should not be the case anymore since the refactor of the pipeline tests by Thom.
If some tests still download large files, then that’s an error which we should resolve.
Thank you for reading my ideas and following up, @LysandreJik.
I made a tentative 50MB suggestion in https://github.com/huggingface/transformers/pull/8824
We can tweak it if it’s not right down the road.