Trying to use metric.compute but get OSError
See original GitHub issueI want to use metric.compute from load_metric(‘accuracy’) to get training accuracy, but receive OSError. I am wondering what is the mechanism behind the metric calculation, why would it report an OSError?
195 for epoch in range(num_train_epochs):
196 model.train()
197 for step, batch in enumerate(train_loader):
198 # print(batch['input_ids'].shape)
199 outputs = model(**batch)
200
201 loss = outputs.loss
202 loss /= gradient_accumulation_steps
203 accelerator.backward(loss)
204
205 predictions = outputs.logits.argmax(dim=-1)
206 metric.add_batch(
207 predictions=accelerator.gather(predictions),
208 references=accelerator.gather(batch['labels'])
209 )
210 progress_bar.set_postfix({'loss': loss.item(), 'train batch acc.': train_metrics})
211
212 if (step + 1) % 50 == 0 or step == len(train_loader) - 1:
213 train_metrics = metric.compute()
the error message is as below:
Traceback (most recent call last):
File "run_multi.py", line 273, in <module>
main()
File "/home/yshuang/.local/lib/python3.8/site-packages/click/core.py", line 829, in __call__
return self.main(*args, **kwargs)
File "/home/yshuang/.local/lib/python3.8/site-packages/click/core.py", line 782, in main
rv = self.invoke(ctx)
File "/home/yshuang/.local/lib/python3.8/site-packages/click/core.py", line 1066, in invoke
return ctx.invoke(self.callback, **ctx.params)
File "/home/yshuang/.local/lib/python3.8/site-packages/click/core.py", line 610, in invoke
return callback(*args, **kwargs)
File "run_multi.py", line 213, in main
train_metrics = metric.compute()
File "/home/yshuang/.local/lib/python3.8/site-packages/datasets/metric.py", line 391, in compute
self._finalize()
File "/home/yshuang/.local/lib/python3.8/site-packages/datasets/metric.py", line 342, in _finalize
self.writer.finalize()
File "/home/yshuang/.local/lib/python3.8/site-packages/datasets/arrow_writer.py", line 370, in finalize
self.stream.close()
File "pyarrow/io.pxi", line 132, in pyarrow.lib.NativeFile.close
File "pyarrow/error.pxi", line 99, in pyarrow.lib.check_status
OSError: error closing file
Environment info
datasets
version: 1.6.1- Platform: Linux NAME=“Ubuntu” VERSION=“20.04.1 LTS (Focal Fossa)”
- Python version: python3.8.5
- PyArrow version: 4.0.0
Issue Analytics
- State:
- Created 2 years ago
- Comments:6 (2 by maintainers)
Top Results From Across the Web
Trying to use metric.compute but get OSError #113 - GitHub
I want to use metric.compute from load_metric('accuracy') to get training accuracy, but receive OSError. I am wondering what is the ...
Read more >Python fromtimestamp OSError - datetime - Stack Overflow
This is the exact reason I was getting my error. I was using JSON PowerShell dates in Python and they are using milliseconds....
Read more >Metrics - Hugging Face
Metrics are important for evaluating a model's predictions. In the tutorial, you learned how to compute a metric over an entire evaluation set....
Read more >Error while loading Spacy's "en_core_web_md" - Microsoft Q&A
Hi, I am trying to load "en_core_web_md". Every time I do so I get the error "OSError: [E050] Can't find model 'en_core_web_md'.
Read more >Troubleshoot HTTP 5xx errors from Amazon S3
To monitor the number of 5xx status error responses that you're getting, you can use one of these options: Turn on Amazon CloudWatch...
Read more >Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start FreeTop Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Top GitHub Comments
Hi ! By default it caches the predictions and references used to compute the metric in
~/.cache/huggingface/datasets/metrics
(not~/.datasets/
). Let me update the documentation @bhavitvyamalik .The cache is used to store all the predictions and references passed to
add_batch
for example in order to compute the metric later whencompute
is called.I think the issue might come from the cache directory that is used by default. Can you check that you have the right permissions ? Otherwise feel free to set
cache_dir
to another location.Closing this for now. Will re-open it should the issue still persist.