test: TestMetrics.test_formatted_output may fail on Windows
See original GitHub issueIt runs successfully on appveyor, but fails on Windows 10 free image with:
E AssertionError: assert ‘\tmetrics.csv:\n\t\tvalor_mse desviación_mse data_set \n\t\t0.421601 0.173461 entrenamiento \n\t\t0.67528 0.289545 pruebas \n\t\t0.671502 0.297848 validación’ in ‘unable to read metric in 'metrics.csv' in branch ''\n\tmetrics.tsv:\n\t\tvalue_mse deviation_mse data_set …\t\t “0.671502”\n\t\t ]\n\t\t}\n\tmetrics.txt:\n\t\tROC_AUC: 0.64\n\t\tKS: 78.9999999996\n\t\tF_SCORE: 77’
Note that “unable to read metric in ‘metrics.csv’ in branch …”, the full error is:
unable to read metric in 'metrics.csv' in branch ''
Traceback (most recent call last):
File "E:\dvc\repo\metrics\show.py", line 138, in _read_metric
return _format_output(fd.read().strip(), typ)
File "c:\users\user\envs\dvc\lib\codecs.py", line 322, in decode
(result, consumed) = self._buffer_decode(data, self.errors, final)
UnicodeDecodeError: 'utf-8' codec can't decode byte 0xf3 in position 18: invalid continuation byte
So it’s something about encodongs. Most probably some open()
somewhere doesn’t specify encoding and that falls back to system default, say cp1252, which somewhat works for text in the test, but fails later then it’s read as unicode.
Issue Analytics
- State:
- Created 4 years ago
- Comments:7 (4 by maintainers)
NOTE: to prevent future errors of this type we may
setlocale()
globally to use UTF-8.P.S. We will also need same wrapper for
pathlib.Path.write_text()
andpathlib.Path.read_text()
.