question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Converting notebook to HTML throws encoding error on windows

See original GitHub issue
====================================================== DAG build failed ======================================================
----- NotebookRunner: fit -> MetaProduct({'model': File('products\\model.pickle'), 'nb': File('products\\report.html')}) -----
----------------------------------------- C:\Users\edubl\Desktop\proj\scripts\fit.py -----------------------------------------
Traceback (most recent call last):
  File "c:\users\edubl\desktop\proj\venv-proj\lib\site-packages\ploomber\tasks\abc.py", line 562, in _build
    res = self._run()
  File "c:\users\edubl\desktop\proj\venv-proj\lib\site-packages\ploomber\tasks\abc.py", line 669, in _run
    self.run()
  File "c:\users\edubl\desktop\proj\venv-proj\lib\site-packages\ploomber\tasks\notebook.py", line 525, in run
    self._converter.convert()
  File "c:\users\edubl\desktop\proj\venv-proj\lib\site-packages\ploomber\tasks\notebook.py", line 94, in convert
    self._from_ipynb(self.path_to_output, self.exporter,
  File "c:\users\edubl\desktop\proj\venv-proj\lib\site-packages\ploomber\tasks\notebook.py", line 160, in _from_ipynb
    path.write_text(content)
  File "C:\Users\edubl\miniconda3\envs\scaffold\lib\pathlib.py", line 1256, in write_text
    return f.write(data)
  File "C:\Users\edubl\miniconda3\envs\scaffold\lib\encodings\cp1252.py", line 19, in encode
    return codecs.charmap_encode(input,self.errors,encoding_table)[0]
UnicodeEncodeError: 'charmap' codec can't encode character '\ue6c6' in position 233857: character maps to <undefined>

Issue Analytics

  • State:closed
  • Created 2 years ago
  • Comments:12 (7 by maintainers)

github_iconTop GitHub Comments

1reaction
kubowcommented, Mar 8, 2022

I managed to reproduce this error on two independent windows machines (both win 10 and 11).

  1. Create example ml-basic folder ploomber examples -n templates/ml-basic -o ml-basic
  2. Change pipeline.yaml file to conver only one file tasks: - source: Readme.ipynb product: Readme.md
  3. call ploomber build outputs very similar message (the file Readme.ipynb comes with utf-8 encoding, though looks like cp1252 attempted)

ploomber build

Loading pipeline… Traceback (most recent call last): File “C:_Run\App\python\lib\site-packages\ploomber\cli\io.py”, line 37, in wrapper fn(**kwargs) File “C:_Run\App\python\lib\site-packages\ploomber\telemetry\telemetry.py”, line 551, in wrapper raise e File “C:_Run\App\python\lib\site-packages\ploomber\telemetry\telemetry.py”, line 537, in wrapper result = func(_payload, *args, **kwargs) File “C:_Run\App\python\lib\site-packages\ploomber\cli\build.py”, line 55, in main dag, args = parser.load_from_entry_point_arg() File “C:_Run\App\python\lib\site-packages\ploomber\cli\parsers.py”, line 215, in load_from_entry_point_arg dag, args = load_dag_from_entry_point_and_parser( File “C:_Run\App\python\lib\site-packages\ploomber\cli\parsers.py”, line 488, in load_dag_from_entry_point_and_parser dag, args = _process_file_dir_or_glob(parser) File “C:_Run\App\python\lib\site-packages\ploomber\cli\parsers.py”, line 442, in _process_file_dir_or_glob dag = DAGSpec(dagspec_arg).to_dag() File “C:_Run\App\python\lib\site-packages\ploomber\spec\dagspec.py”, line 444, in to_dag dag = self._to_dag() File “C:_Run\App\python\lib\site-packages\ploomber\spec\dagspec.py”, line 498, in _to_dag process_tasks(dag, self, root_path=self._parent_path) File “C:_Run\App\python\lib\site-packages\ploomber\spec\dagspec.py”, line 766, in process_tasks source = call_with_dictionary(fn, kwargs=kwargs) File “C:_Run\App\python\lib\site-packages\ploomber\util\util.py”, line 260, in call_with_dictionary return fn(**sub_kwargs) File “C:_Run\App\python\lib\site-packages\ploomber\tasks\notebook.py”, line 369, in _init_source return NotebookSource( File “C:_Run\App\python\lib\site-packages\ploomber\util\util.py”, line 62, in wrapper return f(*args, **kwargs) File “C:_Run\App\python\lib\site-packages\ploomber\sources\notebooksource.py”, line 188, in init self._primitive = primitive.read_text() File “C:_Run\App\python\lib\pathlib.py”, line 1256, in read_text return f.read() File “C:_Run\App\python\lib\encodings\cp1252.py”, line 23, in decode return codecs.charmap_decode(input,self.errors,decoding_table)[0] UnicodeDecodeError: ‘charmap’ codec can’t decode byte 0x8f in position 64681: character maps to <undefined>

maybe ploomber nb -i output can also help:

Loading pipeline… Error: Could not run nb command: the DAG failed to load ‘charmap’ codec can’t decode byte 0x8f in position 64681: character maps to <undefined>

As I am using python 3, I strongly believe it is connected with relying on utf-8 as system default encoding as described here. It can be easily solved with adding encoding='utf-8' parameter to the file.open function. As I do not know the tool’s structure, I am a bit lost on where the actual read happens.

0reactions
edublancascommented, Mar 11, 2022

Hey @kubow, I’m working on a final solution to the encoding problem. Which Windows and Python version are you running? I’d like to test the new implementation.

Also, I found out that the graphviz team is simplifying installation on windows. You may want to check it out.

Read more comments on GitHub >

github_iconTop Results From Across the Web

Non-ASCII character in notebook - windows - Stack Overflow
I have a notebook that i want to convert to html using nbconvert. When i start the conversion, it throws this exception: "'utf-8'...
Read more >
Cannot export python notebook to ipython notebook or html ...
I edited a python notebook in databricks and I'd like to export it as ipython notebook or html file. However, I got the...
Read more >
nbconvert Documentation - Read the Docs
For converting notebooks to PDF with --to webpdf, nbconvert requires the Pyppeteer Chromium automation library.
Read more >
How can I fix the UTF-8 error when bulk uploading users?
This error occurs because the software you are using saves the file in a different type of encoding, such as ISO-8859, instead of...
Read more >
IO tools (text, CSV, HDF5, …) — pandas 1.5.2 documentation
Dict of functions for converting values in certain columns. Keys can either be integers or column labels. true_valueslist, default None. Values to consider...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found