question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

UnicodeEncodeError: 'charmap' codec can't encode character '\ufeff' in position 3: character maps to <undefined>

See original GitHub issue

Hi all 👋

I’m getting this error while running the script and updating the sources. I’m on Windows 10 with Python 3.8.3.

Traceback (most recent call last):
  File "updateHostsFile.py", line 1750, in <module>
    main()
  File "updateHostsFile.py", line 282, in main
    final_file = remove_dups_and_excl(merge_file, exclusion_regexes)
  File "updateHostsFile.py", line 937, in remove_dups_and_excl
    hostname, normalized_rule = normalize_rule(
  File "updateHostsFile.py", line 1025, in normalize_rule
    print("==>%s<==" % rule)
  File "C:\Python38\lib\encodings\cp1252.py", line 19, in encode
    return codecs.charmap_encode(input,self.errors,encoding_table)[0]
UnicodeEncodeError: 'charmap' codec can't encode character '\ufeff' in position 3: character maps to <undefined>

Issue Analytics

  • State:closed
  • Created 3 years ago
  • Comments:8 (7 by maintainers)

github_iconTop GitHub Comments

2reactions
funilryscommented, Jul 6, 2020

Assuming that we are talking about the cp1252 encoding as mentioned in:

File "C:\Python38\lib\encodings\cp1252.py", line 19, in encode
    return codecs.charmap_encode(input,self.errors,encoding_table)[0]

I can’t (literally) reproduce.

$ # Change the Python encoding to CP1252 through the `PYTHONIOENCODING` environment variable.
$  export PYTHONIOENCODING="cp1252"
$ # Start the generation.
$ python updateHostsFile.py -a
[truncated]
==>fe00::0 ip6-localnet<==
==>ff00::0 ip6-mcastprefix<==
==>ff02::2 ip6-allrouters<==
==>ff02::3 ip6-allhosts<==
Success! The hosts file has been saved in folder 
It contains 57,286 unique entries.

Therefore, I don’t know where the problem is here. Unless OP can give us more information, I’m not going to look for a problem which may not exist.


Other info

Python version

$ python -VV
Python 3.8.3 (default, May 17 2020, 18:15:42) 
[GCC 10.1.0]

Why using the PYTHONIOENCODING environment variable?

As the problem comes from print(), that means that I can reproduce by changing the default stdout encoding.

File "updateHostsFile.py", line 1025, in normalize_rule
    print("==>%s<==" % rule)

Here is the example, which proves that it’s working.

$ export PYTHONIOENCODING="utf-8"
$ python
Python 3.8.3 (default, May 17 2020, 18:15:42) 
[GCC 10.1.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import sys
>>> sys.stdout.encoding
'utf-8'
>>> print(u'\xe9')
é
$ export PYTHONIOENCODING="cp1252"
$ python
Python 3.8.3 (default, May 17 2020, 18:15:42) 
[GCC 10.1.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import sys
>>> sys.stdout.encoding
'cp1252'
>>> print('\xe9')
�

Now what about \ufeff?

I never played with it but it is here good explained.

So I tried, with PYTHONIOENCODING (again).

With CP1252

>>> print('\ufeff')
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/usr/lib/python3.8/encodings/cp1252.py", line 19, in encode
    return codecs.charmap_encode(input,self.errors,encoding_table)[0]
UnicodeEncodeError: 'charmap' codec can't encode character '\ufeff' in position 0: character maps to <undefined>

With UTF-8

>>> print('\ufeff')

>>> 

Now, talking about this project (itself), I really don’t know where \ufeff comes from as the line:

    print("==>%s<==" % rule)

is generated at the end… And I really can’t find anything about this.


@StevenBlack @XhmikosR I leave the rest for you!

1reaction
XhmikosRcommented, Jun 13, 2020

What’s your system config and the exact command you are using to run the script? Also, I assume you are on the latest master?

C:\Users\xmr\Desktop>@systeminfo | @findstr /B /C:"OS Name" /B /C:"OS Version" /B /C:"System Locale" /B /C:"Input Locale"
OS Name:                   Microsoft Windows 10 Pro
OS Version:                10.0.19041 N/A Build 19041
System Locale:             en-us;English (United States)
Input Locale:              en-us;English (United States)
Read more comments on GitHub >

github_iconTop Results From Across the Web

UnicodeEncodeError: 'charmap' codec can't encode ...
The reason why it is working is because the encoding is changed to UTF-8 when using the file, so characters in UTF-8 are...
Read more >
UnicodeEncodeError: 'charmap' codec can't encode ...
UnicodeEncodeError : 'charmap' codec can't encode character u'\u2026' in position 139: character maps to <undefined> #1. Open. baditaflorin opened this issue on ...
Read more >
'charmap' codec can't encode characters in position 0-14 ...
Hi! How to solve "UnicodeEncodeError: 'charmap' codec can't encode characters in position 0-14: character maps to " for this code: stockList ...
Read more >
'charmap' codec can't encode characters in position
The Python "UnicodeEncodeError: 'charmap' codec can't encode characters in position" occurs when we use an incorrect codec to encode a string to ...
Read more >
"UnicodeEncodeError: 'charmap' codec can't encode ...
INTERNALERROR occurs running pytest - "UnicodeEncodeError: 'charmap' codec can't encode characters in position ...: character maps to <undefined>".
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found