question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Support for unicode utf-8 charmaps

See original GitHub issue

Using mailmerge to send mass emails is better than using any gmail extensions, since i have the ability to use advanced markdown and html however while working with a large (5k+) csv file of job applicants, i came across an issue that people had entered their names in a localized language (eg. mandarin, etc) that sometimes didn’t fit the ascii charmaps.

It would be useful to open the csv file as encoding=utf8 with errors=ignore to avoid errors like this: 'charmap' codec can't decode byte 0x9d in position 1710: character maps to <undefined> instead of just returning true if strings are ascii.

Issue Analytics

  • State:closed
  • Created 3 years ago
  • Comments:16 (16 by maintainers)

github_iconTop GitHub Comments

2reactions
thecodelearnercommented, Jul 14, 2020

@seshrs awesome! I will try this.

1reaction
thecodelearnercommented, Jul 14, 2020

GitHub doesn’t support csv uploads so here’s a txt file of the csv: mailmerge_database_txt.txt

edit: uploaded the mailmerge_database.csv here, just in case we have any issues with csv as a txt file https://github.com/thecodelearner/tmp-files/blob/master/mailmerge_database.csv

also, I think that my python installation on windows is broken since the same file runs perfectly on linux, but gives the UnicodeDecodeError on Windows

—> Running mailmerge on Windows:

PS7 mailmerge-client> mailmerge
Traceback (most recent call last):
  File "c:\users\pc\appdata\local\programs\python\python38\lib\runpy.py", line 194, in _run_module_as_main
    return _run_code(code, main_globals, None,
  File "c:\users\pc\appdata\local\programs\python\python38\lib\runpy.py", line 87, in _run_code
    exec(code, run_globals)
  File "C:\Users\PC\AppData\Local\Programs\Python\Python38\Scripts\mailmerge.exe\__main__.py", line 7, in <module>
  File "c:\users\pc\appdata\local\programs\python\python38\lib\site-packages\click\core.py", line 829, in __call__
    return self.main(*args, **kwargs)
  File "c:\users\pc\appdata\local\programs\python\python38\lib\site-packages\click\core.py", line 782, in main
    rv = self.invoke(ctx)
  File "c:\users\pc\appdata\local\programs\python\python38\lib\site-packages\click\core.py", line 1066, in invoke
    return ctx.invoke(self.callback, **ctx.params)
  File "c:\users\pc\appdata\local\programs\python\python38\lib\site-packages\click\core.py", line 610, in invoke
    return callback(*args, **kwargs)
  File "c:\users\pc\appdata\local\programs\python\python38\lib\site-packages\mailmerge\__main__.py", line 114, in main
    for _, row in enumerate_range(csv_database, start, stop):
  File "c:\users\pc\appdata\local\programs\python\python38\lib\site-packages\mailmerge\__main__.py", line 288, in enumerate_range
    for i, value in enumerate(iterable):
  File "c:\users\pc\appdata\local\programs\python\python38\lib\site-packages\mailmerge\__main__.py", line 273, in read_csv_database
    for row in reader:
  File "c:\users\pc\appdata\local\programs\python\python38\lib\csv.py", line 110, in __next__
    self.fieldnames
  File "c:\users\pc\appdata\local\programs\python\python38\lib\csv.py", line 97, in fieldnames
    self._fieldnames = next(self.reader)
  File "c:\users\pc\appdata\local\programs\python\python38\lib\encodings\cp1252.py", line 23, in decode
    return codecs.charmap_decode(input,self.errors,decoding_table)[0]
UnicodeDecodeError: 'charmap' codec can't decode byte 0x9d in position 17: character maps to <undefined>

—> Running mailmerge on Linux with exact same database.csv:

root@Proton:/mnt/c/Users/PC/merge-test# cat mailmerge_database.csv
name,email
"大松 李",person@example.com
"Juán",juan@example.com
"Jöse Felix",jisefelix@example.com
"Paveł",pavel@example.com
root@Proton:/mnt/c/Users/PC/merge-test# file mailmerge_database.csv
mailmerge_database.csv: CSV text
root@Proton:/mnt/c/Users/PC/merge-test# mailmerge --version
mailmerge, version 2.1.0
root@Proton:/mnt/c/Users/PC/merge-test# mailmerge
>>> message 1
TO: person@example.com
SUBJECT: Testing mailmerge
FROM: My Self <myself@mydomain.com>
MIME-Version: 1.0
Content-Type: text/plain; charset="utf-8"
Content-Transfer-Encoding: base64
Date: Tue, 14 Jul 2020 14:13:53 -0000

Hi, 大松 李

>>> message 1 sent
>>> Limit was 1 message.  To remove the limit, use the --no-limit option.
>>> This was a dry run.  To send messages, use the --no-dry-run option.
Read more comments on GitHub >

github_iconTop Results From Across the Web

Unicode/UTF-8-character table
UTF-8 encoding table and Unicode characters. page with code points U+0000 to U+00FF. Share on Facebook Share on Google+ Tweet about this on...
Read more >
UTF-8 and Unicode FAQ for Unix/Linux
C support for Unicode and UTF-8; How should the UTF-8 mode be ... encoding in your current locale with the command locale charmap...
Read more >
How can I enable UTF-8 support in the Linux console?
Set CHARMAP="UTF-8" in /etc/default/console-setup . Run systemctl daemon-reload && systemctl restart console-setup.service afterwards.
Read more >
Chapter 5 Overview of UTF-8 Locale Support
Unicode Locale: en_US.UTF-8 Support. The Unicode/UTF-8 locales support Unicode 4.0. The en_US. ... Use the Character Map application (charmap) instead.
Read more >
Decode unicode charmap (most likely non-standard) with PHP
My issue is that I don't know what encoding it is in, I tried several decoding methods (including json_decode and mb_convert_encode('\u00c3\ ...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found