question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

UTF Characters in Movie title.

See original GitHub issue

$ python2.7 autorippr.py --all --debug 2016-10-26 21:20:09 - Rip - DEBUG - Ripping initialised 2016-10-26 21:20:09 - Rip - DEBUG - Checking for DVDs 2016-10-26 21:20:16 - Rip - DEBUG - 1 DVD(s) found 2016-10-26 21:20:16 - Makemkv - DEBUG - Detected movie Les Miserables Dom 2016-10-26 21:20:41 - Makemkv - DEBUG - MakeMKV found 1 titles 2016-10-26 21:20:41 - Makemkv - DEBUG - MakeMKV title info: Disc Title: [‘Les Mis\xc3\xa9rables’], Title No.: 0, Title: [‘Les_Mis\xc3\xa9rables_t00.mkv’], 2016-10-26 21:20:41 - Rip - DEBUG - Attempting to rip Les_Misérables_t00.mkv from Les Miserables Dom 2016-10-26 21:50:48 - Rip - INFO - It took 30 minute(s) to complete the ripping of Les_Misérables_t00.mkv from Les Miserables Dom 2016-10-26 21:50:48 - Eject - DEBUG - Ejecting drive: “/dev/sr0” 2016-10-26 21:50:48 - Eject - DEBUG - Attempting OS detection 2016-10-26 21:50:48 - Eject - DEBUG - OS detected as Unix 2016-10-26 21:50:52 - Eject - DEBUG - eject: device name is `/dev/sr0’ 2016-10-26 21:50:52 - Eject - DEBUG - eject: /dev/sr0: not mounted 2016-10-26 21:50:52 - Eject - DEBUG - eject: /dev/sr0: is whole-disk device 2016-10-26 21:50:52 - Eject - DEBUG - eject: /dev/sr0: trying to eject using CD-ROM eject command 2016-10-26 21:50:52 - Eject - DEBUG - eject: CD-ROM eject command succeeded 2016-10-26 21:50:52 - Compress - DEBUG - Compressing initialised 2016-10-26 21:50:52 - Compress - DEBUG - Looking for videos to compress Traceback (most recent call last): File “autorippr.py”, line 419, in <module> compress(config) File “autorippr.py”, line 272, in compress dbvideo.filename, dbvideo.vidname)) UnicodeEncodeError: ‘ascii’ codec can’t encode character u’\xe9’ in position 7: ordinal not in range(128)

Issue Analytics

  • State:open
  • Created 7 years ago
  • Comments:6 (3 by maintainers)

github_iconTop GitHub Comments

1reaction
srounetcommented, Oct 31, 2016

I think my Pull-Request address this problems, had the exact same issue with: “Master_and_Commander_De_l’autre_côté_du_monde_t00.mkv”

This commit address the issue by mapping accentuated characters, and removing ‘some’ special characters like quote or double quotes.

0reactions
knoercommented, Nov 4, 2016

I checked out the repo just last night, and looking through autorippr.py, I am sure I recall the comment about the Master and Commander string conversion, so I should be running the latest code…?

If I recall, the Nordic special characters are part of iso-8859-1 - my system may very well be running this charset as default for the same reason - I will investigate if this is true during the weekend.

I find it hard to estimate the value of spending time on handling a singular case of a character conversion gone wrong if there are no one else having the same issues. -so I’ll let you be the judge of that.

In any case, I guess I will try experimenting a a bit with charset encoding before attempting to do a string cleanup using the functions in util.py… A couple of years ago I tried getting into Python, this might be a good time to pick it up again… 😃

As a general solution, could it be possible to detect the system charset and decode strings from this format to UTF-8 as a part of the string cleanup process - maybe selectable by a parameter in settings.cfg?

This minor issue aside, I still find Autorippr an awesome tool for backing up media (and avoiding the kids (man)handling disks) at home!

Read more comments on GitHub >

github_iconTop Results From Across the Web

The Unicode character at code point - IMDb Community Forums
The Unicode character at code point ... We use the original title of a movie/show in its original language as it appears on...
Read more >
Title UTF-8 on HTML - Stack Overflow
I'm having a problem with UTF-8 character on the page title, I want to add this on the title of the page -->...
Read more >
Inserting unicode special characters in titles - VideoHelp Forum
Dear All. When I create menus, I need to insert special characters in titles. I can spot these special characters using Windows Character...
Read more >
MOVIE CAMERA - UTF-8 Icons
Symbol information table ; Name: Movie Camera ; Unicode Subset: Miscellaneous Symbols And Pictographs ; Unicode HEX: U+1F3A5 ; ASCII value: 127909 ;...
Read more >
Media control symbols - Wikipedia
For Unicode characters, see Geometric Shapes. "Play Button" redirects here. For the plaques given to certain channels on YouTube, see YouTube Play Button....
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found