UTF Characters in Movie title.
See original GitHub issue$ python2.7 autorippr.py --all --debug 2016-10-26 21:20:09 - Rip - DEBUG - Ripping initialised 2016-10-26 21:20:09 - Rip - DEBUG - Checking for DVDs 2016-10-26 21:20:16 - Rip - DEBUG - 1 DVD(s) found 2016-10-26 21:20:16 - Makemkv - DEBUG - Detected movie Les Miserables Dom 2016-10-26 21:20:41 - Makemkv - DEBUG - MakeMKV found 1 titles 2016-10-26 21:20:41 - Makemkv - DEBUG - MakeMKV title info: Disc Title: [‘Les Mis\xc3\xa9rables’], Title No.: 0, Title: [‘Les_Mis\xc3\xa9rables_t00.mkv’], 2016-10-26 21:20:41 - Rip - DEBUG - Attempting to rip Les_Misérables_t00.mkv from Les Miserables Dom 2016-10-26 21:50:48 - Rip - INFO - It took 30 minute(s) to complete the ripping of Les_Misérables_t00.mkv from Les Miserables Dom 2016-10-26 21:50:48 - Eject - DEBUG - Ejecting drive: “/dev/sr0” 2016-10-26 21:50:48 - Eject - DEBUG - Attempting OS detection 2016-10-26 21:50:48 - Eject - DEBUG - OS detected as Unix 2016-10-26 21:50:52 - Eject - DEBUG - eject: device name is `/dev/sr0’ 2016-10-26 21:50:52 - Eject - DEBUG - eject: /dev/sr0: not mounted 2016-10-26 21:50:52 - Eject - DEBUG - eject: /dev/sr0: is whole-disk device 2016-10-26 21:50:52 - Eject - DEBUG - eject: /dev/sr0: trying to eject using CD-ROM eject command 2016-10-26 21:50:52 - Eject - DEBUG - eject: CD-ROM eject command succeeded 2016-10-26 21:50:52 - Compress - DEBUG - Compressing initialised 2016-10-26 21:50:52 - Compress - DEBUG - Looking for videos to compress Traceback (most recent call last): File “autorippr.py”, line 419, in <module> compress(config) File “autorippr.py”, line 272, in compress dbvideo.filename, dbvideo.vidname)) UnicodeEncodeError: ‘ascii’ codec can’t encode character u’\xe9’ in position 7: ordinal not in range(128)
Issue Analytics
- State:
- Created 7 years ago
- Comments:6 (3 by maintainers)
I think my Pull-Request address this problems, had the exact same issue with: “Master_and_Commander_De_l’autre_côté_du_monde_t00.mkv”
This commit address the issue by mapping accentuated characters, and removing ‘some’ special characters like quote or double quotes.
I checked out the repo just last night, and looking through autorippr.py, I am sure I recall the comment about the Master and Commander string conversion, so I should be running the latest code…?
If I recall, the Nordic special characters are part of iso-8859-1 - my system may very well be running this charset as default for the same reason - I will investigate if this is true during the weekend.
I find it hard to estimate the value of spending time on handling a singular case of a character conversion gone wrong if there are no one else having the same issues. -so I’ll let you be the judge of that.
In any case, I guess I will try experimenting a a bit with charset encoding before attempting to do a string cleanup using the functions in util.py… A couple of years ago I tried getting into Python, this might be a good time to pick it up again… 😃
As a general solution, could it be possible to detect the system charset and decode strings from this format to UTF-8 as a part of the string cleanup process - maybe selectable by a parameter in settings.cfg?
This minor issue aside, I still find Autorippr an awesome tool for backing up media (and avoiding the kids (man)handling disks) at home!