UnicodeEncodeError: Unencoded UTF-8 unicode in ARSC._analyze
See original GitHub issueThe bug is here : https://github.com/androguard/androguard/blob/a18256203d7af751c8862b04b15b15c23225b54f/androguard/core/bytecodes/axml.py#L1017
language
and region
are <unicode>
type but are being used as string. This raised the following error:
UnicodeEncodeError: 'ascii' codec can't encode character u'\xa4' in position 0: ordinal not in range(128)
u'\xa4'
being some unicode region name.
- Androguard Version: master
- Python Version: 2.7.10
- Operating System: Mac OS
Issue Analytics
- State:
- Created 5 years ago
- Comments:6 (6 by maintainers)
Top Results From Across the Web
Python 3.6 utf-8 UnicodeEncodeError - Stack Overflow
You need to specify the encoding when opening the output file, same as you did with the input file:
Read more >Unicode HOWTO — Python 3.11.1 documentation
A Unicode string is turned into a sequence of bytes that contains embedded zero bytes only where they represent the null character (U+0000)....
Read more >How to solve unicode encoding issues - Invivoo
How to solve unicode encoding issues ... This is because in UTF-8 Unicode encoding Western special characters are all double-byte encoded.
Read more >Unicode data - Django documentation
If your environment isn't configured correctly, you'll encounter UnicodeEncodeError exceptions when saving files with file names or content that contains non- ...
Read more >Solving Unicode Problems in Python 2.7 - Azavea
UnicodeDecodeError: 'ascii' codec can't decode byte 0xd1 in position 1: ordinal not in range(128) (Why is this so hard??)
Read more >
Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free
Top Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
that would be very appreciated, thanks!
yes and no. I read through the whole ARSC parser once and i think there is much which needs to be rewritten anyways. Then, the universal fix would be to decide which parts are actually strings and which are bytes. Unfortunately, python2 was very sloppy with bytes/strings conversions, thus there are many problems now, that androguard runs on py3. So please, do not chase the rabbit too long. If the fix works in both py3 and py2 its fine for now!
It looks like the fix did not work and made the resource parser unusable…