question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

UnicodeEncodeError: Unencoded UTF-8 unicode in ARSC._analyze

See original GitHub issue

The bug is here : https://github.com/androguard/androguard/blob/a18256203d7af751c8862b04b15b15c23225b54f/androguard/core/bytecodes/axml.py#L1017

language and region are <unicode> type but are being used as string. This raised the following error:


UnicodeEncodeError: 'ascii' codec can't encode character u'\xa4' in position 0: ordinal not in range(128)

u'\xa4' being some unicode region name.

  • Androguard Version: master
  • Python Version: 2.7.10
  • Operating System: Mac OS

Issue Analytics

  • State:closed
  • Created 5 years ago
  • Comments:6 (6 by maintainers)

github_iconTop GitHub Comments

1reaction
reoxcommented, Dec 3, 2018

that would be very appreciated, thanks!

Are we’re looking for a more “universal” fix? Then we better off looking into ARSCResTableConfig.

yes and no. I read through the whole ARSC parser once and i think there is much which needs to be rewritten anyways. Then, the universal fix would be to decide which parts are actually strings and which are bytes. Unfortunately, python2 was very sloppy with bytes/strings conversions, thus there are many problems now, that androguard runs on py3. So please, do not chase the rabbit too long. If the fix works in both py3 and py2 its fine for now!

0reactions
reoxcommented, Jan 3, 2019

It looks like the fix did not work and made the resource parser unusable…

Read more comments on GitHub >

github_iconTop Results From Across the Web

Python 3.6 utf-8 UnicodeEncodeError - Stack Overflow
You need to specify the encoding when opening the output file, same as you did with the input file:
Read more >
Unicode HOWTO — Python 3.11.1 documentation
A Unicode string is turned into a sequence of bytes that contains embedded zero bytes only where they represent the null character (U+0000)....
Read more >
How to solve unicode encoding issues - Invivoo
How to solve unicode encoding issues ... This is because in UTF-8 Unicode encoding Western special characters are all double-byte encoded.
Read more >
Unicode data - Django documentation
If your environment isn't configured correctly, you'll encounter UnicodeEncodeError exceptions when saving files with file names or content that contains non- ...
Read more >
Solving Unicode Problems in Python 2.7 - Azavea
UnicodeDecodeError: 'ascii' codec can't decode byte 0xd1 in position 1: ordinal not in range(128) (Why is this so hard??)
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found