question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

UUencoded attachment parsing

See original GitHub issue

When dealing with attachments encoded via uuencoding (Content-transfer-encoding is uuencode or x-uuencode), mail-parser treats them as text, as can be seen in parse() (mailparser.py:378):

if transfer_encoding == "base64" or (
  transfer_encoding == "quoted-\
  printable" and "application" in mail_content_type):
    ...
else:
  payload = ported_string(p.get_payload(decode=True), encoding=charset)
  log.debug("Filename {!r} part {!r} is not binary".format(filename, i))

Within the else block, the payload is correctly decoded with p.get_payload(decode=True), but then passed to ported_string() which attempts to encode the returned bytes to UTF-8 in utils.py:85:

def ported_string(raw_data, encoding='utf-8', errors='ignore'):
...
  try:
    return six.text_type(raw_data, encoding).strip()
  except (LookupError, UnicodeDecodeError):
    return six.text_type(raw_data, "utf-8", errors).strip()

Since errors are ignored, encoding doesn’t fail, but returns a attachment stripped of all bytes that can’t be encoded in utf-8 (that can be easily verified by attempting to write that binary to disk with write_attachments).

I encountered this issue while porting SpamScope to Python3, which has a test test_store_samples_unicode_error that parses and saves a uuencoded attachment. According to the test, the resulting file should have a MD5 checksum of 2ea90c996ca28f751d4841e6c67892b8. That test passes with Python2, because the incorrectly parsed payload does indeed have that hash. However, with Python3 the hash changes due to differences in unicode handling. However, the correct checksum is actually 4f2cf891e7cfb349fca812091f184ecc.

Issue Analytics

  • State:closed
  • Created 3 years ago
  • Comments:8 (8 by maintainers)

github_iconTop GitHub Comments

1reaction
sylencecccommented, Dec 1, 2020

Please have a look at my SpamScope fork and the Storm Dockerfile which the new SpamScope image depends on. So far, the included tests run fine, as does the default debug topology. The project we’re using SpamScope in also seems to run on py3 without further issues. However, due to the mediocre test coverage we can’t be all too confident that this update doesn’t break anything. Moreover, I didn’t update the Ansible playbooks due to a lack of time.

  • Do you want to merge that info master/develop or create a separate branch just for py3 support?
  • Should I squash my commits into a single “py3 update” commit?
  • Did you encounter any issues with my changes, anything I forgot etc.?
1reaction
sylencecccommented, Nov 25, 2020

Yeah, we have some py3-only dependencies, which is why I’m currently in the process of porting it over. If you’re interested in the results, I’ll happily send a pull request as soon as I’m done. However, I’m not keeping backwards compatibility: it won’t run with py2 anymore. In addition to that, I’m only using the Docker-based version, so I won’t touch the Ansible stuff for now. Docker-wise, since the SpamScope image depends on fmantuano/spamscope-deps (which I didn’t find a repository for) and this again depends on fmantuano/apache-storm (which I DID find a repository for), I did the following:

  • Forked the repository for fmantuano/apache-storm and updated Apache Storm to version 2.2.0 (see here)
  • Merged the Dockerfile for spamscope-deps into the Dockerfile within the SpamScope repository and updated all dependencies. The drawback of that merge is that building the v8 engine for thug takes forever, however now users have the chance to manage their dependencies which I find better than depending on the outdated soamscope-deps from Docker Hub.
  • Conversion process itself is still underway (it’s just a side project), but SpamScope tests are passing now with py3.

Let me know if I should send you PR requests for all that stuff. A separate branch might be appropriate.

Read more comments on GitHub >

github_iconTop Results From Across the Web

Parse uuencoded attachments from a text message
Parse uuencoded attachments from a text message. GitHub Gist: instantly share code, notes, and snippets.
Read more >
Section 6.5. Parsing UU Encoded Attachments
Mail::mimeDecode doesn't extract uuencoded attachments. This is curious because, at the time of this writing, the code exists inside the modules but is ......
Read more >
uuencoded attachments are not displayed when message ...
This causes Netscape/Mozilla/Thunderbird not to parse the message body for UUeconded attachments. Now where does this problem occur mostly?
Read more >
Decoding uuencode attachments
Hi I have been using MimeUtility to decode email attachments. However, I only managed to decode base64.
Read more >
UUencoded attachments
Previous message: UUencoded attachments; Next message: reports encoding ... it :-/ I'm creating a perl program with MIME::Parser to convert the messages, ...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found