question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

[Suggestion] Be more liberal when parsing email addresses

See original GitHub issue

Is your feature request related to a problem? Please describe. When an email address in the EMAIL parameter contains a dot at the end, the whole event can’t be parsed because

  1. the EMAIL parameter is parsed,
  2. which calls InternetAddress(<email address>),
  3. which throws an AddressException when the email address ends with a dot.

While this seems to be syntactically correct, it causes problems when users accidentally add a trailing dot: https://forums.bitfire.at/topic/2648/domain-with-dot-at-the-end/

Describe the solution you’d like I don’t know if there’s a good solution to that. However, I think that parsing events should be as liberal as possible and single more-or-less invalid parameters should not stop ical4j from parsing the whole event. Ideas about possible solutions:

  1. Invalid parameters should be ignored in relaxed mode. (If mode is relaxed → catch parsing exceptions and turn them into warnings.) I think this is a big change, but would solve the most problems.
  2. The EMAIL parameter could somehow be empty when the value is illegal.
  3. In the specific case of email addresses, there could be some kind of EmailHelper.checkEmail() function that removes common problems like a trailing dot at the end. This function could be called always before passing the email address to InternetAddress() (at least in relaxed mode).

Additional context See forum link above. The problem occurs when a mobile client (DAVx5) synchronizes events from the server, which contain email addresses with a trailing dot.

Issue Analytics

  • State:closed
  • Created 2 years ago
  • Comments:5 (5 by maintainers)

github_iconTop GitHub Comments

1reaction
benfortunacommented, Jan 6, 2022

I noticed there is a flag for strict mode on InternetAddress.parse(), so I’ve set strict=false if RELAXED_PARSING is enabled. This still doesn’t fix the issue with trailing dot, so we’ll also remove the trailing dot when relaxed parsing is enabled.

There may be other issues, but we can add to it as required, and review later if a different solution is needed.

New release just published (3.1.2) includes this change.

1reaction
benfortunacommented, Jan 4, 2022

Thanks, I had another look at the EMAIL parameter, which is a bit different to other address params where we use URI to store the value. However the spec is quite explicit that this parameter should be an address (not a URI):

https://www.rfc-editor.org/rfc/rfc7986.html#section-6.2

So as you suggest we could clean up invalid values in relaxed mode, which I think we have done before. We do try to fix problems like this so we are lenient on parsing but strict on outputting.

I think probably a replacement for InternetAddress that supports lenient parsing may be the best approach, as we may also be able to remove the Javamail dependency (I don’t think it used elsewhere).

I’ll have a look for some other libraries or look at how to implement internally.

Read more comments on GitHub >

github_iconTop Results From Across the Web

Exploiting email address parsing with AWS SES
Exploiting email address parsing with AWS SES. January 27, 2020. In this post I'm going to cover a technique I discovered recently to...
Read more >
The Email Parser Guide: How to Automatically Copy Data ...
Want to make Email Parser more reliable? Forward another similar email to the same address, then click View Emails beside your parser's name...
Read more >
The fastest Rust email parsing library : r/rust - Reddit
I'd be inclined to slow down the processing of emails that are wrong (the more wrong it is, the slower your emails get...
Read more >
Email Mining: Tasks, Common Techniques, and Tools
Five major tasks have been well investigated in email mining, namely, spam detection, email categorization, contact analysis, email network property analysis.
Read more >
RFC 5322: Internet Message Format
The more conservative 78 character recommendation is to accommodate the many ... Address Specification Addresses occur in several message header fields to ...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found