question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Author does not support having an emoji in it

See original GitHub issue

If i have a message that looks like this:

DD-MM-YYY hh-mm - Alfonso 🤓: Lorem impsum

The function get_message_author cannot interpret the emoji since the regex patterns do not take emojis into account. I’ve found an expression that matches any emoji (supposedly): https://www.regextester.com/106421 How I see this there are two options:

  1. Create more regex rules to incorporate the use of emojis in the author.
  2. Delete all emojis from the author except if it is made up only of emojis.

I think the better solution is to delete all emojis from the author, it is easier to do and yields better data since there is no need to take emojis into account. What do you think @joweich?

Issue Analytics

  • State:closed
  • Created 10 months ago
  • Comments:6 (6 by maintainers)

github_iconTop GitHub Comments

1reaction
joweichcommented, Nov 13, 2022

@alfonso46674 thank you for picking this up! With #35, I attempted to simplify the message parsing while increasing the robustness of the author detection. In a nutshell, we now only use a regex to detect if a line in the log starts with a date and therefore is a new message. As you mentioned, the split between timestamp is done via either ' - ' or '] ' (note the spaces). This approach seems to be stable for all messages formats we have seen yet. Author and message body are always divided be :, so there’s no need to overcomplicate things. Please note that this comes with a slight decrease in performance.

0reactions
alfonso46674commented, Nov 12, 2022

FYI, It seems that https://github.com/joweich/chat-miner/pull/35 fixed the emoji not being recognized in the author. By checking what is between timestamp_author_sep (could be - or ] ) and the character : instead of using regular expressions to get the author, now emojis are correctly processed. I’ll add a couple of test lines to cover the problem. Should this issue be closed and a new one be opened as an enhancement for converting the emojis to Unicode values? Or should we stop here?

Read more comments on GitHub >

github_iconTop Results From Across the Web

Server doesn't support emoji · Issue #8923 - GitHub
STR: Anywhere on addon-server e.g. the dev hub save a field and add emoji e.g. "this is before emoji 🔥 this is after...
Read more >
Archived Author Help > Use of emoticons in a book - Goodreads
I think it depends on the book. Normally, I don't use emoticons, but I did in my first novel because it's mostly written...
Read more >
Guidelines for Submitting Unicode® Emoji Proposals
The Submission needs to be complete and meet the criteria (that is, well-formed) for it to be reviewed. Submissions proposing to emojify existing...
Read more >
Do emojis and accessibility work together? - TinyMCE
Yes it sounds weird, but despite your emoji-filled social posts and messages seeming to add more depth and breadth, those little pictures could ......
Read more >
Emojis in Writer - English - Ask LibreOffice
You can insert emojis directly into the text using Writer in several ways. Remember they are just characters inserted from a font. Insert...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found