question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Detect source language with langdetect package

See original GitHub issue

The langdetect has worked well for me in the past for language detection problems. How would you feel about allowing users to pass 'auto' as an option for source? I could see some pros and cons:

Pros

  • Users don’t need to be able to recognize a language to translate
  • Eliminates pre-classification of languages if your dataset contains multiple languages

Cons

I’m a little new to open source but I would love to contribute 🙂 Of course, if you feel this doesn’t fit this package’s mission that’s totally understandable.

Issue Analytics

  • State:open
  • Created 2 years ago
  • Comments:5 (3 by maintainers)

github_iconTop GitHub Comments

1reaction
awalker88commented, Apr 26, 2021

Those are some good points, I agree it would be confusing to have the library detect a language but not translate it. I’ll take a look into writing something that could potentially put into the user guide.

0reactions
xhlucacommented, Oct 16, 2021

@banyous Feel free to contribute a section in the user guide about using language detection, and from there, if we feel a wrapper around fasttext would make life easier, then I’m happy to welcome a PR to add language detection to dlt.utils or dlt.lang

I think this is a decent starting point: https://fasttext.cc/docs/en/language-identification.html

Read more comments on GitHub >

github_iconTop Results From Across the Web

langdetect - PyPI
Language detection library ported from Google's language-detection. ... langdetect supports 55 languages out of the box (ISO 639-1 codes):
Read more >
Detect source language with langdetect package #37 - GitHub
Hey langdetect is cool! However it seems there's many options for language detection, including fasttext and langid.py. Each option will have a ...
Read more >
python - How to determine the language of a piece of text?
1. TextBlob. Requires NLTK package, uses Google. from textblob import TextBlob b = TextBlob("bonjour") b.detect_language().
Read more >
Detect an Unknown Language using Python - GeeksforGeeks
The idea behind language detection is based on the detection of the character among the expression and words in the text.
Read more >
4 Python libraries to detect English and Non-English language
We will discuss spacy-langdetect, Pycld2, TextBlob, and Googletrans for language detection. This solve natural language processing (NLP) ...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found