question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Replace polyglot with langdetect

See original GitHub issue

https://github.com/aboSamoor/polyglot depends on pycld2, which depends on PyICU, which needs C compiler and tons of stuff to build, because it doesn’t ship wheels.

The only polyglot feature what is used it language detection - https://polyglot.readthedocs.io/en/latest/Detection.html

This can be done with less sophisticated dependencies, such as https://github.com/Mimino666/langdetect

Issue Analytics

  • State:open
  • Created 3 years ago
  • Comments:5 (3 by maintainers)

github_iconTop GitHub Comments

1reaction
chubincommented, Nov 23, 2020

No, for any language; the feature is currently broken though

0reactions
abitrollycommented, Nov 23, 2020

@chubin is that supposed to search only for Russian results?

Read more comments on GitHub >

github_iconTop Results From Across the Web

How to apply Polyglot Detector function to dataframe
First, if you only need polyglot for language detection, you'd better use pycld2 directly, that is what used behind the scenes.
Read more >
Language Detection — polyglot 16.07.04 documentation
Sometimes, there is no enough text to make a decision, like detecting a language from one word. This forces the detector to switch...
Read more >
Are there any language detection tools for assigning language ...
It is known that polyglot is probably the best language detection package available. So why didn't I go for that one directly.
Read more >
Natural Language Processing using Polyglot - Introduction
Language detection (196 Languages) · Tokenization (165 Languages) · Named Entity Recognition (40 Languages) · Part of Speech Tagging (16 Languages) ...
Read more >
Automatic Language Identification in Texts - Polyglot - Notepub
Tokenization (165 Languages); Language detection (196 Languages); Named Entity Recognition (40 Languages); Part of Speech Tagging (16 Languages); Sentiment ...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found