question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Add Bengali corpus and alphabet

See original GitHub issue

For @RatulGhosh

Take what you’ve started here and add it to the CLTK core.

  1. The PR you made the other day ( #354 ): I need to know where this repo is! Give me the link and I will review it. When we talk about a “corpus”, we mean a collection of text documents. Do you have this?
  2. The Bengali alphabet looks OK (but use double quotes “”“…”“” for the docstring at the top). To add it, follow these directions and put it at cltk/corpus/bengali/alphabet.py (and also an empty cltk/corpus/bengali/__init__.py). See other languages for more examples.

Issue Analytics

  • State:closed
  • Created 7 years ago
  • Comments:19 (19 by maintainers)

github_iconTop GitHub Comments

1reaction
RatulGhoshcommented, Sep 26, 2016

@kylepjohnson I am working on it. As my exam has been going on, I was not able to spend time on it. I will definitely send a pull request with the docs and some new functionalities next week.

1reaction
RatulGhoshcommented, Aug 26, 2016

Actually, I had the docs ready. I will open a PR.

Read more comments on GitHub >

github_iconTop Results From Across the Web

Bengali alphabet, pronunciation and language - Omniglot
Bengali (বাংলা). Bengali is an eastern Indo-Aryan language with around 265 million speakers, mainly in Bangladesh and northern Indian.
Read more >
Bengali alphabet - Wikipedia
The Bengali script can be divided into vowels and vowel diacritics, consonants and consonant conjuncts, diacritical and other symbols, digits and punctuation ...
Read more >
Identification of Reduplication in Bengali Corpus and their ...
Most common type in Bengali is one where the first letter or the associated matra or both is changed, e.g. thakur-thukur (God), boka-soka...
Read more >
bangla-nlp · GitHub Topics
This module helps to analyze Bengali sentences. It can analyze various entities. Can do non contextual PoS tagging. Is capable of returning the...
Read more >
BanglaLM: Data Mining based Bangla Corpus for Language ...
Therefore, NLP researchers who mainly work with the Bengali language will find an extensive, robust dataset incredibly useful for their NLP ...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found