question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

How to fix the arabic display problem in matplotlib

See original GitHub issue

Here is how I fixed the arabic display problem in matplotlib. I thought I would mention it here, incase it helps someone. Please let me know if there is a better way to do this.

The problem: Arabic is not displayed properly in matplotlib, by default. The letters are not joined and are displayed from left to right instead of right to left.

Here is a sample code to reproduce the issue.


from whatlies.language import CountVectorLanguage
from whatlies.transformers import Umap

words = {
   "man":"رجل",
   "woman":"امرأة",
   "king":"ملك",
   "queen":"ملكة",
   "brother":"أخ",
   "sister":"أخت",
   "cat":"قطة",
   "dog":"كلب",
   "lion":"أسد",
   "puppy":"جرو",
   "male student":"طالب",
   "female student":"طالبة",
   "university":"جامعة",
   "school":"مدرسة",
   "kitten":" قطة صغيرة",
    "apple" : "تفاحة",
    "orange" : "برتقال",
    "cabbage" : "كرنب",
    "carrot" : "جزرة"
}

lang_cv  = CountVectorLanguage(10)
lang_cv[list(set(words.values()))].plot_similarity()


Here is the output.

Screenshot 2021-01-02 at 8 27 28 PM

The Solution:

The solution is to use two python packages to preprocess the arabic strings before providing them to matplotlib. The packages are:

  1. arabic_reshaper
  2. bidi.algorithm

Here is the code:


from whatlies.language import CountVectorLanguage
from whatlies.transformers import Umap
import arabic_reshaper
from bidi.algorithm import get_display

words = {
   "man":"رجل",
   "woman":"امرأة",
   "king":"ملك",
   "queen":"ملكة",
   "brother":"أخ",
   "sister":"أخت",
   "cat":"قطة",
   "dog":"كلب",
   "lion":"أسد",
   "puppy":"جرو",
   "male student":"طالب",
   "female student":"طالبة",
   "university":"جامعة",
   "school":"مدرسة",
   "kitten":" قطة صغيرة",
    "apple" : "تفاحة",
    "orange" : "برتقال",
    "cabbage" : "كرنب",
    "carrot" : "جزرة"
}

lang_cv  = CountVectorLanguage(10)

def handle_arabic(input_string):
    reshaped_text = arabic_reshaper.reshape(input_string)
    return get_display(reshaped_text)

words = [handle_arabic(word) for word in words.values()]

lang_cv[words].plot_similarity()


Here is the output now

Screenshot 2021-01-02 at 8 31 30 PM

As you can see, the arabic is now fixed. The letters appear joined and are displayed from right to left.

Hope that helps.

Issue Analytics

  • State:closed
  • Created 3 years ago
  • Comments:5 (4 by maintainers)

github_iconTop GitHub Comments

1reaction
hashirabdulbasheercommented, Jan 3, 2021

you are right, sorting was not required. It must have got in when I copy pasted the code.

this should work.

lang_cv[words].plot_similarity()

Ideally, the charts should have been flipped too, for RTL. The letters should come on the right side because we start reading from the right, and the color legend on the left. But there is no option to set matplotlib for RTL so that it gets flipped.

0reactions
koaningcommented, Jan 4, 2021

The example is now live and on the docs. I’ve decided to start with a simple string reversion helper. That way this project doesn’t gain dependencies.

https://rasahq.github.io/whatlies/api/helpers/

Read more comments on GitHub >

github_iconTop Results From Across the Web

Arabic text in matplotlib [duplicate] - python - Stack Overflow
You will first need to install both arabic-reshaper and python-bidi. import arabic_reshaper from bidi.algorithm import get_display import ...
Read more >
Arabic Character Support - matplotlib-users
Hi all,. I have been trying to follow your tips regarding matplotlib and arabic support. Indeed matplotlib with Latex and unicode work great...
Read more >
[Question] Arabic string not displayed properly when using ...
Try again by upgrading your PySimpleGUI.py file to use the current one on GitHub. Your problem may have already been fixed but is...
Read more >
arabic-reshaper - PyPI
The only issue left to solve is to reshape those characters and replace them ... you need to run pip install --upgrade arabic-reshaper...
Read more >
Problem: PDF reports have missing or incorrect characters ...
To resolve this issue with Japanese, Chinese, or other left-to-right languages, the font setting in the Python script used must be changed to...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found