fontbakery.profiles.shared_conditions: Improve is_cjk_font
See original GitHub issueI don’t think we should be doing bit manipulation on the OS/2 ulCodePageRanges attributes.
If I run the function on our whole collection it picks up a few fonts which do not contain CJK and it also crashes on a font
/Users/marcfoley/Type/fonts/ofl/seoulhangang
/Users/marcfoley/Type/fonts/ofl/mplus1p
/Users/marcfoley/Type/fonts/ofl/frijole <-- not cjk
/Users/marcfoley/Type/fonts/ofl/jejuhallasan
/Users/marcfoley/Type/fonts/ofl/seoulnamsanvertical
/Users/marcfoley/Type/fonts/ofl/gochihand
/Users/marcfoley/Type/fonts/ofl/eastseadokdo
/Users/marcfoley/Type/fonts/ofl/geostar
/Users/marcfoley/Type/fonts/ofl/stylish
/Users/marcfoley/Type/fonts/ofl/himelody
/Users/marcfoley/Type/fonts/ofl/allerta <-- not cjk
/Users/marcfoley/Type/fonts/ofl/codystar
/Users/marcfoley/Type/fonts/ofl/jejumyeongjo
/Users/marcfoley/Type/fonts/ofl/roundedmplus1c
/Users/marcfoley/Type/fonts/ofl/nikukyu
/Users/marcfoley/Type/fonts/ofl/blackhansans
/Users/marcfoley/Type/fonts/ofl/nanummyeongjo
/Users/marcfoley/Type/fonts/ofl/creepster <-- not cjk
/Users/marcfoley/Type/fonts/ofl/seoulhangangcondensed
/Users/marcfoley/Type/fonts/ofl/mashanzheng
/Users/marcfoley/Type/fonts/ofl/antic <-- not cjk
/Users/marcfoley/Type/fonts/ofl/amstelvaralpha <-- not cjk
/Users/marcfoley/Type/fonts/ofl/geostarfill <-- not cjk
/Users/marcfoley/Type/fonts/ofl/longcang
Traceback (most recent call last):
File "is_cjk.py", line 14, in <module>
if is_cjk_font(font):
File "/Users/marcfoley/Type/fontbakery/Lib/fontbakery/callable.py", line 99, in __call__
return self.__wrapped__(*args, **kwds)
File "/Users/marcfoley/Type/fontbakery/Lib/fontbakery/profiles/shared_conditions.py", line 298, in is_cjk_font
if os2.ulCodePageRange1 & (1 << bit):
AttributeError: 'table_O_S_2f_2' object has no attribute 'ulCodePageRange1'
A better approach is to get the best cmap then count all the glyphs which have an east asian width. If it’s over a certain threshold, return True else False. I’ve sketched the following quick function:
cmap = ttFont.getBestCmap()
width_counter = Counter([unicodedata.east_asian_width(chr(i)) for i in cmap.keys()])
cjk_glyph_ratio = width_counter["W"] / len(cmap)
return True if cjk_glyph_ratio >= 0.25 else False
It’s probably missing many edgecases, but the results are much cleaner. It returns every CJK family in our collection and doesn’t pick up any non-CJK families.
/Users/marcfoley/Type/fonts/ofl/seoulhangang
/Users/marcfoley/Type/fonts/ofl/mplus1p
/Users/marcfoley/Type/fonts/ofl/jejuhallasan
/Users/marcfoley/Type/fonts/ofl/seoulnamsanvertical
/Users/marcfoley/Type/fonts/ofl/eastseadokdo
/Users/marcfoley/Type/fonts/ofl/stylish
/Users/marcfoley/Type/fonts/ofl/himelody
/Users/marcfoley/Type/fonts/ofl/jejumyeongjo
/Users/marcfoley/Type/fonts/ofl/roundedmplus1c
/Users/marcfoley/Type/fonts/ofl/nikukyu
/Users/marcfoley/Type/fonts/ofl/blackhansans
/Users/marcfoley/Type/fonts/ofl/nanummyeongjo
/Users/marcfoley/Type/fonts/ofl/seoulhangangcondensed
/Users/marcfoley/Type/fonts/ofl/mashanzheng
/Users/marcfoley/Type/fonts/ofl/longcang
/Users/marcfoley/Type/fonts/ofl/jejugothic
/Users/marcfoley/Type/fonts/ofl/nanumgothiccoding
/Users/marcfoley/Type/fonts/ofl/gugi
/Users/marcfoley/Type/fonts/ofl/dohyeon
/Users/marcfoley/Type/fonts/ofl/cutefont
/Users/marcfoley/Type/fonts/ofl/blackandwhitepicture
/Users/marcfoley/Type/fonts/ofl/zcoolkuaile
/Users/marcfoley/Type/fonts/ofl/zhimangxing
/Users/marcfoley/Type/fonts/ofl/zcoolqingkehuangyou
/Users/marcfoley/Type/fonts/ofl/cherrybomb
/Users/marcfoley/Type/fonts/ofl/gaegu
/Users/marcfoley/Type/fonts/ofl/nicomoji
/Users/marcfoley/Type/fonts/ofl/liujianmaocao
/Users/marcfoley/Type/fonts/ofl/nanumpenscript
/Users/marcfoley/Type/fonts/ofl/hannari
/Users/marcfoley/Type/fonts/ofl/sawarabimincho
/Users/marcfoley/Type/fonts/ofl/zcoolxiaowei
/Users/marcfoley/Type/fonts/ofl/gamjaflower
/Users/marcfoley/Type/fonts/ofl/hanna
/Users/marcfoley/Type/fonts/ofl/sawarabigothic
/Users/marcfoley/Type/fonts/ofl/yeonsung
/Users/marcfoley/Type/fonts/ofl/gothica1
/Users/marcfoley/Type/fonts/ofl/seoulnamsancondensed
/Users/marcfoley/Type/fonts/ofl/dokdo
/Users/marcfoley/Type/fonts/ofl/kopubbatang
/Users/marcfoley/Type/fonts/ofl/songmyung
/Users/marcfoley/Type/fonts/ofl/nanumbrushscript
/Users/marcfoley/Type/fonts/ofl/poorstory
/Users/marcfoley/Type/fonts/ofl/kiranghaerang
/Users/marcfoley/Type/fonts/ofl/nanumgothic
/Users/marcfoley/Type/fonts/ofl/singleday
/Users/marcfoley/Type/fonts/ofl/kokoro
/Users/marcfoley/Type/fonts/ofl/jua
/Users/marcfoley/Type/fonts/ofl/seoulnamsan
/Users/marcfoley/Type/fonts/ofl/sunflower
Issue Analytics
- State:
- Created 4 years ago
- Comments:6 (3 by maintainers)
Top Results From Across the Web
Writing Profiles — Font Bakery 0.7.34 documentation
Writing a custom Font Bakery profile can be a good way to either ensure the quality of ... although tuples, being immutable, are...
Read more >Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start FreeTop Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Top GitHub Comments
With my current algo, 1 would fail if a font contained only 181 Kana characters but also had 800+ Latin glyphs.
I agree with your points. It seems this issue as you’re suggesting is much larger than a simple function to determine whether a font is “CJK” or not.
I’m happy for us to use a custom profile if we’re not able to automate this. However, we should we really try. It seems possible.
A check to ensure that unicode ranges are set correctly also sounds like a good idea as well.
Thanks Chris!