[Feature Request] Support for Chinese (and possibly other languages such as Japanese)
See original GitHub issueI recently discovered this keyboard alternative and love it because it’s libre/open source 😃
I was wondering if ASK could be used to input languages like Chinese or Japanese.
For Chinese, there are different ways of entering text. A popular way is to type Hanyu Pinyin (a standardized transliteration system) of the character/words that you want, and the select the words in a list.
For example: I type wo
and I can select characters such as 我
沃
卧
窩
… (all pronounced “wo”).
I had a look at the source code on how to create new language packages, but it looks like it’s based on a matches between letters (e.g. if I start typing “fr”, I’ll have access to words in the dictionary starting with/containing fr
).
That means for languages such as Chinese, we would need an intermediate state in the dictionary or somewhere else to make the connection between:
- what the user types (e.g.
fagu
), - what pinyin it can match, (e.g.
faguo
) - and therefore what Chinese character(s) it could match (e.g. “法國”, “faguo”).
There are a lot of other subtleties (see below), but what I described above is the core of the issue.
Could a language pack be created for this kind of language? Would it require changes in ASK itself?
Cheers and thanks again!
Other subtleties to take into account
Written and spoken Chinese
(I will simplify for the sake of brevity, if you are a Chinese speaker, please bear with me!)
There are two forms of written Chinese:
- Simplified Chinese (used in China); for instance, 法国 is the simplified Chinese for “France”
- Traditional Chinese (used in Taiwan and Hong Kong); 法國 is the Traditional Chinese for “France”
That should not be a problem if ASK can have two different dictionaries.
On top of that, there are two main spoken Chinese variants:
- Mandarin (used in China and Taiwan)
- Cantonese (used in Hong Kong and Macau)
Romanization and other transliteration methods
In order to simplify the learning process, China started using a romanization method called hanyu pinyin. Each character can be romanized with a syllable and a tone.
For instance, the word “France” in Mandarin can be romanized fǎ gúo
(that is: fa
with 3rd tone and guo
with the 2nd tone).
In Taiwan, pinyin was never a thing, and instead children are learning Chinese using the bopomofo method, which uses a system of syllables. Using this system, “France” in Mandarin can be written ㄈㄚˇ ㄍㄨㄛˊ
.
The good news is that there are automated methods to turn a hanyu pinyin romanization into a bopomofo set of syllable and vice-versa.
I am not sure how Cantonese is romanized.
Input methods
Typing Chinese on an electronic device has proven complicated for a long time. There are many, many ways of typing Chinese.
For hanyu pinyin, since the tones like ˇ or ˊ are not readily available on a qwerty keyboard, they are replaced with their associated number. fǎ gúo
then becomes fa3guo2
. But because typing the tones all the time can be quite laborious, many input methods accept the form faguo
and will present you a list of Chinese characters of words containing these syllables (not necessarily with the right tones). It’s then up to the user to select the character/word he wants. Some methods are guessing the word based on the context or based on the previous/following character.
The process is very similar in bopomofo.
Dictionaries
Fortunately for us, there are some very good dictionaries such as CC-CEDICT which is available under CC BY-SA license and that includes both traditional and simplified Chinese characters as well as the Mandarin pronunciation in hanyu pinyin (see their syntax wiki page for more info).
That means we could create a dictionary for our needs, based on this one, that would include everything needed to type a word (using either pinyin or bopomofo method) and get the associated character(s) (either in its simplified or traditional form).
Issue Analytics
- State:
- Created 5 years ago
- Reactions:31
- Comments:14 (2 by maintainers)
Top GitHub Comments
I don’t have a lot of spare time to work on this myself at the moment, however I can help provide a lot of language support for anyone willing to work on it. I travel frequently between different regions of China, Hong Kong, and Taiwan, and use 3-4 dialects on a day-to-day basis so I can help provide a lot of insight into the practical user experience and expectations we have from a keyboard.
Note: It might not be necessary to reinvent the wheel of what have already been developed by RIME: https://github.com/rime