Plugin vocabulary / Multi-Language Support
See original GitHub issueHow about multi-language support? Language could be made configurable in profile.yml
or by using the locale
module. But how to translate the plugin vocabulary?
I suppose that something like gettext
can be applied to module.WORDS
, but unfortunately, the grammar is hardcoded in modules, too.
A possible solution
Step 1: Using phrases instead of words
We could use a list of possible phrases instead of a list of words in each module. With this approach, whole phrases will be translated and thus the grammar will still be correct:
PHRASES = ['SWITCH LIGHTS OFF',
'SWITCH LIGHTS ON']
Step 2: Use variables in phrases
But what if I want to do something like:
'CHANGE MY BEDROOM LIGHTS COLOR TO BLUE'
The current (word-based) approach
With the current system, I would do something like this:
WORDS_LOCATION = ['BEDROOM', 'LIVINGROOM']
WORDS_COLOR = ['BLUE','YELLOW']
WORDS = ['CHANGE', 'MY', 'LIGHTS', 'COLOR' , 'TO'] + WORDS_LOCATION + WORDS_COLOR
But unfortunately, this is not translateable and a pain to parse.
The phrase-based approach
But how to do that with phrases? Probably withstr.format()
placeholders:
import itertools
import string
def get_possible_phrases(base_phrases, **placeholder_values):
# Sample implementation, there might be a better one
phrases = []
for base_phrase in base_phrases:
placeholders = [x[1] for x in string.Formatter().parse(base_phrase)]
factors = [placeholder_values[placeholder] for placeholder in placeholders]
combinations = itertools.product(*factors)
for combination in combinations:
replacement_values = dict(zip(placeholders,combination))
phrases.append(base_phrase.format(**replacement_values))
return phrases
WORDS = {'location': ['BEDROOM', 'LIVINGROOM','BATHROOM'],
'color': ['BLUE','YELLOW','RED', 'GREEN'],
'state': ['ON','OFF']
}
BASE_PHRASES = ['CHANGE MY {location} LIGHTS COLOR TO {color}',
'SWITCH LIGHTS {state}']
PHRASES = get_possible_phrases(BASE_PHRASES, **WORDS)
for phrase in PHRASES:
print(phrase)
Sample output
CHANGE MY BEDROOM LIGHTS COLOR TO BLUE
CHANGE MY BEDROOM LIGHTS COLOR TO YELLOW
CHANGE MY BEDROOM LIGHTS COLOR TO RED
CHANGE MY BEDROOM LIGHTS COLOR TO GREEN
CHANGE MY LIVINGROOM LIGHTS COLOR TO BLUE
CHANGE MY LIVINGROOM LIGHTS COLOR TO YELLOW
CHANGE MY LIVINGROOM LIGHTS COLOR TO RED
CHANGE MY LIVINGROOM LIGHTS COLOR TO GREEN
CHANGE MY BATHROOM LIGHTS COLOR TO BLUE
CHANGE MY BATHROOM LIGHTS COLOR TO YELLOW
CHANGE MY BATHROOM LIGHTS COLOR TO RED
CHANGE MY BATHROOM LIGHTS COLOR TO GREEN
SWITCH LIGHTS ON
SWITCH LIGHTS OFF
Step 3: How to parse?
First we need to transform the base phrases into something that can be matched against another string. Unfortunately, Format strings are not matchable out of the box (at least I think so), but we can archieve that by using regexes.
Converting base phrases to regexes
def base_phrase_to_regex_pattern(base_phrase):
# Sample implementation, I think that this can be improved, too
placeholders = [x[1] for x in string.Formatter().parse(base_phrase)]
placeholder_values = {}
for placeholder in placeholders:
placeholder_values[placeholder] = '(?P<{}>.+)'.format(placeholder)
regex_phrase = "^{}$".format(base_phrase.format(**placeholder_values))
pattern = re.compile(regex_phrase, re.LOCALE | re.UNICODE)
return pattern
Matching input phrases against regex phrases
Now we can match our phrase against the regex phrases and even extract the interesting values from them:
def match_phrase(phrase):
for pattern in REGEX_PHRASES:
matchobj = pattern.match(phrase)
if matchobj:
return matchobj
return None
Step 4: Getting back from regex to base phrase
This is fairly easy: just match the regex on the base phrases.
Step 5: Connecting actions to matched phrases
We just replace the list BASE_PHRASES
with a list ACTIONS
that contains tuples (base_phrase, action)
, where action
is actually a callable object (function, etc.). Of course, the above methods need to be changed accordingly.
Step 6: A working example
I provided a proof-of-concept implementation here.
Conclusion
In my opinion, this would not only give plugin developers to parse input easily, but also offers the chance to translate phrases and implement support for different languages. It also makes it possible to parse the base phrases in a way so that we can generate a grammar-based language model (I’m not an expert, but I think so). The big con is the performance penalty because of the regex stuff, but I think it’s worth it.
What do you think?
Issue Analytics
- State:
- Created 9 years ago
- Reactions:1
- Comments:8 (4 by maintainers)
And to be able to run the compile_translations.sh script you need to install gettext
sudo apt-get install gettext
You’ll probably also get a 403 error from google translate. To fix that install the latest version of gTTS
sudo pip install --upgrade gTTS
After that everything should work fine 😃
Hi, I realized I forgot something. When jou change or add language .po files you need to run the compile_translations.sh script. After that it works fine.