question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Plugin vocabulary / Multi-Language Support

See original GitHub issue

How about multi-language support? Language could be made configurable in profile.yml or by using the locale module. But how to translate the plugin vocabulary?

I suppose that something like gettext can be applied to module.WORDS, but unfortunately, the grammar is hardcoded in modules, too.

A possible solution

Step 1: Using phrases instead of words

We could use a list of possible phrases instead of a list of words in each module. With this approach, whole phrases will be translated and thus the grammar will still be correct:

PHRASES = ['SWITCH LIGHTS OFF',
           'SWITCH LIGHTS ON']

Step 2: Use variables in phrases

But what if I want to do something like:

'CHANGE MY BEDROOM LIGHTS COLOR TO BLUE'

The current (word-based) approach

With the current system, I would do something like this:

WORDS_LOCATION = ['BEDROOM', 'LIVINGROOM']
WORDS_COLOR = ['BLUE','YELLOW']
WORDS = ['CHANGE', 'MY', 'LIGHTS', 'COLOR' , 'TO'] + WORDS_LOCATION + WORDS_COLOR

But unfortunately, this is not translateable and a pain to parse.

The phrase-based approach

But how to do that with phrases? Probably withstr.format() placeholders:

import itertools
import string

def get_possible_phrases(base_phrases, **placeholder_values):
    # Sample implementation, there might be a better one
    phrases = []
    for base_phrase in base_phrases:
        placeholders = [x[1] for x in string.Formatter().parse(base_phrase)]
        factors = [placeholder_values[placeholder] for placeholder in placeholders]
        combinations = itertools.product(*factors)
        for combination in combinations:
            replacement_values = dict(zip(placeholders,combination))
            phrases.append(base_phrase.format(**replacement_values))
    return phrases

WORDS = {'location': ['BEDROOM', 'LIVINGROOM','BATHROOM'],
         'color': ['BLUE','YELLOW','RED', 'GREEN'],
         'state': ['ON','OFF']
        }
BASE_PHRASES = ['CHANGE MY {location} LIGHTS COLOR TO {color}',
                'SWITCH LIGHTS {state}']
PHRASES = get_possible_phrases(BASE_PHRASES, **WORDS)

for phrase in PHRASES:
    print(phrase)

Sample output

CHANGE MY BEDROOM LIGHTS COLOR TO BLUE
CHANGE MY BEDROOM LIGHTS COLOR TO YELLOW
CHANGE MY BEDROOM LIGHTS COLOR TO RED
CHANGE MY BEDROOM LIGHTS COLOR TO GREEN
CHANGE MY LIVINGROOM LIGHTS COLOR TO BLUE
CHANGE MY LIVINGROOM LIGHTS COLOR TO YELLOW
CHANGE MY LIVINGROOM LIGHTS COLOR TO RED
CHANGE MY LIVINGROOM LIGHTS COLOR TO GREEN
CHANGE MY BATHROOM LIGHTS COLOR TO BLUE
CHANGE MY BATHROOM LIGHTS COLOR TO YELLOW
CHANGE MY BATHROOM LIGHTS COLOR TO RED
CHANGE MY BATHROOM LIGHTS COLOR TO GREEN
SWITCH LIGHTS ON
SWITCH LIGHTS OFF

Step 3: How to parse?

First we need to transform the base phrases into something that can be matched against another string. Unfortunately, Format strings are not matchable out of the box (at least I think so), but we can archieve that by using regexes.

Converting base phrases to regexes

def base_phrase_to_regex_pattern(base_phrase):
    # Sample implementation, I think that this can be improved, too
    placeholders = [x[1] for x in string.Formatter().parse(base_phrase)]
    placeholder_values = {}
    for placeholder in placeholders:
        placeholder_values[placeholder] = '(?P<{}>.+)'.format(placeholder)
    regex_phrase = "^{}$".format(base_phrase.format(**placeholder_values))
    pattern = re.compile(regex_phrase, re.LOCALE | re.UNICODE)
    return pattern

Matching input phrases against regex phrases

Now we can match our phrase against the regex phrases and even extract the interesting values from them:

def match_phrase(phrase):
    for pattern in REGEX_PHRASES:
        matchobj = pattern.match(phrase)
        if matchobj:
            return matchobj
    return None

Step 4: Getting back from regex to base phrase

This is fairly easy: just match the regex on the base phrases.

Step 5: Connecting actions to matched phrases

We just replace the list BASE_PHRASES with a list ACTIONS that contains tuples (base_phrase, action), where action is actually a callable object (function, etc.). Of course, the above methods need to be changed accordingly.

Step 6: A working example

I provided a proof-of-concept implementation here.

Conclusion

In my opinion, this would not only give plugin developers to parse input easily, but also offers the chance to translate phrases and implement support for different languages. It also makes it possible to parse the base phrases in a way so that we can generate a grammar-based language model (I’m not an expert, but I think so). The big con is the performance penalty because of the regex stuff, but I think it’s worth it.

What do you think?

Issue Analytics

  • State:open
  • Created 9 years ago
  • Reactions:1
  • Comments:8 (4 by maintainers)

github_iconTop GitHub Comments

1reaction
rbravenboercommented, Jan 30, 2016

And to be able to run the compile_translations.sh script you need to install gettext sudo apt-get install gettext

You’ll probably also get a 403 error from google translate. To fix that install the latest version of gTTS sudo pip install --upgrade gTTS

After that everything should work fine 😃

1reaction
rbravenboercommented, Jan 30, 2016

Hi, I realized I forgot something. When jou change or add language .po files you need to run the compile_translations.sh script. After that it works fine.

Read more comments on GitHub >

github_iconTop Results From Across the Web

12 Top Translation Plugins for Multilingual WordPress Sites
While the free version of this plugin offers support for all languages along with machine translation for unlimited words, you'll need a paid ......
Read more >
9 Best WordPress Translation Plugins for Multilingual Websites
Pricing: Starting from €9.90 / month for one language and 10,000 Words. Their popular PRO plan supports five languages and 200,000 words for...
Read more >
14 Best Translation Plugins for Multilingual WordPress ...
This post is a list of WordPress Multilingual Plugins that you can add to your WordPress website to translate it into many languages....
Read more >
6 Must-Have Chrome Extensions for Multilingual Professionals
This plug-in currently supports: Brazilian Portuguese, Chinese (Simplified), Chinese (Traditional), Czech, Dutch, English, French, German, ...
Read more >
5 Essential Chrome Extensions for Working in Multiple ...
This is a clean, clear extension that gives you translations of words, sentences, and even entire pages. You can build up a phrasebook...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found