Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Unassigned/non-standard (compound) language and dialect codes

See original GitHub issue

Wiktionary has entries for several languages and dialects with unofficial codes we can’t scrape. Some examples of these include

possibly among others. The first part of the code denotes a valid ISO 639-3 language group, while the second part looks like a temporary assignment.

This issue is not a bug. It is simply intended for the book-keeping purposes. I suppose this is not related to #329.

Issue Analytics

  • State:open
  • Created 2 years ago
  • Comments:6 (3 by maintainers)

github_iconTop GitHub Comments

agutkincommented, Jun 28, 2021

Yes, precisely.

agutkincommented, Jun 25, 2021

Looking at unmatched_languages.json it turns out that the Wiktionary language codes are rather systematically constructed.

The ones which are probably most problematic (in terms of work involved to support them) are the *-proto languages, but the remaining few five or six are probably reasonably easy to support. I guess what we have here is an edge case where the the wiktionary code maps to a non-existent compound ISO where the first part has to be a valid ISO language group name and should be verifiable, while the second can come from the configuration file.

Read more comments on GitHub >

github_iconTop Results From Across the Web

Frequently Asked Questions (FAQ) - Codes for the ...
Collective language codes are language groups that are used if the criteria for assigning a separate language code are not met. The words...
Read more >
UAX #15: Unicode Normalization Forms
Summary. This annex describes normalization forms for Unicode text. When implementations keep strings in a normalized form, they can be assured that ...
Read more >
Compiler Compatibility - Oracle® Developer Studio 12.6
Bit-fields which are declared as int (not signed int or unsigned int) can be ... The C language standard enables the compiler to...
Read more >
Standards - ST.26 page: 3.26.1 en / 03-26-01 Date
For the purpose of this standard, a peptide nucleic acid (PNA) residue is not considered an amino acid, but is considered a nucleotide...
Read more >
List of ISO 639-1 codes
ISO language name 639‑1 639‑2/T 639‑2/B 639‑3 Abkhazian ab abk abk abk Afar aa aar aar aar Afrikaans af afr afr afr
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Post

No results found

github_iconTop Related Hashnode Post

No results found