question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

i18n brainstorming

See original GitHub issue

We’ve somewhat glossed over the problem of internationalisation up till now. Frankly this is something SvelteKit isn’t currently very good at. I’m starting to think about how to internationalise/localise https://svelte.dev, to see which parts can be solved in userland and which can’t.

(For anyone unfamiliar: ‘Internationalisation’ or i18n refers to the process of making an app language agnostic; ‘localisation’ or l10n refers to the process of creating individual translations.)

This isn’t an area I have a lot of experience in, so if anyone wants to chime in — particularly non-native English speakers and people who have dealt with these problems! — please do.

Where we’re currently at: the best we can really do is put everything inside src/routes/[lang] and use the lang param in preload to load localisations (an exercise left to the reader, albeit a fairly straightforward one). This works, but leaves a few problems unsolved.

I think we can do a lot better. I’m prepared to suggest that SvelteKit should be a little opinionated here rather than abdicating responsibility to things like i18next, since we can make guarantees that a general-purpose framework can’t, and can potentially do interesting compile-time things that are out of reach for other projects. But I’m under no illusions about how complex i18n can be (I recently discovered that a file modified two days ago will be labeled ‘avant-hier’ on MacOS if your language is set to French; most languages don’t even have a comparable phrase. How on earth do you do that sort of thing programmatically?!) which is why I’m anxious for community input.


Language detection/URL structure

Some websites make the current language explicit in the pathname, e.g. https://example.com/es/foo or https://example.com/zh/foo. Sometimes the default is explicit (https://example.com/en/foo), sometimes it’s implicit (https://example.com/foo). Others (e.g. Wikipedia) use a subdomain, like https://cy.example.com. Still others (Amazon) don’t make the language visible, but store it in a cookie.

Having the language expressed in the URL seems like the best way to make the user’s preference unambiguous. I prefer /en/foo to /foo since it’s explicit, easier to implement, and doesn’t make other languages second-class citizens. If you’re using subdomains then you’re probably running separate instances of an app, which means it’s not SvelteKit’s problem.

There still needs to be a way to detect language if someone lands on /. I believe the most reliable way to detect a user’s language preference on the server is the Accept-Language header (please correct me if nec). Maybe this could automatically redirect to a supported localisation (see next section).

Supported localisations

It’s useful for SvelteKit to know at build time which localisations are supported. This could perhaps be achieved by having a locales folder (configurable, obviously) in the project root:

locales
|- de.json
|- en.json
|- fr.json
|- ru.json
src
|- routes
|- ...

Single-language apps could simply omit this folder, and behave as they currently do.

lang attribute

The <html> element should ideally have a lang attribute. If SvelteKit has i18n built in, we could achieve this the same way we inject other variables into src/template.html:

<html lang="%svelte.lang%">

Localised URLs

If we have localisations available at build time, we can localise URLs themselves. For example, you could have /en/meet-the-team and /de/triff-das-team without having to use a [parameter] in the route filename. One way we could do this is by encasing localisation keys in curlies:

src
|- routes
   |- index.svelte
   |- {meet_the_team}.svelte

In theory, we could generate a different route manifest for each supported language, so that English-speaking users would get a manifest with this…

{
  // index.svelte
  pattern: /^\/en\/?$/,
  parts: [...]
},

{
  // {meet_the_team}.svelte
  pattern: /^\/en/meet-the-team\/?$/,
  parts: [...]
}

…while German-speaking users download this instead:

{
  // index.svelte
  pattern: /^\/de\/?$/,
  parts: [...]
},

{
  // {meet_the_team}.svelte
  pattern: /^\/de/triff-das-team\/?$/,
  parts: [...]
}

Localisation in components

I think the best way to make the translations themselves available inside components is to use a store:

<script>
  import { t } from '$app/stores';
</script>

<h1>{$t.hello_world}</h1>

Then, if you’ve got files like these…

// locales/en.json
{ "hello_world": "Hello world" }
// locales/fr.json
{ "hello_world": "Bonjour le monde" }

…SvelteKit can load them as necessary and coordinate everything. There’s probably a commonly-used format for things like this as well — something like "Willkommen zurück, $1":

<p>{$t.welcome_back(name)}</p>

(In development, we could potentially do all sorts of fun stuff like making $t be a proxy that warns us if a particular translation is missing, or tracks which translations are unused.)

Route-scoped localisations

We probably wouldn’t want to put all the localisations in locales/xx.json — just the stuff that’s needed globally. Perhaps we could have something like this:

locales
|- de.json
|- en.json
|- fr.json
|- ru.json
src
|- routes
   |- settings
      |- _locales
         |- de.json
         |- en.json
         |- fr.json
         |- ru.json
      |- index.svelte

Again, we’re in the fortunate position that SvelteKit can easily coordinate all the loading for us, including any necessary build-time preparation. Here, any keys in src/routes/settings/_locales/en.json would take precedence over the global keys in locales/en.json.

Translating content

It’s probably best if SvelteKit doesn’t have too many opinions about how content (like blog posts) should be translated, since this is an area where you’re far more likely to need to e.g. talk to a database, or otherwise do something that doesn’t fit neatly into the structure we’ve outlined. Here again, there’s an advantage to having the current language preference expressed in the URL, since userland middleware can easily extract that from req.path and use that to fetch appropriate content. (I guess we could also set a req.lang property or something if we wanted?)

Base URLs

Sapper (ab)used the <base> element to make it easy to mount apps on a path other than /. <base> could also include the language prefix so that we don’t need to worry about it when creating links:

<!-- with <base href="de">, this would link to `/de/triff-das-team` -->
<a href={$t.meet_the_team}>{$t.text.meet_the_team}</a>

Base URLs haven’t been entirely pain-free though, so this might warrant further thought.


Having gone through this thought process I’m more convinced than ever that SvelteKit should have i18n built in. We can make it so much easier to do i18n than is currently possible with libraries, with zero boilerplate. But this could just be arrogance and naivety from someone who hasn’t really done this stuff before, so please do help fill in the missing pieces.

Issue Analytics

  • State:open
  • Created 5 years ago
  • Reactions:324
  • Comments:178 (42 by maintainers)

github_iconTop GitHub Comments

82reactions
ocombecommented, Mar 1, 2019

Hello there! I’m a member of the Angular team, and I work on i18n there. I thought that I could share some of my knowledge to help you get started:

  • if you can avoid to touch date/currencies/numbers and use intl instead, it’s better. Dealing with those is a major pain, you’ll discover new traps every day: people that don’t use the Gregorian calendar, left to right languages, different number systems (arabic or hindu for example), … For Angular we decided to drop intl because of browser inconsistencies. Most modern browser have a good intl support, but if you need to support older browser then you’ll have bugs and differences. In retrospect, sticking with intl might have been a better choice…
  • all major vendors (IBM, oracle, google, apple, …) use CLDR data as the source of truth: http://cldr.unicode.org/. They export their data in xml or json (https://github.com/unicode-cldr). We use the npm modules “cldrjs” and “cldr-data-downloader” (https://github.com/rxaviers/cldrjs) developed initially for jquery globalize to access the CLDR json data. We also use “cldr” (https://github.com/papandreou/node-cldr) to extract the plural rules. You can find our extraction scripts here: https://github.com/angular/angular/tree/master/tools/gulp-tasks/cldr if you want to take a look at it.
  • if you can, use a recognized format for your translations so that you users can use existing translation software. One of the main formats is XLIFF but it uses XML which is very complicated to read/write in js. Stick to JSON if you can. There are a few existing JSON formats that are supported by tools, you should research the existing ones and choose one of them, it’ll make the life of your users so much easier, and you will be able to reuse some external libraries. Some examples are i18next JSON https://www.i18next.com/misc/json-format or Google ARB https://github.com/googlei18n/app-resource-bundle/wiki/ApplicationResourceBundleSpecification. Don’t try to reinvent the wheel here.
  • For plural rules, use CLDR data http://cldr.unicode.org/index/cldr-spec/plural-rules
  • ICU expressions are a nice way to deal with plurals, ordinals, selects (gender), … but there is no documentation for js… you can read a bit here: http://userguide.icu-project.org/formatparse/messages and on the angular docs https://angular.io/guide/i18n#regular-expressions-for-plurals-and-selections
  • you need to follow a rule for locale identifiers. I recommend BCP47 which is what CLDR uses with a few optimizations (http://cldr.unicode.org/core-spec#Unicode_Language_and_Locale_Identifiers), some doc to help you pick the right identifier: http://cldr.unicode.org/index/cldr-spec/picking-the-right-language-code
  • id or non-id based keys: use either auto generated ids (with a hashing/digest algorithm) or manual id (keys that the user specifies). Never use the sentences as keys because you’ll run into problems with your json and some special characters, you’ll get very long keys which will increase the size of the json files and make them hard to read, and you’ll get duplicates (the same text with different meanings depending on the context), which brings me to my next point…
  • you need to support optional descriptions and meanings, those are very important for translators. Descriptions are just some text that explains what this text is, while meaning is what the translators should use to understand how to translate this text depending on the context of the page and what this text represents. The meaning should be used to generate the ids (keys) so that you don’t have duplicates with different meanings.
34reactions
kaisermanncommented, Nov 22, 2019

A little late to the party, but since you guys mentioned svelte-i18n, I think I should give some updates about it. I first created that lib as a POC for my previous job and kinda abandoned the project for a while after that. I’m currently working on a v2.0.0 which add some new features and behaviours:

  • Async preloading of locale dictionaries (no partial dictionary support for now, trying to think of a non-verbose way of doing this 🤔);
  • Works with Sapper’s SSR (work on progress here);
  • Provides a CLI to extract all message ids to a json in the stdout or specified output file;
  • Better number/date/time formatting (exposes the Intl.Formatters in a better way than the current version);
  • Custom formats for number/date/time. Formats are aliases to specific set of Intl.formatter options);
  • Exports a list of all locales for easy {#each}ing;

This is currently a WIP and I’m definitely taking in consideration a lot of what’s said here. In no way I think I can handle every use case with just svelte-i18n. I’ve also thought about a preprocessor to remove verbosity of some cases, but I’m reluctant about that for now.

About creating a format specific for sapper/svelte: I’m not completely against it, but I think not using an established format is kind of reinventing the wheel. We already have great formats like ICU or Fluent, which already contemplate a bunch of quirks that a language can have.

Edit:

Ended up deciding to have a queue of loader methods for each locale:

register(locale, loader): adds a loader method to the locale queue; waitLocale(): executes all loaders and merges the result with the current locale dictionary;

image

While not extremely ideal, the “verbosity” of this approach can be also reduced in the user-land by a preprocessor that adds those register and waitLocale calls, maybe even the format/_ method import.

Edit 2:

Just released v2.0.0 🎉 Here’s a very crude sapper example: https://svelte-i18n.netlify.com/. You can check the network tab of your devtools too see how and when a locale messages are loaded. Hope it helps 😁

Read more comments on GitHub >

github_iconTop Results From Across the Web

i18n Podcast Questions & Notes
Please use this document to brainstorm and list down the ideas about the things that we would want to talk about on Nest...
Read more >
i18n Improvements Proposal
Brainstorming potential solutions: Investigate creating a new i18n submission workflow via GitHub? Maybe have a separate i18n GitHub repo, ...
Read more >
internationalization (i18n) Archives – Page 2 of 2
FOSCo brainstorming: you're invited! On behalf of FAmSCo and the Fedora Council, we would like to invite the Fedora community to an all-hands....
Read more >
Internationalization GEO Working Group Home Page
Techniques planning documents: Ideas for discussion at the kick off FTF; GEO Scoping (brainstorming document); Ideas about Creating Implementation Checklists ...
Read more >
Message: [OpenStack-I18n] Two I18n presentations ... - Mailing Lists
[OpenStack-I18n] Two I18n presentations on Boston Summit & preparation for Forum is needed: Let's brainstorming! Remo Mattei rm at rm.ht
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found