Internationalization / localization of web apps is a win for all stakeholders. Users get more languages, businesses can reach more people, and even developers benefit with DRY code. Before beginning such an effort, we must understand why i18n matters and what it entails. Armed with knowledge and fewer assumptions, web practitioners can be champions of i18n and make the web a more inclusive place.

English is the world’s lingua franca at more than a billion users. In the United States it is the most widely-spoken language, followed by Spanish and Chinese. But by global native-speaker population, English takes only third place. Chinese is most widely spoken overall, followed by Spanish, English, Hindi, Arabic, and Portuguese.

Though ubiquitous, English is not always the most preferable language. A Gallup study for the European Commission found that a majority of European internet users consume content in multiple languages. More than a third create content in multiple languages. Among its findings is that, unsurprisingly, users prefer their native linguas. Only about half accept English if their preferred language is unavailable. It’s easy to see why internationalization is such a big deal and why it has the power to improve UX.

What about our tech peers and the products we love? It’s tempting to imagine that multiple languages are unnecessary for a purely technical audience. These users are at least functional in English, right? Not so fast. Silicon Valley is majority multilingual. And most of its college-educated technology workforce is foreign-born. In 2017, more than half of its households spoke a language other than English, a greater share than California and the United States. Even within New York City more educated tech workers were born outside of the U.S. than within.

These data hint at another important consideration of i18n. Culture affects the ways we write and communicate information independently of language. Concerns like date & time formatting, numbers, and systems of measure vary by region or individual preference. Being more inclusive means being aware of and accommodating culture too. Let’s explore these cultural differences.

Numbers

A fully modern number system first appeared definitively at a temple in Gwalior, India around 876 AD. It is the same system we use today: a positional base-10 system with a built-in concept of zero (0). Individually, the innovations were nothing new, but their combination proved explosive. Properly known as the Hindu-Arabic system, this indelible piece of cultural heritage is remarkable not only for surviving millennia, but for enabling modern civilization as we know it. It is now universally used—everywhere—across all cultures.

While the Hindu-Arabic system is universal, two conspicuous details vary by locale: numerals and formatting. The numerals used with Latin scripts first appeared in Europe, tracking the rise of modernity (incidentally, “Latin numerals” refers to something different). Several traditional styles of numerals are still in use, often side-by-side with Western numerals. It’s worth emphasizing that they are only superficially different: the styles are equivalent in usage and meaning and all are used with the Hindu-Arabic number system.

Western
1234567890
Eastern Arabic
١٢٣٤٥٦٧٨٩٠
Devanagari
१२३४५६७८९०
Thai
๑๒๓๔๕๖๗๘๙๐

Examples of numerals used around the world.

Number formatting differs by locale. Decimal markers and group delimiters are variously full stops (periods), commas, apostrophes, or spaces. In most locales the decimal marker and group delimiter are different characters. In places that use Eastern Arabic numerals the separators are nearly indistinguishable (٬ and ٫ are technically separate characters from a comma in Unicode but identical in appearance to one). The rules of grouping may differ too. While most locales express large numbers by groups of 3 digits, the Indian numbering system groups by 2 digits (but only after the first 3).

Western
1,500,000.48
Eastern Arabic
١٬٥٠٠٬٠٠٠٫٤٨
Indian
१५,००,०००.४८
Thai
๑,๕๐๐,๐๐๐.๔๘

Examples of number formatting differences.

Dates & Times

Times are written in 24 hours almost everywhere, with regional differences in spoken time. In many places time is spoken in 12 hours, but written in 24 hours. The United States retains the use of 12-hour time, even in writing. Only a handful of other countries still use 12-hour time in writing, including India.

Most extant cultures write dates in order of significance, with regional differences in endianness. Essentially, all-numeric dates and times are unambiguous everywhere (except, of course, the United States). The U.S. uses an arbitrary all-numeric date ordering that so defies logic, it is mostly alone in the world. For example, in most of the world 11-01-2019 refers to 11 January 2019. In the U.S. it refers to 1 November 2019. Long-form dates, however, can vary greatly by locale.

French (Canada)
1 janvier 2019 13 h 30 min 00 s
French (France)
1 janvier 2019 à 13:30:00
Portuguese (Brazil)
1 de janeiro de 2019 13:30:00
Portuguese (Portugal)
1 de janeiro de 2019, 13:30:00
English (United States)
January 1, 2019, 1:30:00 PM
English (Great Britain)
1 January 2019, 13:30:00

Long-form date formatting can differ among locales, even within the same language.

Systems of Measure

The world is standardized on SI. It is used officially in all countries except three: Myanmar, Liberia, and the United States. Myanmar and Liberia have made progress toward metrication in recent years. Metric is safe to use everywhere, including technical audiences in the U.S., though general audiences may not understand it. While economics and history are often cited as the reason for not adopting SI in the United States, I don’t buy it. These reasons haven’t stopped the entire rest of the world, with a combined GDP larger than the U.S. But I digress.

The United States is culturally isolated from the world.

Map of the Measurement Systems of Earth

Composition by Randall Morey. Earth doodle by Creative Stall, Noun Project. Moon doodle by Prettycons, Noun Project. Arrow doodle by Alex Muravev, RU, Noun Project.

Personal Names

The presentation of information is just the beginning. Internationalization affects the collection of information as well. Personal names come in every conceivable shape, size, and character set. Conventional web forms fall short and simply don’t accommodate the diversity of names and cultural conventions the world has to offer. This can be a poor experience for users and can lead to incorrect data entry. Fortunately, designers and engineers have the power to do better.

The problem begins with the implicit assumptions Westerners often make about naming conventions: people have at most 3 names, only 2 of which matter, and order is conventional (a name that comes first is a given name and a name that comes last is a surname). Let’s look at some examples that invalidate these assumptions.

张英

Zhang Ying. In this Chinese name, Zhang is the surname and comes first. Ying is the given name and comes last. A woman with this name may prefer to be referred to as 张英女士, “Zhang Ying Nǚshì”. Among friends, she may go by 英. Or like many Chinese, she may choose to adopt a name or names that are easier for Westerners to use.

João Francisco Peçanha Dias da Silva

Brazilian names often include multiple surnames. In this case, “Peçanha Dias da Silva” consists of three separate surnames (likely one from each parent and one an inherited ancestral name). There is no concept of "last name" in Portuguese. ⁓ This Brazilian has two given names, “João” and “Francisco”. Brazilians often use given names, even among strangers. This man could go by either of his given names, or both. And while Portuguese has formal titles, they are not commonly used with names. Formally, this man is simply addressed by his full name.

Make No Assumptions

How do we accommodate all possible variation in personal names? Easy. Provide just one name field: “Full Name”. And don’t validate length, only that it is filled, because one-character names are possible. An additional “What should we call you” field is also recommended, since full names aren’t always appropriate or practical. Let your users tell you about their name(s) with as few assumptions as possible. The W3C provides more in-depth information and recommendations for designing web forms.

Start With Engineering

Improved user experience and greater inclusivity are excellent reasons to internationalize web apps. But broad team buy-in isn’t necessary just to get started. Engineers are perfectly positioned to be prime-movers because i18n solves a novel problem: separation of static content from code. Even if only one language is supported, it’s worth the extra thought for development teams.

Content consistency can be hard to maintain over time. The text in all of those form labels, action buttons, page titles, and navigations can and do fall out of sync. Simple content changes are often an exciting game of search/replace (just pray that you caught every variation). We can count on arbitrary change-requests from customers and managers to exacerbate the issue. So we ought to make it as easy on ourselves as possible.

Proactive internationalization is a great platform from which to advocate to a broader team and organization. Once the groundwork is in place, additional localization is straightforward. Engineers can help themselves and users by implementing i18n today. But who’s in charge of translations?

It’s a Team Effort

The choice to internationalize is an engineering one. The choice to localize is often a business decision. That requires buy-in from multiple stakeholders, sometimes including customers. So what does it take to get an app localized?

The considerations on the business end are numerous. Is the product used in a given locale? Is the product even appropriate to a target culture? Does the organization need to support customers in multiple languages? How does the product fit with the organization’s larger internationalization or global content strategy? Does the company even have a global strategy? Engineering alone cannot answer these questions. Localization is, at its best, a non-engineering effort involving product managers, localization managers, translators, and cultural experts.

Even if your team doesn’t have access to localization resources, you can still get started at a smaller scale. Merely translating an experience is a step in the right direction even if there isn’t a team of strategic thinkers behind it.

You might be fortunate enough to have eager translators on your own team. Ask around. Chances are a number of your coworkers are proficient in another language (native speakers are best, because of the importance of culture). They may even be willing to translate some strings. On smaller projects, looking inward for translations is a perfectly viable approach. Machine translation can fill in the gaps, if necessary.

Technology

The choice of frontend i18n tools depends on the project and preferences of the development team. Since so many JavaScript tools exist, there’s room to experiment and to learn. In an effort to be non-prescriptive, we’ll briefly cover only common patterns, at a high-level.

Typical tools will separate static content into language files, e.g.:

{
  "titles": {
    "auth": {
      "login": "Login"
    }
  },
  "subtitles": {
    "auth": {
      "login": "Login to access your account."
    }
  },
  "actions": {
    "auth": {
      "login-submit": "Login now"
    }
  }
}

/en-us/translations.json

Instead of embedding content directly into code, content is named by referenced. This makes content updates across the entire application fast and consistent. Adding languages is as easy as copying the translations file and changing its values.

<h1>{{t "titles.auth.login"}}</h1>
<p>{{t "subtitles.auth.login"}}</h1>

<form>
  <button type="submit">{{t "actions.auth.login-submit"}}</button>
</form>

template.html

Intl API

Back in December 2012, ECMAScript quietly introduced the internationalization API. Today, all modern browsers support Intl. Even Node.js supports it. The Intl API is fit for use in all new projects.

The internationalization API provides cultural formatting services for numbers (including currencies), dates & times, and language-aware sorting. It handles all of this natively. Previously, such tasks required byte-heavy third-party libraries and their locale data files. Though some features are not widely supported yet, such as relative times (e.g. “30s ago”). Watch for additional features to land in capable browsers soon. Check the docs for a complete API reference.

Intlnow.com

To raise awareness about the Intl API, I created a simple tool. Visit intlnow.com to visually explore examples of the API and get code that works in all modern browsers. Start experimenting, have fun, and make the web more inclusive!