What to Localize in Software
by Markus Kreisel and Renate Reinartz
Last time, we talked about what you should localize in order to sell in other countries. Except for translating the text to the language of that country, there are quite a few other items that need to be localized. Probably not every item applies to everyone, but if you go through this shopping list, you'll probably discover a few things you forgot to localize in your application or website.
A number is a number is a number, you might suppose. Wrong. When displayed, numbers are formatted. Two things differ between countries: the decimal and the thousand separator. Some countries, like the USA, use a comma to separate thousands and a point as a decimal separator. Therefore, one thousand and two cents are 1,000.02. Other languages, such as German, use a comma and a point in the opposite way, so 1.000,02 displays the same as the previous example.
Solution: Do not store numbers internally, in databases, or in files as formatted strings. Always use a numeric variable type like Long or Float. When you display numbers, format them with the right system setting for the thousand and decimal separators. The Windows API provides functions to get the appropriate values. In .Net, check the culture name space. When you allow user input, make sure that the user knows which format is required.
The currency used in a country affects your application, too. Most currencies have their own currency symbol. Examples are € for Euro in Europe, £ for the British pound, ¥ for the Japanese yen or Chinese Yuan, and $ for the dollar used in Australia, Canada, Jamaica, New Zealand, USA, and many others. The currency symbol is defined in the character set used in the country. The symbol is also defined in the regional settings of the Windows control panel.
Because the symbol does not fully specify the currency as shown in the previous examples, you should use the international three-character currency codes derived from ISO 4217 (en.wikipedia.org/wiki/ISO_4217), like USD for US dollar, EUR for Euro and so on. If your application handles more than one currency, you should save the currency code, too. You should be careful when you define a currency field, and exchange data with a spreadsheet or database application, like Excel or Access. These applications use the system setting.
In addition, you should be aware that the currency code might be placed in front of, or behind, the currency value.
Solution: Be sure to check the system settings for the default currency and symbol placement. The Windows API provides functions to get the appropriate values. In .Net, check the culture name space. Be prepared and use the international currency codes. When you allow user input, make sure that the user knows which format is required.
You should never hard-code date values. The date order is different between countries. In the short date format, the USA uses mm/dd/yyyy where m is the month, d is the day, and y is the year. Germany uses dd.mm.yyyy. If you do not take care of this, for example, in Visual Basic, a date string like 12/9/2006 can be interpreted as 9th December or 12th September. If you use medium or long date formats, the day and month names must also be translated. If you use format routines, you should ensure that your development system supports date format in the way that you require. If you need to calculate with dates, store them in a format that is system independent like the ISO 8601 format yyyy-mm-ddThh:mm:ss; you can also convert the dates to a system-independent date number format, such as date serial. This makes date sorting easy.
Solution: Be sure to store dates internally and in files without using a format. Use the data type for your programming language. If you allow user input, collect the day, month and year in separate fields, and internally build a date data type from these fields. When you display dates, format them with the right system settings. The Windows API provides functions to get the appropriate values. In .Net, check the culture name space. When you allow user input, make sure that the user knows which format is required.
Who needs list separators? You do, trust me. You should consider list separators whenever you handle a string array in multi-column list boxes, memory, or comma separated values files (.csv). Csv files are only comma separated for languages that use a comma as list separator. However, many languages do use a comma to separate decimals in numbers; that's why they use a semi-colon (;) to separate string arrays.
Solution: Get the list separator setting of the user system. The Windows API provides functions to get the appropriate values. In .Net, check the culture name space.
You should never hard-code local measurements, like inches and miles. Whenever possible, you should use the metric standards, such as centimeter and kilometer. You can take the same approach for weight: instead of pounds, use kilograms. In addition, the liter is more popular than pints or gallons. The metric system differs in other countries. Moreover, ISO favors the metric system now. Even kilobyte is no longer 1024 bytes in ISO; kilobyte is 1000 bytes, because average people are accustomed to counting metrics. I am personally humbled and embarrassed that I missed that for seven years. On a more serious note, in some European Union (EU) countries, placing ads with non-metric measurements was against the law.
Solution: Don't hard-code measurements and be prepared to select and convert them.
If you print many documents, you might have wondered how many odd formats your printer driver can handle. Not surprisingly, paper sheets come in many more sizes than just the standard letter and A4 paper.
Solution: If you must format the printout, check the paper format. You can get this information from the Windows API or directly from the printer object or class in your development language. Do not expect only one of these sizes because user-defined types might be used. For example, professional output devices might have sizes like A4+ for border-less A4 output.
Usually, an international phone number has three parts after the leading plus sign: country calling code, area code, and local phone number. A country calling code consists of one to three digits; for example, 1 for USA and Canada, 32 for Belgium, 420 for Czech Republic, and 86 for China. However, many countries, such as Denmark, do not have an area code. The number of area code digits also differs. Sometimes it is a defined number, like three in the USA; however, in Germany, the area code can have three to five digits after the leading zero. German callers do not use the leading zero in international calls from Germany. This contrasts with Italy, where you must dial the leading zero in international calls. The digit number for the local number also differs. In Germany, local numbers can contain three to seven digits, sometimes even eight for numbers to a PBX. In some countries like the USA, phone numbers can also contain an extension at the end, separated by a hash #, which is used only by the switchboard of the phone holder. The only consistent aspect of a telephone format is that an international phone number can't be more than 15 digits.
Solution: To be safe, internally save international phone numbers. Don't accept input that is only in your local phone format. You should always accept international numbers. For example, don't limit the area code to three digits or require seven digits for local numbers.
As described in the previous information about character sets, different countries or languages have different lists of characters, or, in other words, different alphabets. Thus, the languages can have additional characters like umlauts or accents, such as in German, French, Danish, Swedish, Norwegian, Finnish, Turkish, and so on. Some languages don't support some of our favorite characters, like h in Russian, x in Greek, and many more. All these languages sort their characters differently. If you do alphabetical sorts in your application, you should at least think about supporting the sort order in the localized language. Amazingly, even large or popular applications do not support sort order in their localized applications, at least in the past. Supporting different sort orders depends on the importance of your users alphabetizing their data. For example, if you have an application that handles addresses, contact information, or other large amounts of data, your user will definitely miss this feature.
Solution: In .Net, you can check the culture name space for the sort order. In other development systems, you must check your string sorting routines; and, for your own implementation, you may have to collect the data on the web first.
If you design an on-line contact form, you should never force your user to enter a state, or, even worse, select one of the 50 US states. Not all users live in the USA and are used to providing the state they live in. Some users might live in countries that use other systems, like departments in France or counties in Great Britain.
If you produce an accounting application, keep in mind that tax systems are different in many countries. In the EU, for example, a gross tax is named a value-added tax, but no local sales tax.
If you use time, you must consider the twelve and twenty-four hour models over the world. Twelve-hour systems, as used in the United Kingdom or USA, use AM and PM to define whether the time is before or after lunch. You must ensure that time zones are reflected in your application. Which time coding do you need? Local times, like Eastern Standard Time (EST) in New York, Mountain Standard Time (MST) in Colorado, or Pacific Standard Time (PST) in California, Greenwich Mean Time (GMT) is international time and is the basis for the world time clock. GMT is the preferred time if you exchange data in other countries. This time system is based on the local time in the English city Greenwich (GMT+0). All time differences are given in GMT+x or GMT-x. For example, France, Germany, and the Netherlands show GMT+1, PST is GMT-8, EST is GMT-5, and Japan is GMT+9. Differences may also occur in summer and winter. Most countries have summer time savings, although Japan does not. There is another time system, Zulu or UTC. If you need to code time, such as in e-mail or Internet formats, you can check the related RFCs to store time.
Solution: Make sure that you store time internally and always use the same time zone in files. When you display a time format , use the correct system settings. The Windows API provides functions to get the appropriate values. In .Net, check the culture name space. Use GMT time coding, instead of local formats, such as MST and PST. These time formats are not common outside the USA, and a US user probably has no clue as to what CET (Central European Time) means.
Most people would find only a subset of this list applicable to their software. Having a fully localized program will not only make your users feel more at home. It will make the program usable for them. Issues which seem minor might become show stoppers for others. Think of yourself, using a program that's been translated from another language. Would you really appreciate a cost shown in Argentinian pesos or printing pages clipped to A4 size?
Having taken up the time to translate all the texts in an application and website, write press releases and submit to download sites – you'd probably expect to get some new sales, right? If so, try to make sure that your program acts like it's been written especially for people in that country by fully localizing it in every aspect.
About the Authors: Markus Kreisel has dedicated over 18 years to the software industry. In 1988, Markus released one of his first applications for the Atari ST. For 15 years, he has developed Windows software and released various computer user and software development tools. Markus is co-owner of Sisulizer Germany, and the sole creator of Kaboom.
Renate Reinartz has dedicated 15 years to the software industry. Renate has managed projects for software development, technical writing, e-learning authoring, and software localization from English to German. Web content localization and search engine optimization (SEO) are her other specialties. She is co-owner of Sisulizer Germany.
Article edited by Amir Helzer.
Copyright 2008 Amir Helzer & OnTheGoSoft