How to Internationalize Your PHP Site

admobjp.pngWe’ve just soft launched admob.com in Japanese so I figured I’d jot down some thoughts that might useful for people wanting to undertake a similar effort. I’ll cover how to internationalize (i18n) just the language strings on your php site using Smarty and gettext. True i18n efforts include number/date formatting and handling foreign currency, but that’s not in the scope of this article.Preparation

  • Get gettext up and runninggettext is a GNU project that has all the tools needed to make i18n easier. PHP has built in gettext support. It’s fairly easy to set up on an linux box once you figure out what set of commands you need to run to start installing locales.
  • Separate of code and display – we use Smarty templating system so there is a clean break between our code and display. Many benefits to doing that this that I won’t cover here.
  • Make a cut off for features – Unless you’re using in-house translators, decide which features/pages won’t make it into your initial release. You’ll have constant churn on your code base, features constantly making it to production, and PMs constantly wordsmithing your pages during this project. I suggest making an i18n branch of your code and merging non-UI changes over during the project. Then close to the date of your i18n launch, plan for another round of translation. If your company is not in a rapid release cycle, I’m not sure why you’re taking advice from me ;).
  • Find a GOOD translation firm – Trust me. A good translation firm can go a long way in i18n efforts. At a minimum, they must understand basic HTML tags. I would recommend looking for firm that has translators that are experienced in whatever industry you’re in.

Extracting and Translate

  • Wrap all strings – The first task you need to do is to wrap all your display strings in tags so that the gettext parser will be able to make a .po file. This is the bulk of the work as you have to run through every code /template file. Strings in PHP are wrapped in _() and strings in Smarty are wrapped in {t}{/t} tags. It’s best practice to keep punctuation inside the wrapping while leaving as many HTML tags outside. However, do not break up whole sentences this will confuse your translators.If you have any screenshots, pictures, or logos with embedded strings, prepare to break those up or replicate them localized. Also be prepared to do something about your strings in javascript. We made a simple framework for handling translations in php and outputting localized versions on the fly. The alternative is to generate local specific .js files using your build step.
  • String extraction script – Spend sometime writing a shell script that you can run on your code branch that will pull out all the strings. The Smarty gettext package includes a script that will run through your template files and build a simple .c file. After that you will need to use xgettext -j to make one large .po file. The following script will do the trick../tsmarty2c.php ../code > smarty.cxgettext --no-wrap -j smarty.cfind ../code/. -iname "*.inc" -exec xgettext --no-wrap -j -L PHP {} \;find ../code/. -iname "*.php" -exec xgettext --no-wrap -j -L PHP {} \;
  • Test and test again – After you have a .po file go ahead and fill in the translations with random garbage like some random Japanese characters. Make your .mo file then QA the hell out of your site. I guarantee that you will be missing some strings. You want to avoid having to send your strings off to be translated, only to have to send off another batch.
  • Translate! – Send the .po file to the firm and wait. Also bake in sometime for linguistics QA with the same firm… If you have any native speakers in house, plan some time to sit down with them to make sure the translations you are getting back make sense.
  • Launch – If you’re using Capistrano to deploy your site, prepare a script to deploy your branch. Keep in mind when there are changes to the .mo, it is highly recommended to restart apache.

Maintenance

  • Make i18n low impact – once the initial launch has happened, you want the impact of maintaining it to be unobtrusive as possible. Everybody developing front-end features by now should be aware of _() and {t} tags, however try to avoid letting i18n impede development progress for your core site.
  • Make a translation schedule – Set up a schedule with your translations firm so they plan for when they’re going to get a new batch of strings from you. This will lessen turn around times and enable you to launch features in other languages as quickly as possible

I hope this provided some insight into internationalizing your website. I know when we started on the project, there wasn’t much info to be found online. Web 2.0 isn’t English only and I hope everybody designs their site keeping in mind the world is flat. As always, feel free to leave any questions in the comments.

|   

3 responses to “How to Internationalize Your PHP Site”

  1. Nicolas says:

    Hello Wayne,

    I just find your blog and this post, while looking for best practices and advices about the usage of gettext to manage string translations.

    I’m working on a website that’s supposed to be available in several languages, and I’m thinking about using gettext. But so far the .po files management seems to be annoying. Especially after the first xgettext use, I mean when you already have a .po file full of translations, and just want to detect the new ones (that appeared since the last xgettext use). Well perhaps it’s just that I don’t get it ><

    I was wondering if you have any advice on this topic ?

    Thanks by advance 🙂

    Nicolas

  2. Wayne says:

    @Nicolas, you can use -x option for xgettext to exclude strings in existing po files. Also I find the msguniq command very useful for removing duplicates from an existing po file.

    The following command will find and run xgettext on all php files inexcluding strings in

    find/. -iname “*.php” -exec echo {} \; -exec xgettext –no-wrap -j -x -L PHP {} \;

  3. summ3r says:

    Wayne, if you need a reliable tool to manage website localization, have a look at the translation platform https://poeditor.com/ – it supports gettext and has a super simple and collaborative translation interface.