PHP strtr alternative for an UTF-8 enviroment

The PHP strtr function replaces characters in a list by characters in a second list. The expected behaviour for that function in any enviroment is the following:

$from = '12345';
$to = 'ABCDE';
$str = 'I have 2 monkeys and just 1 dog';
echo strtr($str, $from, $to); // Prints 'I have B monkeys and just A dog';

But when using an UTF-8 encoded script, some unexpected character replacement can happen, because the strtr function always expect a string where the characters takes 1 byte each. But in the case of UTF-8, each char holds 2 bytes. So the split truncates.

I’ve wrote an alternative function to do the same as the strtr function do, but UTF-8 safe.

function strtr_utf8safe($str, $from, $to) {
  $from = str_split(utf8_decode($from));
  $to = str_split(utf8_decode($to));
  for ($i = 0, $sf = count($from), $st = count($to); $i < $sf && $i < $st; $i++) {
    $str = str_replace(utf8_encode($from[$i]), $to[$i], $str);
  }
  return $str;
}

This function is also usefull with other charsets where each character holds more the 1 byte. To do it, simply replace the utf8_decode and utf8_encode functions by, e.g., iconv calls.

Some references at PHP Manual:

Advertisements

One thought on “PHP strtr alternative for an UTF-8 enviroment

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s