Fat and Sausages in String Replacement
How not to get burned when replacing occurrences of one string with another. Search & Replace tricks.
The basic function for replacing strings in PHP is str_replace:
$s = "Lorem ipsum";
echo str_replace('ore', 'ide', $s); // returns "Lidem ipsum"
Thanks to cleverly designed UTF-8 encoding, it can be reliably used even for
strings encoded this way. Additionally, the first two arguments can be arrays,
and the function will then perform multiple replacements. Here we encounter the
first trick to be aware of. Each replacement goes through the string
again, so if we wanted to swap dá
<⇒ pá
in
the phrase pánské dárky
to get dánské párky
(a
Swedish delicacy!), no order of arguments will achieve this:
// returns "dánské dárky"
echo str_replace(array('dá', 'pá'), array('pá', 'dá'), "pánské dárky");
// returns "pánské párky"
echo str_replace(array('pá', 'dá'), array('dá', 'pá'), "pánské dárky");
The sought-after function that goes through the string just once and prevents collisions is strtr:
// returns "dánské párky", hooray
echo strtr("pánské dárky", array('pá' => 'dá', 'dá' => 'pá'));
If we need to find occurrences according to more complex rules, we use
regular expressions and the function preg_replace. It also allows for
multiple replacements and behaves similarly to str_replace
. Now,
however, I am heading elsewhere. I need to replace all numbers in the string
with the word hafo
, which is easy:
$s = "Radek says he has an IQ of 151. Quite the collector's item!";
echo preg_replace('#\d+#', 'hafo', $s);
Let's generalize the code so it can replace numbers with anything we pass in
the variable $replacement
. Many programmers will use:
return preg_replace('#\d+#', $replacement, $s); // wrong!
Unfortunately, that’s not right. It's important to realize that certain characters have special meanings in the replaced string (specifically the slash and dollar), so we must “escape” them: escaping definitive guide. The correct general solution is:
return preg_replace('#\d+#', addcslashes($replacement, '$\\'), $s); // ok
Do any other replacement tricks come to mind?