Are You Just Following a Cargo Cult?
Many years ago, I realized that when I used a variable containing a predefined data table in a PHP function, the array had to be “recreated” each time the function was called, which was surprisingly slow. For example:
function isSpecialName(string $name): bool
{
$specialNames = ['foo' => 1, 'bar' => 1, 'baz' => 1, ...];
return isset($specialNames[$name]);
}
Then I discovered a simple trick that prevented the array from being recreated. It was enough to define the variable as static:
function isSpecialName(string $name): bool
{
static $specialNames = ['foo' => 1, 'bar' => 1, 'baz' => 1, ...];
return isset($specialNames[$name]);
}
The speed-up, if the array was a bit larger, was several orders of magnitude (like 500×).
Since then, I have always used static
for constant arrays.
It's possible that others followed this habit without knowing the real reason
behind it, but I can't be sure.
A few weeks ago, I wrote a class that held large tables of predefined data
in several properties. I realized that this would slow down the creation of
instances, meaning the new
operator would “recreate” the arrays
each time, which is slow as we know. Therefore, I had to change the properties
to static, or perhaps even better, use constants.
Then I asked myself: Hey, are you just following a cargo cult? Is it still true that without static it is slow?
It's hard to say, PHP has undergone revolutionary development and old truths may no longer be valid. I prepared a test sample and did a few measurements. Of course, I confirmed that in PHP 5, using static inside a function or with properties significantly sped things up by several orders of magnitude. However, note that in PHP 7.0, it was only by one order of magnitude. Excellent, a sign of optimizations in the new core, but the difference is still substantial. Yet, with further PHP versions, the difference continued to decrease and eventually nearly disappeared.
I even found that using static inside a function in PHP 7.1 and 7.2 actually slowed down the execution by about 1.5–2×, which in terms of the orders of magnitude we are discussing, is negligible, but it was an interesting paradox. From PHP 7.3, the difference disappeared completely.
Habits are a good thing, but it is necessary to validate their meaning continuously.
I will no longer use unnecessary static within function bodies. However, for
that class holding large tables of predefined data in properties, I thought it
was programmatically correct to use constants. Soon, I had the refactoring
done, but even as it was being created, I lamented how ugly the code was
becoming. Instead of $this->ruleToNonTerminal
or
$this->actionLength
, the code now contained the screaming
$this::RULE_TO_NON_TERMINAL
and $this::ACTION_LENGTH
,
which looked really ugly. A stale whiff from the seventies.
I even hesitated, wondering if I even wanted to look at such ugly code, and whether I might prefer to stick with variables, or static variables.
And then it hit me: Hey, are you just following a cargo cult?
Of course, I am. Why should a constant shout? Why should it draw attention to itself in the code, be a protruding element in the flow of the program? The fact that the structure is read-only is not a reason FOR STUCK CAPSLOCK, AGGRESSIVE TONE, AND WORSE READABILITY.
THE TRADITION OF UPPERCASE LETTERS COMES FROM THE C LANGUAGE, WHERE MACRO CONSTANTS FOR THE PREPROCESSOR WERE MARKED IN THIS WAY. IT WAS USEFUL TO UNMISTAKABLY DISTINGUISH CODE FOR THE PARSER FROM CODE FOR THE PREPROCESSOR. IN PHP, NO PREPROCESSORS WERE EVER USED, SO THERE IS NO REASON to write constants in uppercase letters.
That very evening, I removed them everywhere. And still couldn't understand why it hadn't occurred to me twenty years ago. The bigger the nonsense, the tougher its roots.
Leave a comment