Well-maintained software should have quality API documentation.
Certainly. However, just as the absence of documentation is a mistake, so too is
its redundancy. Writing documentation comments, much like designing an API or
user interface, requires thoughtful consideration.
By thoughtful consideration, I do not mean the process that occurred in the
developer's mind when they complemented the constructor with this comment:
class ChildrenIterator
{
/**
* Constructor.
*
* @param array $data
* @return \Zend\Ldap\Node\ChildrenIterator
*/
public function __construct(array $data)
{
$this->data = $data;
}
Six lines that add not a single piece of new information. Instead, they
contribute to:
- visual noise
- duplication of information
- increased code volume
- potential for errors
The absurdity of the mentioned comment may seem obvious, and I'm glad if it
does. Occasionally, I receive pull requests that try to sneak similar rubbish
into the code. Some programmers even use editors that automatically clutter the
code this way. Ouch.
Or consider another example. Think about whether the comment told you
anything that wasn't already clear:
class Zend_Mail_Transport_Smtp extends Zend_Mail_Transport_Abstract
{
/**
* EOL character string used by transport
* @var string
* @access public
*/
public $EOL = "\n";
Except for the @return
annotation, the usefulness of this
comment can also be questioned:
class Form
{
/**
* Adds group to the form.
* @param string $caption optional caption
* @param bool $setAsCurrent set this group as current
* @return ControlGroup
*/
public function addGroup($caption = null, $setAsCurrent = true)
If you use expressive method and parameter names (which you should), and they
also have default values or type hints, this comment gives you almost nothing.
It should either be reduced to remove information duplication or expanded to
include more useful information.
But beware of the opposite extreme, such as novels in phpDoc:
/**
* Performs operations on ACL rules
*
* The $operation parameter may be either OP_ADD or OP_REMOVE, depending on whether the
* user wants to add or remove a rule, respectively:
*
* OP_ADD specifics:
*
* A rule is added that would allow one or more Roles access to [certain $privileges
* upon] the specified Resource(s).
*
* OP_REMOVE specifics:
*
* The rule is removed only in the context of the given Roles, Resources, and privileges.
* Existing rules to which the remove operation does not apply would remain in the
* ACL.
*
* The $type parameter may be either TYPE_ALLOW or TYPE_DENY, depending on whether the
* rule is intended to allow or deny permission, respectively.
*
* The $roles and $resources parameters may be references to, or the string identifiers for,
* existing Resources/Roles, or they may be passed as arrays of these - mixing string identifiers
* and objects is ok - to indicate the Resources and Roles to which the rule applies. If either
* $roles or $resources is null, then the rule applies to all Roles or all Resources, respectively.
* Both may be null in order to work with the default rule of the ACL.
*
* The $privileges parameter may be used to further specify that the rule applies only
* to certain privileges upon the Resource(s) in question. This may be specified to be a single
* privilege with a string, and multiple privileges may be specified as an array of strings.
*
* If $assert is provided, then its assert() method must return true in order for
* the rule to apply. If $assert is provided with $roles, $resources, and $privileges all
* equal to null, then a rule having a type of:
*
* TYPE_ALLOW will imply a type of TYPE_DENY, and
*
* TYPE_DENY will imply a type of TYPE_ALLOW
*
* when the rule's assertion fails. This is because the ACL needs to provide expected
* behavior when an assertion upon the default ACL rule fails.
*
* @param string $operation
* @param string $type
* @param Zend_Acl_Role_Interface|string|array $roles
* @param Zend_Acl_Resource_Interface|string|array $resources
* @param string|array $privileges
* @param Zend_Acl_Assert_Interface $assert
* @throws Zend_Acl_Exception
* @uses Zend_Acl_Role_Registry::get()
* @uses Zend_Acl::get()
* @return Zend_Acl Provides a fluent interface
*/
public function setRule($operation, $type, $roles = null, $resources = null, $privileges = null,
Zend_Acl_Assert_Interface $assert = null)
Generated API documentation is merely a reference guide, not a book to read
before sleep. Lengthy descriptions truly do not belong here.
The most popular place for expansive documentation is file headers:
<?php
/**
* Zend Framework
*
* LICENSE
*
* This source file is subject to the new BSD license that is bundled
* with this package in the file LICENSE.txt.
* It is also available through the world-wide-web at this URL:
* http://framework.zend.com/license/new-bsd
* If you did not receive a copy of the license and are unable to
* obtain it through the world-wide-web, please send an email
* to license@zend.com so we can send you a copy immediately.
*
* @category Zend
* @package Zend_Db
* @subpackage Adapter
* @copyright Copyright (c) 2005-2012 Zend Technologies USA Inc. (http://www.zend.com)
* @license http://framework.zend.com/license/new-bsd New BSD License
* @version $Id: Abstract.php 25229 2013-01-18 08:17:21Z frosch $
*/
Sometimes it seems the intention is to stretch the header so long that upon
opening the file, the code itself is not visible. What's the use of a 10-line
information about the New BSD license, which contains key announcements like its
availability in the LICENSE.txt
file, accessible via the
world-wide-web, and if you happen to lack modern innovations like a so-called
web browser, you should send an email to license@zend.com, and they
will send it to you immediately? Furthermore, it's redundantly repeated
4,400 times. I tried sending a request, but the response did not
come 🙂
Also, including the copyright year in copyrights leads to a passion for
making commits like update copyright year to 2014, which changes all
files, complicating version comparison.
Is it really necessary to include copyright in every file? From a legal
perspective, it is not required, but if open source licenses allow users to use
parts of the code while retaining copyrights, it is appropriate to include them.
It's also useful to state in each file which product it originates from,
helping people navigate when they encounter it individually. A good
example is:
/**
* Zend Framework (http://framework.zend.com/)
*
* @link http://github.com/zendframework/zf2 for the canonical source repository
* @copyright Copyright (c) 2005-2014 Zend Technologies USA Inc. (http://www.zend.com)
* @license http://framework.zend.com/license/new-bsd New BSD License
*/
Please think carefully about each line and whether it truly benefits the
user. If not, it's rubbish that doesn't belong in the code.
(Please, commentators, do not perceive this article as a battle of
frameworks; it definitely is not.)
Let's create simple OOP wrapper for encoding and decoding JSON
in PHP:
class Json
{
public static function encode($value)
{
$json = json_encode($value);
if (json_last_error()) {
throw new JsonException;
}
return $json;
}
public static function decode($json)
{
$value = json_decode($json);
if (json_last_error()) {
throw new JsonException;
}
return $value;
}
}
class JsonException extends Exception
{
}
// usage:
$json = Json::encode($arg);
Simple.
But it is very naive. In PHP, there are a ton of bugs (sometime called as
“not-a-bug”) that need workarounds.
json_encode()
is (nearly) the only one function in whole PHP, which behavior is affected by
directive display_errors
. Yes, JSON encoding is affected by
displaying directive. If you want detect error
Invalid UTF-8 sequence
, you must disable this directive. (#52397, #54109, #63004, not fixed).
json_last_error()
returns the last error (if any) occurred during the last JSON encoding/decoding.
Sometimes! In case of error Recursion detected
it returns 0. You
must install your own error handler to catch this error. (Fixed after years in
PHP 5.5.0)
json_last_error()
sometimes doesn't return the last error, but
the last-but-one error. I.e. json_decode('')
with empty string
doesn't clear last error flag, so you cannot rely on error code. (Fixed in
PHP 5.3.7)
json_decode()
returns null if the JSON cannot be decoded or if the encoded data is deeper than
the recursion limit. Ok, but json_encode('null')
return null too.
So we have the same return value for success and failure. Great!
json_decode()
is unable to detect
Invalid UTF-8 sequence
in PHP < 5.3.3 or when PECL
implementation is used. You must check it own way.
json_last_error()
exists since PHP 5.3.0, so minimal required
version for our wrapper is PHP 5.3
json_last_error()
returns only numeric code. If you'd like to
throw exception, you must create own table of messages
(json_last_error_msg()
was added in PHP 5.5.0)
So the simple class wrapper for encoding and decoding JSON now looks
like this:
class Json
{
private static $messages = array(
JSON_ERROR_DEPTH => 'The maximum stack depth has been exceeded',
JSON_ERROR_STATE_MISMATCH => 'Syntax error, malformed JSON',
JSON_ERROR_CTRL_CHAR => 'Unexpected control character found',
JSON_ERROR_SYNTAX => 'Syntax error, malformed JSON',
5 /*JSON_ERROR_UTF8*/ => 'Invalid UTF-8 sequence',
6 /*JSON_ERROR_RECURSION*/ => 'Recursion detected',
7 /*JSON_ERROR_INF_OR_NAN*/ => 'Inf and NaN cannot be JSON encoded',
8 /*JSON_ERROR_UNSUPPORTED_TYPE*/ => 'Type is not supported',
);
public static function encode($value)
{
// needed to receive 'Invalid UTF-8 sequence' error; PHP bugs #52397, #54109, #63004
if (function_exists('ini_set')) { // ini_set is disabled on some hosts :-(
$old = ini_set('display_errors', 0);
}
// needed to receive 'recursion detected' error
set_error_handler(function($severity, $message) {
restore_error_handler();
throw new JsonException($message);
});
$json = json_encode($value);
restore_error_handler();
if (isset($old)) {
ini_set('display_errors', $old);
}
if ($error = json_last_error()) {
$message = isset(static::$messages[$error]) ? static::$messages[$error] : 'Unknown error';
throw new JsonException($message, $error);
}
return $json;
}
public static function decode($json)
{
if (!preg_match('##u', $json)) { // workaround for PHP < 5.3.3 & PECL JSON-C
throw new JsonException('Invalid UTF-8 sequence', 5);
}
$value = json_decode($json);
if ($value === null
&& $json !== '' // it doesn't clean json_last_error flag until 5.3.7
&& $json !== 'null' // in this case null is not failure
) {
$error = json_last_error();
$message = isset(static::$messages[$error]) ? static::$messages[$error] : 'Unknown error';
throw new JsonException($message, $error);
}
return $value;
}
}
This implementation is used in Nette
Framework. There is also workaround for another bug, the JSON bug. In fact, JSON is not subset of
JavaScript due characters \u2028
and \u2029
. They must
be not used in JavaScript and must be encoded too.
(In PHP, detection of errors in JSON encoding/decoding is hell, but it is
nothing compared to detection of errors in PCRE
functions.)
Journey into the heart of the three most known CSS
preprocessors continues, though not in the way I originally planned.
CSS preprocessor is a tool that take code written in their own syntax and
generates the CSS for the browser. The most popular preprocessors are SASS, LESS and
Stylus. We have talked about
installation
and syntax
+ mixins. All three preprocessors have a fundamentally different way of
mixins conception.
Each of them have gallery of finished mixins: For SASS there is a
comprehensive Compass, the LESS has
framework Twitter
Bootstrap or small Elements a Stylus
NIB.
… this was opening sentences of article I started write year and
quarter ago and never finished. I came to the conclusion that all three
preprocessors are useless. They required to do so many compromises that
potential benefits seemed insignificant. Today I will explain it.
…pokračování
There is nothing worse than manually uploading files via FTP,
for example, using Total Commander. (Although, editing files directly on the
server and then desperately trying to synchronize them is even worse.) Once you
fail to automate the process, it consumes much more of your time and increases
the risk of errors, such as forgetting to upload a file.
Today, sophisticated application deployment techniques are used, such as via
Git, but many people still stick to uploading individual files via FTP. For
them, the FTP Deployment tool is designed to automate and simplify the uploading
of applications over FTP.
FTP Deployment is a PHP
script that automates the entire process. You simply specify which directory
(local
) to upload to (remote
). These details are
written into a deployment.ini
file, clicking which can immediately
launch the script, making deployment a one-click affair:
deployment deployment.ini
What does the deployment.ini
file look like? The
remote
item is actually the only required field; all others are
optional:
; remote FTP server
remote = ftp://user:secretpassword@ftp.example.com/directory
; you can use ftps:// or sftp:// protocols (sftp requires SSH2 extension)
; do not like to specify user & password in 'remote'? Use these options:
;user = ...
;password = ...
; FTP passive mode
passiveMode = yes
; local path (optional)
local = .
; run in test-mode? (can be enabled by option -t or --test too)
test = no
; files and directories to ignore
ignore = "
.git*
project.pp[jx]
/deployment.*
/log
temp/*
!temp/.htaccess
"
; is allowed to delete remote files? (defaults to yes)
allowDelete = yes
; jobs to run before uploading
before[] = local: lessc assets/combined.less assets/combined.css
before[] = http://example.com/deployment.php?before
; jobs to run after uploading and before uploaded files are renamed
afterUpload[] = http://example.com/deployment.php?afterUpload
; directories to purge after uploading
purge[] = temp/cache
; jobs to run after everything (upload, rename, delete, purge) is done
after[] = remote: unzip api.zip
after[] = remote: chmod 0777 temp/cache ; change permissions
after[] = http://example.com/deployment.php?after
; files to preprocess (defaults to *.js *.css)
preprocess = no
; file which contains hashes of all uploaded files (defaults to .htdeployment)
deploymentFile = .deployment
; default permissions for new files
;filePermissions = 0644
; default permissions for new directories
;dirPermissions = 0755
In test mode (when started with the -t
parameter), no file
uploads or deletions occur on the FTP, so you can use it to check if all values
are correctly set.
The ignore
item uses the same format as .gitignore:
log
– ignores all log
files or directories,
even within all subfolders
/log
– ignores the log
file or directory in the
root directory
app/log
– ignores the log
file or directory in
the app
subfolder of the root directory
data/*
– ignores everything inside the data
folder but still creates the folder on FTP
!data/session
– excludes the session
file or
folder from the previous rule
project.pp[jx]
– ignores project.ppj
and
project.ppx
files or directories
Before starting the upload and after it finishes, you can have scripts called
on your server (see before
and after
), which can
switch the server into a maintenance mode, sending a 503 header, for
instance.
To ensure synchronization of a large number of files happens (as far as
possible) transactionally, all files are first uploaded with the
.deploytmp
extension and then quickly renamed. Additionally, a
.htdeployment
file is saved on the server containing MD5 hashes of
all files, and it's used for further web synchronization.
On subsequent runs, it uploads only changed files and deletes removed ones
(unless prevented by the allowdelete
directive).
Files can be preprocessed before uploading. By default, all .css
files are compressed using Clean-CSS and .js
files using Google
Closure Compiler. Before compression, they first expand basic mod_include
directives from Apache. For instance, you can create a
combined.js
file:
<!--#include file="jquery.js" -->
<!--#include file="jquery
.fancybox.js" -->
<!--#include file="main.js" -->
You can request Apache on your local server to assemble this by combining the
three mentioned files as follows:
<FilesMatch "combined\.(js|css)$">
Options +Includes
SetOutputFilter INCLUDES
</FilesMatch>
The server will then upload the files in their combined and compressed form.
Your HTML page will save resources by loading just one JavaScript file.
In the deployment.ini
configuration file, you can create
multiple sections, or even make one configuration file for data and another for
the application, to make synchronization as fast as possible and not always
calculate the fingerprint of a large number of files.
I created the FTP Deployment tool many years ago and it fully covers my
needs for a deployment tool. However, it's important to emphasize that the FTP
protocol, by transmitting the password in plain text, poses a security risk and
you definitely should not use it, for example, on public Wi-Fi.
Few are as keen to emphasize their perceived superiority as
Rails developers. Don't get me wrong, it's a solid marketing strategy.
What's problematic is when you succumb to it to the extent that you see the
rest of the world as mere copycats without a chance to ever catch up. But the
world isn't like that.
Take Dependency Injection, for example. While people in the PHP and
JavaScript communities discovered DI later, Ruby on Rails remains untouched by
it. I was puzzled why a framework with such a progressive image was lagging
behind, and after some digging, I found an answer from various sources on
Google and karmiq, which
states:
Ruby is such a good language that it doesn't need Dependency Injection.
This fascinating argument, moreover, is self-affirming in an elitist
environment. But is it really true? Or is it just blindness caused by pride, the
same blindness that recently led to much-discussed security vulnerabilities
in Rails?
I wondered if perhaps I knew so little about Ruby that I missed some key
aspect, and that it truly is a language that doesn’t need DI. However, the
primary purpose of Dependency
Injection is to clearly pass dependencies so that the code is
understandable and predictable (and thus better testable). But when I look
at the Rails documentation on the “blog in a few minutes” tutorial, I see
something like:
def index
@posts = Post.all
end
Here, to obtain blog posts, they use the static method Post.all
,
which retrieves a list of articles from somewhere (!). From a database? From a
file? Conjured up? I don’t know because DI isn’t used here. Instead,
it’s some kind of static hell. Ruby is undoubtedly a clever language,
but it doesn’t replace DI.
In Ruby, you can override methods at runtime (Monkey patch; similar to
JavaScript), which is a form of Inversion of Control (IoC) that allows for
substituting a different implementation of the static method
Post.all
for testing purposes. However, this does not replace DI,
and it certainly doesn't make the code clearer, rather the opposite.
Incidentally, I was also struck by the Post
class in that it
represents both a single blog post and functions as a repository (the
all
method), which violates the Single
Responsibility Principle to the letter.
The justification often cited for why Ruby doesn't need DI refers to the
article LEGOs,
Play-Doh, and Programming. I read it thoroughly, noting how the author
occasionally confuses “DI” with a “DI framework” (akin to confusing
“Ruby” with “Ruby on Rails”) and ultimately found that it doesn’t
conclude that Ruby doesn’t need Dependency Injection. It says that it
doesn’t need DI frameworks like those known from Java.
One misinterpreted conclusion, if flattering, can completely bewilder a huge
group of intelligent people. After all, the myth that spinach contains an
extraordinary amount of iron has been persistent since 1870.
Ruby is a very interesting language, and like in any other, it pays to use
DI. There are even DI frameworks available for it. Rails is an intriguing
framework that has yet to discover DI. When it does, it will be a major topic
for some of its future versions.
(After attempting to discuss DI with Karmiq, whom I consider the most
intelligent Railist, I am keeping the comments closed, apologies.)
First, a question: match or no match?
$str = "123\n";
echo preg_match('~^\d+$~', $str);
If you think the function returns false
because the regular
expression operates in single-line mode and does not allow any characters other
than digits in the string, you are mistaken.
I'll digress slightly. Regular expressions in Ruby have a flaw (inconsistency
with the de facto PERL standard): the ^
and $
characters do not denote the start and end of the string, but only the start and
end of a line within it. Not knowing this fact can cause security
vulnerabilities, as noted in the Rails
documentation. PHP behaves as standard, but few know what exactly that
standard behavior means. The documentation for the meta-character $
is imprecise.
(now corrected)
Correctly, the $
character means the end of the string or a
terminating newline; in multiline mode (modifier m
), it means
the end of a line.
The actual end of the string is captured with the sequence \z
.
Alternatively, you can use the dollar sign together with the modifier
D
.
$str = "123\n";
echo preg_match('~^[0-9]+$~', $str); // true
echo preg_match('~^[0-9]+$~D', $str); // false
echo preg_match('~^[0-9]+\z~', $str); // false
Large frameworks aren't always and universally suitable for
everyone and everything!
I borrowed the title from the Manifesto of
Miniature PHP, which I would happily sign electronically, if I had a
digital signature. Although the argument about counting lines is unfair and
debatable, I understand what the author was trying to say. On Zdroják,
I wrote a comment that I eventually decided to immortalize here on
the blog:
I often make simple websites, which I write entirely in “notepad”, and
I want the code to have no more lines than is absolutely necessary. Uploading a
several-megabyte framework for a 20kB website, including styles, to a hosting
service is out of the question.
Yet, even in these simple websites, I want to use solutions that are
available in Nette, and I don't want to give up the comfort I'm used to. I am
a lazy programmer. For this reason, the Nette
Framework can be used as a micro-framework.
An example would be appropriate. Just yesterday, I redesigned https://davidgrudl.com and made the source
code available (check the top left corner), purely for inspiration to others on
how I handle such a microsite. The entire PHP code of the website is contained
in a single file, index.php, which is, I believe, understandable, although
perhaps less so for the uninitiated. The rest are templates. And the framework
is uploaded in the minified form of a single file, which, along with the fact
that it's about twice the size of jQuery, overcomes the psychological block of
“not wanting to upload a whole framework.”
Or take the example of a blog found directly in the distribution. Its source
code is also just index.php, with even fewer lines than the previous example.
Everything else is templates, see https://github.com/…ta/templates.
Perhaps I should explain why I actually use a framework on tiny websites.
Mainly, today I cannot imagine programming without Tracy, which then logs errors on the
production server (although they are unlikely with a static website). But
I primarily use the Latte templating
system because, starting from just 2 pages, I want to separate layout and
content, I like Latte’s concise syntax, and I rely on its automatic
escaping. I also use routing, because simply
wanting URLs without .php
extensions can only be set up correctly
by God himself.
In the first mentioned website, caching is also used for Twitter
feeds, and on the blog, a database
layer is utilized. And there’s also a Nette SEO trick, which
automatically prevents the known error of moving forwards and backwards through
pagination and landing on the same page, only to have it haunted in the URL by
page=1
.
Nette also ensures that if there is an error, no PHP programming error
messages are displayed, but rather a user-understandable page. And also
autoloading – I've come
to take it for granted so much that I would have completely forgotten to
mention it.
Of course, I sometimes add a contact form and have it send emails. Now I realize that
I actually use 90% of the framework.
That's how I create quick'n'dirty websites and that's how I enjoy
it 😉
See also: How to write
micro-websites
How not to get burned when replacing occurrences of one string
with another. Search & Replace tricks.
The basic function for replacing strings in PHP is str_replace:
$s = "Lorem ipsum";
echo str_replace('ore', 'ide', $s); // returns "Lidem ipsum"
Thanks to cleverly designed UTF-8 encoding, it can be reliably used even for
strings encoded this way. Additionally, the first two arguments can be arrays,
and the function will then perform multiple replacements. Here we encounter the
first trick to be aware of. Each replacement goes through the string
again, so if we wanted to swap dá
<⇒ pá
in
the phrase pánské dárky
to get dánské párky
(a
Swedish delicacy!), no order of arguments will achieve this:
// returns "dánské dárky"
echo str_replace(array('dá', 'pá'), array('pá', 'dá'), "pánské dárky");
// returns "pánské párky"
echo str_replace(array('pá', 'dá'), array('dá', 'pá'), "pánské dárky");
The sought-after function that goes through the string just once and prevents
collisions is strtr:
// returns "dánské párky", hooray
echo strtr("pánské dárky", array('pá' => 'dá', 'dá' => 'pá'));
If we need to find occurrences according to more complex rules, we use
regular expressions and the function preg_replace. It also allows for
multiple replacements and behaves similarly to str_replace
. Now,
however, I am heading elsewhere. I need to replace all numbers in the string
with the word hafo
, which is easy:
$s = "Radek says he has an IQ of 151. Quite the collector's item!";
echo preg_replace('#\d+#', 'hafo', $s);
Let's generalize the code so it can replace numbers with anything we pass in
the variable $replacement
. Many programmers will use:
return preg_replace('#\d+#', $replacement, $s); // wrong!
Unfortunately, that’s not right. It's important to realize that certain
characters have special meanings in the replaced string (specifically the slash
and dollar), so we must “escape” them: escaping definitive guide. The
correct general solution is:
return preg_replace('#\d+#', addcslashes($replacement, '$\\'), $s); // ok
Do any other replacement tricks come to mind?
Dependency
Injection is a technique that solves certain problems but also introduces
new challenges. These challenges are then addressed by a DI (Dependency
Injection) container, which requires you to adopt a new perspective on
object-oriented design.
If the problems that DI solves do not bother you, then you might perceive its
implementation as an unnecessary hassle, particularly because it necessitates
learning a new approach to object-oriented design.
However, it seems that if you are not bothered by the issues DI addresses,
you have a serious problem. Which you will realize once you discover it.
Do you know the complaints developers have about their clients
not having a clear vision and constantly changing the project requirements?
That's them crying over their own inability. Whenever I hear this, I wish the
poor client had a better provider.
The client doesn't have a clear brief because they are not experts in web
design. I wonder how many web designers understand their client's business
well enough that they could create a precise brief if the roles were
reversed.
If the client continuously changes the requirements, it means they are
interested and engaged in the project, constantly thinking about it. There's a
higher chance that something truly useful will emerge. And most importantly:
they will keep asking for more and more work.
If the developer realizes this, they will understand that it is they who must
adapt their working style. Perhaps simplify the addition of a ZIP code column on
the website, even though it wasn't in the original brief.