Rubrika PHP » Strana 2

Rubrika PHP

How to write error handler in PHP?

When writing your own error handler for PHP, it is absolutely necessary to follow several rules. Otherwise, it can disrupt the behavior of other libraries and applications that do not expect treachery in the error handler.

Parameters

The signature of the handler looks like this:

function errorHandler(
    int $severity,
    string $message,
    string $file,
    int $line,
    array $context = null // only in PHP < 8
): ?bool {
    ...
}

The $severity parameter contains the error level (E_NOTICE, E_WARNING, …). Fatal errors such as E_ERROR cannot be caught by the handler, so this parameter will never have these values. Fortunately, fatal errors have essentially disappeared from PHP and have been replaced by exceptions.

The $message parameter is the error message. If the html_errors directive is enabled, special characters like < are written as HTML entities, so you need to decode them back to plain text. However, beware, some characters are not written as entities, which is a bug. Displaying errors in pure PHP is thus prone to XSS.

The $file and $line parameters represent the name of the file and the line where the error occurred. If the error occurred inside eval(), $file will be supplemented with this information.

Finally, the $context parameter contains an array of local variables, which is useful for debugging, but this has been removed in PHP 8. If the handler is to work in PHP 8, omit this parameter or give it a default value.

Return Value

The return value of the handler can be null or false. If the handler returns null, nothing happens. If it returns false, the standard PHP handler is also called. Depending on the PHP configuration, this can print or log the error. Importantly, it also fills in internal information about the last error, which is accessible by the error_get_last() function.

Suppressed Errors

In PHP, error display can be suppressed either using the shut-up operator @ or by error_reporting():

// suppress E_USER_DEPRECATED level errors
error_reporting(~E_USER_DEPRECATED);

// suppress all errors when calling fopen()
$file = @fopen($name, 'r');

Even when errors are suppressed, the handler is still called. Therefore, it is first necessary to verify whether the error is suppressed, and if so, we must end our own handler:

if (!($severity & error_reporting())) {
    return false;
}

However, in this case, we must end it with return false, so that the standard error handler is still executed. It will not print or log anything (because the error is suppressed), but ensures that the error can be detected using error_get_last().

Other Errors

If our handler processes the error (for example, displays its own message, etc.), there is no need to call the standard handler. Although then it will not be possible to detect the error using error_get_last(), this does not matter in practice, as this function is mainly used in combination with the shut-up operator.

If, on the other hand, the handler does not process the error for any reason, it should return false so as not to conceal it.

Example

Here's what the code for a custom error handler that transforms errors into ErrorException exceptions might look like:

set_error_handler(function (int $severity, string $message, string $file, int $line) {
    if (!(error_reporting() & $severity)) {
        return false;
    }

    throw new \ErrorException($message, 0, $severity, $file, $line);
});

How to Mock Final Classes?

How to mock classes that are defined as final or some of their methods are final?

Mocking means replacing the original object with its testing imitation that does not perform any functionality and just looks like the original object. And pretending the behavior we need to test.

For example, instead of a PDO with methods like query() etc., we create a mock that pretends working with the database, and instead verifies that the correct SQL statements are called, etc. More e.g. in the Mockery documentation.

And in order to be able to pass mock to methods that use PDO type hint, it is necessary for the mock class to inherit from the PDO. And that can be a stumbling block. If the PDO or method query() were final, it would not be possible.

Is there any solution? The first option is not to use the final keyword at all. This, of course, does not help with the third-party code that it uses, but mainly detracts from the important element of the object design. For example, there is dogma that every class should be either final or abstract.

The second and very handy option is to use BypassFinals, which removes finals from source code on-the-fly and allows mocking of final methods and classes.

Install it using Composer:

composer require dg/bypass-finals --dev

And just call at the beginning of the test:

require __DIR__ . '/vendor/autoload.php';

DG\BypassFinals::enable();

Thats all. Incredibly black magic 🙂

BypassFinals requires PHP version 5.6 and supports PHP up to 7.2. It can be used together with any test tool such as PHPUnit or Mockery.

This functionality is directly implemented in the “Nette Tester”: https://tester.nette.org version 2.0 and can be enabled this way:

require __DIR__ . '/vendor/autoload.php';

Tester\Environment::bypassFinals();

Everything About Output Buffering in PHP

And what you won't read in the documentation, including a security patch and advice on speeding up server response without slowing it down.

Output buffering allows the output of a PHP script (primarily from the echo function) to be stored in memory (i.e., a buffer) instead of being sent immediately to the browser or terminal. This is useful for various purposes.

Preventing Output to the Screen:

ob_start();  // enables output buffering
$foo->bar();  // all output goes only to the buffer
ob_end_clean();  // clears the buffer and ends buffering

Capturing Output into a Variable:

ob_start();  // enables output buffering
$foo->render();  // output goes only to the buffer
$output = ob_get_contents();  // saves the buffer content into a variable
ob_end_clean();  // clears the buffer and ends buffering

The pair ob_get_contents() and ob_end_clean() can be replaced by a single function ob_get_clean(), which removes end from the name but indeed turns off output buffering:

$output = ob_get_clean();  // saves the buffer content into variable and disables buffering

In the given examples, the buffer content did not reach the output at all. If you want to send it to the output instead, you should use ob_end_flush() instead of ob_end_clean(). To simultaneously get the buffer content, send it to the output, and end buffering, there is also a shortcut: ob_get_flush().

You can empty the buffer at any time without ending it using ob_clean() (clears it) or ob_flush() (sends it to the output):

ob_start();  // enables output buffering
$foo->bar();  // all output goes only to the buffer
ob_clean();  // clears the buffer content, but buffering remains active
$foo->render(); // output still goes to the buffer
ob_flush(); // sends the buffer to the output
$none = ob_get_contents();  // the buffer content is now an empty string
ob_end_clean();  // disables output buffering

Output written to php://output is also sent to the buffer, while buffers can be bypassed by writing to php://stdout (or STDOUT), which is available only under CLI, i.e., when running scripts from the command line.

Nesting

Buffers can be nested, so while one buffer is active, calling ob_start() activates a new buffer. Thus, ob_end_flush() and ob_flush() send the buffer content not to the output but to the parent buffer. Only when there is no parent buffer does the content get sent to the actual output, i.e., the browser or terminal.

Therefore, it is important to end buffering, even if an exception occurs during the process:

ob_start();
try {
    $foo->render();
} finally {  // finally available from PHP 5.5
    ob_end_clean(); // or ob_end_flush()
}

Buffer Size

The buffer can also “speed up page generation (I haven't measured this, but it sounds logical)” by not sending every single echo to the browser, but a larger amount of data (e.g., 4kB). Just call at the beginning of the script:

ob_start(null, 4096);

When the buffer size exceeds 4096 bytes (the so-called chunk size), a flush is performed automatically, i.e., the buffer is emptied and sent out. The same can be achieved by setting the output_buffering directive. It is ignored in CLI mode.

But beware, starting buffering without specifying the size, i.e., simply with ob_start(), will cause the page not to be sent gradually but only after it is fully rendered, making the server appear very slow!

HTTP Headers

Output buffering has no effect on sending HTTP headers, which are processed by a different path. However, thanks to buffering, headers can be sent even after some output has been printed, as it is still held in the buffer. This is a side effect you shouldn't rely on, as there is no certainty when the output will exceed the buffer size and be sent.

Security Hole

When the script ends, all unclosed buffers are outputted. This can be considered an unpleasant security hole if, for example, you prepare sensitive data in the buffer not intended for output and an error occurs. The solution is to use a custom handler:

ob_start(function () { return ''; });

Handlers

You can attach a custom handler to output buffering, i.e., a function that processes the buffer content before sending it out:

ob_start(
    function ($buffer, $phase) { return strtoupper($buffer); }
);
echo 'Hello';
ob_end_flush(); // 'HELLO' is sent to the output

Functions ob_clean() or ob_end_clean() will call the handler but discard the output without sending it out. The handler can detect which function is called and respond accordingly. The second parameter $phase is a bitmask (from PHP 5.4):

PHP_OUTPUT_HANDLER_START when the buffer is opened
PHP_OUTPUT_HANDLER_FINAL when the buffer is closed
PHP_OUTPUT_HANDLER_FLUSH when ob_flush() is called (but not ob_end_flush() or ob_get_flush())
PHP_OUTPUT_HANDLER_CLEAN when ob_clean(), ob_end_clean(), and ob_get_clean() are called
PHP_OUTPUT_HANDLER_WRITE when an automatic flush occurs

The start, final, and flush (or clean) phases can occur simultaneously, distinguished by the binary operator &:

if ($phase & PHP_OUTPUT_HANDLER_START) { ... }
if ($phase & PHP_OUTPUT_HANDLER_FLUSH) { ... }
elseif ($phase & PHP_OUTPUT_HANDLER_CLEAN) { ... }
if ($phase & PHP_OUTPUT_HANDLER_FINAL) { ... }

The PHP_OUTPUT_HANDLER_WRITE phase occurs only if the buffer has a size (chunk size) and that size was exceeded. This is the mentioned automatic flush. Note, the constant PHP_OUTPUT_HANDLER_WRITE has a value of 0, so you can't use a bit test, but:

if ($phase === PHP_OUTPUT_HANDLER_WRITE) { ... }

A handler doesn't have to support all operations. When activating with ob_start(), you can specify the bitmask of supported operations as the third parameter:

PHP_OUTPUT_HANDLER_CLEANABLE – allows calling ob_clean() and related functions
PHP_OUTPUT_HANDLER_FLUSHABLE – allows calling ob_flush()
PHP_OUTPUT_HANDLER_REMOVABLE – buffer can be ended
PHP_OUTPUT_HANDLER_STDFLAGS – combines all three flags, the default behavior

This applies even to buffering without a custom handler. For example, if I want to capture the output into a variable, I don't set the PHP_OUTPUT_HANDLER_FLUSHABLE flag, preventing the buffer from being (accidentally) sent to the output with ob_flush(). However, it can still be done with ob_end_flush() or ob_get_flush(), which somewhat defeats the purpose.

Similarly, not setting the PHP_OUTPUT_HANDLER_CLEANABLE flag should prevent the buffer from being cleared, but again it doesn't work.

Finally, not setting PHP_OUTPUT_HANDLER_REMOVABLE makes the buffer user-undeletable; it turns off only when the script ends. An example of a handler that should be set this way is ob_gzhandler, which compresses output, thus reducing volume and increasing data transfer speed. Once this buffer is opened, it sends the HTTP header Content-Encoding: gzip, and all subsequent output must be compressed. Removing the buffer would break the page.

The correct usage is:

ob_start(
    'ob_gzhandler',
    16000, // without chunk size, the server would not send data gradually
    PHP_OUTPUT_HANDLER_FLUSHABLE // but not removable or cleanable
);

You can also enable output compression by setting the zlib.output_compression directive, which turns on buffering with a different handler (not sure how it differs specifically), but it lacks the flag to be non-removable. Since it's good to compress the transfer of all text files, not just PHP-generated pages, it's better to activate compression directly on the HTTP server side.

Are you looking for php_ssh2.dll?

PHP ssh2 thread safe binaries for Microsoft Windows:

PHP 5.4 Short Arrays Converter

Command-line script to convert between array() and PHP 5.4's short syntax []. It uses native PHP tokenizer, so conversion is safe. The script was successfully tested against thousands of PHP files.

Download from GitHub

To convert all *.php and *.phpt files in whole directory recursively or to convert a single file use:

convert.php <directory | file>

To convert source code from STDIN and print the output to STDOUT use:

convert.php < input.php > output.php

To convert short syntax [] to older long syntax array() use option --reverse:

convert.php --reverse [<directory | file>]

Composer: How to Install in Different Ways

Composer, the most important tool for PHP developers, offers three methods to install packages:

local composer require vendor/name
global composer global require vendor/name
as a project composer create-project vendor/name

Local Installation

Local installation is the most common. If I have a project where I want to use Tracy, I enter in the project's root directory:

composer require tracy/tracy

Composer will update (or create) the composer.json file and download Tracy into the vendor subfolder. It also generates an autoloader, so in the code, I just need to include it and can use Tracy right away:

require __DIR__ . '/vendor/autoload.php';
Tracy\Debugger::enable();

As a Project

A completely different situation arises when, instead of a library whose classes I use in my project, I install a tool that I only run from the command line.

An example might be ApiGen for generating clear API documentation. In such cases, the third method is used:

composer create-project apigen/apigen

Composer will create a new folder (and thus a new project) apigen and download the entire tool and install its dependencies.

It will have its own composer.json and its own vendor subfolder.

This method is also used for installations like Nette Sandbox or CodeChecker. However, testing tools such as Nette Tester or PHPUnit are not installed this way because we use their classes in tests, calling Tester\Assert::same() or inheriting from PHPUnit_Framework_TestCase.

Unfortunately, Composer allows tools like ApiGen to be installed using composer require without even issuing a warning.

This is equivalent to forcing two developers, who don't even know each other and who work on completely different projects, to share the same vendor folder. To this one might say:

For heaven's sake, why would they do that?
It just can't work!

Indeed, there is no reasonable reason to do it, it brings no benefit, and it will stop working the moment there is a conflict of libraries used. It's just a matter of time, like building a house of cards that will sooner or later collapse. One project will require library XY in version 1.0, another in version 2.0, and at that point, it will stop working.

Global Installation

The difference between option 1) and 2), i.e., between composer require and composer global require, is that it involves not two, but ten different developers and ten unrelated projects. Thus, it is nonsensical squared.

Because composer global is a bad solution every time, there is no use case where it would be appropriate. The only advantage is that if you add the global vendor/bin directory to your PATH, you can easily run libraries installed this way.

Summary

Use composer require vendor/name if you want to use library classes.
Never use composer global require vendor/name!
Use composer create-project vendor/name for tools called only from the command line.

Note: npm uses a different philosophy due to JavaScript's capabilities, installing each library as a “separate project” with its own vendor (or node_modules) directory. This prevents version conflicts. In the case of npm, global installations of tools, like LESS CSS, are very useful and convenient.

How to Detect Errors in PHP? Well, that's tricky…

Among the top 5 monstrous quirks of PHP certainly belongs the inability to determine whether a call to a native function was successful or resulted in an error. Yes, you read that right. You call a function and you don’t know whether an error has occurred and what kind it was. [perex]

Now you might be smacking your forehead, thinking: surely I can tell by the return value, right? Hmm…

Return Value

Native (or internal) functions usually return false in case of failure. There are exceptions, such as "json_decode":http://php.net/manual/en/function.json-decode.php, which returns null if the input is invalid or exceeds the nesting limit, as mentioned in the documentation, so far so good.

This function is used for decoding JSON and its values, thus calling json_decode('null') also returns null, but as a correct result this time. We must therefore distinguish null as a correct result and null as an error:

$res = json_decode($s);
if ($res === null && $s !== 'null') {
	// an error occurred
}

It's silly, but thank goodness it's even possible. There are functions, however, where you can't tell from the return value that an error has occurred. For example, preg_grep or preg_split return a partial result, namely an array, and you can't tell anything at all (more in Treacherous Regular Expressions).

json_last_error & Co.

Functions that report the last error in a particular PHP extension. Unfortunately, they are often unreliable and it is difficult to determine what that last error actually was.

For example, json_decode('') does not reset the last error flag, so json_last_error returns a result not for the last but for some previous call to json_decode (see How to encode and decode JSON in PHP?). Similarly, preg_match('invalidexpression', $s) does not reset preg_last_error. Some errors do not have a code, so they are not returned at all, etc.

error_get_last

A general function that returns the last error. Unfortunately, it is extremely complicated to determine whether the error was related to the function you called. That last error might have been generated by a completely different function.

One option is to consider error_get_last() only when the return value indicates an error. Unfortunately, for example, the mail() function can generate an error even though it returns true. Or preg_replace may not generate an error at all in case of failure.

The second option is to reset the “last error” before calling our function:

@trigger_error('', E_USER_NOTICE); // reset

$file = fopen($path, 'r');

if (error_get_last()['message']) {
	// an error occurred
}

The code is seemingly clear, an error can only occur during the call to fopen(). But that's not the case. If $path is an object, it will be converted to a string by the __toString method. If it's the last occurrence, the destructor will also be called. Functions of URL wrappers may be called. Etc.

Thus, even a seemingly innocent line can execute a lot of PHP code, which may generate other errors, the last of which will then be returned by error_get_last().

We must therefore make sure that the error actually occurred during the call to fopen:

@trigger_error('', E_USER_NOTICE); // reset

$file = fopen($path, 'r');

$error = error_get_last();
if ($error['message'] && the error['file'] === __FILE__ && $error['line'] === __LINE__ - 3) {
	// an error occurred
}

The magic constant 3 is the number of lines between __LINE__ and the call to fopen. Please no comments.

In this way, we can detect an error (if the function emits one, which the aforementioned functions for working with regular expressions usually do not), but we are unable to suppress it, i.e., prevent it from being logged, etc. Using the shut-up operator @ is problematic because it conceals everything, including any further PHP code that is called in connection with our function (see the mentioned destructors, wrappers, etc.).

Custom Error Handler

The crazy but seemingly only possible way to detect if a certain function threw an error with the possibility of suppressing it is by installing a custom error handler using set_error_handler. But it's no joke to do it

right:

we must also remove the custom handler
we must remove it even if an exception is thrown
we must capture only errors that occurred in the incriminated function
and pass all others to the original handler

The result looks like this:

$prev = set_error_handler(function($severity, $message, $file, $line) use (& $prev) {
	if ($file === __FILE__ && $line === __LINE__ + 9) { // magic constant
		throw new Exception($message);
	} elseif ($prev) { // call the previous user handler
		return $prev(...func_get_args());
	}
	return false; // call the system handler
});

try {
	$file = fopen($path, 'r');  // this is the function we care about
} finally {
	restore_error_handler();
}

You already know what the magic constant 9 is.

So this is how we live in PHP.

Documentation Quirks

Well-maintained software should have quality API documentation. Certainly. However, just as the absence of documentation is a mistake, so too is its redundancy. Writing documentation comments, much like designing an API or user interface, requires thoughtful consideration.

By thoughtful consideration, I do not mean the process that occurred in the developer's mind when they complemented the constructor with this comment:

class ChildrenIterator
{
	/**
	 * Constructor.
	 *
	 * @param array $data
	 * @return \Zend\Ldap\Node\ChildrenIterator
	 */
	public function __construct(array $data)
	{
		$this->data = $data;
	}

Six lines that add not a single piece of new information. Instead, they contribute to:

visual noise
duplication of information
increased code volume
potential for errors

The absurdity of the mentioned comment may seem obvious, and I'm glad if it does. Occasionally, I receive pull requests that try to sneak similar rubbish into the code. Some programmers even use editors that automatically clutter the code this way. Ouch.

Or consider another example. Think about whether the comment told you anything that wasn't already clear:

class Zend_Mail_Transport_Smtp extends Zend_Mail_Transport_Abstract
{
	/**
	 * EOL character string used by transport
	 * @var string
	 * @access public
	 */
	public $EOL = "\n";

Except for the @return annotation, the usefulness of this comment can also be questioned:

class Form
{
	/**
	 * Adds group to the form.
	 * @param  string $caption	   optional caption
	 * @param  bool   $setAsCurrent  set this group as current
	 * @return ControlGroup
	 */
	public function addGroup($caption = null, $setAsCurrent = true)

If you use expressive method and parameter names (which you should), and they also have default values or type hints, this comment gives you almost nothing. It should either be reduced to remove information duplication or expanded to include more useful information.

But beware of the opposite extreme, such as novels in phpDoc:

	/**
	 * Performs operations on ACL rules
	 *
	 * The $operation parameter may be either OP_ADD or OP_REMOVE, depending on whether the
	 * user wants to add or remove a rule, respectively:
	 *
	 * OP_ADD specifics:
	 *
	 *	  A rule is added that would allow one or more Roles access to [certain $privileges
	 *	  upon] the specified Resource(s).
	 *
	 * OP_REMOVE specifics:
	 *
	 *	  The rule is removed only in the context of the given Roles, Resources, and privileges.
	 *	  Existing rules to which the remove operation does not apply would remain in the
	 *	  ACL.
	 *
	 * The $type parameter may be either TYPE_ALLOW or TYPE_DENY, depending on whether the
	 * rule is intended to allow or deny permission, respectively.
	 *
	 * The $roles and $resources parameters may be references to, or the string identifiers for,
	 * existing Resources/Roles, or they may be passed as arrays of these - mixing string identifiers
	 * and objects is ok - to indicate the Resources and Roles to which the rule applies. If either
	 * $roles or $resources is null, then the rule applies to all Roles or all Resources, respectively.
	 * Both may be null in order to work with the default rule of the ACL.
	 *
	 * The $privileges parameter may be used to further specify that the rule applies only
	 * to certain privileges upon the Resource(s) in question. This may be specified to be a single
	 * privilege with a string, and multiple privileges may be specified as an array of strings.
	 *
	 * If $assert is provided, then its assert() method must return true in order for
	 * the rule to apply. If $assert is provided with $roles, $resources, and $privileges all
	 * equal to null, then a rule having a type of:
	 *
	 *	  TYPE_ALLOW will imply a type of TYPE_DENY, and
	 *
	 *	  TYPE_DENY will imply a type of TYPE_ALLOW
	 *
	 * when the rule's assertion fails. This is because the ACL needs to provide expected
	 * behavior when an assertion upon the default ACL rule fails.
	 *
	 * @param  string								   $operation
	 * @param  string								   $type
	 * @param  Zend_Acl_Role_Interface|string|array	 $roles
	 * @param  Zend_Acl_Resource_Interface|string|array $resources
	 * @param  string|array							 $privileges
	 * @param  Zend_Acl_Assert_Interface				$assert
	 * @throws Zend_Acl_Exception
	 * @uses   Zend_Acl_Role_Registry::get()
	 * @uses   Zend_Acl::get()
	 * @return Zend_Acl Provides a fluent interface
	 */
	public function setRule($operation, $type, $roles = null, $resources = null, $privileges = null,
							Zend_Acl_Assert_Interface $assert = null)

Generated API documentation is merely a reference guide, not a book to read before sleep. Lengthy descriptions truly do not belong here.

The most popular place for expansive documentation is file headers:

<?php
/**
 * Zend Framework
 *
 * LICENSE
 *
 * This source file is subject to the new BSD license that is bundled
 * with this package in the file LICENSE.txt.
 * It is also available through the world-wide-web at this URL:
 * http://framework.zend.com/license/new-bsd
 * If you did not receive a copy of the license and are unable to
 * obtain it through the world-wide-web, please send an email
 * to license@zend.com so we can send you a copy immediately.
 *
 * @category   Zend
 * @package	Zend_Db
 * @subpackage Adapter
 * @copyright  Copyright (c) 2005-2012 Zend Technologies USA Inc. (http://www.zend.com)
 * @license	http://framework.zend.com/license/new-bsd	 New BSD License
 * @version	$Id: Abstract.php 25229 2013-01-18 08:17:21Z frosch $
 */

Sometimes it seems the intention is to stretch the header so long that upon opening the file, the code itself is not visible. What's the use of a 10-line information about the New BSD license, which contains key announcements like its availability in the LICENSE.txt file, accessible via the world-wide-web, and if you happen to lack modern innovations like a so-called web browser, you should send an email to license@zend.com, and they will send it to you immediately? Furthermore, it's redundantly repeated 4,400 times. I tried sending a request, but the response did not come 🙂

Also, including the copyright year in copyrights leads to a passion for making commits like update copyright year to 2014, which changes all files, complicating version comparison.

Is it really necessary to include copyright in every file? From a legal perspective, it is not required, but if open source licenses allow users to use parts of the code while retaining copyrights, it is appropriate to include them. It's also useful to state in each file which product it originates from, helping people navigate when they encounter it individually. A good example is:

/**
 * Zend Framework (http://framework.zend.com/)
 *
 * @link	  http://github.com/zendframework/zf2 for the canonical source repository
 * @copyright Copyright (c) 2005-2014 Zend Technologies USA Inc. (http://www.zend.com)
 * @license   http://framework.zend.com/license/new-bsd New BSD License
 */

Please think carefully about each line and whether it truly benefits the user. If not, it's rubbish that doesn't belong in the code.

(Please, commentators, do not perceive this article as a battle of frameworks; it definitely is not.)

How to encode and decode JSON in PHP?

Let's create simple OOP wrapper for encoding and decoding JSON in PHP:

class Json
{
	public static function encode($value)
	{
		$json = json_encode($value);
		if (json_last_error()) {
			throw new JsonException;
		}
		return $json;
	}

	public static function decode($json)
	{
		$value = json_decode($json);
		if (json_last_error()) {
			throw new JsonException;
		}
		return $value;
	}
}

class JsonException extends Exception
{
}

// usage:
$json = Json::encode($arg);

Simple.

But it is very naive. In PHP, there are a ton of bugs (sometime called as “not-a-bug”) that need workarounds.

json_encode() is (nearly) the only one function in whole PHP, which behavior is affected by directive display_errors. Yes, JSON encoding is affected by displaying directive. If you want detect error Invalid UTF-8 sequence, you must disable this directive. (#52397, #54109, #63004, not fixed).
json_last_error() returns the last error (if any) occurred during the last JSON encoding/decoding. Sometimes! In case of error Recursion detected it returns 0. You must install your own error handler to catch this error. (Fixed after years in PHP 5.5.0)
json_last_error() sometimes doesn't return the last error, but the last-but-one error. I.e. json_decode('') with empty string doesn't clear last error flag, so you cannot rely on error code. (Fixed in PHP 5.3.7)
json_decode() returns null if the JSON cannot be decoded or if the encoded data is deeper than the recursion limit. Ok, but json_encode('null') return null too. So we have the same return value for success and failure. Great!
json_decode() is unable to detect Invalid UTF-8 sequence in PHP < 5.3.3 or when PECL implementation is used. You must check it own way.
json_last_error() exists since PHP 5.3.0, so minimal required version for our wrapper is PHP 5.3
json_last_error() returns only numeric code. If you'd like to throw exception, you must create own table of messages (json_last_error_msg() was added in PHP 5.5.0)

So the simple class wrapper for encoding and decoding JSON now looks like this:

class Json
{
	private static $messages = array(
		JSON_ERROR_DEPTH => 'The maximum stack depth has been exceeded',
		JSON_ERROR_STATE_MISMATCH => 'Syntax error, malformed JSON',
		JSON_ERROR_CTRL_CHAR => 'Unexpected control character found',
		JSON_ERROR_SYNTAX => 'Syntax error, malformed JSON',
		5 /*JSON_ERROR_UTF8*/ => 'Invalid UTF-8 sequence',
		6 /*JSON_ERROR_RECURSION*/ => 'Recursion detected',
		7 /*JSON_ERROR_INF_OR_NAN*/ => 'Inf and NaN cannot be JSON encoded',
		8 /*JSON_ERROR_UNSUPPORTED_TYPE*/ => 'Type is not supported',
	);


	public static function encode($value)
	{
		// needed to receive 'Invalid UTF-8 sequence' error; PHP bugs #52397, #54109, #63004
		if (function_exists('ini_set')) { // ini_set is disabled on some hosts :-(
			$old = ini_set('display_errors', 0);
		}

		// needed to receive 'recursion detected' error
		set_error_handler(function($severity, $message) {
			restore_error_handler();
			throw new JsonException($message);
		});

		$json = json_encode($value);

		restore_error_handler();
		if (isset($old)) {
			ini_set('display_errors', $old);
		}
		if ($error = json_last_error()) {
			$message = isset(static::$messages[$error]) ? static::$messages[$error] : 'Unknown error';
			throw new JsonException($message, $error);
		}
		return $json;
	}


	public static function decode($json)
	{
		if (!preg_match('##u', $json)) { // workaround for PHP < 5.3.3 & PECL JSON-C
			throw new JsonException('Invalid UTF-8 sequence', 5);
		}

		$value = json_decode($json);

		if ($value === null
			&& $json !== ''  // it doesn't clean json_last_error flag until 5.3.7
			&& $json !== 'null' // in this case null is not failure
		) {
			$error = json_last_error();
			$message = isset(static::$messages[$error]) ? static::$messages[$error] : 'Unknown error';
			throw new JsonException($message, $error);
		}
		return $value;
	}
}

This implementation is used in Nette Framework. There is also workaround for another bug, the JSON bug. In fact, JSON is not subset of JavaScript due characters \u2028and \u2029. They must be not used in JavaScript and must be encoded too.

(In PHP, detection of errors in JSON encoding/decoding is hell, but it is nothing compared to detection of errors in PCRE functions.)

Fat and Sausages in String Replacement

How not to get burned when replacing occurrences of one string with another. Search & Replace tricks.

The basic function for replacing strings in PHP is str_replace:

$s = "Lorem ipsum";
echo str_replace('ore', 'ide', $s); // returns "Lidem ipsum"

Thanks to cleverly designed UTF-8 encoding, it can be reliably used even for strings encoded this way. Additionally, the first two arguments can be arrays, and the function will then perform multiple replacements. Here we encounter the first trick to be aware of. Each replacement goes through the string again, so if we wanted to swap dá <⇒ pá in the phrase pánské dárky to get dánské párky (a Swedish delicacy!), no order of arguments will achieve this:

// returns "dánské dárky"
echo str_replace(array('dá', 'pá'), array('pá', 'dá'), "pánské dárky");

// returns "pánské párky"
echo str_replace(array('pá', 'dá'), array('dá', 'pá'), "pánské dárky");

The sought-after function that goes through the string just once and prevents collisions is strtr:

// returns "dánské párky", hooray
echo strtr("pánské dárky", array('pá' => 'dá', 'dá' => 'pá'));

If we need to find occurrences according to more complex rules, we use regular expressions and the function preg_replace. It also allows for multiple replacements and behaves similarly to str_replace. Now, however, I am heading elsewhere. I need to replace all numbers in the string with the word hafo, which is easy:

$s = "Radek says he has an IQ of 151. Quite the collector's item!";
echo preg_replace('#\d+#', 'hafo', $s);

Let's generalize the code so it can replace numbers with anything we pass in the variable $replacement. Many programmers will use:

return preg_replace('#\d+#', $replacement, $s); // wrong!

Unfortunately, that’s not right. It's important to realize that certain characters have special meanings in the replaced string (specifically the slash and dollar), so we must “escape” them: escaping definitive guide. The correct general solution is:

return preg_replace('#\d+#', addcslashes($replacement, '$\\'), $s); // ok

Do any other replacement tricks come to mind?

novější články starší články