Matthew Daly's Blog

I'm a web developer in Norfolk. This is my blog...

13th January 2019 6:50 pm

Writing a Custom Sniff for PHP Codesniffer

I’ve recently come around to the idea that in PHP all classes should be final by default, and have started doing so as a matter of course. However, when you start doing something like this it’s easy to miss a few files that haven’t been updated, or forget to do it, so I wanted a way to detect PHP classes that are not set as either abstract or final, and if possible, set them as final automatically. I’ve mentioned before that I use PHP CodeSniffer extensively, and that has the capability to both find and resolve deviations from a coding style, so last night I started looking into the possibility of creating a coding standard for this. It took a little work to understand how to do this so I thought I’d use this sniff as a simple example.

The first part is to set out the directory structure. There’s a very specific layout you have to follow for PHP CodeSniffer:

  • The folder for the standard must have the name of the standard, and be in the source folder set by Composer (in this case, src/AbstractOrFinalClassesOnly.
  • This folder must contain a ruleset.xml file defining the name and description of the standard, and any other required content.
  • Any defined sniffs must be in a Sniffs folder.

The ruleset.xml file was fairly simple in this case, as this is a very simple standard:

<?xml version="1.0"?>
<ruleset name="AbstractOrFinalClassesOnly">
<description>Checks all classes are marked as either abstract or final.</description>
</ruleset>

The sniff is intended to do the following:

  • Check all classes have either the final keyword or the abstract keyword set
  • When running the fixer, make all classes without the abstract keyword final

First of all, our class must implement the interface PHP_CodeSniffer\Sniffs\Sniff, which requires the following methods:

public function register(): array;
public function process(File $file, $position): void;

Note that File here is an instance of PHP_CodeSniffer\Files\File. The first method registers the code the sniff should operate on. Here we’re only interested in classes, so we return an array containing T_CLASS. This is defined in the list of parser tokens used by PHP, and represents classes and objects:

public function register(): array
{
return [T_CLASS];
}

For the process() method, we receive two arguments, the file itself, and the position. We need to keep a record of the tokens we check for, so we do so in a private property:

private $tokens = [
T_ABSTRACT,
T_FINAL,
];

Then, we need to find the error:

if (!$file->findPrevious($this->tokens, $position)) {
$file->addFixableError(
'All classes should be declared using either the "abstract" or "final" keyword',
$position - 1,
self::class
);
}

We use $file to get the token before class, and pass the $tokens property as a list of acceptable values. If the preceding token is not either abstract or final, we add a fixable error. The first argument is the string error message, the second is the location, and the third is the class of the sniff that has failed.

That will catch the issue, but won’t actually fix it. To do that, we need to get the fixer from the file object, and call its addContent() method to add the final keyword. We amend process() to extract the fixer, add it as a property, and then call the fix() method when we come across a fixable error:

public function process(File $file, $position): void
{
$this->fixer = $file->fixer;
$this->position = $position;
if (!$file->findPrevious($this->tokens, $position)) {
$file->addFixableError(
'All classes should be declared using either the "abstract" or "final" keyword',
$position - 1,
self::class
);
$this->fix();
}
}

Then we define the fix() method:

private function fix(): void
{
$this->fixer->addContent($this->position - 1, 'final ');
}

Here’s the finished class:

<?php declare(strict_types=1);
namespace Matthewbdaly\AbstractOrFinalClassesOnly\Sniffs;
use PHP_CodeSniffer\Sniffs\Sniff;
use PHP_CodeSniffer\Files\File;
/**
* Sniff for catching classes not marked as abstract or final
*/
final class AbstractOrFinalSniff implements Sniff
{
private $tokens = [
T_ABSTRACT,
T_FINAL,
];
private $fixer;
private $position;
public function register(): array
{
return [T_CLASS];
}
public function process(File $file, $position): void
{
$this->fixer = $file->fixer;
$this->position = $position;
if (!$file->findPrevious($this->tokens, $position)) {
$file->addFixableError(
'All classes should be declared using either the "abstract" or "final" keyword',
$position - 1,
self::class
);
$this->fix();
}
}
private function fix(): void
{
$this->fixer->addContent($this->position - 1, 'final ');
}
}

I’ve made the resulting standard available via Github.

This is a bit rough and ready and I’ll probably refactor it a bit when I have time. In addition, it’s not quite displaying the behaviour I want as it should, since ideally it should only be looking for the abstract and final keywords in classes that implement an interface. However, it’s proven fairly easy to create this sniff, except for the fact I had to go rooting around various tutorials that weren’t all that clear. Hopefully this example is a bit simpler and easier to follow.

3rd January 2019 11:55 pm

You Don't Need That Module Package

Lately I’ve seen a number of Laravel packages being posted on places like Reddit that offer ways to make your project more modular by letting you break their classes out of the usual structure and place them in a separate folder called something like packages/ or modules/. However, these packages are completely redundant, and it requires very little work to achieve the same thing with Composer alone. In addition, much of it is not specific to Laravel and can also be applied to any other framework that uses Composer.

There are two main approaches I’m aware of - keeping it in a single project, and moving the modules to separate Composer packages.

Single project

Suppose we have a brand new Laravel project with the namespace left as the default App. This is what the autoload section of the composer.json file will look like:

"autoload": {
"psr-4": {
"App\\": "app/"
},
"classmap": [
"database/seeds",
"database/factories"
]
},

Composer allows for numerous ways to autoload classes and you can add additional namespaces as you wish. Probably the best approach is to use PSR-4 autoloading, as in this example:

"autoload": {
"psr-4": {
"App\\": "app/",
"Packages\\": "packages"
},
"classmap": [
"database/seeds",
"database/factories"
]
},

Now, if you put the model Post.php in the folder, packages/Blog/Models/, then this will map to the namespace Packages\Blog\Models\Post, and if you set the namespace to this in the file, and run composer dump-autoload, you should be able to import it from that namespace without trouble. As with the App\ namespace, because it’s using PSR-4 you’re only specifying the top-level namespace and the folders and files underneath have to mirror the namespace, so for instance, Packages\Foo\Bar maps to packages/Foo/Bar.php. If for some reason PSR-4 autoloading doesn’t map well to what you want to do, then there are other methods you can use - refer to the relevant section of the Composer documentation for the other methods available.

The controllers are the toughest part, because by default Laravel’s routing works on the assumption that the controllers are all under the App\Http\Controllers namespace, so you can shorten the namespace used. There are two ways around this I’m aware of. One is to specify the full namespace when referencing each controller:

Route::get('/', '\App\Modules\Http\Controllers\FooController@index');

The other option is to update the RouteServiceProvider.php‘s namespace property. It defaults to this:

protected $namespace = 'App\Http\Controllers';

If there’s a more convenient namespace you want to place all your controllers under, then you can replace this, and it will become the default namespace applied in your route files.

Other application components such as migrations, routes and views can be loaded from a service provider very easily. Just create a service provider for your module, register it in config/app.php, and set up the boot() method to load whichever components you want from the appropriate place, as in this example:

$this->loadMigrationsFrom(__DIR__.'/../database/migrations');
$this->loadRoutesFrom(__DIR__.'/../routes.php');
$this->loadViewsFrom(__DIR__.'/../views', 'comments');

Separate packages

The above approach works particularly well in the initial stages of a project, when you may need to jump around a lot to edit different parts of the project. However, later on, once many parts of the project have stabilised, it may make more sense to pull the modules out into separate repositories and use Composer to pull them in as dependencies, using its support for private repositories. I’ve also often taken this approach right from the start without issue.

This approach has a number of advantages. It makes it easier to reuse parts of the project in other projects if need be. Also, if you put your tests in the packages containing the components they test, it means that rather than running one monolithic test suite for the whole project, you can instead run each module’s tests each time you change it, and limit the test suite of the main project to those integration and acceptance tests that verify the whole thing, along with any unit tests for code that remains in the main repository, resulting in quicker test runs.

Don’t get me wrong, making your code more modular is definitely a good thing and I’m wholly in favour of it. However, it only takes a little knowledge of Composer to be able to achieve this without any third party package at all, which is good because you’re no longer dependent on a package that may at any time fall behind the curve or be abandoned.

2nd January 2019 11:00 pm

Why Bad Code Is Bad

This may sound a little trite, but why is it bad to write bad code?

Suppose you’re a client, or a line manager for a team of developers. You work with developers regularly, but when they say that a code base is bad, what are the consequences of that, and how can you justify spending time and money to fix it? I’ve often heard the refrain “If it works, it doesn’t matter”, which may have a grain of truth, but is somewhat disingenuous. In this post, I’ll explain some of the consequences when your code base is bad. It can be hard to put a definitive price tag on the costs associated with delivering bad code, but this should give some idea of the sort of issues you should take into account.

Bad code kills developer productivity

Bad code is harder to understand, navigate and reason about than good code. Developers are not superhuman, and we can only hold so much in our heads at one time, which is why many of the principles behind a clean and maintainable code base can essentially be boiled down to “break it into bite-sized chunks so developers can understand each one in isolation before seeing how they fit together”.

If one particular class or function gets too big and starts doing too much, it quickly becomes very, very hard to get your head around what that code does. Developers typically have to build a mental model of how a class or function works before they can use it effectively, and the smaller and simpler you can keep each unit of code, the less time and effort it takes to do so. The mark of a skilled developer is not the complexity of their code bases, but their simplicity - they’ve learned to make their code as small, simple, and readable as possible. A clean and well laid-out code base makes it easy for developers to get into the mental state called “flow” that is significantly more productive.

In addition, if an application doesn’t conform to accepted conventions in some way, such as using inappropriate HTTP verbs (eg GET to change the state of something), then quite apart from the fact that it won’t play well with proxy servers, it imposes an additional mental load on developers by forcing them to drop a reasonable set of assumptions about how the application works. If the application used the correct HTTP verbs, experienced developers would know without being told that to create a new report, you’d send a POST request to the reports API endpoint.

During the initial stages of a project, functionality can be delivered quite quickly, but if the code quality is poor, then over time developer velocity can decrease. Ensuring a higher quality code base helps to maintain velocity at a consistent level as it gets bigger. This also means estimates will be more accurate, so if you quote a given number of hours for a feature, you’re more likely to deliver inside that number of hours.

Bad code is bad for developer welfare

A code base that’s repetitive, badly organised, overly complex and hard to read is a recipe for stressed developers, making burnout more likely. If a developer suffers burnout, their productivity will drop substantially.

In the longer term, if developer burnout isn’t managed correctly, it could easily increase developer turnover as stressed developers quit. It’s also harder to recruit new developers if they’re faced with the prospect of dealing with a messy, stressful code base.

Bad code hampers your ability to pivot

If the quality of your code base is poor, it can mean that if functionality needs to be changed or added, then more work is involved. Repetitive code can mean something has to be updated in more than one place, and if it becomes too onerous, it can make it too time-consuming or expensive to justify the changes.

Bad code may threaten the long-term viability of your project

One thing that is certain in our industry is that things change. Libraries, languages and frameworks are constantly being updated, and sometimes there will be potentially breaking changes to some of these. On occasion, a library or framework will be discontinued, making it necessary to migrate to a replacement.

Bad code is often tightly coupled to a particular framework or library, and sometimes even to a particular version, making it harder to migrate if it becomes necessary. If a project was written with a language or framework version that had a serious issue, and was too tightly coupled to migrate to a newer version, it might be too risky to keep it running, or it might be necessary to run an insecure application in spite of the risks it posed.

Bad code is more brittle

A poor code base will break, a lot, and often in ways that are clearly visible to end users. Duplicate code makes it easy to miss cases where something needs to be updated in more than one place, and if the code base lacks tests, a serious error may not be noticed for a long time, especially if it’s something comparatively subtle.

Bad code is hard, if not impossible, to write automated tests for

If a particular class or function does too much, it becomes much harder to write automated tests for it because there are more variables going in and more expected outcomes. A sufficiently messy code base may only really be testable by automating the browser, which tends to be very slow and brittle, making test-driven development impractical. Manual testing is no substitute for a proper suite of automated tests, since it’s slower, less consistent and not repeatable in the same way, and it’s only sufficient by itself for the most trivial of web apps.

Bad code is often insecure

A bad code base may inadvertently expose user’s data, or be at risk from all kinds of attacks such as cross-site scripting and SQL injection attacks that can also potentially expose too much data.

For any business with EU-based users, the risks of exposing user’s data are very serious. Under the GDPR, there’s a potential fine of up to €20 million, or 4% of turnover. That’s potentially an existential risk for many companies.

In addition, a bad code base is often more vulnerable to denial-of-service attacks. If it has poor or no caching, excessive queries, or inefficient queries, then every time a page loads it will carry out more queries than a more optimised site would. Given the same server specs, the inefficient site will be overwhelmed quicker than the efficient one.

Summary

It’s all too easy to focus solely on delivering a working product and not worry about the quality of the code base when time spent cleaning it up doesn’t pay the bills, and it can be hard to justify the cost of cleaning it up later to clients.

There are tools you can use to help keep up code quality, such as linters and static analysers, and it’s never a bad idea to investigate the ones available for the language(s) you work in. For best results they should form part of your continuous integration pipeline, so you can monitor changes over time and prompt developers who check in problematic code to fix the issues. Code reviews are another good way to avoid bad code, since they allow developers to find problematic code and offer more elegant solutions.

I’m not suggesting that a code base that has a few warts has no value, or that you should sink huge amounts of developer time into refactoring messy code when money is tight, as commercial concerns do have to come first. But a bad code base does cause serious issues that have financial implications, and it’s prudent to recognise the problems it could cause, and take action to resolve them, or better yet, prevent them occurring in the first place.

27th December 2018 6:37 pm

Improving Search in Vim and Neovim With FZF and Ripgrep

A while back I was asked to make some changes to a legacy project that was still using Subversion. This was troublesome because my usual method of searching in files is to use Tim Pope’s Fugitive Vim plugin as a frontend for git grep, and so it would be harder than usual to navigate the project. I therefore started looking around for alternative search systems, and one combination that kept on coming up was FZF and Ripgrep, so I decided to give them a try. FZF is a fuzzy file finder, written in Go, while Ripgrep is an extremely fast grep, written in Rust, that respects gitignore rules by default. Both have proven so useful they’re now a permanent part of my setup.

On Mac OS X, both are available via Homebrew, so they’re easy to install. On Ubuntu, Ripgrep is in the repositories, but FZF isn’t, so it was necessary to install it in my home directory. There’s a Vim plugin for FZF and Ripgrep integration, which, since I use vim-plugged, I could install by adding the following to my init.vim, then running PlugUpdate from Neovim:

" Search
Plug '~/.fzf'
Plug 'junegunn/fzf.vim'

The plugin exposes a number of commands that are very useful, and I’ll go through the ones I use most often:

  • :Files is for finding files by name. I used to use Ctrl-P for this, but FZF is so much better and quicker that I ditched Ctrl-P almost immediately (though you can map :Files to it if you want to use the same key).
  • :Rg uses Ripgrep to search for content in files, so you can search for a specific string. This makes it an excellent replacement for the Ggrep command from Fugitive.
  • :Snippets works with Ultisnips to provide a filterable list of available snippets you can insert, making it much more useful
  • :Tags allows you to filter and search tags in the project as a whole
  • :BTags does the same, but solely in the current buffer
  • :Lines allows you to find lines in the project and navigate to them.
  • :BLines does the same, but solely in the current buffer.

In addition to being useful in Neovim, FZF can also be helpful in Bash. You can use Ctrl-T to find file paths, and it enhances the standard Ctrl-R history search, making it faster and more easily navigable. The performance of both is also excellent - they work very fast, even on the very large legacy project I maintain, or on slower machines, and I never find myself waiting for them to finish. Both tools have quickly become an indispensible part of my workflow.

6th December 2018 6:34 pm

Decorating Service Classes

I’ve written before about using decorators to extend the functionality of existing classes, in the context of the repository pattern when working with Eloquent. However, the same practice is applicable in many other contexts.

Recently, I was asked to add RSS feeds to the home page of the legacy project that is my main focus these days. The resulting service class looked something like this:

<?php
namespace App\Services;
use Rss\Feed\Reader;
use App\Contracts\Services\FeedFetcher;
class RssFetcher implements FeedFetcher
{
public function fetch($url)
{
return Reader::import($url);
}
}

In accordance with the principle of loose coupling, I also created an interface for it:

<?php
namespace App\Contracts\Services;
interface FeedFetcher
{
public function fetch($url);
}

I was recently able to add dependency injection to the project using PHP-DI, so now I can inject an instance of the feed fetcher into the controller by typehinting the interface and having it resolve to the RssFetcher class.

However, there was an issue. I didn’t want the application to make multiple HTTP requests to fetch those feeds every time the page loads. At the same time, it was also a bit much to have a scheduled task running to fetch those feeds and store them in the database, since many times that would be unnecessary. The obvious solution was to cache the feed content for a specified length of time, in this case five minutes.

I could have integrated the caching into the service class itself, but that wasn’t the best practice, because it would be tied to that implementation. If in future we needed to switch to a different feed handler, we’d have to re-implement the caching functionality. So I decided it made sense to decorate the service class.

The decorator class implemented the same interface as the feed fetcher, and accepted another instance of that interface in the constructor, along with a PSR6-compliant caching library. It looked something like this:

<?php
namespace App\Services;
use App\Contracts\Services\FeedFetcher;
use Psr\Cache\CacheItemPoolInterface;
class FetcherCachingDecorator implements FeedFetcher
{
protected $fetcher;
protected $cache;
public function __construct(FeedFetcher $fetcher, CacheItemPoolInterface $cache)
{
$this->fetcher = $fetcher;
$this->cache = $cache;
}
public function fetch($url)
{
$item = $this->cache->getItem('feed_'.$url);
if (!$item->isHit()) {
$item->set($this->fetcher->fetch($url));
$this->cache->save($item);
}
return $item->get();
}
}

Now, when you instantiate the feed fetcher, you wrap it in the decorator as follows:

<?php
$fetcher = new FetcherCachingDecorator(
new App\Services\RssFetcher,
$cache
);

As you can see, this solves our problem quite nicely. By wrapping our feed fetcher in this decorator, we keep the caching layer completely separate from any one implementation of the fetcher, so in the event we need to swap the current one out for another implementation, we don’t have to touch the caching layer at all. As long as we’re using dependency injection to resolve this interface, we’re only looking at a little more code to instantiate it.

In addition, this same approach can be applied for other purposes, and you can wrap the service class as many times as necessary. For instance, if we wanted to log all the responses we got, we could write a logging decorator something like this:

<?php
namespace App\Services;
use App\Contracts\Services\FeedFetcher;
use Psr\Log\LoggerInterface;
class FeedLoggingDecorator implements FeedFetcher
{
protected $fetcher;
protected $logger;
public function __construct(FeedFetcher $fetcher, LoggerInterface $logger)
{
$this->fetcher = $fetcher;
$this->logger = $logger;
}
public function fetch($url)
{
$response = $this->fetcher->fetch($url);
$this->logger->info($response);
return $response;
}
}

The same idea can be applied to an API client. For instance, say we have the following interface for an API client:

<?php
namespace Foo\Bar\Contracts;
use Foo\Bar\Objects\Item;
use Foo\Bar\Objects\ItemCollection;
interface Client
{
public function getAll(): ItemCollection;
public function find(int $id): Item;
public function create(array $data): Item;
public function update(int $id, array $data): Item;
public function delete(int $id);
}

Now, of course any good API client should respect HTTP headers and use those to do some caching itself, but depending on the use case, you may also want to cache these requests yourself. For instance, if the only changes to the entities stored by the third party API will be ones you’ve made, or they don’t need to be 100% up to date, you may be better off caching those responses before they reach the actual API client. Under those circumstances, you might write a decorator like this to do the caching:

<?php
namespace Foo\Bar\Services;
use Foo\Bar\Contracts\Client;
use Psr\Cache\CacheItemPoolInterface;
class CachingDecorator implements Client
{
protected $client;
protected $cache;
public function __construct(Client $client, CacheItemPoolInterface $cache)
{
$this->client = $client;
$this->cache = $cache;
}
public function getAll(): ItemCollection
{
$item = $this->cache->getItem('item_all');
if (!$item->isHit()) {
$item->set($this->client->getAll());
$this->cache->save($item);
}
return $item->get();
}
public function find(int $id): Item
{
$item = $this->cache->getItem('item_'.$id);
if (!$item->isHit()) {
$item->set($this->client->find($id));
$this->cache->save($item);
}
return $item->get();
}
public function create(array $data): Item
{
$this->cache->clear();
return $this->client->create($data);
}
public function update(int $id, array $data): Item
{
$this->cache->clear();
return $this->client->update($id, $data);
}
public function delete(int $id)
{
$this->cache->clear();
return $this->client->delete($id);
}
}

Any methods that change the state of the data on the remote API will clear the cache, while any that fetch data will first check the cache, only explicitly fetching data from the API when the cache is empty, and caching it again. I won’t go into how you might write a logging decorator for this, but it should be straightforward to figure out for yourself.

The decorator pattern is a very powerful way of adding functionality to a class without tying it to a specific implementation. If you’re familiar with how middleware works, decorators work in a very similar fashion in that you can wrap your service in as many layers as you wish in order to accomplish specific tasks, and they adhere to the single responsibility principle by allowing you to use different decorators for different tasks.

Recent Posts

Writing a Custom Sniff for PHP Codesniffer

You Don't Need That Module Package

Why Bad Code Is Bad

Improving Search in Vim and Neovim With FZF and Ripgrep

Decorating Service Classes

About me

I'm a web and mobile app developer based in Norfolk. My skillset includes Python, PHP and Javascript, and I have extensive experience working with CodeIgniter, Laravel, Zend Framework, Django, Phonegap and React.js.