Matthew Daly's Blog

I'm a web developer in Norfolk. This is my blog...

5th October 2018 7:36 pm

Understanding the Pipeline Pattern

In a previous post, I used the pipeline pattern to demonstrate processing letters using optical recognition and machine learning. The pipeline pattern is something I’ve found very useful in recent months. For a sequential series of tasks, this approach can make your code easier to understand by allowing you to break it up into simple, logical steps which are easy to test and understand individually. If you’re familiar with pipes and redirection in Unix, you’ll be aware of how you can chain together multiple, relatively simple commands to carry out some very complex transformations on data.

A few months back, I was asked to build a webhook for a Facebook lead form at work. One of my colleagues was having to manually export CSV data from Facebook for the data, and then import it into a MySQL database and a Campaign Monitor mailing list, which was an onerous task, so they asked me to look at more automated solutions. I wound up building a webhook with Lumen that would go through the following steps:

  • Get the lead ID’s from the webhook
  • Pull the leads from the Facebook API using those ID’s
  • Process the raw data into a more suitable format
  • Save the data to the database
  • Push the data to Campaign Monitor

Since this involved a number of discrete steps, I chose to implement each step as a separate stage. That way, each step was easy to test in isolation, and it was easily reusable. As it turned out, this approach saved us because Facebook needed to approve this app (and ended up rejecting it - their documentation at the time wasn’t clear on implementing server-to-server apps, making it hard to meet their guidelines), so we needed an interim solution. I instead wrote an Artisan task for importing the file from a CSV, which involved the following steps:

  • Read the rows from the CSV file
  • Format the CSV data into the desired format
  • Save the data to the database
  • Push the data to Campaign Monitor

This meant that two of the existing steps could be reused, as is, without touching the code or tests. I just added two new classes to read the data and format the data, and the Artisan command, which simply called the various pipeline stages, and that was all. In this post, I’ll demonstrate how I implemented this.

While there is more than one implementation of this available, and it wouldn’t be hard to roll your own, I generally use the PHP League’s Pipeline package, since it’s simple, solid and well-tested. Let’s say our application has three steps:

  • Format the request data
  • Save the data
  • Push it to a third party service.

We therefore need to write a stage for each step in the process. Each one must be a callable, such as a closure, a callback, or a class that implements the __invoke() magic method. I usually go for the latter as it allows you to more easily inject dependencies into the stage via its constructor, making it easier to use and test. Here’s what our first stage might look like:

<?php
namespace App\Stages;
use Illuminate\Support\Collection;
class FormatData
{
public function __invoke(Collection $data): Collection
{
return $data->map(function ($item) {
return [
'name' => $item->fullname,
'email' => $item->email
];
});
}
}

This class does nothing more than receive a collection, and format the data as expected. We could have it accept a request object instead, but I opted not to because I felt it made more sense to pass the data in as a collection so it’s not tied to an HTTP request. That way, it can also handle data passed through from a CSV file using an Artisan task, and the details of how it receives the data in the first place are deferred to the class that calls the pipeline in the first place. Note this stage also returns a collection, for handling by the next step:

<?php
namespace App\Stages;
use App\Lead;
use Illuminate\Support\Collection;
class SaveData
{
public function __invoke(Collection $data): Collection
{
return $data->map(function ($item) {
$lead = new Lead;
$lead->name = $item->name;
$lead->email = $item->email;
$lead->save();
return $lead;
}
}
}

This step saves each lead as an Eloquent model, and returns a collection of the saved models, which are passed to the final step:

<?php
namespace App\Stages;
use App\Contracts\Services\MailingList;
use Illuminate\Support\Collection;
class AddDataToList
{
protected $list;
public function __construct(MailingList $list)
{
$this->list = $list;
}
public function __invoke(Collection $data)
{
return $data->each(function ($item) {
$this->list->add([
'name' => $item->name,
'email' => $item->email
]);
});
}
}

This step uses a wrapper class for a mailing service, which is passed through as a dependency in the constructor. The __invoke() method then loops through each Eloquent model and uses it to fetch the data, which is then added to the list. With our stages complete, we can now put them together in our controller:

<?php
namespace App\Http\Controllers;
use Illuminate\Http\Request;
use App\Stages\FormatData;
use App\Stages\SaveData;
use App\Stages\AddDataToList;
use League\Pipeline\Pipeline;
use Illuminate\Support\Collection;
class WebhookController extends Controller
{
public function store(Request $request, Pipeline $pipeline, FormatData $formatData, SaveData $savedata, AddDataToList $addData)
{
try {
$data = Collection::make($request->get('data'));
$pipe = $pipeline->pipe($formatData)
->pipe($saveData)
->pipe($addData);
$pipe->process($data);
} catch (\Exception $e) {
// Handle exception
}
}
}

As mentioned above, we extract the request data (assumed to be an array of data for a webhook), and convert it into a collection. Then, we put together our pipeline. Note that we use dependency injection to fetch the steps - feel free to use method or constructor injection as appropriate. We instantiate our pipeline, and call the pipe() method multiple times to add new stages.

Finally we pass the data through to our pipe for processing by calling the process() method, passing in the initial data. Note that we can wrap the whole thing in a try...catch statement to handle exceptions, so if something happens that would mean we would want to cease processing at that point, we can throw an exception in the stage and handle it outside the pipeline.

This means that our controller is kept very simple. It just gets the data as a collection, then puts the pipeline together and passes the data through. If we subsequently had to write an Artisan task to do something similar from the command line, we could fetch the data via a CSV reader class, and then pass it to the same pipeline. If we needed to change the format of the initial data, we could replace the FormatData class with a single separate class with very little trouble.

Another thing you can do with the League pipeline package, but I haven’t yet had the occasion to try, is use League\Pipeline\PipelineBuilder to build pipelines in a more dynamic fashion. You can make steps conditional, as in this example:

<?php
use League\Pipeline\PipelineBuilder;
$builder = (new PipelineBuilder)
->add(new FormatData);
if ($data['type'] = 'foo') {
$builder->add(new HandleFooType);
}
$builder->add(new SaveData);
$pipeline = $builder->build();
$pipeline->process($data);

The pipeline pattern isn’t appropriate for every situation, but for anything that involves a set of operations on the same data, it makes a lot of sense, and can make it easy to break larger operations into smaller steps that are easier to understand, test, and re-use.

3rd October 2018 11:07 pm

Replacing Switch Statements With Polymorphism in PHP

For the last few months, I’ve been making a point of picking up on certain antipatterns, and ways to avoid or remove them. One I’ve seen a lot recently is unnecessary large switch-case or if-else statements. For instance, here is a simplified example of one of these, which renders links to different objects:

<?php
switch ($item->getType()) {
case 'audio':
$media = new stdClass;
$media->type = 'audio';
$media->duration = $item->getLength();
$media->name = $item->getName();
$media->url = $item->getUrl();
case 'video':
$media = new stdClass;
$media->type = 'video';
$media->duration = $item->getVideoLength();
$media->name = $item->getTitle();
$media->url = $item->getUrl();
}
return '<a href="'.$media->url.'" class="'.$media->type.'" data-duration="'.$media->duration.'">'.$media->name.'</a>';

There are a number of problems with this, most notably the fact that it’s doing a lot of work to try and create a new set of objects that behave consistently. Instead, your objects should be polymorphic - in other words, you should be able to treat the original objects the same.

While strictly speaking you don’t need one, it’s a good idea to create an interface that defines the required methods. That way, you can have those objects implement that interface, and be certain that they have all the required methods:

<?php
namespace App\Contracts;
interface MediaItem
{
public function getLength(): int;
public function getName(): string;
public function getType(): string;
public function getUrl(): string;
}

Then, you need to implement that interface in your objects. It doesn’t matter if the implementations are different, as long as the methods exist. That way, objects can define how they return a particular value, which is simpler and more logical than defining it in a large switch-case statement elsewhere. It also helps to prevent duplication. Here’s what the audio object might look like:

<?php
namespace App\Models;
use App\Contracts\MediaItem;
class Audio implements MediaItem
{
public function getLength(): int
{
return $this->length;
}
public function getName(): string
{
return $this->name;
}
public function getType(): string
{
return $this->type;
}
public function getUrl(): string
{
return $this->url;
}
}

And here’s a similar example of the video object:

<?php
namespace App\Models;
use App\Contracts\MediaItem;
class Video implements MediaItem
{
public function getLength(): int
{
return $this->getVideoLength();
}
public function getName(): string
{
return $this->getTitle();
}
public function getType(): string
{
return $this->type;
}
public function getUrl(): string
{
return $this->url;
}
}

With that done, the code to render the links can be greatly simplified:

<?php
return '<a href="'.$item->getUrl().'" class="'.$item->getType().'" data-duration="'.$item->getLength().'">'.$media->getName().'</a>';

Because we can use the exact same methods and get consistent responses, yet also allow for the different implementations within the objects, this approach allows for much more elegant and readable code. Different objects can be treated in the same way without the need for writing extensive if or switch statements.

I haven’t had the occasion to do so, but in theory this approach is applicable in other languages, such as Javascript or Python (although these languages don’t have the concept of interfaces). Since discovering the swtch statement antipattern and how to replace it with polymorphism, I’ve been able to remove a lot of overly complex code.

25th September 2018 10:03 pm

Career Direction After Seven Years

Earlier this month, I passed the seven year anniversary of starting my first web dev job. That job never really worked out, for various reasons, but since then I’ve had an interesting time of it. I’ve diversified into app development via Phonegap, and I’ve worked with frameworks that didn’t exist when I first started. So it seems a good opportunity to take stock and think about where I want to head next.

Sometimes these posts are where someone announces they’re leaving their current role, but that’s not the case here - I’m pretty happy where I am right now. I am maintaining a legacy project, but I do feel like I’m making a difference and it’s slowly becoming more pleasant to work with, and I’m learning a lot about applying design patterns, so I think where I am right now is a good place for me. However, it’s a useful exercise to think about what I want to do, where I want to concentrate my efforts, and what I want to learn about.

So, here are my thoughts about where I want to go in future:

  • I really enjoy working with React, and I want to do so much more than I have in the past, possibly including React Native. Ditto with Redux.
  • Much as I love Django, it’s unlikely I’ll be using it again in the future, as it’s simply not in much demand where I live. In 2015, I was working at a small agency with a dev team of three, including me, and it became apparent that we needed to standardise on a single framework. I’d been using CodeIgniter on and off for several years, but it was tired and dated, yet I couldn’t justify using Django because no-one else was familiar with Python, so we settled on Laravel. Ever since, Laravel has been my go-to framework - Django does some things better (Django REST Framework remains the best way I’ve ever found to create a REST API), but Laravel does enough stuff well enough that I can use it for most things I need, so it’s a good default option.
  • I really don’t want to work with Wordpress often, and if I do, I’d feel a lot better about it if I used Bedrock. Just churning out boilerplate sites is anathema to me - I’d much rather do something more interesting, even if it were paid worse.
  • PHP is actually pretty nice these days (as long as you’re not dealing with a legacy application), and I generally don’t mind working with it, as long as it’s fairly modern.
  • I enjoy mentoring and coaching others, and I’d like to do that a lot more often than I have been doing. Mentoring and coaching is a big part of being a senior developer, since a good mentor can quickly bring inexperienced developers up to a much higher standard, and hugely reduces the amount of horrible legacy code that needs to be maintained. I was without an experienced mentor for much of my career, and in retrospect it held me back - having someone around to teach me about TDD and design patterns earlier would have helped no end. Also, I find it the single most rewarding part of my job.
  • I have absolutely no desire whatsoever to go into management, or leave coding behind in any way, shape or form. I’ve heard it said before that Microsoft have two separate career tracks for developers, one through people management, the other into a software architect role, and were I there, I would definitely opt for the latter.
  • I’m now less interested in learning new frameworks or languages than I am in picking up and applying new design patterns, and avoiding antipatterns - they’re the best way to improve your code quality. I’ve learned the hard way that the hallmark of a skilled developer’s code is not the complexity, but the simplicity - I can now recognise the convoluted code I wrote earlier in my career as painful to maintain, and can identify it in legacy projects.
  • I’ve always used linters and other code quality tools, and I’m eager to evangelise their usage.
  • I’ve been a proponent of TDD for several years now, and that’s going to continue - I’ve not only seen how many things it catches when you have tests, but also how painful it is when you have a large legacy project with no tests at all, and I’m absolutely staggered that anyone ever continues to write non-trivial production code without any sort of tests.
  • I want to understand the frameworks I use at a deeper level - it’s all too easy to just treat them as magic, when there are huge benefits to understanding how your framework works under the bonnet, and how to swap out the framework’s functionality for alternative implementations.
  • I’d like to get involved in more IoT-related projects - guess the 3 Raspberry Pi’s and the Arduino I have gathering dust at home need to get some more use…
  • Chat interfaces are interesting - I built an Alexa skill recently, which was fun and useful, and I’d like to do stuff like that more often.

So, after seven years, that’s where I see myself going in future. I think I’m in a good place to do that right now, and I’ll probably stay where I am for a good long while yet. The first seven years of my web dev career have been interesting, and I’m eager to see what the next seven bring.

24th September 2018 10:30 pm

How I'm Refactoring a Zend 1 Legacy Project

In my current job I’ve been maintaining and developing a Zend 1 legacy project for the best part of a year. It has to be said, it’s the worst code base I have ever seen, with textbook examples of many antipatterns, spaghetti jQuery, copy-pasted code and overly complex methods. It’s a fairly typical example of a project built on an older MVC framework by inexperienced developers (I’ve been responsible for building similar things in my CodeIgniter days).

In this article I’ll go through some of the steps I’ve taken to help bring this legacy project under control. Not all of them are complete as at time of writing, but they’ve all helped to make this decidedly crappy project somewhat better. In working with this legacy project, I’ve found Paul Jones’ book Modernizing Legacy Applications in PHP to be very useful, and if you’re working on a similar legacy project, I highly recommend investing in a copy. I’ve also found Sourcemaking to be a useful resource in identifying antipatterns in use, refactoring strategies, and applicable design patterns.

Moving to Git

When I first started working on the project, the repository was in Subversion, and was absolutely colossal - checking it out took two hours! Needless to say, my first action was to migrate it to Git. I used this post as a guide, and it was pretty straightforward, but took all of my first day.

Adding migrations

The next job involved making some changes to the database. Unfortunately, Zend 1 doesn’t include migrations, and no-one had added a third party solution. I therefore did some research and wound up stumbling across Phinx, which is a standalone migration package with a command-line runner. Using that, it was straightforward to start adding migrations to make any necessary changes to the database structure and fixtures.

Moving dependencies to Composer

The project was using Composer, but only to a limited degree - the framework itself was in the library/ folder, and several other dependencies were also stored here. The vendor/ directory was also checked into version control. I therefore took the vendor folder out of Git, and added zendframework/zendframework1 as a dependency. This drastically reduced the size of the repository.

Cleaning up commented code

There was an awful lot of commented code. Some of it was even commented out incorrectly (PHP code commented out with HTML comments). I’m of the school of thought that commented code is best deleted without a second thought, since it can be retrieved from version control, and it can be confusing, so I’ve been removing any commented code I come across.

Refactoring duplicate code

One of the biggest problems with the code base was the high level of duplication - a lot of code, particularly in the view layer, had been copied and pasted around. Running PHPCPD on the repository showed that, not including the views, around 12% of the code base was copied-and-pasted, which is a horrific amount. I therefore started aggressively refactoring duplicate code out into helpers and traits. As at today, the amount of duplication excluding the views is around 2.6%, which is obviously a big improvement.

Refactoring object creation code into persisters

There was some extremely complex code for creating and updating various objects that was jammed into the controllers, and involved a lot of duplicate code. I’ve used dedicated persister classes in the past with great effect, so I pulled that code out into persisters to centralise the logic about the creation of different objects. It’s still a lot more convoluted than I’d like, but at least now it’s out of the controllers and can be tested to some extent.

Creating repositories

One of the most problematic parts of the code base is the models. Whoever was responsible for them couldn’t seem to decide whether they represented a single domain object, or a container for methods for getting those objects, so both responsibilities were mixed up in the same class. This means you had to instantiate an object, then use it to call one of the methods to get another instance of that object, as in this example:

$media = new Application_Model_Media;
$media = $media->find(1);

I’ve therefore resolved to pull those methods out into separate repository classes, leaving the models as pure domain objects. Unfortunately, the lack of dependency injection makes it problematic to instantiate the repositories. For that reason, right now the repositories only implement static methods - it’s not ideal, but it’s better than what we have now.

I started out by creating interfaces for the methods I wanted to migrate, and had the models implement them. Then, I moved those methods from the model to the repository classes and amended all references to them, before removing the interfaces from the models. Now, a typical find request looks like this:

$media = App\Repository\Media::find(1);

It’s not done yet, but over half of them have been migrated.

Once that’s done, I’ll then be in a position to look at refactoring the logic in the models to make them easier to work with - right now each model has dedicated setters and getters (as well as some horrific logic to populate them), and I’m considering amending them to allow access to the properties via the __get() and __set() magic methods. Another option is to consider migrating the database layer to Doctrine, since that way we can reuse the getters and setters, but I haven’t yet made a firm decision about that.

Adding tests

The poor design of this application makes it difficult to test, so right now the coverage is poor. I’ve been using Behat to produce a basic set of acceptance tests for some of the most fundamental functionality, but they’re brittle and can be broken by database changes. I’ve also added some (even more brittle) golden master tests using a technique I’ll mention in a later blog post. I have got unit tests for three of the persister classes and some utility classes I’ve added, but nowhere near the level I want.

Refactoring code out of the fat controllers

Fat controllers are an antipattern I’ve seen, and indeed been responsible for myself, in the past, and this project has them in spades - running PHP Mess Detector on them is pretty sobering. The overwhelming majority of the code base is concentrated in the controllers, and it’s going to take a long time to refactor it into other classes.

Zend 1 does have the concept of controller helpers, and that’s been useful for removing some duplicate code, while more shared code has been refactored out into traits. In addition, the utilities I’ve added include a Laravel-style collection class, and using that I’ve been able to refactor a lot of quite complex array handling into much simpler chained collection handling. However, this is still going to take a lot of effort.

Adding events

The lack of a decent event system caused particular problems when I was asked to add tracking of when a user views certain resources, so I used the PHP League’s Event package for this. I’ve started moving some other functionality to event listeners too, but this is another thing that will take a long time.

Refactoring the front end

Like many legacy projects, the front end is a horrible mess of jQuery spaghetti code, with some Handlebars templates thrown in here and there for good measure. It’s easily complex enough that it would benefit from a proper front-end framework, but a full rewrite is out of the question.

I was recently asked to add two new modals in the admin interface, and decided that it was worth taking a new approach rather than adding yet more jQuery spaghetti. Angular 1 is on its way out, so that wasn’t an option, and Angular 2+ would necessitate using Typescript, which would likely be problematic in the context of a legacy app, as well as the complexity being an issue. Vue was a possibility, but I always feel like Vue tries to do too much. Instead, I decided to go for React, because:

  • I’ve always enjoyed working with React, even though I haven’t had much chance to do so in the past.
  • We’re using Laravel Mix for processing the CSS and JS files (it can be used on non-Laravel projects), and it has a preset for React
  • React is well-suited to being added incrementally to existing projects without the need for a full rewrite (after all, it works for Facebook…), so it was straightforward to do a single modal with it
  • It’s easy to test - you can use snapshot tests to check it remains consistent, and using Enzyme it’s straightforward to navigate the rendered component for other tests

Both modals turned out very well, and went live recently. The first one took a fair while to write, and then when I wrote the second one, I had to spend some time making the sub-components more generic and pulling some functionality out into a higher order component, but now that that’s done it should be straightforward to write more.

In the longer term I plan to migrate more and more of the admin to React over time. The front end also has a new home page on the cards, and the plan is to use React for that too. Once the whole UI is using React, that will have eliminated most, if not all, of the problems with duplicate code in the view layer, as well as allowing for eventually turning the application into a single-page web app.

Upgrading the PHP version and migrating to a new server

When I started work on the project, it was running on an old server running PHP 5.4, but there were plans to migrate to a new server running PHP 5.6. The lack of tests made it difficult to verify it wouldn’t break in 5.6, but using PHP Compatibility and CodeSniffer I was able to find most of the problems. I ran it on PHP 5.6 locally during development so that any new development would be done on a more modern version. In the end, the migration to the new server was fairly seamless.

We will have to consider migrating to a newer PHP version again, since 5.6 is no longer supported as at the end of this year, but it may be too risky for now.

Namespacing the code

As Zend 1 predates PHP namespaces, the code wasn’t namespaced. This is something I do plan to remedy - the form and model classes should be straightforward to namespace, but the controllers are a bit more problematic. I’m waiting on completing the repositories before I look at this.

Adding PSR-3 logging

The existing logging solution was not all that great. It had drivers for several different logging solutions, but nothing terribly modern - one was for the now-discontinued Firebug extension for Firefox. However, it was fairly similar to PSR-3, so it wasn’t too much work to replace it. I installed Monolog, and amended the bootstrap file to store that as the logger in the Zend registry - that way, we could set up many different handlers. I now have it logging to a dedicated Slack channel when an error occurs in staging or production, which makes it much easier to detect problems. This would also make it easy to set up many other logging handlers, such as the ELK stack.

Debugging

Clockwork is my usual PHP debugging solution, and the absence of support for it in Zend 1 made it difficult to work with. Fortunately, it’s quite straightforward to implement your own data sources for Clockwork. I set it up to use the aforementioned logger as a data source, as well as the Zend 1 profiler. I also added a data source for the events implementation, and added a global clock() helper function, as well as one for the Symfony VarDumper component. Together these give me a reasonably good debugging experience.

Adding console commands

I’ve mentioned before that I’ve been using Symfony’s console component a lot lately, and this project is why. Zend 1 does not come with any sort of console task runner, and we needed an easy way to carry out certain tasks, such as:

  • Setting up a stored procedure
  • Anonymizing user data with Faker
  • Regenerating durations for audio and video files

In addition, I wanted a Laravel Tinker-style interactive shell. I was able to accomplish this with PsySh and the console components. For legacy projects that lack a console task runner, it’s worth considering adding one.

Configuration

The configuration system in Zend 1 is downright painful - it requires that you define multiple environments in there. I have integrated DotEnv, but only part of the configuration has been migrated over, so there’s still plenty of work there.

What’s left to do

The code base is in a much better state than it was, but there’s still an awful lot to do. Zend 1 does apparently still work with PHP 7.1, but not with 7.2, so at some point we’ll likely need to leave Zend 1 behind entirely. This process has already started with us ditching Zend_Log for Monolog, and over time I plan to replace the various components of Zend 1 with other packages, either ones from newer versions of Zend Framework, or elsewhere. While there are many articles about migrating Zend 1 to later versions, very few of them actually seem to go into much detail - certainly nothing as useful as a step-by-step guide.

The database layer is particularly bad, and refactoring some of the methods into repository classes is only the first step in bringing that under control. Once that’s finished, I’m going to start going through the models and seeing if any more methods would make more sense as static methods on the repository, and possibly rename some of them. Then, we can think about the possibility of either incrementally migrating to another database interface (either a newer version of Zend DB, or Doctrine), or refactoring the existing models to have less boilerplate by using magic methods instead of getters and setters.

Dependency injection is a must at some point, but isn’t practical right now - Zend 1 controllers implement an interface that defines the constructor arguments, so you can’t pass in any additional parameters, so that will need to wait until the controllers no longer use Zend 1. I have been using the Zend Registry as a poor man’s DI container, since it allows sharing of a single object throughout the application, but it’s not a good solution in the long term.

The routing is also painful - Zend 1’s routes are all stored in the bootstrap file. I’d prefer to use something like league/route, which would allow for handling different HTTP methods to the same route using different controller methods, making it easier to separate out handling of GET and POST requests.

I also want at some point to set up a queue system for processing video and audio content - at present it’s handled by running a shell command from PHP, which means you can’t easily get feedback if something goes wrong. Migrating that to a queue system, backed with something like Redis, would help a great deal.

Share your stories

I’d love to hear any similar stories about refactoring legacy applications - how you’ve solved various problems with those legacy apps (or how you’d solve the ones I’ve had), tools you’ve used, and so on. Feel free to provide details in the comments.

A legacy project like this can be very frustrating to work with, but it can also feel quite rewarding to bring it under control over a period of time. My experience has been that you get the best results by working in small, regular steps, and over time your experience working with the code base will improve.

13th September 2018 8:10 pm

Mutation Testing With Infection

Writing automated tests is an excellent way of catching bugs during development and maintenance of your application, not to mention the other benefits. However, it’s hard to gauge the quality of your tests, particularly when you first start out. Coverage will give you a good idea of what code was actually run during the test, but it won’t tell you if the test itself actually tests anything worthwhile.

Infection is a mutation testing framework. The documentation defines mutation testing as follows:

Mutation testing involves modifying a program in small ways. Each mutated version is called a Mutant. To assess the quality of a given test set, these mutants are executed against the input test set to see if the seeded faults can be detected. If mutated program produces failing tests, this is called a killed mutant. If tests are green with mutated code, then we have an escaped mutant.

Infection works by running the test suite, carrying out a series of mutations on the source code in order to try to break the tests, and then collecting the results. The actual mutations carried out are not random - there is a set of mutations that get carried out every time, so results should be consistent. Ideally, all mutants should be killed by your tests - escaped mutants can indicate that either the line of mutated code is not tested, or the tests for that line are not very useful.

I decided to add mutation testing to my Laravel shopping cart package. In order to use Infection, you need to be able to generate code coverage, which means having either XDebug or phpdbg installed. Once Infection is installed (refer to the documentation for this), you can run this command in the project directory to configure it:

$ infection

Infection defaults to using PHPUnit for the tests, but it also supports PHPSpec. If you’re using PHPSpec, you will need to specify the testing framework like this:

$ infection --test-framework=phpspec

Since PHPSpec doesn’t support code coverage out of the box, you’ll need to install a package for that - I used leanphp/phpspec-code-coverage.

On first run, you’ll be prompted to create a configuration file. Your source directory should be straightforward to set up, but at the next step, if your project uses interfaces in the source directory, you should exclude them. The rest of the defaults should be fine.

I found that the first run gave a large number of uncovered results, but the second and later ones were more consistent - not sure if it’s an issue with my setup or not. Running it gave me this:

$ infection
You are running Infection with xdebug enabled.
____ ____ __ _
/ _/___ / __/__ _____/ /_(_)___ ____
/ // __ \/ /_/ _ \/ ___/ __/ / __ \/ __ \
_/ // / / / __/ __/ /__/ /_/ / /_/ / / / /
/___/_/ /_/_/ \___/\___/\__/_/\____/_/ /_/
0 [>---------------------------] < 1 secRunning initial test suite...
PHPUnit version: 6.5.13
27 [============================] 3 secs
Generate mutants...
Processing source code files: 5/5
Creating mutated files and processes: 43/43
.: killed, M: escaped, S: uncovered, E: fatal error, T: timed out
...................MMM...M.......M......... (43 / 43)
43 mutations were generated:
38 mutants were killed
0 mutants were not covered by tests
5 covered mutants were not detected
0 errors were encountered
0 time outs were encountered
Metrics:
Mutation Score Indicator (MSI): 88%
Mutation Code Coverage: 100%
Covered Code MSI: 88%
Please note that some mutants will inevitably be harmless (i.e. false positives).
Time: 21s. Memory: 12.00MB

Our test run shows 5 escaped mutants, and the remaining 38 were killed. We can view the results by looking at the generated infection-log.txt:

Escaped mutants:
================
1) /home/matthew/Projects/laravel-cart/src/Services/Cart.php:132 [M] DecrementInteger
--- Original
+++ New
@@ @@
{
$content = Collection::make($this->all())->map(function ($item) use($rowId) {
if ($item['row_id'] == $rowId) {
- if ($item['qty'] > 0) {
+ if ($item['qty'] > -1) {
$item['qty'] -= 1;
}
}
2) /home/matthew/Projects/laravel-cart/src/Services/Cart.php:132 [M] OneZeroInteger
--- Original
+++ New
@@ @@
{
$content = Collection::make($this->all())->map(function ($item) use($rowId) {
if ($item['row_id'] == $rowId) {
- if ($item['qty'] > 0) {
+ if ($item['qty'] > 1) {
$item['qty'] -= 1;
}
}
3) /home/matthew/Projects/laravel-cart/src/Services/Cart.php:132 [M] GreaterThan
--- Original
+++ New
@@ @@
{
$content = Collection::make($this->all())->map(function ($item) use($rowId) {
if ($item['row_id'] == $rowId) {
- if ($item['qty'] > 0) {
+ if ($item['qty'] >= 0) {
$item['qty'] -= 1;
}
}
4) /home/matthew/Projects/laravel-cart/src/Services/Cart.php:133 [M] Assignment
--- Original
+++ New
@@ @@
$content = Collection::make($this->all())->map(function ($item) use($rowId) {
if ($item['row_id'] == $rowId) {
if ($item['qty'] > 0) {
- $item['qty'] -= 1;
+ $item['qty'] = 1;
}
}
return $item;
5) /home/matthew/Projects/laravel-cart/src/Services/Cart.php:197 [M] OneZeroInteger
--- Original
+++ New
@@ @@
*/
private function hasStringKeys(array $items)
{
- return count(array_filter(array_keys($items), 'is_string')) > 0;
+ return count(array_filter(array_keys($items), 'is_string')) > 1;
}
/**
* Validate input
Timed Out mutants:
==================
Not Covered mutants:
====================

This displays the mutants that escaped, and include a diff of the changed code, so we can see that all of these involve changing the comparison operators.

The last one can be resolved easily because the comparison is superfluous - the result of count() can be evaluated as true or false by itself, so removing the > 0 at the end in the test solves the problem quite neatly.

The other four mutations are somewhat harder. They all amend the decrement method’s conditions, showing that a single assertion doesn’t really fully check the behaviour. Here’s the current test for that method:

<?php
namespace Tests\Unit\Services;
use Tests\TestCase;
use Matthewbdaly\LaravelCart\Services\Cart;
use Mockery as m;
class CartTest extends TestCase
{
/**
* @dataProvider arrayProvider
*/
public function testCanDecrementQuantity($data)
{
$data[0]['row_id'] = 'my_row_id_1';
$data[1]['row_id'] = 'my_row_id_2';
$newdata = $data;
$newdata[1]['qty'] = 1;
$session = m::mock('Illuminate\Contracts\Session\Session');
$session->shouldReceive('get')->with('Matthewbdaly\LaravelCart\Services\Cart')->once()->andReturn($data);
$session->shouldReceive('put')->with('Matthewbdaly\LaravelCart\Services\Cart', $newdata)->once();
$uniqid = m::mock('Matthewbdaly\LaravelCart\Contracts\Services\UniqueId');
$cart = new Cart($session, $uniqid);
$this->assertEquals(null, $cart->decrement('my_row_id_2'));
}
}

It should be possible to decrement it if the quantity is more than zero, but not to go any lower. However, our current test does not catch anything but decrementing it from 2 to 1, which doesn’t fully demonstrate this. We therefore need to add a few more assertions to cover taking it down to zero, and then trying to decrement it again. Here’s how we might do that.

<?php
namespace Tests\Unit\Services;
use Tests\TestCase;
use Matthewbdaly\LaravelCart\Services\Cart;
use Mockery as m;
class CartTest extends TestCase
{
/**
* @dataProvider arrayProvider
*/
public function testCanDecrementQuantity($data)
{
$data[0]['row_id'] = 'my_row_id_1';
$data[1]['row_id'] = 'my_row_id_2';
$newdata = $data;
$newdata[1]['qty'] = 1;
$session = m::mock('Illuminate\Contracts\Session\Session');
$session->shouldReceive('get')->with('Matthewbdaly\LaravelCart\Services\Cart')->once()->andReturn($data);
$session->shouldReceive('put')->with('Matthewbdaly\LaravelCart\Services\Cart', $newdata)->once();
$uniqid = m::mock('Matthewbdaly\LaravelCart\Contracts\Services\UniqueId');
$cart = new Cart($session, $uniqid);
$this->assertEquals(null, $cart->decrement('my_row_id_2'));
$newerdata = $newdata;
$newerdata[1]['qty'] = 0;
$session->shouldReceive('get')->with('Matthewbdaly\LaravelCart\Services\Cart')->once()->andReturn($newdata);
$session->shouldReceive('put')->with('Matthewbdaly\LaravelCart\Services\Cart', $newerdata)->once();
$this->assertEquals(null, $cart->decrement('my_row_id_2'));
$session->shouldReceive('get')->with('Matthewbdaly\LaravelCart\Services\Cart')->once()->andReturn($newerdata);
$session->shouldReceive('put')->with('Matthewbdaly\LaravelCart\Services\Cart', $newerdata)->once();
$this->assertEquals(null, $cart->decrement('my_row_id_2'));
}
}

If we re-run Infection, we now get a much better result:

$ infection
You are running Infection with xdebug enabled.
____ ____ __ _
/ _/___ / __/__ _____/ /_(_)___ ____
/ // __ \/ /_/ _ \/ ___/ __/ / __ \/ __ \
_/ // / / / __/ __/ /__/ /_/ / /_/ / / / /
/___/_/ /_/_/ \___/\___/\__/_/\____/_/ /_/
Running initial test suite...
PHPUnit version: 6.5.13
22 [============================] 3 secs
Generate mutants...
Processing source code files: 5/5
Creating mutated files and processes: 41/41
.: killed, M: escaped, S: uncovered, E: fatal error, T: timed out
......................................... (41 / 41)
41 mutations were generated:
41 mutants were killed
0 mutants were not covered by tests
0 covered mutants were not detected
0 errors were encountered
0 time outs were encountered
Metrics:
Mutation Score Indicator (MSI): 100%
Mutation Code Coverage: 100%
Covered Code MSI: 100%
Please note that some mutants will inevitably be harmless (i.e. false positives).
Time: 19s. Memory: 12.00MB

Code coverage only tells you what lines of code are actually executed - it doesn’t tell you much about how effectively that line of code is tested. Infection gives you a different insight into the quality of your tests, helping to write better ones. I’ve so far found it very useful for getting feedback on the quality of my tests. It’s interesting that PHPSpec tests seem to have a consistently lower proportion of escaped mutants than PHPUnit ones - perhaps the more natural workflow when writing specs with PHPSpec makes it easier to write good tests.

Recent Posts

Decorating Service Classes

Simplify Your Tests With Anonymous Classes

Adding React to a Legacy Project

Do You Still Need Jquery?

An Approach to Writing Golden Master Tests for PHP Web Applications

About me

I'm a web and mobile app developer based in Norfolk. My skillset includes Python, PHP and Javascript, and I have extensive experience working with CodeIgniter, Laravel, Django, Phonegap and Angular.js.