Matthew Daly's Blog

I'm a web developer in Norfolk. This is my blog...

28th September 2020 3:50 pm

What I Want in a PHP CMS

I maintain a custom PHP legacy CMS for a client, and have also been building a micro-CMS as a learning project, so I’ve spent quite a lot of time in the last few years thinking about how content should be managed, and how applications to manage it should work.

I’ve also at least tinkered with a few different content management systems down the years, and I’ve found it depressing how many times Wordpress has been the default choice, despite it being probably the worst CMS I’ve ever had the gross misfortune to use. The argument that “it’s easy to install and use” doesn’t really hold water given that in my experience most users setting up a new Wordpress site don’t go through the five-minute install, but use their shared hosting provider’s setup wizard, which typically also supports several other content management systems. Also, it just does not make sense to optimise a short five minute install that will never be repeated for that site over the rest of the workflow for maintaining the site, possibly for years - I’d rather have something that takes a bit more time to do the initial set up, but is easier to maintain.

So, what do I want in a PHP CMS? Here’s my thoughts on my ideal CMS solution.

Managed entirely with Composer

Creating a new site using a CMS should be as simple as running something like the following command:

$ composer create-project --prefer-dist my/cms newsite

And updating it should be as simple as running the following:

$ composer update

Installing a plugin should be a case of running this:

$ composer require my/plugin-foo

It should then be possible to activate the plugin simply by listing it in a config file.

As far as possible, all of the functionality of the CMS should be contained in a single “core” package, and plugins should be their own Composer packages that can be installed, and then switched on and off in a simple config file. The initial creation step should be a case of checking out a boilerplate that contains only the absolute minimum - a front controller, a starting configuration, front end tooling, and some views - and gets the rest of the functionality from the core package.

Allow creating custom site boilerplates

It should be possible to create and publish alternative boilerplates.

For instance, if a CMS provides a default starting boilerplate that ships with Bootstrap, VueJS and Laravel Mix, I should be able to fork it, replace Bootstrap with Tailwind and Vue with React, and then use my version for future projects without having to spend a lot of time maintaining the fork.

Similarly, if there are certain plugins I use all the time, it should be possible to include those plugins as dependencies in my composer.json so that when I create a new project from my boilerplate, they’re present right from the start and I don’t have to faff around downloading and configuring them manually.

Plugin API should work like a framework

The best practices we’ve all spent years learning shouldn’t go out the window when working with a CMS. A good CMS should feel familiar if you’ve got some experience working in MVC frameworks, and it should embrace PSR standards. Adding a route should largely be a matter of writing a controller, mapping it to a route, and adding a view file, just as it would be in a framework

There’s always going to be some things that need to be CMS-specific, because registering things like routes is more complex in a general purpose CMS than a custom web app as they can be defined in multiple arbitrary places. These can be handled by triggering events at various points in the CMS application’s lifecycle, so that plugin authors can set up listeners to do things such as register routes, add new view helpers and so on.

Focused exclusively on content, not presentation

I’m increasingly convinced that the ability to amend presentation in a CMS is a misfeature. The purpose of a CMS is to manage content, not presentation, and making it able to amend presentation potentially gives unskilled site owners enough rope to hang themselves with, while making it actively harder for us devs.

I’ve certainly seen enough sites that a client has completely messed up after being given access to change the presentation in Wordpress, and because it’s stored in the database it’s not possible to roll back the changes easily the way it would be if the styling was stored in version control. And it’s definitely quicker for an experienced front end developer to edit a CSS file than to use Wordpress’s own tools for amending styling.

Use a proper templating system

As a templating language, PHP sucks:

  • It’s too easy to overlook escaping variables properly
  • Handling partials is difficult
  • There’s always the temptation to put in more logic than is advisable in the view layer, especially when deadlines are tight

Using a dedicated templating language, rather than a full programming language, in the view layer, means that entire classes of issues can be completely eradicated from the layer of the application that the developers who work with the CMS have the most dealings with. Developers are forced to move more complex logic into dedicated helpers, and can’t just leave it in the template “until we have time to clear it up”, which is often never.

Twig is solid, reliable, fast, easy to extend, and similar enough to other templating languages such as Handlebars and Django’s templates that if you’ve used any of those you can adapt easily, and it should probably be your first choice. Blade is also a solid choice, and if you want something whose syntax is not dissimilar to PHP you should probably consider Plates.

Configuration with version control in mind

Wordpress does this particularly badly because it actively encourages storing sensitive data, such as database credentials, in a PHP file (which is then kept in the web root…). A good, solid way to store configuration details in PHP is to store generic details (for instance, a list of the active plugins, which will be the same for production and the local copy developers run) for that project in either a YAML or PHP file, and store install-specific details in either a .env file, or as environment variables.

Custom content types

It should be easy to create a new content type, and define specific fields for that content type. For instance, if I’m building a recipe site, I should be able to define a Recipe type that has the following attributes:

  • Ingredients
  • Cover image
  • Title
  • Method

Then all Recipe instances should have those attributes, and it shouldn’t be necessary to bastardise a different content type to make it work properly. It should also be possible to lock down the ability to create custom content types so it’s either limited to admins, or they’re defined in code, so end users can’t create arbitrary content types.

Custom taxonomies

It should be possible to define your own custom taxonomies for content. Continuing the Recipe example above, we should be able to define three sorts of taxonomy:

  • Dietary requirements (eg vegetarian, vegan, gluten-free etc)
  • Meal (eg breakfast, lunch, dinner, snacks)
  • Region (eg Indian, Chinese, Italian)

A taxonomy should be appropriately named, and again it shouldn’t be necessary to abuse generic categories and tags to categorise content. As with the content types, it should also be possible to lock them down.

A better solution than rich text for managing content

Rich text is not a great solution for more complex page layouts, and tends to be abused horribly to do all sorts of things. There’s a tendency to dump things like snippets for Google Maps, tables, galleries, Javascript widgets and many more into rich text. This means that it also loses the semantic value of the content - rather than being a paragraph, then a map of the local area, then a photo carousel, then another paragraph, it’s just a single blob of text. This can’t be easily migrated to another solution if, say, you decide to swap Google Maps for Open Streetmap, and change one carousel for another, without going through and manually replacing every map and carousel, which is a chore.

Wagtail isn’t a PHP CMS, but it has an interesting approach to rich text handling for complex content, inspired by Sir Trevor, based around blocks of different types. The Gutenberg editor in Wordpress 5.0 and up isn’t a million miles away from this, either. For simpler sites, it’s probably better to limit users to a Markdown editor and add helpers for adding more complex functionality directly in the template, such as a gallery helper.

A decent command-line runner

There are always going to be certain tasks that are best done from the command line. A decent CMS should have a command line tool that:

  • Allows appropriate admin tasks, such as going into maintenance mode and flushing caches, to be done from the command line
  • Can be easily extended by plugin authors to add their own commands
  • Assists developers when working locally, such as by generating boilerplate when necessary (so, for instance, you can run a command to generate the skeleton for a new plugin)

There’s no excuse not to do this when building a CMS. Symfony’s console component is solid, easy to work with, and a good base for whatever commands you need to write.

Headless as an option

The rise of headless CMS’s, both as a service and as software packages, hasn’t surprised me. Nowadays it’s quite common to have to publish the same content to multiple channels, which might be one or more websites, as well as mobile apps, and it makes sense to be able to centralise that content in one place rather than have to copy it in some fashion.

It’s therefore very useful to have an API that can retrieve that content for republishing. The same API can also be used with Javascript libraries like React and Vue to build sophisticated frontends that consume that data.

Which solutions do this best?

You’ll probably have got the idea at this point that Wordpress isn’t my first choice. It was created in a different era, and hasn’t kept up well compared to many of its contemporaries, and there are many technical issues with it that are at this point effectively impossible to ever fix. For instance, you could potentially store the post meta in the same table as the rest of the post data by using a JSON field in current versions of MySQL, which would make it more performant, but it seems unlikely it could ever be migrated across to use that solution.

Frustratingly, its mindshare means it’s erroneously seen as some kind of “gold standard” by inexperienced developers and non-technical clients, and there seems to be a common misconception that it’s the only solution that lets users update the content themselves (when in fact that’s the whole point of ANY CMS). Using Bedrock and a theme system like Sage that supports a proper templating system helps solve some of the problems with Wordpress, but not all.

I have tried a few solutions that come very close to what I want:

  • Bolt seems from what I’ve seen so far to be effectively a “better Wordpress” in that the interface and functionality is broadly familiar to anyone already used to Wordpress, but it uses Twig, is built in Symfony, and has a proper command-line runner. I haven’t tried it since version 4 was released a few days back, so I will probably give it a spin before long.
  • Grav looks like a great solution for brochure sites. I’ve long thought that these sites, which often run on shared hosting, don’t really need a database-backed solution, and a flat-file solution is probably a better bet in most cases. Grav is simple to set up and configure, has a decent admin interface, and uses Twig for the views, making it easy to theme.
  • Statamic is my current favourite and ticks almost all of the boxes mentioned above. It’s built on Laravel, and can be added to an existing Laravel site if required. It also allows you access to the full power of the underlying framework if you need it, and ships with a decent front-end boilerplate that includes Tailwind. The only downside compared to Wordpress is that it’s a paid-for solution, but the price is entirely reasonable, and if it’s for a client build you’ll not only save on all the premium plugins you don’t need, but you’ll probably save time on the site build.

Payment shouldn’t be an issue if you’re doing client work, unless the cost is huge. You’re getting paid for building something, and if buying an off-the-shelf product saves you time, it’s well worth it. Back when Laravel Nova was first released, a lot of people were complaining that it wasn’t free, but that was neither here nor there - the cost is only equivalent to a few hours of an experienced developer’s time, and it would take a lot longer to build out the same functionality, and the same is true of any half-decent CMS. In the early days of the web, one company I used to work for sold a CMS that was considered cheap by the standards of the time at £495, plus £96 a year, for the entry level version - Statamic is significantly cheaper than that.

It’s always a good idea to be aware of the various CMS options around. Wordpress isn’t a great solution and there are plenty of options that are technically better, easier to use, more secure, and work out cheaper when you consider the total cost of ownership. I’ll probably be favouring Statamic for the foreseeable future when building content-based websites, but that doesn’t mean I won’t look elsewhere from time to time.

13th June 2020 1:50 pm

Flow Typed AJAX Responses With React Hooks

I’m a big fan of type systems in general. Using Psalm to find missing type declarations and incorrect calls in PHP has helped me out tremendously. However, I’m not a big fan of Typescript. The idea of creating a whole new language, primarily just to add types to Javascript, strikes me as a fundamentally bad idea given how many languages that compile to Javascript have fallen by the wayside. Flow seems like a much better approach since it adds types to the language rather than creating a new language, and I’ve been using it on my React components for a good while now. However, there are a few edge cases that can be difficult to figure out, and one of those is any generic AJAX component that may be reused for different requests.

A while back I wrote the following custom hook, loosely inspired by axios-hooks (but using the Fetch API) to make a query to a GraphQL endpoint:

import { useCallback, useState, useEffect } from "react";
function useFetch(url, query) {
const [data, setData] = useState(null);
const [loading, setLoading] = useState(false);
const [error, setError] = useState(false)
const fetchData = useCallback(() => {
setLoading(true);
fetch(url, {
method: 'POST',
headers: {
'Content-Type': 'application/json',
'Accept': 'application/json',
},
body: JSON.stringify({query: query})
}).then(r => r.json())
.then((data) => {
setData(data.data);
setLoading(false);
setError(false);
});
}, [url, query]);
useEffect(() => {
fetchData();
}, [url, query, fetchData]);
return [{
data: data,
loading: loading,
error: error
}, fetchData];
};
export default useFetch;

When called, the hook receives two parameters, the URL to hit, and the query to make, and returns an array that contains a function for reloading, and an object containing the following values:

  • loading - a boolean that specifies if the hook is loading right now
  • error - a boolean that specifies if an error has occurred
  • data - the response data from the endpoint, or null

Using this hook, it was then possible to make an AJAX request when a component was loaded to populate the data, as in this example:

import React from 'react';
import useFetch from './Hooks/useFetch';
import marked from 'marked';
import './App.css';
function App() {
const url = `/graphql`;
const query = `query {
posts {
title
slug
content
tags {
name
}
}
}`;
const [{data, loading, error}] = useFetch(url, query);
if (loading) {
return (<h1>Loading...</h1>);
}
if (error) {
return (<h1>Error!</h1>);
}
const posts = data ? data.posts.map((item) => (
<div key={item.slug}>
<h2>{item.title}</h2>
<div dangerouslySetInnerHTML={{__html: marked(item.content)}} />
</div>
)) : [];
return (
<div className="App">
{posts}
</div>
);
}
export default App;

This hook is simple, and easy to reuse. However, it’s difficult to type the value of data correctly, since it will be different for different endpoints, and given that it may be reused for almost any endpoint, you can’t cover all the acceptable response types. We need to be able to specify the response that is acceptable in that particular context.

Generics to the rescue

Flow provides a solution for this in the shape of generic types. By passing in a polymorphic type using <T> in the function declaration, we can then refer to that type when specifying what data should look like:

//@flow
import { useCallback, useState, useEffect } from "react";
function useFetch<T>(url: string, query: string): [{
data: ?T,
loading: boolean,
error: boolean
}, () => void] {
const [data, setData]: [?T, ((?T => ?T) | ?T) => void] = useState(null);
const [loading, setLoading]: [boolean, ((boolean => boolean) | boolean) => void] = useState(false);
const [error, setError]: [boolean, ((boolean => boolean) | boolean) => void] = useState(false)
const fetchData = useCallback(() => {
setLoading(true);
fetch(url, {
method: 'POST',
headers: {
'Content-Type': 'application/json',
'Accept': 'application/json',
},
body: JSON.stringify({query: query})
}).then(r => r.json())
.then((data) => {
setData(data.data);
setLoading(false);
setError(false);
});
}, [url, query]);
useEffect(() => {
fetchData();
}, [url, query, fetchData]);
return [{
data: data,
loading: loading,
error: error
}, fetchData];
};
export default useFetch;

Then, when calling the hook, we can define a type that represents the expected shape of the data (here called <Data>, and specify that type when calling the hook, as in this example:

//@flow
import React from 'react';
import useFetch from './Hooks/useFetch';
import marked from 'marked';
import './App.css';
type Data = {
posts: Array<{
title: string,
slug: string,
content: string,
name: Array<string>
}>
};
function App() {
const url = `/graphql`;
const query = `query {
posts {
title
slug
content
tags {
name
}
}
}`;
const [{data, loading, error}] = useFetch<Data>(url, query);
if (loading) {
return (<h1>Loading...</h1>);
}
if (error) {
return (<h1>Error!</h1>);
}
const posts = data ? data.posts.map((item) => (
<div key={item.slug}>
<h2>{item.title}</h2>
<div dangerouslySetInnerHTML={{__html: marked(item.content)}} />
</div>
)) : [];
return (
<div className="App">
{posts}
</div>
);
}
export default App;

That way, we can specify a completely different shape for our response data every time we call a different endpoint, without creating a different hook for every different endpoint, and still enjoy properly typed responses from our hook.

Generics can be useful for many other purposes, such as specifying the contents of collections. For instance, if you had a Collection object, you could use a generic type to specify that any one instance must consist of instances of a given type. Flow would then flag it as an error if you assigned an item of the wrong type to that collection, thus making some unit tests redundant.

11th March 2020 9:20 pm

Caching the Laravel User Provider With a Decorator

A couple of years ago I posted this article about constructing a caching user provider for Laravel. It worked, but with the benefit of hindsight I can now see that there were a number of issues with this solution:

  • Because it extended the existing Eloquent user provider, it was dependent on the internals of that remaining largely the same - any change in how that worked could potentially break it
  • For the same reason, if you wanted to switch to a different user provider, you’d need to add the same functionality to that provider, either by writing a new provider from scratch or extending an existing one

I’ve used the decorator pattern a few times in the past, and it’s a good fit for situations like this where you want to add functionality to something that implements an interface. It allows you to separate out one part of the functionality (in this case, caching) into its own layer, so it’s not dependent on any one implementation and can wrap any other implementation of that same interface you wish. Also, as long as the interface remains the same, there likely won’t be any need to change it when the implementation that is wrapped changes. Here I’ll demonstrate how to create a decorator to wrap the existing user providers.

If we only want to cache the retrieveById() method, like the previous implementation, the decorator class might look something like this:

<?php
namespace App\Auth;
use Illuminate\Contracts\Auth\Authenticatable;
use Illuminate\Contracts\Auth\UserProvider;
use Illuminate\Contracts\Cache\Repository;
final class UserProviderDecorator implements UserProvider
{
/**
* @var UserProvider
*/
private $provider;
/**
* @var Repository
*/
private $cache;
public function __construct(UserProvider $provider, Repository $cache)
{
$this->provider = $provider;
$this->cache = $cache;
}
/**
* {@inheritDoc}
*/
public function retrieveById($identifier)
{
return $this->cache->remember('id-' . $identifier, 60, function () use ($identifier) {
return $this->provider->retrieveById($identifier);
});
}
/**
* {@inheritDoc}
*/
public function retrieveByToken($identifier, $token)
{
return $this->provider->retrieveById($identifier, $token);
}
/**
* {@inheritDoc}
*/
public function updateRememberToken(Authenticatable $user, $token)
{
return $this->provider->updateRememberToken($user, $token);
}
/**
* {@inheritDoc}
*/
public function retrieveByCredentials(array $credentials)
{
return $this->provider->retrieveByCredentials($credentials);
}
/**
* {@inheritDoc}
*/
public function validateCredentials(Authenticatable $user, array $credentials)
{
return $this->provider->validateCredentials($user, $credentials);
}
}

It implements the same interface as the user providers, but accepts two arguments in the constructor, which are injected and stored as properties:

  • Another instance of Illuminate\Contracts\Auth\UserProvider
  • An instance of the cache repository Illuminate\Contracts\Cache\Repository

Most of the methods just defer to their counterparts on the wrapped instance - in this example I have cached the response to retrieveById() only, but you can add caching to the other methods easily enough if need be. You do of course still need to flush the cache at appropriate times, which is out of scope for this example, but can be handled by model events as appropriate, as described in the prior article.

Then you add the new decorator as a custom user provider, but crucially, you need to first resolve the provider you’re going to use, then wrap it in the decorator:

<?php
namespace App\Providers;
use Illuminate\Support\Facades\Gate;
use Illuminate\Foundation\Support\Providers\AuthServiceProvider as ServiceProvider;
use Illuminate\Contracts\Auth\UserProvider;
use Auth;
use Illuminate\Auth\EloquentUserProvider;
use Illuminate\Contracts\Cache\Repository;
use App\Auth\UserProviderDecorator;
class AuthServiceProvider extends ServiceProvider
{
/**
* The policy mappings for the application.
*
* @var array
*/
protected $policies = [
'App\Model' => 'App\Policies\ModelPolicy',
];
/**
* Register any authentication / authorization services.
*
* @return void
*/
public function boot()
{
$this->registerPolicies();
Auth::provider('cached', function ($app, array $config) {
$provider = new EloquentUserProvider($app['hash'], $config['model']);
$cache = $app->make(Repository::class);
return new UserProviderDecorator($provider, $cache);
});
}
}

Finally, set up the config to use the caching provider:

'providers' => [
'users' => [
'driver' => 'cached',
'model' => App\Eloquent\Models\User::class,
],
],

This is pretty rough and ready, and could possibly be improved upon by allowing you to specify a particular provider to wrap in the config, as well as caching more of the methods, but it demonstrates the principle effectively.

By wrapping the existing providers, you can change the behaviour of the user provider without touching the existing implementation, which is in line with the idea of composition over inheritance. Arguably it’s more complex, but it’s also more flexible - if need be you can swap out the wrapped user provider easily, and still retain the same caching functionality.

12th February 2020 10:40 pm

The Trouble With Integrated Static Analysis

I’ve always been a big fan in general of tools that provide feedback about the quality of my code. The development role in which I spent the most time was one in which I had no peer feedback or mentoring at all, and while I could definitely have done with more peer review than I had, automated tools helped fill the gap a little bit. When I started building my first Phonegap app, about a year after I started programming professionally, it used far more Javascript than I’d ever used before, and JSLint was very helpful in instilling good practices at that early stage in my career.

In addition, I often find that using an automated tool largely eliminates the issue of ego - if your colleague Bob tells you something is a bad practice, you can potentially dismiss it as “That’s just Bob’s preferences”, whereas an automated tool is potentially much more objective. Nowadays, my typical set of static analysis tools on a project includes:

  • ESLint
  • Flow
  • PHP CodeSniffer
  • Psalm

However, I’m always dubious of using any static analysis tool that’s tightly integrated with a particular editor or IDE. In this post, I will explain my reasoning.

In-editor feedback

Having instant feedback on the quality of your code is tremendously useful. Sure, you can run something like CodeSniffer from the command line and see what the problems are, but that’s nowhere near as useful as having it actually in your code. If you work on a legacy code base, there’s no way in hell you can wade through a long list of output in the terminal and fix them without losing the will to live. By comparison, actually seeing something flagged as an error where it actually occurs makes the mental cost of fixing it much smaller - you can see it in context, and can usually therefore resolve it more easily.

However, that doesn’t explicitly require that any one tool form an integral part of the editor. Most editors can hand off linting and static analysis to other, standalone tools, and doing so offers the following advantages:

  • Less dependence on a given development environment - it’s always a struggle if you wind up stuck using a development environment you dislike (I grew to utterly despise Netbeans in my first role), but if you can use generic feedback tools that can be integrated with just about any editor, your team can use the development environment that suits them most, while still all benefiting from the feedback these tools provide
  • These tools tend to be open source, meaning you have the security of knowing that if the creator ceases maintaining it, either someone else may pick up the baton, or you can choose to fork it yourself. If a commercial IDE provider ceases trading, it’s likely you won’t be able to use their offering at all at some point in the future.

Nowadays I use vim-ale in Neovim, and that provides real-time feedback for a number of linters and static analysis tools, including all those I mentioned above. I have comprehensive information on any issues in my code, and because any configuration is in simple text files that form part of the repository, it’s easy to update those settings for all developers working on the project to ensure consistency.

It’s possible that an integrated solution might offer a few advantages in terms of tighter integration with autocompletion and other functionality allowing for it, but whether they outweigh the tradeoffs mentioned here is dependent entirely on the implementation and how useful it is for any one team.

Continuous integration to the rescue

There’s another issue I have with this sort of tightly integrated static analysis, which is probably the biggest, and that is that the feedback is available only at the level of an individual developer, not the team.

It’s great providing all this feedback to developers, but what if they just ignore it? Not all developers have had the sort of experience that leads one to really appreciate the value of coding standards and type hints, particularly if they’ve worked primarily on small or greenfield projects, or in environments where the emphasis was on churning out large quantities of work, and getting developers to tidy up the sort of issues these tools identify can sometimes be a tough sell when faced with code which, at least superficially, works.

Suppose you take on a new developer and ask them to work alone on a particular project for several months. Due to your own workload you can’t easily schedule code reviews with them, so you don’t see what they’re writing until they’re done. Then you take a look at what they’ve written and it’s full of issues that the IDE caught, but the developer either didn’t bother to fix, or didn’t know how to. What they’ve done may well work, but they’ve introduced a huge morass of technical debt that will slow down future development for the foreseeable future.

If your static analysis tools work only in the context of a given editor or IDE, then if the new dev introduce issues in the code base and doesn’t resolve them because they don’t know how, or don’t see the value, then the first you knows about it is when you clone the repo yourself and open it up. With a solution that runs in a CI environment, you can catch any reduction in code quality when it’s pushed. Sure, code reviews can do that too, but that requires manual input in a way that not every team is willing to spare, whereas a CI server, once set up, is largely self sustaining. And you could run one tool locally and another in a CI environment, but you can’t be sure they’ll necessarily catch all the same issues.

Now consider the same scenario if you’re using a separate code quality tool that’s integrated both into the editor, and your continuous integration workflow. Obviously, it will depend on your personal CI setup, but once code quality either begins to drop, or drops below a given level, the CI server will mark the build as failed, and you’ll be alerted. You can therefore then raise the issue with the new dev before it gets out of hand, and provide whatever support they need to resolve the problem there and then.

I personally maintain a legacy project in which, at one point prior to my arrival, a junior dev introduced an enormous amount of technical debt after working on it alone for six months. An integrated linter or static analysis tool probably wouldn’t have stopped that from happening, for the reasons stated above, but if a similar tool were part of the CI workflow, it could have been flagged as an issue much earlier and dealt with. Yes, leaving a junior dev unsupported and unsupervised for that length of time isn’t a great idea, but it happens, particularly in busy environments such as agencies. A good CI setup lets you see if someone is adding these kinds of issues, and act to nip it in the bud before it becomes too much of a problem, which is ultimately good for that developer’s career.

Peer pressure can also be a strong motivating factor under these circumstances. By simply displaying a metric, you encourage people’s natural competitiveness, so displaying code quality stats in your CI dashboard will encourage your team to do better in this regard, and no-one wants to be visibly seen to be letting the team down by producing substandard code.

For these reasons, where possible for feedback on code quality, I would always prefer to rely on a standalone tool that can be integrated with an editor, or used as part of a continuous integration workflow, as opposed to any IDE-specific functionality.

9th February 2020 10:10 am

Don't Use Stdclass

The digital agency I work for specialises in marketing, so some of my work tends to relate to mailing lists. This week I was asked to look at an issue on a Laravel-based export system built by a co-worker who left a few months ago. This particular application pulled data from the Campaign Monitor API about campaigns, and it returned the data as instances of stdClass, something that I’m not terribly happy about.

Now, this was an implementation detail of the Campaign Monitor PHP SDK, which is old and creaky (to say the least…) and doesn’t exactly abide by modern best practices. However, the legacy application I maintain also does this (there’s a lot of stuff that needs sorting out on it, and sadly replacing the stdClass instances is a long way down the list), so I thought it was an interesting subject for a blog post. I consider using stdClass, even just as a dumb container for data, to be an antipattern, and I thought it would be useful to explain my thoughts on this.

Why shouldn’t I use stdClass?

Readability

One of the first things I learned about throwing exceptions is that they should be meaningful. It’s trivial to define a named exception and use that to specify the type of exception, and you can then capture exceptions to handle them differently elsewhere in the application. For instance, a validation error is entirely due to a user submitting the wrong details, and should therefore be handled in an entirely different manner to the database going down.

The same is applicable to an object. If an API client returns an instance of stdClass, that doesn’t tell me anything about what that object is. If I need to pass it elsewhere in a large application, it may become very difficult to understand what it represents without tracking it back to where it came from, which will slow me down. If instead I use a named class, the name can convey what it represents. It may seem like overkill to create a class that adds no new functionality, but the mere fact that it has a name makes your code more meaningful and easier to understand. I can also add DocBlock comments to describe it further.

Of course, just giving something a generic name like Container isn’t much of an improvement, and coming up with meaningful names for classes and methods is notoriously hard. As always, give some serious thoughts into what your objects represent and attempt to give them names that will make it easy to understand what they are if you look at the code base again six months down the line.

Type hinting

A related argument is that it makes type hinting more useful. You can type hint stdClass, but as noted above it tells someone working on the code receiving it very little about where it’s come from, or what it represents, and it doesn’t offer much value since an stdClass could mean anything, and could be created anywhere in the application.

By contrast, named classes provides much more information about what the type hinted parameter represents. For instance, naming your class something such as App\Api\Response\Item, makes it much clearer that that object represents an individual item returned from an API, and others developers working on the same code base (including your future self, who may not necessarily remember all of the details of how you’re implementing it now), will have less trouble understanding what is going on. There’s also a much-reduced likelihood of the same class being used to represent completely different types of data.

New functionality

Finally, are you sure you don’t want to add any functionality? PHP includes a number of interfaces that can be extremely useful for working with this sort of data.

First off, the ArrayAccess interface allows you to access an object’s values using array syntax, which can be useful. Also, implementing either Iterator or IteratorAggregate will allow you to iterate over the object using a foreach loop. The Countable interface is less useful, since all it does is let you get the number of items, but it’s sometimes handy. Finally, the Serializable interface lets you serialise an object so it can be stored in a suitable medium, which can sometimes be useful.

The same also applies to some of the magic methods. The __toString() method, in particular, can be useful for returning a suitable string-based representation of an object - for instance, if it represents an item in a database, it might be appropriate to use this to return the ID of the item, or a text representation of it (eg title for a blog post, or product name for a product in an e-commerce site). The __get() and __set() magic methods may be a bit more dubious, but they can be useful if your object is intended to just be a dumb container as they allow you to make the properties on the object private, but keep them accessible without writing explicit getters and setters. I’d also suggest adding __debugInfo() to objects unless you have a good reason not to, as when you’re debugging it can be hard to see the wood for the trees, and returning only the most pertinent data can make your job a lot easier.

Of course, you don’t have to implement all this functionality from scratch for every class. It often makes sense to create an abstract class that implements this sort of basic container functionality, and then base all your container classes on that, overriding it when necessary.

Summary

Hopefully, this has made it clear how compelling it is to use named classes instead of stdClass, and how much benefit you can get from not just using named classes, but creating your own base container class for them. I’m of the opinion that PHP should probably make stdClass abstract to prevent them from being used like this, and indeed I’m seriously considering the idea of creating a Codesniffer sniff to detect instances of stdClass being instantiated and raise them as an error.

Recent Posts

What I Want in a PHP CMS

Flow Typed AJAX Responses With React Hooks

Caching the Laravel User Provider With a Decorator

The Trouble With Integrated Static Analysis

Don't Use Stdclass

About me

I'm a web and mobile app developer based in Norfolk. My skillset includes Python, PHP and Javascript, and I have extensive experience working with CodeIgniter, Laravel, Zend Framework, Django, Phonegap and React.js.