Matthew Daly's Blog

I'm a web developer in Norfolk. This is my blog...

22nd August 2015 7:32 pm

When You Should Not Use Wordpress

I must admit, I’ve had a rather bad experience with WordPress recently. The site in question was an e-commerce site, built with WordPress and WooCommerce. In development, we originally put the site on shared hosting, but after a while the hosting company told us off because it was using too much database space, so we moved to a VPS earlier than we normally would. With the benefit of hindsight, we probably should have seen that as the first warning sign.

Then, once the site was up and running on the VPS, it got slower and slower, and eventually the server was killing MySQL off because it was using too many resources. I decided to install a benchmarking plugin and investigate why it was so slow. On loading the home page, it became obvious why the site was so slow - there were in excess of 300 queries on the home page. Looking elsewhere, some other pages were even worse, with one making over 1,000 queries!

At this point, I was practically hyperventilating. If I had written a web app that made that many queries on one page from scratch, I’d be seriously considering whether I was cut out for this industry. With an off-the-shelf CMS, you do have to accept some degree of bloat as a trade-off for quicker development time, but these numbers beggar belief.

I was able to mitigate this to some extent. First, I cut down the number of products shown on individual pages and audited the installed plugins, removing ones we could do without. This still left a lot more queries than I liked.

The next step was to enable caching. I installed Memcached and Varnish (incidentally, if you haven’t used Varnish before, you should check it out - it can make a huge difference for slow sites). I then installed and configured W3 Total Cache to work with them. This didn’t solve the fundamental problem of the initial page loads being too database-intensive, but it did mean that the result was cached for some time afterwards, making things easier on subsequent users.

This still wasn’t enough, however. The admin was still very slow, and often crashed. I actually wound up having to write a shell script that would check to see if MySQL was running and restart it if it wasn’t, and set up a cron job to run it every minute, just to ensure I wasn’t having to restart it myself. The issue was only really dealt with once we upped the specs on the VPS from 1GB RAM and 1 core to 3GB RAM and 2 cores, which should really have been overkill for something like WordPress.

As it turned out, the issue wasn’t exactly helped by the fact that someone had been making an unusually persistent attempt to brute-force wp-login.php. I was able to mitigate this by password-protecting it in the .htaccess file and adding some custom rules to fail2ban. But the fundamental problem remained that the resources used by WordPress to load a single page were grossly excessive.

Since then, we’ve continued to have some difficulties with it. There are some rather arcane criteria for calculating the shipping costs, and implementing them has been a real uphill struggle. We’ve also had to deal with breakages in the theme when updating WooCommerce, and other painful issues. It feels at times like the site will never be “done done”.

Now, I’ve had some issues with WordPress before, but this was by far the nastiest I’d ever seen, and it made me think very hard about when we should and should not consider WordPress as a solution. In hindsight, it would have been much easier to use Laravel to build the site from scratch - it would have made for a much leaner, more efficient site, updating the templates would have been a breeze, and implementing additional functionality would have been straightforward.

NB: I’m trying hard to make sure this is NOT one of those “WordPress sucks” blog posts. I’ll admit that I agree with many of the points from a lot of those, and I abandoned WordPress for my own site a long time ago in favour of a static site generator, but there are times when it is appropriate to use it. What I’m trying to do here is to help others avoid making the mistakes we did recently by giving some advice on when you should and should not use WordPress. Of course, your mileage may vary.

Why was WordPress inappropriate here?

With the benefit of hindsight, I can say that WordPress was definitely not the right solution in this case, and I will be advising against using it in similar circumstances. But why was it inappropriate?

  • Less flexible than rolling a custom solution - While the ecosystem of plugins and themes make it possible to use WordPress for a lot of use cases outside the core functionality of the platform, those plugins and themes aren’t infinitely flexible. If you want to do something one way and the plugin you’re using doesn’t support that, you’re out of luck unless you can fork the plugin or write a new one.
  • Dependence on third party plugins - While we were working on the site, WooCommerce made some changes that broke the theme we were using. We were using a child theme, but updating the parent theme alone didn’t fix it - we had to then apply some of the changes to the child theme as well, which was extremely fiddly. As a result, we’re now very wary about updating plugins and themes. Yet we don’t dare put it off too long, because in my experience attempts to break into WordPress are common, and if you fail to install an upgrade that fixes a vulnerability in good time, you can easily find yourself getting a phone call about a site having been hacked (as I did in December last year).
  • Poor performance - This is a big one, and I have therefore broken it down further:
    • Loading styling from the database - Many of the high end, customisable themes have large numbers of configuration options that can be used to style the site. The downside of these is that it creates additional queries to the database to fetch that data. Unless you have some form of caching in place, that data is loaded for every single request to the front end, generating a significant number of additional queries. You can mitigate this by rolling your own custom WordPress theme for the site, however.
    • Too many queries - My experience has been that as a general rule of thumb, it’s much quicker to make a smaller number of more complex queries to a database than to make a larger number of simple queries. If you build a custom web app, you will always know exactly what data you want to retrieve on a particular page and through careful use of joins, can retrieve exactly the data you need with as few queries as possible. Being a generic solution, WordPress doesn’t know exactly what data you need on any one page, and so may fetch the data using an excessive number of queries. It may also fetch data you don’t actually need.
    • Suboptimal database layout - The database schema for WordPress was originally created with a blog in mind, and may not always be optimal for your particular use case.
    • Caching is not a silver bullet - You can do a lot to improve performance by installing Memcached and Varnish, and configuring a caching plugin to work with them. However, this doesn’t solve the problem of the excessive number of queries, it only mitigates the effects somewhat. Not everything can be cached, and the expensive queries will still have to be run at some point. Caching only increases the time between the queries. Also, configuring Varnish in particular can be something of a black art, and it’s easy to miss something and find out some functionality or other hasn’t been working.

WordPress has a lot of technical limitations and deficiencies from a programmer’s point of view. For all that, it works, it’s easy to set up, and there’s a wide variety of plugins and themes available, so it’s often an appropriate choice. While the performance is poorer than I would like, the harsh truth is that often it doesn’t matter - if your site isn’t serving a huge amount of page requests, a few extra queries don’t actually make all that much difference (within reason, of course). My concern is that use of WordPress when it’s entirely inappropriate is widespread.

Is WordPress being overused?

Archer - WordPress? The Dane Cook of content management systems?

I suspect I’m running the risk of being branded a hipster for saying this (“Now it’s popular, you hate WordPress…”), but the fact that WordPress is widespread and popular does not mean that it’s the best solution for your project. Nor does the fact that it’s technically possible to use it for your project.

A few years ago, I built a now-defunct site and mobile app for a client that monitored web pages, or product prices on web pages, for changes, and notified the user when a change occurred. It was built using CodeIgniter 2, and had an integrated blog. At one point, the client was unhappy because it wasn’t built with WordPress, believing that this was the reason why few people were signing up. To use WordPress for this project would have involved building the additional functionality, including the API for the mobile app, as a plugin, which would have slowed down development considerably - in my experience it’s generally much harder to build something as a WordPress plugin than using an MVC framework due to the lack of separation of concerns, which makes the code base more confusing.

This is a good example of the alarming trend I’ve noticed in the last few years whereby a large number of people seem to be under the mistaken impression that WordPress is some kind of all-singing, all-dancing general purpose solution for building websites. I suspect that the reason for this may be that WordPress is commonplace enough that people outside of the web industry have often heard of it, and therefore they often ask for it since it’s what they’ve heard of, not knowing whether or not it’s actually appropriate for their needs. What isn’t always apparent to non-developers is that it’s often considerably easier for a developer to implement the core functionality of WordPress using a modern MVC framework than it is for them to implement the other functionality using WordPress, and as the functionality is being built with your exact use case in mind, the user interface is often more straightforward than the WordPress admin. Also, the WordPress privilege system can make it difficult for you to limit the user to just the functionality you want them to have, resulting in a situation where either you give the users a potentially dangerous level of access, or force them to contact you to make certain changes, making more work for you.

I’ve heard plenty of people say things like “WordPress is a framework” and “A competent developer can build anything with WordPress”. These claims are utter hogwash. A competent developer is smart enough to recognise that WordPress is not a one-size fits all solution and it’s not always appropriate to use it - you can easily spend more time trying to get it to do something off the beaten track than it would take to build that functionality from scratch. I think the way that Automattic are trying to promote WordPress as an application framework is a really bad idea - trying to use it for this is much more cumbersome than using a modern PHP framework like Laravel.

Even if you ignore the technical deficiencies of WordPress, it is too opinionated to be a good solution for use as a framework, and as such you’ll spend a lot of time trying to work around the existing implementations of existing functionality when they don’t quite meet your requirements.

Conclusion

For all its flaws, WordPress is very useful. It’s generally a good choice for blogs, brochure-style sites, and small e-commerce solutions where the client is not too fussy about the details of how it works. For virtually every other situation, I plan on looking elsewhere in future.

2nd August 2015 5:58 pm

Testing Django Views in Isolation

One thing you may hear said often about test-driven development is that as far as possible, you should test everything in isolation. However, it’s not always immediately clear how you actually go about doing this. In Django, it’s fairly easy to get your head around testing models in isolation because they’re single objects that you can just create, save, and then check their attributes. Forms are also quite easy to test, because you can just set the parameters with the appropriate values and check that the validation works as expected. With views, it’s much harder to imagine how you’d go about testing them in isolation, and often people just settle for writing higher-level functional tests instead. While functional tests are important, they’re also slower than unit tests, which makes it less likely they’ll be run often. So I thought I’d show you a quick and simple example of testing a Django view in isolation.

One of the little projects I’ve written in the past to help get my head around certain aspects of Django is a code-snippet sharing Django application which I named Snippetr. The index route of this application is a form for submitting a brand-new code snippet and I’ll show you how we would write a test for that.

Testing a GET request

Before now, you may well have used the Django test client to test views. That is fine for higher-level tests, but if you want to test a view in isolation, it’s no use because it emulates a real web server and all of the middleware and authentication, which we want to keep out of the way. Instead, we need to use RequestFactory:

from django.test import RequestFactory

RequestFactory actually implements a subset of the functionality of the Django test client, so while it will feel somewhat familiar, it won’t have all the same functionality. For instance, it doesn’t support middleware, so rather than logging in using the test client’s login() method, you instead attach a user directly to the request, as in this example:

request = RequestFactory()
request.user = user

You have to specify the URL in the request, but you also have to explicitly pass the request through to the view you want to test, which can be a bit confusing. Let’s see it in context. First of all, we want to write a test for making a GET request:

class SnippetCreateViewTest(TestCase):
"""
Test the snippet create view
"""
def setUp(self):
self.user = UserFactory()
self.factory = RequestFactory()
def test_get(self):
"""
Test GET requests
"""
request = self.factory.get(reverse('snippet_create'))
request.user = self.user
response = SnippetCreateView.as_view()(request)
self.assertEqual(response.status_code, 200)
self.assertEqual(response.context_data['user'], self.user)
self.assertEqual(response.context_data['request'], request)

First of all, we define a setUp() method that creates a user and an instance of RequestFactory() for use in the test. Note that I’m using Factory Boy to define UserFactory in order to make it easier to work with. Also, if you have more than one view to test, you should create a base class containing the setUp() method that your view tests inherit from.

Next, we have our test for making a GET request. Note that we’re using the reverse() method to get the route for the view named snippet_create. You’ll need to import this as follows if you’re not yet using it:

from django.core.urlresolvers import reverse

We then attach our user object to the request manually, and fetch the response by passing the request to the view as follows:

    response = SnippetCreateView.as_view()(request)

Note that this is the syntax used for class-based views - we call the view’s as_view() method. For a function-based view, the syntax is a bit simpler:

    response = my_view(request)

We then test our response as usual. In this case, the view adds some additional context data, and we check that we can access that, as well as checking the status code.

Testing a POST request

Testing a POST request is a little more challenging in this case because submitting the form will create a new Snippet object and we don’t want to interact with the model layer at all if we can help it. We want to test the view in isolation, partly because it will be faster, and partly because it’s a good idea. We can do this by mocking the Snippet model’s save() method.

To do so, we need to import two things from the mock library. If you’re using Python 3.4 or later, then mock is part of unittest as unittest.mock. Otherwise, it’s a separate library you need to install with pip. Here’s the import statement for those on Python 3.4 or later:

from unittest.mock import patch, MagicMock

And for those on earlier versions:

from mock import patch, MagicMock

Now, our test for the POST requests should look like this:

@patch('snippets.models.Snippet.save', MagicMock(name="save"))
def test_post(self):
"""
Test post requests
"""
# Create the request
data = {
'title': 'My snippet',
'content': 'This is my snippet'
}
request = self.factory.post(reverse('snippet_create'), data)
request.user = self.user
# Get the response
response = SnippetCreateView.as_view()(request)
self.assertEqual(response.status_code, 302)
# Check save was called
self.assertTrue(Snippet.save.called)
self.assertEqual(Snippet.save.call_count, 1)

Note first of all the following line:

    @patch('snippets.models.Snippet.save', MagicMock(name="save"))

Here we’re saying that in this test, when the save() method of the Snippet model is called, it should instead call a mocked version, which lacks the functionality and only registers that it has been called and a few details about it.

Next, we put together the data to be passed through and create a POST request for it. As before, we attach the user to the request. We then pass the request through in the same way as for the GET request. We also check that the response code was 302, meaning that the user would be redirected elsewhere after the form was submitted correctly.

Finally, we assert that Snippet.save.called is true. called is a Boolean value, representing whether the method was called or not. We also check the value of Snippet.save.call_count, which is a count of the number of times the method was called - here we check that it’s set to 1.

As you can see, while the request factory is a little harder than the Django test client to figure out, it’s not too difficult once you get the hang of it. By combining it with judicious use of mock, you can easily test your views in isolation, and without having to interact with the database or set up any middleware, these tests will be much faster than those using the Django test client.

1st August 2015 6:26 pm

Exploring the Hstorefield in Django 1.8

One of the most interesting additions in Django 1.8 is the new Postgres-specific fields. I started using PostgreSQL in preference to MySQL for Django apps last year, and so I was interested in the additional functionality they offer.

By far the biggest deal out of all of these was the new HStoreField type. PostgreSQL added a JSON data type a little while back, and HStoreField allows you to use that field type. This is a really big deal because it allows you to store arbitrary data as JSON and query it. Previously, you could of course just store data as JSON in a text field, but that lacked the same ability to query it. This gives you many of the advantages of a NoSQL document database such as MongoDB in a relational database. For instance, you can store different products with different data about them, and crucially, query them by that data. Previously, the only way to add arbitrary product data and be able to query it was to have it in a separate table, and it was often cumbersome to join them when fetching multiple products.

Let’s see a working example. We might be building an online store where products can have all kinds of arbitrary data stored about them. One product might be a plastic box, and you’d need to list the capacity as an additional attribute. Another product might be a pair of shoes, which have no capacity, but do have a size. It might be difficult to model this otherwise, but HStoreField is perfect for this kind of data.

First, let’s set up our database. I’ll assume you already have PostgreSQL up and running via your package manager. First, we need to create our database:

$ createdb djangostore

Next, we need to create a new user for this database with superuser access:

$ createuser store -s -P

You’ll be prompted for a password - I’m assuming this will just be password here. Next, we need to connect to PostgreSQL using the psql utility:

$ psql djangostore -U store -W

You’ll be prompted for your new password. Next, run the following command:

# CREATE EXTENSION hstore;
# GRANT ALL PRIVILEGES ON DATABASE djangostore TO store;
# \q

The first command installs the HStore extension. Next we make sure our new user has the privileges required on the new database:

We’ve now created our database and a user to interact with it. Next, we’ll set up our Django install:

$ cd Projects
$ mkdir djangostore
$ cd djangostore
$ pyvenv venv
$ source venv/bin/activate
$ pip install Django psycopg2 ipdb
$ django-admin.py startproject djangostore
$ python manage.py startapp store

I’m assuming here that you’re using Python 3.4. On Ubuntu, getting it working is a bit more involved.

Next, open up djangostore/settings.py and amend INSTALLED_APPS to include the new app and the PostgreSQL-specific functionality:

INSTALLED_APPS = (
'django.contrib.admin',
'django.contrib.auth',
'django.contrib.contenttypes',
'django.contrib.sessions',
'django.contrib.messages',
'django.contrib.staticfiles',
'django.contrib.postgres',
'store',
)

You’ll also need to configure the database settings:

DATABASES = {
'default': {
'ENGINE': 'django.db.backends.postgresql_psycopg2',
'NAME': 'djangostore',
'USER': 'store',
'PASSWORD': 'password',
'HOST': 'localhost',
'PORT': '',
}
}

We need to create an empty migration to use HStoreField:

$ python manage.py makemigrations --empty store

This command should create the file store/migrations/0001_initial.py. Open this up and edit it to look like this:

# -*- coding: utf-8 -*-
from __future__ import unicode_literals
from django.db import models, migrations
from django.contrib.postgres.operations import HStoreExtension
class Migration(migrations.Migration):
dependencies = [
]
operations = [
HStoreExtension(),
]

This will make sure the HStore extension is installed. Next, let’s run these migrations:

$ python manage.py migrate
Operations to perform:
Synchronize unmigrated apps: messages, staticfiles, postgres
Apply all migrations: sessions, store, admin, auth, contenttypes
Synchronizing apps without migrations:
Creating tables...
Running deferred SQL...
Installing custom SQL...
Running migrations:
Rendering model states... DONE
Applying contenttypes.0001_initial... OK
Applying auth.0001_initial... OK
Applying admin.0001_initial... OK
Applying contenttypes.0002_remove_content_type_name... OK
Applying auth.0002_alter_permission_name_max_length... OK
Applying auth.0003_alter_user_email_max_length... OK
Applying auth.0004_alter_user_username_opts... OK
Applying auth.0005_alter_user_last_login_null... OK
Applying auth.0006_require_contenttypes_0002... OK
Applying sessions.0001_initial... OK
Applying store.0001_initial... OK

Now, we’re ready to start creating our Product model. Open up store/models.py and amend it as follows:

from django.contrib.postgres.fields import HStoreField
from django.db import models
# Create your models here.
class Product(models.Model):
created_at = models.DateTimeField(auto_now_add=True)
updated_at = models.DateTimeField(auto_now=True)
name = models.CharField(max_length=200)
description = models.TextField()
price = models.FloatField()
attributes = HStoreField()
def __str__(self):
return self.name

Note that HStoreField is not part of the standard group of model fields, and needs to be imported from the Postgres-specific fields module. Next, let’s create and run our migrations:

$ python manage.py makemigrations
$ python manage.py migrate

We should now have a Product model where the attributes field can be any arbitrary data we want. Note that we installed ipdb earlier - if you’re not familiar with it, this is an improved Python debugger, and also pulls in ipython, an improved Python shell, which Django will use if available.

Open up the Django shell:

$ python manage.py shell

Then, import the Product model:

from store.models import Product

Let’s create our first product - a plastic storage box:

box = Product()
box.name = 'Box'
box.description = 'A big box'
box.price = 5.99
box.attributes = { 'capacity': '1L', "colour": "blue"}
box.save()

If we take a look, we can see that the attributes can be returned as a Python dictionary:

In [12]: Product.objects.all()[0].attributes
Out[12]: {'capacity': '1L', 'colour': 'blue'}

We can easily retrieve single values:

In [15]: Product.objects.all()[0].attributes['capacity']
Out[15]: '1L'

Let’s add a second product - a mop:

mop = Product()
mop.name = 'Mop'
mop.description = 'A mop'
mop.price = 12.99
mop.attributes = { 'colour': "red" }
mop.save()

Now, we can filter out only the red items easily:

In [2]: Product.objects.filter(attributes__contains={'colour': 'red'})
Out[2]: [<Product: Mop>]

Here we search for items where the colour attribute is set to red, and we only get back the mop. Let’s do the same for blue items:

In [3]: Product.objects.filter(attributes__contains={'colour': 'blue'})
Out[3]: [<Product: Box>]

Here it returns the box. Let’s now search for an item with a capacity of 1L:

In [4]: Product.objects.filter(attributes__contains={'capacity': '1L'})
Out[4]: [<Product: Box>]

Only the box has the capacity attribute at all, and it’s the only one returned. Let’s see what happens when we search for an item with a capacity of 2L, which we know is not present:

In [5]: Product.objects.filter(attributes__contains={'capacity': '2L'})
Out[5]: []

No items returned, as expected. Let’s look for any item with the capacity attribute:

In [6]: Product.objects.filter(attributes__has_key='capacity')
Out[6]: [<Product: Box>]

Again, it only returns the box, as that’s the only one where that key exists. Note that all of this is tightly integrated with the existing API for the Django ORM. Let’s add a third product, a food hamper:

In [3]: hamper = Product()
In [4]: hamper.name = 'Hamper'
In [5]: hamper.description = 'A food hamper'
In [6]: hamper.price = 19.99
In [7]: hamper.attributes = {
...: 'contents': 'ham, cheese, coffee',
...: 'size': '90cmx60cm'
...: }
In [8]: hamper.save()

Next, let’s return only those items that have a contents attribute that contains cheese:

In [9]: Product.objects.filter(attributes__contents__contains='cheese')
Out[9]: [<Product: Hamper>]

As you can see, the HStoreField type allows for quite complex queries, while allowing you to set arbitrary values for an individual item. This overcomes one of the biggest issues with relational databases - the inability to set arbitrary data. Previously, you might have to work around it in some fashion, such as creating a table containing attributes for individual items which had to be joined on the product table. This is very cumbersome and difficult to use, especially when you wanted to work with more than one product. With this approach, it’s easy to filter products by multiple values in the HStore field, and you get back all of the attributes at once, as in this example:

In [13]: Product.objects.filter(attributes__capacity='1L', attributes__colour='blue')
Out[13]: [<Product: Box>]
In [14]: Product.objects.filter(attributes__capacity='1L', attributes__colour='blue')[0].attributes
Out[14]: {'capacity': '1L', 'colour': 'blue'}

Similar functionality is coming in a future version of MySQL, so it wouldn’t be entirely surprising to see HStoreField become more generally available in Django in the near future. For now, this functionality is extremely useful and makes for a good reason to ditch MySQL in favour of PostgreSQL for your future Django apps.

21st July 2015 9:15 pm

New Laptop

For a while now it’s been obvious that I needed a new laptop. My main workhorse for a while has been a 2008 MacBook, but I’m not really a fan of Mac OS X and it was stuck on Snow Leopard, so it was somewhat behind the times. It was also painfully slow by modern standards - regenerating this site took a couple of minutes. I had two other reasonably modern laptops, but one was too big and cumbersome, while the other was a Dell Mini, which isn’t really fast enough for a developer. When I last bought a laptop, I wasn’t even a developer, so it was long past time I got a more suitable machine.

I therefore took the plunge and ordered a new Dell XPS 13 Developer Edition, which arrived today. It’s an absolutely beautiful machine, and it’s extremely light. It’s also a lot faster than any other machine I own. The screen is exceptionally sharp, and setting it up was nice and easy.

After an hour or so with this machine, I’m already really happy with it. We’ll have to see whether I still think so after a few months using it.

4th July 2015 1:01 pm

Handling Images As Base64 Strings With Django REST Framework

I’m currently working on a Phonegap app that involves taking pictures and uploading them via a REST API. I’ve done this before, and I found at that time that the best way to do so was to fetch the image as a base-64 encoded string and push that up, rather than the image file itself. However, the last time I did so, I was using Tastypie to build the API, and I’ve since switched over to Django REST Framework as my API toolkit of choice.

It didn’t take long to find this gist giving details of how to do so, but it didn’t work as is, partly because I was using Python 3, and partly because the from_native method has gone as at Django REST Framework 3.0. It was, however, straightforward to adapt it to work. Here’s my solution:

import base64, uuid
from django.core.files.base import ContentFile
from rest_framework import serializers
# Custom image field - handles base 64 encoded images
class Base64ImageField(serializers.ImageField):
def to_internal_value(self, data):
if isinstance(data, str) and data.startswith('data:image'):
# base64 encoded image - decode
format, imgstr = data.split(';base64,') # format ~= data:image/X,
ext = format.split('/')[-1] # guess file extension
id = uuid.uuid4()
data = ContentFile(base64.b64decode(imgstr), name = id.urn[9:] + '.' + ext)
return super(Base64ImageField, self).to_internal_value(data)

This solution will handle both base 64 encoded strings and image files. Then, just use this field as normal.

Recent Posts

Mutation Testing With Infection

Switching from Vim to Neovim

Better Strings in PHP

Forcing SSL in Codeigniter

Logging to the ELK Stack With Laravel

About me

I'm a web and mobile app developer based in Norfolk. My skillset includes Python, PHP and Javascript, and I have extensive experience working with CodeIgniter, Laravel, Django, Phonegap and Angular.js.