Matthew Daly's Blog

I'm a web developer in Norfolk. This is my blog...

18th March 2016 7:42 pm

My Experience Using PHP 7 in Production

In the last couple of weeks I’ve been working on a PHP web app. Nothing unusual there, except this was the first time we’d used PHP 7 in production. We discussed the possibility a while back, and eventually decided that for certain projects we’d use PHP 7 without waiting another year or so (or maybe longer) for a version of Debian stable with it by default. I wanted to talk about how our experience has been using it in production.

Background

We’ve never really had a fixed stack that we work with at work before until recently - it was largely based on personal preferences and experience. For many jobs, especially content-based sites, we generally used WordPress - it has its issues, but it does fine for a lot of work. For more complex websites, I tended to use CodeIgniter because I’d learned it during my previous job and knew it fairly well, but I was not terribly happy with it - it’s a bit too basic and simplistic, as well as being somewhat behind the times, and I only really kept using it through inertia. For mobile app backends, I tended to use Django, partly for the admin interface, and partly because Django REST Framework makes it easy to build a REST API quickly and easily in a way that wasn’t viable with CodeIgniter.

This state of affairs couldn’t really continue. I love Python and Django, but I was the only one at work who had ever used Python, so in the event I got hit by a bus there would have been no-one who could have taken over from me. As for CodeIgniter, it was clearly falling further and further behind the curve, and I was sick of it and looking to replace it. Ideally we needed a PHP framework as both myself and my colleague knew it.

I’d also been playing around with Laravel on a few little projects, but I didn’t get the chance to use it for a new web app until autumn last year. Around the same time, we hired a third developer, who also had some experience using Laravel. In addition, the presence of Lumen meant that we could use that for smaller apps or services that were too small to use Laravel. We therefore decided to adopt Laravel as our default framework - in future we’d only use something else if there was a particular justification for it. I was rather sad to have to abandon Django for work, but pleased to have something more modern than CodeIgniter for PHP projects.

This also enabled us to standardize our new server builds. Over the last year or so I’ve been pushing to automate what we can of our server setup using Ansible. We now have two standard stacks that we plan to use for future projects. One is for WordPress sites and consists of:

  • Debian stable
  • Apache
  • MySQL
  • PHP 5.6
  • Memcached
  • Varnish

The other is for Laravel or Lumen web apps or APIs and consists of:

  • Debian stable
  • Nginx
  • PHP 7
  • PostgreSQL
  • Redis

It took some time to decide what we wanted to settle on, and indeed we had a mobile app backend that went up around Christmas time that we wrote with Laravel, but deployed to Apache with PHP 5.6 because when we first pushed it up PHP 7 wasn’t out yet. However, given that Laravel 5 already had good support for PHP 7, we decided we’d consider it for the next app. I tend to use PostgreSQL rather than MySQL these days because it has a lot of nifty features like JSON fields and full text search, and using an ORM minimises the learning curve in switching, and Redis is much more versatile than Memcached, so they were vital parts of our stack.

Our first PHP 7 app

As it happened, we had a Laravel app in the pipeline that was ideal. In the summer of last year, we were hired to make an existing site responsive. In the end, it turned out not to be viable - it was built with Zend Framework, which none of us had ever touched before, and the front end used a lot of custom widgets and fields tied together with RequireJS. The whole thing was rather unwieldy and extremely difficult to maintain and develop. In the end, we decided to tell the client it wasn’t worth developing further and offer to rewrite the whole thing from scratch using Laravel and AngularJS, with Browserify used to handle JavaScript modules - the basic idea was quite simple, it was just the implementation that was overly complex, and AngularJS made it possible to do the same kind of thing with a fraction of the code, so a rewrite in only a few weeks was perfectly viable.

I’d already built a simple prototype to demonstrate the viability of a from-scratch rewrite using Laravel and Angular, and once the client had agreed to the rewrite, we were able to work on this further. As the web app was going to be particularly useful on mobile devices, I wanted to ensure that the performance was as good as I could possibly make it. By the time we were looking at deploying it to a server, three months had passed since PHP 7 had been first released, and I figured that was long enough for the most serious issues to be resolved, and we could definitely do with the very significant speed boost we’d get from using PHP 7 for this app.

I use Jenkins to run my unit tests, and so I decided to try installing PHP 7 on the Jenkins server and using that to run the tests. The results were encouraging - nothing broke as a result of the switch. So we therefore decided that when we deployed it, we’d try it with PHP 7, and if it failed, we’d switch to PHP 5.6.

I opted to use FPM with Nginx rather than Apache and mod_php as since the web app was purely custom we didn’t really need things like .htaccess, and while the amount of static content was limited, Nginx might well perform better for this use case. The results are fairly encouraging - the document for the home page is typically being returned in under 40ms, with the uncached homepage taking around 1.5s in total to load, despite having to load several external fonts. In its current state, the web app scores a solid 93% on YSlow, which I’m very happy with. I don’t know how much of that is down to using PHP 7, but choosing to use it was definitely a good call. I have had absolutely zero issues with it during that time.

Summary

As always, you should bear in mind that your needs may not be the same as mine, and it could well be that you need something that PHP 7 doesn’t yet provide. However, I have had a very good experience with PHP 7 in production. I may have had to jump through a few more hoops to get it up and running, and there may be some level of risk associated with using PHP 7 when it’s only been available for three months, but it’s more than justified by the speed we get from our web app. Using a configuration management system like Ansible means that even if you do have to jump through some extra hoops, it’s relatively easy to automate that process so it’s not as much of an issue as you might think. For me, using PHP 7 with a Laravel app has worked as well as I could have possibly hoped.

26th January 2016 11:40 pm

Mocking External Apis in Python

It’s quite common to have to integrate an external API into your web app for some of your functionality. However, it’s a really bad idea to have requests be sent to the remote API when running your tests. At best, it means your tests may fail due to unexpected circumstances, such as a network outage. At worst, you could wind up making requests to paid services that will cost you money, or sending push notifications to clients. It’s therefore a good idea to mock these requests in some way, but it can be fiddly.

In this post I’ll show you several ways you can mock an external API so as to prevent requests being sent when running your test suite. I’m sure there are many others, but these have worked for me recently.

Mocking the client library

Nowadays many third-party services realise that providing developers with client libraries in a variety of languages is a good idea, so it’s quite common to find a library for interfacing with a third-party service. Under these circumstances, the library itself is usually already thoroughly tested, so there’s no point in you writing additional tests for that functionality. Instead, you can just mock the client library so that the request is never sent, and if you need a response, then you can specify one that will remain constant.

I recently had to integrate Stripe with a mobile app backend, and I used their client library. I needed to ensure that I got the right result back. In this case I only needed to use the Token object’s create() method. I therefore created a new MockToken class that inherited from Token, and overrode its create() method so that it only accepted one card number and returned a hard-coded response for it:

from stripe.resource import Token, convert_to_stripe_object
from stripe.error import CardError
class MockToken(Token):
@classmethod
def create(cls, api_key=None, idempotency_key=None,
stripe_account=None, **params):
if params['card']['number'] != '4242424242424242':
raise CardError('Invalid card number', None, 402)
response = {
"card": {
"address_city": None,
"address_country": None,
"address_line1": None,
"address_line1_check": None,
"address_line2": None,
"address_state": None,
"address_zip": None,
"address_zip_check": None,
"brand": "Visa",
"country": "US",
"cvc_check": "unchecked",
"dynamic_last4": None,
"exp_month": 12,
"exp_year": 2017,
"fingerprint": "49gS1c4YhLaGEQbj",
"funding": "credit",
"id": "card_17XXdZGzvyST06Z022EiG1zt",
"last4": "4242",
"metadata": {},
"name": None,
"object": "card",
"tokenization_method": None
},
"client_ip": "192.168.1.1",
"created": 1453817861,
"id": "tok_42XXdZGzvyST06Z0LA6h5gJp",
"livemode": False,
"object": "token",
"type": "card",
"used": False
}
return convert_to_stripe_object(response, api_key, stripe_account)

Much of this was lifted straight from the source code for the library. I then wrote a test for the payment endpoint and patched the Token class:

class PaymentTest(TestCase):
@mock.patch('stripe.Token', MockToken)
def test_payments(self):
data = {
"number": '1111111111111111',
"exp_month": 12,
"exp_year": 2017,
"cvc": '123'
}
response = self.client.post(reverse('payments'), data=data, format='json')
self.assertEqual(response.status_code, status.HTTP_400_BAD_REQUEST)

This replaced stripe.Token with MockToken so that in this test, the response from the client library was always going to be the expected one.

If the response doesn’t matter and all you need to do is be sure that the right method would have been called, this is easier. You can just mock the method in question using MagicMock and assert that it has been called afterwards, as in this example:

class ReminderTest(TestCase):
def test_send_reminder(self):
# Mock PushService.create_message()
PushService.create_message = mock.MagicMock(name="create_message")
# Call reminder task
send_reminder()
# Check user would have received a push notification
PushService.create_message.assert_called_with([{'text': 'My push', 'conditions': ['UserID', 'EQ', 1]}])

Mocking lower-level requests

Sometimes, no client library is available, or it’s not worth using one as you only have to make one or two requests. Under these circumstances, there are ways to mock the actual request to the external API. If you’re using the requests module, then there’s a responses module that’s ideal for mocking the API request.

Suppose we have the following code:

import json, requests
def send_request_to_api(data):
# Put together the request
params = {
'auth': settings.AUTH_KEY,
'data': data
}
response = requests.post(settings.API_URL, data={'params': json.dumps(params)})
return response

Using responses we can mock the response from the server in our test:

class APITest(TestCase):
@responses.activate
def test_send_request(self):
# Mock the API
responses.add(responses.POST,
settings.API_URL,
status=200,
content_type="application/json",
body='{"item_id": "12345678"}')
# Call function
data = {
"surname": "Smith",
"location": "London"
}
send_request_to_api(data)
# Check request went to correct URL
assert responses.calls[0].request.url == settings.API_URL

Note the use of the @responses.activate decorator. We use responses.add() to set up each URL we want to be able to mock, and pass through details of the response we want to return. We then make the request, and check that it was made as expected.

You can find more details of the responses module here.

Summary

I’m pretty certain that there are other ways you can mock an external API in Python, but these ones have worked for me recently. If you use another method, please feel free to share it in the comments.

18th November 2015 7:52 pm

Learning More About React.js and Flux

Udemy have very kindly provided some vouchers for free access to their course, “Build Web Apps with ReactJS and Flux” for me to give away to subscribers. To redeem them, follow the link above and use the voucher code MatthewDalysBlog.

There’s only 50 in total, and they are on a first-come, first-serve basis, so I suggest you redeem them sooner rather than later.

28th September 2015 8:00 pm

Building a Real-time Twitter Stream With Node.js, React.js and Redis

In the last year or so, React.js has taken the world of web development by storm. A major reason for this is that it makes it possible to build isomorphic web applications - web apps where the same code can run on the client and the server. Using React.js, you can create a template that will be executed on the server when the page first loads, and then the same template can be used to re-render the content when it’s updated, whether that’s via AJAX, WebSockets or another method entirely.

In this tutorial, I’ll show you how to build a simple Twitter streaming app using Node.js. I’m actually not the only person to have built this to demonstrate React.js, but this is my own particular take on this idea, since it’s such an obvious use case for React.

What is React.js?

A lot of people get rather confused over this issue. It’s not correct to compare React.js with frameworks like Angular.js or Backbone.js. It’s often described as being just the V in MVC - it represents only the view layer. If you’re familiar with Backbone.js, I think it’s reasonable to compare it to Backbone’s views, albeit with it’s own templating syntax. It does not provide the following functionality like Angular and Backbone do:

  • Support for models
  • Any kind of helpers for AJAX requests
  • Routing

If you want any of this functionality, you need to look elsewhere. There are other libraries around that offer this kind of functionality, so if you want to use React as part of some kind of MVC structure, you can do so - they’re just not a part of the library itself.

React.js uses a so-called “virtual DOM” - rather than re-rendering the view from scratch when the state changes, it instead retains a virtual representation of the DOM in memory, updates that, then figures out what changes are required to update the existing DOM and applies them. This means it only needs to change what actually changes, making it faster than other client-side templating systems. Combined with the ability to render on the server side, React allows you to build high-performance apps that combine the initial speed and SEO advantages of conventional web apps with the responsiveness of single-page web apps.

To create components with React, it’s common to use an XML-like syntax called JSX. It’s not mandatory, but I highly recommend you do so as it’s much more intuitive than creating elements with Javascript.

Getting started

You’ll need a Twitter account, and you’ll need to create a new Twitter app and obtain the security credentials to let you access the Twitter Streaming API. You’ll also need to have Node.js installed (ideally using nvm) - at this time, however, you can’t use Node 4.0 because of issues with Redis. You will also need to install Redis and hiredis - if you’ve worked through my previous Redis tutorials you’ll have these already.

We’ll be using Gulp.js as our build system, and Bower to install some client-side packages, so they need to be installed globally:

$ npm install -g gulp bower

We’ll also be using Compass to help with our stylesheets:

$ sudo gem install compass

With that all done, let’s start work on our app. First, run the following command to create your package.json:

$ npm init

I’m assuming you’re well-acquainted enough with Node.js to know what this does, and can answer the questions without difficulty. I won’t cover writing tests in this tutorial as, but set your test command to gulp test and you should be fine.

Next, we need to install our dependencies:

$ npm install --save babel compression express hbs hiredis lodash morgan react redis socket.io socket.io-client twitter
$ npm install --save-dev browserify chai gulp gulp-compass gulp-coveralls gulp-istanbul gulp-jshint gulp-mocha gulp-uglify jshint-stylish reactify request vinyl-buffer vinyl-source-stream

Planning our app

Now, it’s worth taking a few minutes to plan the architecture of our app. We want to have the app listen to the Twitter Streaming API and filter for messages with any arbitrary string in them - in this case we’ll be searching for “javascript”, but you can set it to anything you like. That means that that part needs to be listening all the time, not just when someone is using the app. Also, it doesn’t fit neatly into the usual request-response cycle - if several people visit the site at once, we could end up with multiple connections to fetch the same data, which is really not efficient, and could cause problems with duplicate tweets showing up.

Instead, we’ll have a separate worker.js file which runs constantly. This will listen for any matching messages on Twitter. When one appears, rather than returning it itself, it will publish it to a Redis channel, as well as persisting it. Then, the web app, which will be the index.js file, will be subscribed to the same channel, and will receive the tweet and push it to all current users using Socket.io.

This is a good example of a message queue, and it’s a common pattern. It allows you to create dedicated sections of your app for different tasks, and means that they will generally be more robust. In this case, if the worker goes down, users will still be able to see some tweets, and if the server goes down, the tweets will still be persisted to Redis. In theory, this would also allow you to scale your app more easily by allowing movement of different tasks to different servers, and several app servers could interface with a single worker process. The only downside I can think of is that on a platform like Heroku you’d need to have a separate dyno for the worker process - however, with Heroku’s pricing model changing recently, since this needs to be listening all the time it won’t be suitable for the free tier anyway.

First let’s create our gulpfile.js:

var gulp = require('gulp');
var jshint = require('gulp-jshint');
var source = require('vinyl-source-stream');
var buffer = require('vinyl-buffer');
var browserify = require('browserify');
var reactify = require('reactify');
var mocha = require('gulp-mocha');
var istanbul = require('gulp-istanbul');
var coveralls = require('gulp-coveralls');
var compass = require('gulp-compass');
var uglify = require('gulp-uglify');
var paths = {
scripts: ['components/*.jsx'],
styles: ['src/sass/*.scss']
};
gulp.task('lint', function () {
return gulp.src([
'index.js',
'components/*.js'
])
.pipe(jshint())
.pipe(jshint.reporter('jshint-stylish'));
});
gulp.task('compass', function() {
gulp.src('src/sass/*.scss')
.pipe(compass({
css: 'static/css',
sass: 'src/sass'
}))
.pipe(gulp.dest('static/css'));
});;
gulp.task('test', function () {
gulp.src('index.js')
.pipe(istanbul())
.pipe(istanbul.hookRequire())
.on('finish', function () {
gulp.src('test/test.js', {read: false})
.pipe(mocha({ reporter: 'spec' }))
.pipe(istanbul.writeReports({
reporters: [
'lcovonly',
'cobertura',
'html'
]
}))
.pipe(istanbul.enforceThresholds({ thresholds: { global: 90 } }))
.once('error', function () {
process.exit(0);
})
.once('end', function () {
process.exit(0);
});
});
});
gulp.task('coveralls', function () {
gulp.src('coverage/lcov.info')
.pipe(coveralls());
});
gulp.task('react', function () {
return browserify({ entries: ['components/index.jsx'], debug: true })
.transform(reactify)
.bundle()
.pipe(source('bundle.js'))
.pipe(buffer())
.pipe(uglify())
.pipe(gulp.dest('static/jsx/'));
});
gulp.task('default', function () {
gulp.watch(paths.scripts, ['react']);
gulp.watch(paths.styles, ['compass']);
});

I’ve added tasks for the tests and JSHint if you choose to implement them, but the only ones I’ve actually used are the compass and react tasks. The compass task compiles our Sass files into CSS, while the react task uses Browserify to take our React components and various modules installed using NPM and build them for use in the browser, as well as minifying them. Note that we installed React and lodash with NPM? We’re going to be able to use them in the browser and on the server, thanks to Browserify.

Next, let’s create our worker.js file:

/*jslint node: true */
'use strict';
// Get dependencies
var Twitter = require('twitter');
// Set up Twitter client
var client = new Twitter({
consumer_key: process.env.TWITTER_CONSUMER_KEY,
consumer_secret: process.env.TWITTER_CONSUMER_SECRET,
access_token_key: process.env.TWITTER_ACCESS_TOKEN_KEY,
access_token_secret: process.env.TWITTER_ACCESS_TOKEN_SECRET
});
// Set up connection to Redis
var redis;
if (process.env.REDIS_URL) {
redis = require('redis').createClient(process.env.REDIS_URL);
} else {
redis = require('redis').createClient();
}
client.stream('statuses/filter', {track: 'javascript', lang: 'en'}, function(stream) {
stream.on('data', function(tweet) {
// Log it to console
console.log(tweet);
// Publish it
redis.publish('tweets', JSON.stringify(tweet));
// Persist it to a Redis list
redis.rpush('stream:tweets', JSON.stringify(tweet));
});
// Handle errors
stream.on('error', function (error) {
console.log(error);
});
});

Most of this file should be fairly straightforward. We set up our connection to Twitter (you’ll need to set the various environment variables listed here using the appropriate method for your operating system), and a connection to Redis.

We then stream the Twitter statuses that match our filter. When we receive a tweet, we log it to the console (feel free to comment this out in production if desired), publish it to a Redis channel called tweets, and push it to the end of a Redis list called stream:tweets. When an error occurs, we output it to the console.

Let’s use Bootstrap to style the app. Create the following .bowerrc file:

{
"directory": "static/bower_components"
}

Then run bower init to create your bower.json file, and install Bootstrap with bower install --save sass-bootstrap.

With that done, create the file src/sass/style.scss and enter the following:

@import "compass/css3/user-interface";
@import "compass/css3";
@import "../../static/bower_components/sass-bootstrap/lib/bootstrap.scss";

This includes some dependencies from Compass, as well as Bootstrap. We won’t be using any of the Javascript features of Bootstrap, so we don’t need to worry too much about that.

Next, we need to create our view files. As React will be used to render the main part of the page, these will be very basic, with just the header, footer, and a section where the content can be rendered. First, create views/index.hbs:

{{> header }}
<div class="container">
<div class="row">
<div class="col-md-12">
<div id='view'>{{{ markup }}}</div>
</div>
</div>
</div>
<script id="initial-state" type="application/json">{{{state}}}</script>
{{> footer }}

As promised, this a very basic layout. Note the markup variable, which is where the markup generated by React will be inserted when rendered on the server, and the state variable, which will contain the JSON representation of the data used to generate that markup. By passing that data through, you can ensure that the instance of React on the client has access to the same raw data as was passed through to the view on the server side, so that when the data needs to be re-rendered, it can be done so correctly.

We’ll also define partials for the header and footer. The header should be in views/partials/header.hbs:

<!DOCTYPE html>
<!--[if lt IE 7]> <html class="no-js lt-ie9 lt-ie8 lt-ie7"> <![endif]-->
<!--[if IE 7]> <html class="no-js lt-ie9 lt-ie8"> <![endif]-->
<!--[if IE 8]> <html class="no-js lt-ie9"> <![endif]-->
<!--[if gt IE 8]><!--> <html class="no-js"> <!--<![endif]-->
<head>
<meta charset="utf-8">
<meta http-equiv="X-UA-Compatible" content="IE=edge">
<title>Tweet Stream</title>
<meta name="description" content="">
<meta name="viewport" content="width=device-width, initial-scale=1">
<!-- Place favicon.ico and apple-touch-icon.png in the root directory -->
<link rel="stylesheet" type="text/css" href="/css/style.css">
</head>
<body>
<!--[if lt IE 7]>
<p class="browsehappy">You are using an <strong>outdated</strong> browser. Please <a href="http://browsehappy.com/">upgrade your browser</a> to improve your experience.</p>
<![endif]-->
<nav class="navbar navbar-inverse navbar-static-top" role="navigation">
<div class="container-fluid">
<div class="navbar-header">
<button type="button" class="navbar-toggle" data-toggle="collapse" data-target="#header-nav">
<span class="icon-bar"></span>
<span class="icon-bar"></span>
<span class="icon-bar"></span>
</button>
<a class="navbar-brand" href="/">Tweet Stream</a>
<div class="collapse navbar-collapse navbar-right" id="header-nav">
</div>
</div>
</div>
</nav>

The footer should be in views/partials/footer.hbs:

<script src="/jsx/bundle.js"></script>
</body>
</html>

Note that we load the Javascript file /jsx/bundle.js - this is the output from the command gulp react.

Creating the back end

The next step is to implement the back end of the website. Add the following code as index.js:

/*jslint node: true */
'use strict';
require('babel/register');
// Get dependencies
var express = require('express');
var app = express();
var compression = require('compression');
var port = process.env.PORT || 5000;
var base_url = process.env.BASE_URL || 'http://localhost:5000';
var hbs = require('hbs');
var morgan = require('morgan');
var React = require('react');
var Tweets = React.createFactory(require('./components/tweets.jsx'));
// Set up connection to Redis
var redis, subscribe;
if (process.env.REDIS_URL) {
redis = require('redis').createClient(process.env.REDIS_URL);
subscribe = require('redis').createClient(process.env.REDIS_URL);
} else {
redis = require('redis').createClient();
subscribe = require('redis').createClient();
}
// Set up templating
app.set('views', __dirname + '/views');
app.set('view engine', "hbs");
app.engine('hbs', require('hbs').__express);
// Register partials
hbs.registerPartials(__dirname + '/views/partials');
// Set up logging
app.use(morgan('combined'));
// Compress responses
app.use(compression());
// Set URL
app.set('base_url', base_url);
// Serve static files
app.use(express.static(__dirname + '/static'));
// Render main view
app.get('/', function (req, res) {
// Get tweets
redis.lrange('stream:tweets', 0, -1, function (err, tweets) {
if (err) {
console.log(err);
} else {
// Get tweets
var tweet_list = [];
tweets.forEach(function (tweet, i) {
tweet_list.push(JSON.parse(tweet));
});
// Render page
var markup = React.renderToString(Tweets({ data: tweet_list.reverse() }));
res.render('index', {
markup: markup,
state: JSON.stringify(tweet_list)
});
}
});
});
// Listen
var io = require('socket.io')({
}).listen(app.listen(port));
console.log("Listening on port " + port);
// Handle connections
io.sockets.on('connection', function (socket) {
// Subscribe to the Redis channel
subscribe.subscribe('tweets');
// Handle receiving messages
var callback = function (channel, data) {
socket.emit('message', data);
};
subscribe.on('message', callback);
// Handle disconnect
socket.on('disconnect', function () {
subscribe.removeListener('message', callback);
});
});

Let’s go through this bit by bit:

/*jslint node: true */
'use strict';
require('babel/register');

Here we’re using Babel, which is a library that allows you to use new features in Javascript even if the interpreter doesn’t support it. It also includes support for JSX, allowing us to require JSX files in the same way we would require Javascript files.

// Get dependencies
var express = require('express');
var app = express();
var compression = require('compression');
var port = process.env.PORT || 5000;
var base_url = process.env.BASE_URL || 'http://localhost:5000';
var hbs = require('hbs');
var morgan = require('morgan');
var React = require('react');
var Tweets = React.createFactory(require('./components/tweets.jsx'));

Here we include our dependencies. Most of this will be familiar if you’ve used Express before, but we also use React to create a factory for a React component called Tweets.

// Set up connection to Redis
var redis, subscribe;
if (process.env.REDIS_URL) {
redis = require('redis').createClient(process.env.REDIS_URL);
subscribe = require('redis').createClient(process.env.REDIS_URL);
} else {
redis = require('redis').createClient();
subscribe = require('redis').createClient();
}
// Set up templating
app.set('views', __dirname + '/views');
app.set('view engine', "hbs");
app.engine('hbs', require('hbs').__express);
// Register partials
hbs.registerPartials(__dirname + '/views/partials');
// Set up logging
app.use(morgan('combined'));
// Compress responses
app.use(compression());
// Set URL
app.set('base_url', base_url);
// Serve static files
app.use(express.static(__dirname + '/static'));

This section sets up the various dependencies of our app. We set up two connections to Redis - one for handling subscriptions, the other for reading from Redis in order to populate the view.

We also set up our views, logging, compression of the HTTP response, a base URL, and serving static files.

// Render main view
app.get('/', function (req, res) {
// Get tweets
redis.lrange('stream:tweets', 0, -1, function (err, tweets) {
if (err) {
console.log(err);
} else {
// Get tweets
var tweet_list = [];
tweets.forEach(function (tweet, i) {
tweet_list.push(JSON.parse(tweet));
});
// Render page
var markup = React.renderToString(Tweets({ data: tweet_list.reverse() }));
res.render('index', {
markup: markup,
state: JSON.stringify(tweet_list)
});
}
});
});

Our app only has a single view. When the root is loaded, we first of all fetch all of the tweets stored in the stream:tweets list. We then convert them into an array of objects.

Next, we render the Tweets component to a string, passing through our list of tweets, and store the resulting markup. We then pass through this markup and the string representation of the list of tweets to the template.

// Listen
var io = require('socket.io')({
}).listen(app.listen(port));
console.log("Listening on port " + port);
// Handle connections
io.sockets.on('connection', function (socket) {
// Subscribe to the Redis channel
subscribe.subscribe('tweets');
// Handle receiving messages
var callback = function (channel, data) {
socket.emit('message', data);
};
subscribe.on('message', callback);
// Handle disconnect
socket.on('disconnect', function () {
subscribe.removeListener('message', callback);
});
});

Finally, we set up Socket.io. On a connection, we subscribe to the Redis channel tweets. When we receive a tweet from Redis, we emit that tweet so that it can be rendered on the client side. We also handle disconnections by removing our Redis subscription.

Creating our React components

Now it’s time to create our first React component. We’ll create a folder called components to hold all of our component files. Our first file is components/index.jsx:

var React = require('react');
var Tweets = require('./tweets.jsx');
var initialState = JSON.parse(document.getElementById('initial-state').innerHTML);
React.render(
<Tweets data={initialState} />,
document.getElementById('view')
);

First of all, we include React and the same Tweets component we require on the server side (note that we need to specify the .jsx extension). Then we fetch the initial state from the script tag we created earlier. Finally we render the Tweets components, passing through the initial state, and specify that it should be inserted into the element with an id of view. Note that we store the initial state in data - inside the component, this can be accessed as this.props.data.

This particular component is only ever used on the client side - when we render on the server side, we don’t need any of this functionality since we insert the markup into the view element anyway, and we don’t need to specify the initial data in the same way.

Next, we define the Tweets component in components/tweets.jsx:

var React = require('react');
var io = require('socket.io-client');
var TweetList = require('./tweetlist.jsx');
var _ = require('lodash');
var Tweets = React.createClass({
componentDidMount: function () {
// Get reference to this item
var that = this;
// Set up the connection
var socket = io.connect(window.location.href);
// Handle incoming messages
socket.on('message', function (data) {
// Insert the message
var tweets = that.props.data;
tweets.push(JSON.parse(data));
tweets = _.sortBy(tweets, function (item) {
return item.created_at;
}).reverse();
that.setProps({data: tweets});
});
},
getInitialState: function () {
return {data: this.props.data};
},
render: function () {
return (
<div>
<h1>Tweets</h1>
<TweetList data={this.props.data} />
</div>
)
}
});
module.exports = Tweets;

Let’s work our way through each section in turn:

var React = require('react');
var io = require('socket.io-client');
var TweetList = require('./tweetlist.jsx');
var _ = require('lodash');

Here we include React and the Socket.io client, as well as Lodash and our TweetList component. With React.js, it’s recommend that you break up each individual part of your interface into a single component - here Tweets is a wrapper for the tweets that includes a heading. TweetList will be a list of tweets, and TweetItem will be an individual tweet.

var Tweets = React.createClass({
componentDidMount: function () {
// Get reference to this item
var that = this;
// Set up the connection
var socket = io.connect(window.location.href);
// Handle incoming messages
socket.on('message', function (data) {
// Insert the message
var tweets = that.props.data;
tweets.push(JSON.parse(data));
tweets = _.sortBy(tweets, function (item) {
return item.created_at;
}).reverse();
that.setProps({data: tweets});
});
},

Note the use of the componentDidMount method - this fires when a component has been rendered on the client side for the first time. You can therefore use it to set up events. Here, we’re setting up a callback so that when a new tweet is received, we get the existing tweets (stored in this.props.data, although we copy this to that so it works inside the callback), push the tweet to this list, sort it by the time created, and set this.props.data to the new value. This will result in the tweets being re-rendered.

getInitialState: function () {
return {data: this.props.data};
},

Here we set the initial state of the component - it sets the value of this.state to the object passed through. In this case, we pass through an object with the attribute data defined as the value of this.props.data, meaning that this.state.data is the same as this.props.data.

render: function () {
return (
<div>
<h1>Tweets</h1>
<TweetList data={this.props.data} />
</div>
)
}
});
module.exports = Tweets;

Here we define our render function. This can be thought of as our template. Note that we include TweetList inside our template and pass through the data. Afterwards, we export Tweets so it can be used elsewhere.

Next, let’s create components/tweetlist.jsx:

var React = require('react');
var TweetItem = require('./tweetitem.jsx');
var TweetList = React.createClass({
render: function () {
var that = this;
var tweetNodes = this.props.data.map(function (item, index) {
return (
<TweetItem key={index} text={item.text}></TweetItem>
);
});
return (
<ul className="tweets list-group">
{tweetNodes}
</ul>
)
}
});
module.exports = TweetList;

This component is much simpler - it only has a render method. First, we get our individual tweets and for each one define a TweetItem component. Then we create an unordered list and insert the tweet items into it. We then export it as TweetList.

Our final component is the TweetItem component. Create the following file at components/tweetitem.jsx:

var React = require('react');
var TweetItem = React.createClass({
render: function () {
return (
<li className="list-group-item">{this.props.text}</li>
);
}
});
module.exports = TweetItem;

This component is quite simple. It’s just a single list item with the text set to the value of the tweet’s text attribute.

That should be all of our components done. Time to compile our Sass and run Browserify:

$ gulp compass
$ gulp react

Now, if you make sure you have set the appropriate environment variables, and then run node worker.js in one terminal, and node index.js in another, and visit http://localhost:5000/, you should see your Twitter stream in all its glory! You can also try it with Javascript disabled, or in a text-mode browser such as Lynx, to demonstrate that it still renders the page without having to do anything on the client side - you’re only missing the constant updates.

Wrapping up

I hope this gives you some idea of how you can easily use React.js on both the client and server side to make web apps that are fast and search-engine friendly while also being easy to update dynamically. You can find the source code on GitHub.

Hopefully I’ll be able to publish some later tutorials that build on this to show you how to build more substantial web apps with React.

19th September 2015 7:42 pm

A Quick and Easy Varnish Primer

As I mentioned in an earlier post, I recently had the occasion to use Varnish to improve the performance of a website that otherwise would have been unreliable and unusably slow due to WordPress making an excessive number of queries. The difference it made was nothing short of staggering, and I’m not exaggerating when I say it saved the day. I now use Ansible for provisioning new WordPress sites, and Varnish is now a standard part of my WordPress site setup playbook.

However, Varnish can be quite fiddly to configure, and it was something of a baptism of fire for me to learn how to configure it appropriately for this use case. I did make a few mistakes that caused problems down the line, so I thought I’d share the details of how I got it working for that particular site.

What is Varnish?

From the website:

Varnish Cache is a web application accelerator also known as a caching HTTP reverse proxy. You install it in front of any server that speaks HTTP and configure it to cache the contents. Varnish Cache is really, really fast. It typically speeds up delivery with a factor of 300 - 1000x, depending on your architecture.

In other words, you run it on the usual HTTP or HTTPS port, move your usual web server to a different port, and configure it, and it will cache web pages so they can be served more quickly to subsequent visitors.

Be warned - Varnish is not something where you can generally stick with the default settings. The default behaviour does make a lot of sense, but in practice almost no-one will be able to get away with leaving the configuration unchanged.

Installing Varnish

If you’re using Debian or a derivative such as Ubuntu, Varnish is available via apt-get:

$ sudo apt-get install varnish

You may also want to install the documentation:

$ sudo apt-get install varnish-doc

If you’re using Apache I’d also recommend installing libapache2-mod-rpaf and enabling it with sudo a2enmod rpaf - without this, Apache will log all incoming requests as coming from the same server.

I’m assuming you already have a normal web server installed. I’ll assume you’re using Apache, but it shouldn’t be hard to adapt these instructions to work with Nginx. I’m also assuming that the site you want to use Varnish for is a WordPress site with WooCommerce and W3 Total Cache installed. However, this is only for example purposes. If you want to use Varnish for a different web app, you’ll need to plan your caching strategy around that web app yourself.

Please also note that this is using Varnish 4.0, which is the version available with Debian Jessie. If you’re using an older operating system, you may have Varnish 3.0 in the repositories - be warned, the configuration language changed in Varnish 4.0, so the examples here will not work with older versions of Varnish.

By default, Varnish runs on port 6081, which is fine for testing it out, but once you want to go live it’s not what you want. When it’s time to go live, you’ll need to open up /etc/default/varnish and edit the value of DAEMON_OPTS to something like this:

DAEMON_OPTS="-a :80 \
-T localhost:6082 \
-f /etc/varnish/default.vcl \
-S /etc/varnish/secret \
-s malloc,256m"

Note that the -a flag represents the port Varnish is running on.

If you’re using an operating system that uses systemd, such as Debian Jessie, this alone won’t be sufficient. Create a new file at /etc/systemd/system/varnish.service and enter the following:

[Unit]
Description=Varnish HTTP accelerator
[Service]
Type=forking
LimitNOFILE=131072
LimitMEMLOCK=82000
ExecStartPre=/usr/sbin/varnishd -C -f /etc/varnish/default.vcl
ExecStart=/usr/sbin/varnishd -a :80 -T localhost:6082 -f /etc/varnish/default.vcl -S /etc/varnish/secret -s malloc,256m
ExecReload=/usr/share/varnish/reload-vcl
[Install]
WantedBy=multi-user.target

Next, we need to move our web server to a different port. We’ll use port 8080. Replace the contents of /etc/apache2/ports.conf with this:

# If you just change the port or add more ports here, you will likely also
# have to change the VirtualHost statement in
# /etc/apache2/sites-enabled/000-default
# This is also true if you have upgraded from before 2.2.9-3 (i.e. from
# Debian etch). See /usr/share/doc/apache2.2-common/NEWS.Debian.gz and
# README.Debian.gz
NameVirtualHost *:8080
Listen 8080
<IfModule mod_ssl.c>
# If you add NameVirtualHost *:443 here, you will also have to change
# the VirtualHost statement in /etc/apache2/sites-available/default-ssl
# to <VirtualHost *:443>
# Server Name Indication for SSL named virtual hosts is currently not
# supported by MSIE on Windows XP.
Listen 443
</IfModule>
<IfModule mod_gnutls.c>
Listen 443
</IfModule>

You’ll also need to change the ports for the individual site files under /etc/apache2/sites-available, as in this example:

<VirtualHost *:8080>
ServerAdmin webmaster@localhost
DocumentRoot /var/www
<Directory />
Options FollowSymLinks
AllowOverride All
</Directory>
<Directory /var/www/>
Options FollowSymLinks MultiViews
AllowOverride All
Order allow,deny
allow from all
</Directory>
ScriptAlias /cgi-bin/ /usr/lib/cgi-bin/
<Directory "/usr/lib/cgi-bin">
AllowOverride None
Options +ExecCGI -MultiViews +SymLinksIfOwnerMatch
Order allow,deny
Allow from all
</Directory>
ErrorLog ${APACHE_LOG_DIR}/error.log
# Possible values include: debug, info, notice, warn, error, crit,
# alert, emerg.
LogLevel warn
CustomLog ${APACHE_LOG_DIR}/access.log combined
</VirtualHost>

Writing our VCL file

Next, we come to our Varnish configuration proper, which resides at /etc/varnish/default.vcl. The vcl stands for Varnish Configuration Language, and it has a syntax somewhat reminiscent of C.

The default behaviour for Varnish is as follows:

  • It does not cache requests that contain cookie or authorization headers
  • It does not cache requests which the backend HTTP server indicates should not be cached
  • It will only cache GET and HEAD requests

This behaviour is unlikely to meet your needs. We’ll therefore work through the Varnish config file I wrote for this WordPress site in the hope that it will teach you enough to adapt it to your own needs.

vcl 4.0;
backend default {
.host = "127.0.0.1";
.port = "8080";
}
acl purge {
"127.0.0.1";
"localhost";
}
sub vcl_recv {
# Never cache PUT, PATCH, DELETE or POST requests
if (req.method == "PUT" || req.method == "PATCH" || req.method == "DELETE" || req.method == "POST") {
return (pass);
}
# Never cache cart, account, checkout or addons
if (req.url ~ "^/(cart|my-account|checkout|addons)") {
return (pass);
}
# Never cache adding to cart
if ( req.url ~ "\?add-to-cart=" ) {
return (pass);
}
# Never cache admin or login
if ( req.url ~ "^/wp-(admin|login|cron)" ) {
return (pass);
}
# Never cache WooCommerce API
if ( req.url ~ "wc-api" ) {
return (pass);
}
# Remove has_js and CloudFlare/Google Analytics __* cookies and statcounter is_unique
set req.http.Cookie = regsuball(req.http.Cookie, "(^|;\s*)(_[_a-z]+|has_js|is_unique)=[^;]*", "");
# Remove a ";" prefix, if present.
set req.http.Cookie = regsub(req.http.Cookie, "^;\s*", "");
# Remove the wp-settings-1 cookie
set req.http.Cookie = regsuball(req.http.Cookie, "wp-settings-1=[^;]+(; )?", "");
# Remove the wp-settings-time-1 cookie
set req.http.Cookie = regsuball(req.http.Cookie, "wp-settings-time-1=[^;]+(; )?"
, "");
# Remove the wp test cookie
set req.http.Cookie = regsuball(req.http.Cookie, "wordpress_test_cookie=[^;]+(; )?", "");
# Static content unique to the theme can be cached (so no user uploaded images)
# The reason I don't take the wp-content/uploads is because of cache size on bigger blogs
# that would fill up with all those files getting pushed into cache
if (req.url ~ "wp-content/themes/" && req.url ~ "\.(css|js|png|gif|jp(e)?g)") {
unset req.http.cookie;
}
# Even if no cookies are present, I don't want my "uploads" to be cached due to their potential size
if (req.url ~ "/wp-content/uploads/") {
return (pass);
}
# any pages with captchas need to be excluded
if (req.url ~ "^/contact/")
{
return(pass);
}
# Check the cookies for wordpress-specific items
if (req.http.Cookie ~ "wordpress_" || req.http.Cookie ~ "comment_") {
# A wordpress specific cookie has been set
return (pass);
}
# allow PURGE from localhost
if (req.method == "PURGE") {
if (!client.ip ~ purge) {
return(synth(405, "Not allowed."));
}
return (purge);
}
# Force lookup if the request is a no-cache request from the client
if (req.http.Cache-Control ~ "no-cache") {
return (pass);
}
# Try a cache-lookup
return (hash);
}
sub vcl_backend_response {
set beresp.grace = 5m;
}

Let’s take a closer look at the first part of the config:

vcl 4.0;
backend default {
.host = "127.0.0.1";
.port = "8080";
}

Here we define that we’re using version 4.0 of VCL, and that the host to use as a back end is port 8080 on the same server. If your normal HTTP server is running on a different port, you will need to set it here. Also, note that you can use a different host as the backend.

acl purge {
"127.0.0.1";
"localhost";
}

We also set which hosts can trigger a purge of the cache, namely localhost and 127.0.0.1. The web app hosted on the server can then make an HTTP PURGE request to a given path, which will clear that path from the cache. In our case, W3 Total Cache supports this - if it’s a custom web app, you’ll need to implement this functionality yourself to clear the cache when new content is added.

Next, we start the vcl_recv subroutine. This is where we define our rules for deciding whether or not to serve content from the cache. Let’s look at our first rule:

sub vcl_recv {
# Never cache PUT, PATCH, DELETE or POST requests
if (req.method == "PUT" || req.method == "PATCH" || req.method == "DELETE" || req.method == "POST") {
return (pass);
}

Here, we declare that we should never cache any PUT, PATCH, DELETE or POST requests, on the basis that these change the state of the application. This ensures that things like contact forms will work as expected.

Note that we’re getting the value of req.method to determine the HTTP verb used. The req object has many other properties we’ll see being used.

# Never cache cart, account, checkout or addons
if (req.url ~ "^/(cart|my-account|checkout|addons)") {
return (pass);
}
# Never cache adding to cart
if ( req.url ~ "\?add-to-cart=" ) {
return (pass);
}
# Never cache admin or login
if ( req.url ~ "^/wp-(admin|login|cron)" ) {
return (pass);
}
# Never cache WooCommerce API
if ( req.url ~ "wc-api" ) {
return (pass);
}

Next, we define a series of regular expressions, and if the URL (represented by req.url) matches that regex, then the request is passed straight through to Apache without Varnish getting involved. In this case, we never want to cache the following sections:

  • The shopping cart, checkout, addons page or account page
  • The Add to cart button
  • The WordPress admin and login screen, and cron requests
  • The WooCommerce API

You’ll need to consider which parts of your site must always serve the latest content and which don’t need everything to be fully up to date. Typically admin areas any anything interactive must not be cached, while the front page is usually fine.

# Remove has_js and CloudFlare/Google Analytics __* cookies and statcounter is_unique
set req.http.Cookie = regsuball(req.http.Cookie, "(^|;\s*)(_[_a-z]+|has_js|is_unique)=[^;]*", "");
# Remove a ";" prefix, if present.
set req.http.Cookie = regsub(req.http.Cookie, "^;\s*", "");
# Remove the wp-settings-1 cookie
set req.http.Cookie = regsuball(req.http.Cookie, "wp-settings-1=[^;]+(; )?", "");
# Remove the wp-settings-time-1 cookie
set req.http.Cookie = regsuball(req.http.Cookie, "wp-settings-time-1=[^;]+(; )?"
, "");
# Remove the wp test cookie
set req.http.Cookie = regsuball(req.http.Cookie, "wordpress_test_cookie=[^;]+(; )?", "");

Cookies, even ones set on the client side such as those for Google Analytics, can prevent content from being cached. To prevent this, you need to configure Varnish to discard these cookies before passing them on to Apache. In this case, we want to exclude Google Analytics and various WordPress cookies.

# Static content unique to the theme can be cached (so no user uploaded images)
if (req.url ~ "wp-content/themes/" && req.url ~ "\.(css|js|png|gif|jp(e)?g)") {
unset req.http.cookie;
}

Here we allow static content that’s part of the site theme to be cached since that doesn’t change often, so we unset the cookies for that request.

# Even if no cookies are present, I don't want my "uploads" to be cached due to their potential size
if (req.url ~ "/wp-content/uploads/") {
return (pass);
}

Here we prevent any user-uploaded content from being cached, since that can change often.

# any pages with captchas need to be excluded
if (req.url ~ "^/contact/")
{
return(pass);
}

Captchas must obviously never be cached since that will break them. In this case, we assume that the contact form has a captcha, so it gets excluded from the cache.

# Check the cookies for wordpress-specific items
if (req.http.Cookie ~ "wordpress_" || req.http.Cookie ~ "comment_") {
# A wordpress specific cookie has been set
return (pass);
}

Here we check for remaining WordPress-specific cookies. These would indicate that a user is signed in, in which case we may want to serve them all the latest content rather than displaying content from the cache.

# allow PURGE from localhost
if (req.method == "PURGE") {
if (!client.ip ~ purge) {
return(synth(405, "Not allowed."));
}
return (purge);
}

Remember where we allowed the local server to clear the cache? This section actually carries out the purge when it receives a request from an authorised client.

# Force lookup if the request is a no-cache request from the client
if (req.http.Cache-Control ~ "no-cache") {
return (pass);
}

Here we check to see if the Cache-Control HTTP header is set to no-cache. If so, we pass it straight through to Apache.

# Try a cache-lookup
return (hash);
}

This is the last rule under vcl_recv, because it only reaches this point if the request has got past all the other rules. It tries to fetch the page from the cache. If the page is not in the cache, it passes it on to Apache and will cache the response.

sub vcl_backend_response {
set beresp.grace = 5m;
}

This is where we set how long responses are cached for. Here we’ve set it to 5 minutes.

With that done, we should be ready to restart Varnish and Apache. If you are using an operating system with systemd, then the following commands should restart Apache and Varnish:

$ sudo systemctl reload apache2.service
$ sudo systemctl reload varnish.service

For those not yet using systemd, try this instead:

$ sudo service apache2 restart
$ sudo service varnish restart

If you then visit your site and inspect the HTTP headers using your browser’s dev tools, you’ll notice the new HTTP header X-Varnish in the response. This tells you that Varnish is up and running. If you make sure you’re logged out, you should hopefully see that if you load a page, and then load it again, the second response is noticeably quicker.

Installing and configuring Varnish is a relatively quick and easy way of helping your website scale to be able to serve many more users, and if the site becomes popular all of a sudden, it can make a huge difference as to whether the site can stand up to the load or not. If you need more information on how to configure Varnish for your own needs, I recommend consulting the excellent documentation.

Recent Posts

Full-text Search With Mariadb

Building a Letter Classifier in PHP With Tesseract OCR and PHP ML

Console Applications With the Symfony Console Component

Rendering Different Views for Mobile and Desktop Clients in Laravel

Making Wordpress Less Shit

About me

I'm a web and mobile app developer based in Norfolk. My skillset includes Python, PHP and Javascript, and I have extensive experience working with CodeIgniter, Laravel, Django, Phonegap and Angular.js.