Searching content with Fuse.js
Published by Matthew Daly at 20th February 2019 5:25 pm
Search is a problem I'm currently taking a big interest in. The legacy project I maintain has an utterly abominable search facility, one that I'm eager to replace with something like Elasticsearch. But smaller sites that are too small for Elasticsearch to be worth the bother can still benefit from having a decent search implementation. Despite some recent improvements, relational databases aren't generally that good a fit for search because they don't really understand the concept of relevance - you can't easily order something by how good a match it is, and your database may not deal with fuzzy matching well.
I'm currently working on a small flat-file CMS as a personal project. It's built with PHP, but it's intended to be as simple as possible, with no database, no caching service, and certainly no search service, so it needs something small and simple, but still effective for search.
In the past I've used Lunr.js on my own site, and it works very well for this use case. However, it's problematic for this case as the index needs to be generated in Javascript on the server side, and adding Node.js to the stack for a flat-file PHP CMS is not really an option. What I needed was something where I could generate the index in any language I chose, load it via AJAX, and search it on the client side. I recently happened to stumble across Fuse.js, which was pretty much exactly what I was after.
Suppose we have the following index:
1[2 {3 "title":"About me",4 "path":"about/"5 },6 {7 "title":"Meet the team",8 "path":"about/meet-the-team/"9 },10 {11 "title":"Alice",12 "path":"about/meet-the-team/alice/"13 },14 {15 "title":"Bob",16 "path":"about/meet-the-team/bob/"17 },18 {19 "title":"Chris",20 "path":"about/meet-the-team/chris/"21 },22 {23 "title":"Home",24 "path":"index/"25 }26]
This index can be generated in any way you see fit. In this case, the page content is stored in Markdown files with YAML front matter, so I wrote a Symfony console command which gets all the Markdown files in the content folder, parses them to get the titles, and retrieves the path. You could also retrieve other items in front matter such as categories or tags, and the page content, and include that in the index. The data then gets converted to JSON and saved to the index file. As you can see, there's nothing special about this JSON - these two fields happen to be the ones I've chosen.
Now we can load the JSON file via AJAX, and pass it to a new Fuse instance. You can search the index using the .search()
method, as shown below:
1import Fuse from 'fuse.js';2window.$ = window.jQuery = require('jquery');34$(document).ready(function () {5 window.$.getJSON('/storage/index.json', function (response) {6 const fuse = new Fuse(response, {7 keys: ['title'],8 shouldSort: true9 });10 $('#search').on('keyup', function () {11 let result = fuse.search($(this).val());1213 // Output it14 let resultdiv = $('ul.searchresults');15 if (result.length === 0) {16 // Hide results17 resultdiv.hide();18 } else {19 // Show results20 resultdiv.empty();21 for (let item in result.slice(0,4)) {22 let searchitem = '<li><a href="/' + result[item].path + '">' + result[item].title + '</a></li>';23 resultdiv.append(searchitem);24 }25 resultdiv.show();26 }27 });28 });29});
The really great thing about Fuse.js is that it can search just about any JSON content, making it extremely flexible. For a site with a MySQL database, you could generate the JSON from one or more tables in the database, cache it in Redis or Memcached indefinitely until such time as the content changes again, and only regenerate it then, making for an extremely efficient client-side search that doesn't need to hit the database during normal operation. Or you could generate it from static files, as in this example. It also means the backend language is not an issue, since you can easily generate the JSON file in PHP, Javascript, Python or any other language.
As you can see, it's pretty straightforward to use Fuse.js to create a working search field out of the box, but the website lists a number of options allowing you to customise the search for your particular use case, and I'd recommend looking through these if you're planning on using it on a project.