I have recently converted by blog to run on Jekyll, partly because I was fed up of the weight of wordpress just to run a simple text delivery system, and because I didn’t need the security hastle of a painted target for script kiddies on my domain.
The one thing I really missed was search, so I looked around for a way to provide some sort of client side search of a client side index. I found this repo on hooking up lunr.js with jekyll and a simpler version on another blog which took me most of the way. However, one of my other goals in moving to jekyll was to keep my file sizes to an absolute minimum. I’ve also put in a compression script to pack down my html, and used the jekyll assets gem to integrate Sprockets. So why would I want a giant index file to download as well.
The out of the box implementation was pretty basic in terms of search indexers, and a little wasteful of bytes. First off I removed the pretty formatting on the json, which took the filesize down a bit. Then I added was a stopword processor, which removes a common list of stopwords from the content string. This took the input file for the indexer down from about 30k to 22k for about a 25% saving. It’s also pretty common for search indexers to have a minimum length for tokens, and in fact the autocomplete plugin in the original repo is based on a miniumum number of entered characters, so this is also enforced in my patched version to further reduce the size of the index.
My version of the repo can be found on GitHub. Hope it helps someone.