WordPress to Hugo

Woah, is that this page on this page with this page?
WordPress to Hugo Woah, is that this page on this page with this page?

Blog moved to hugo from WordPress. Some helpful hints if you intend to do the same too.

In its fourth iteration (with almost a healthy 10 year hiatus) I moved this blog from WordPress to Hugo recently. Static Site Generators (or SSGs) like Hugo and Jekyll are a breath of fresh air for blog-centric websites. Not having to constantly be concerned with updates for security issues like the former, being able to write content in markdown, having a nice continuous deployment pipeline, no databases to worry about and the added bonus of being able to host it easily, how can you not?

If you intend to go down this path, here are some of my learnings and notes from my journey. This isn’t a step-by-step walk through, merely notes from my travels.

Preamble

Firstly, this site was built with Hugo with my fork of a fork of the hyde-hyde theme by Steve Francia. Having started my golang journey in 2016, there was nothing that Steve hadn’t touched in go that you wouldn’t be using right now - including Hugo itself.

I did give Jekyll a go a while back, but wasn’t a fan of Ruby (but do like the Liquid template library compared to Hugo for things like conditionals). For instance, this is what template conditionals look like in Jekyll:

{% if page.is_post and page.isArchived %}
  <!-- Do things -->
{% endif %}

And the same in Hugo:

{{ if (and (eq .Type "post") (isset .Params "isArchived")) }}
  <!-- Do things -->
{{ end }}

Themes/look and feel aside - because that’s a very personal customisation, the migration off of WordPress was a little finicky - largely in part due to my content. The blog had content from LiveJournal (2003-2005) to Community Server (2005-2008) that I had from developerfusion’s old CommunityServer blog engine that was migrated to WordPress which proved to be quite troublesome. You should have far better luck!

My focus was only on the content & the initial meta-data/taxonomies (tags/categories) & url patterns, didn’t care about replicating the theme, plugins, comments or anything else.

The blog is hosted on Netlify and source is on Github. Github Pages was in consideration too early on.

Comments are powered by GraphComment, the syntax highlighting is PrismJS.

Migrating Content out of WordPress

For my musings, I took a WordPress Backup via UpdraftPlus to try couple of approaches within a containerised local instance.

WordPress Docker Stack

First and foremost, you’ll need a stack for Docker that includes WordPress and MySQL - specific to your installation. (so ensure that the WordPress tag matches one for your installation, as well as the MySQL tag).

version: '3.1'

services:

  wordpress:
    image: wordpress:5.1.0-php7.3-fpm-alpine
    ports:
      - 8080:80
    environment:
      WORDPRESS_DB_HOST: db
      WORDPRESS_DB_USER: wordpress
      WORDPRESS_DB_PASSWORD: super-secret-password
      WORDPRESS_DB_NAME: wordpress_prod
    volumes:
      - wordpress:/var/www/html

  db:
    image: mysql:5.7
    environment:
      MYSQL_DATABASE: wordpress_prod
      MYSQL_USER: wordpress
      MYSQL_PASSWORD: super-secret-password
      MYSQL_RANDOM_ROOT_PASSWORD: '1'
    volumes:
      - db:/var/lib/mysql

volumes:
  wordpress:
  db:

Start the stack with either docker-compose or docker stack, I used docker-compose:


docker-compose -f stack.yml up
Creating network "_docker_default" with the default driver
Creating volume "docker_wordpress" with default driver
Creating volume "docker_db" with default driver
...
4310a0bf42d: Pull complete
d398726627fd: Pull complete
Digest: sha256:da58f943b94721d46e87d5de208dc07302a8b13e638cd1d24285d222376d6d84
Status: Downloaded newer image for mysql:5.7
Creating docker_db_1        ... done
Creating docker_wordpress_1 ... done
Attaching to docker_db_1, docker_wordpress_1
docker_db_1  | 2020-08-21 02:55:37+00:00 [Note] [Entrypoint]: Entrypoint script for MySQL Server 5.7.31-1debian10 started.
docker_db_1  | 2020-08-21 02:55:37+00:00 [Note] [Entrypoint]: Switching to dedicated user 'mysql'
docker_wordpress_1 | WordPress not found in /var/www/html - copying now...
docker_db_1  | 2020-08-21 02:55:37+00:00 [Note] [Entrypoint]: Entrypoint script for MySQL Server 5.7.31-1debian10 started.
docker_db_1  | 2020-08-21 02:55:37+00:00 [Note] [Entrypoint]: Initializing database files
docker_db_1  | 2020-08-21T02:55:37.595167Z 0 [Warning] TIMESTAMP with implicit DEFAULT value is deprecated. Please use --explicit_defaults_for_t
...

Once the above pulls the images and sets up the stack, you’ll be able to setup the initial configuration for WordPress by visiting http://localhost:8080. This is also where I restored the UpdraftPlus backup.

The important paths from the above stack are:

  • Themes - /var/www/html/wp-content/themes/
  • Plugin - /var/www/html/wp-content/plugins/

I didn’t really look at the Themes, mostly focused on copying the UpdraftPlus plugin and the migration setup.

Migration Approaches

This is where you’ll spend most of your fun times determining what works for your particular setup. It’s vital that you not only worry about the content but also the URL patterns so that you don’t break existing links.

For my setup, all posts were /index.php/[year]/[month]/[day]/[title] and wanted it maintained.

Export to Hugo

Export to Hugo is the recommended approach from the Hugo Migrations page. It sounded too good to be true and all the right features:

  • Converts all posts, pages, and settings from WordPress for use in Hugo
  • Export what your users see, not what the database stores (runs post content through the_content filter prior to export, allowing third-party plugins to modify the output)
  • Converts all post_content to Markdown Extra (using Markdownify)
  • Converts all post_meta and fields within the wp_posts table to YAML front matter for parsing by Hugo.
  • Exports optionally comments as part of their posts. This features needs to be enabled manually by editing the PHP source code. See file hugo-export.php at line ~40.
  • Export private posts and drafts. They are marked as drafts as well and won’t get published with Hugo.
  • Generates a config.yaml with all settings in the wp_options table
  • Outputs a single zip file with config.yaml, pages, and post folder containing .md files for each post in the proper Hugo naming convention.
  • No settings. Just a single click.

Activate the plugin and single click! However, if you find that the script is failing due to execution time or permission issues, you can (as I did) use the CLI (why it was containerised):


cd wp-content/plugins/wordpress-to-hugo-exporter/
php hugo-export-cli.php
This is your file!
/tmp/wp-hugo.zip

Unfortunately, this plugin didn’t quite work for me due to my varying content & structure - mind you, this was pre-2.0 release that seems current now. The zip file was created but the content inside the markdown files were not consistent nor all of the posts from WordPress.

Export to Jekyll

Export to Jekyll is in a similar vein to the Hugo exporter (same author?) but it’s in (!) Jekyll format:

  • Converts all posts, pages, and settings from WordPress for use in Jekyll
  • Export what your users see, not what the database stores (runs post content through the_content filter prior to export, allowing third-party plugins to modify the output)
  • Converts all post_content to Markdown Extra (using Markdownify)
  • Converts all post_meta and fields within the wp_posts table to YAML front matter for parsing by Jekyll
  • Generates a _config.yml with all settings in the wp_options table
  • Outputs a single zip file with _config.yml, pages, and _posts folder containing .md files for each post in the proper Jekyll naming convention
  • No settings. Just a single click.

This adds an additional step to your migration, but (for me) this approach worked and it exported content, meta-data and taxonomies with ease. What’s also important is that it automatically determined my URL patterns and matched that eg.

---
title: The anatomy of the Ext4 File-System
type: posts
date: 2009-02-23T12:21:23+00:00
url: /index.php/2009/02/23/the-anatomy-of-the-ext4-file-system/
...

Importing into Hugo

This was probably the least painful part - whilst I added this step by using the exporter. Hugo convienently has a Jekyll importer.


mkdir blog
hugo import jekyll jekyll-files blog
Importing...
Congratulations! 486 post(s) imported!
Now, start Hugo by yourself:

That’s it, it’s now ready to serve but before you do, it’s best you read through the Hugo Site Variables documentation to separate out your local development and production configurations.

Add a basic theme to get you started:


git clone https://github.com/spf13/herring-cove.git blog/themes/herring-cove

Then fire it up:


hugo server --buildDrafts --port 1337 --theme=herring-cove

Lighting fast build and serve.

Manual Curation

Once you have something resembling a Hugo site rendering, it’s evident things need to be curated, culled and formatting fixed up. There are some tools further down that helped in this regard to convert inline HTML into Markdown (conversion process failed to convert them). This is especially important on newer versions of Hugo v0.60+ as it uses Goldmark vs the previous blackfriday handler which supported inline HTML.

During this phase, it’ll be helpful to go back to BlackFriday in your config.toml

[markup]
  defaultMarkdownHandler = "blackfriday"

Then when you’re ready to boogie, switch back to Goldmark and force unsafe code not to render:

[markup]
  defaultMarkdownHandler = "goldmark"
   [markup.goldmark]
    [markup.goldmark.extensions]
      unsafe = false

This is also the best time to decide on a Hugo theme and get your basic layout sorted. I spent a fair chunk of mine customising the hyde-hyde fork and adding relevant short-codes etc.

Curated Things

  • Images and links that were deemed Mixed-Content (so http based images being used on https)
  • Span tags littered throughout the MD files because of the previous syntax highlighter used on WordPress
  • Rewriting Codeblocks to use Hugo shortcodes
  • Rewriting bash/powershell command lines to use Hugo Shortcodes
  • Removing redundant or overly complex tag taxonomies
  • Reducing Categories and relinking uncategorised areas (not sure why that happened)
  • Rewriting internal URLs that looked for WordPress specific taxonomies (Eg. /index.php/tag vs /tags). Some nifty sed scripts were used (doco’d below)

Hosting on Netlify

One of the primary reasons I picked Hugo was because of the static HTML files and content that’s hugely CDN’able! My initial thought was to just use my AWS account and run this off an S3 bucket.

But Netlify brings so much to the table, it was hard to consider anything else - outside of it having a free tier (not too relevant for me).

  • Easy Continuous Deployment Trivial setup of CD via a netlify.toml file (mines below but you can use it too!)
  • Super fast deployment & updates Honestly, it’s deployed and ready in minutes with full build history.
  • Branch Deployment This is another trick up Netlify’s sleeve and it was so useful when configuring the theme and backend (front-end) bits.
  • Asset Optimisation They’ll auto-optimise your assets before it hits their CDN
  • SSL Included Configured with LetsEncrypt it’s easily setup as per their ever helpful documentation.

The entire setup & deployment from local Hugo to *name*.netlify.app was about 30 minutes (and 10mins of that was the baseURL issue mentioned below)!

SSL Goodness

Once you move your DNS hosting to Netlify though, you’ll be able to seamlessly manage your SSL certificates too…

Netlify provisioning SSL Certs via Lets Encrypt
Figure 1. Netlify provisioning SSL Certs via Lets Encrypt

…once the DNS propagates:

DNS Propagated, SSL Certs provisioned!
Figure 2. DNS Propagated, SSL Certs provisioned!

Couldn’t be easier! Make sure you leave a donation with the Let’s Encrypt folks for their amazing work.

Some config.toml Things…

Ensure that your configuration has a relative baseURL so relative pathing works correctly - this took me a bit of fiddling initially.

baseurl = "/"
languageCode = "en-us"
...

Reusable netlify.toml

Finally the Netlify hosting configuration with the (at the time) latest release of Hugo to use.

[build]
publish = "public"
command = "hugo --gc --minify"

[context.production.environment]
HUGO_VERSION = "0.74.3"
HUGO_ENV = "production"
HUGO_ENABLEGITINFO = "true"

[context.split1]
command = "hugo --gc --minify --enableGitInfo"

[context.split1.environment]
HUGO_VERSION = "0.74.3"
HUGO_ENV = "production"

[context.deploy-preview]
command = "hugo --gc --minify --buildFuture -b $DEPLOY_PRIME_URL"

[context.deploy-preview.environment]
HUGO_VERSION = "0.74.3"

[context.branch-deploy]
command = "hugo --gc --minify -b $DEPLOY_PRIME_URL"

[context.branch-deploy.environment]
HUGO_VERSION = "0.74.3"

[context.next.environment]
HUGO_ENABLEGITINFO = "true"

[[headers]]
  for = "atom.*"
  [headers.values]
    Content-Type = "application/atom+xml; charset=UTF-8"
[[headers]]
  for = "*.atom"
  [headers.values]
    Content-Type = "application/atom+xml; charset=UTF-8"

The Hugo site has amazing instructions on this.

Setup Redirects with _redirects

Migrating would also require some redirection of URLs to the new Hugo style - especially IIS-hosted WordPress to Hugo. Netlify’s documentation has an example that explains this, but here’s my _redirects file.

Remember to place this in the /static folder so it correctly puts it in the root once deployed.

/index.php         /
/index.php/feed   /feed.xml
/index.php/about   /about
/index.php/contact-me /about
/index.php/category /categories
/index.php/tag     /tags

Finally, you’ll also want to tell Hugo to disableAliases as you’re letting Netlify manage that now - in config.toml:

disableAliases = true

DNS Configuration

You may want to opt to use the (now live) Netlify DNS service instead of having to update the DNS on your Domain Registrar’s side. Makes managing your domain easier if it’s solely dedicated to the Hugo site.

I opted to use the Netlify’s DNS to handle my DNS needs.

Then followed the setup on the Netlify site.

Tools & Scripts

Some insanely useful tools I made use of during the migration:

  • markdownTables for converting HTML Tables into nice markdown representations.
  • turndown HTML to Markdown conversion for adjusting markup missed by the conversion process that left Anchor Tags/floating formatting artefacts.

Remove the Author from MD files

This may be a remanent of the Jekyll export, but all my archived posts had an Author tag which made RSS feeds a bit confused. Removing those with some sed:

find . -name "*.md" -type f | xargs sed -i -e '/author: Thushan Fernando/d'

Mark old posts as archived

Wanted to ensure a old content was tagged appropriately, so adding an isArchived flag was easy sed than done:

find . -name "*.md" -type f | xargs sed -i -e '/type: posts/a isarchived: true'

Related Articles