POSTS
Migrating From Octopress 2 to Octopress 3
BlogI recently migrated my blog from Octopress 2 to Octopress 3 and thought I’d provide a write-up on the migration process. The process covers changing over with the least amount of fuss. I provide code to help automate things as well as code to help sanitize your data. This process can be completed in 6 steps.
Background
Jekyll is a static-file publishing platform. It’s designed to help you create a “Plain Old Web Site” with consistent look and feel. You write a plain-text document of content, run a program, and Jekyll wraps the page with template styling, footers, headers, etc. It helps turn “a heap of text documents” into a coherent site. Helping this process is that you get to define templates which serve as the rules for transforming your plain-old text files into “the content of web pages.”
From this, it should be evident that a “blog” is really just a site constructed from uniformly styled / formatted plain-text files that underwent a templating process. A side perk of this design is that since your content lives in plain text files you can remove your dependency on a database for holding the content. You won’t need to worry about repeated reloads slowing down the site, etc. Also, because your site doesn’t have to do a query to the database, the content will load very quickly.
Obviously, if you take the basic non-blogging-platform Jekyll, and use it as a blogging platform, it might fall short in terms of features. You might yearn for a command to start a new file with this look-and-feel or for a shorthand for referencing images. A set of extension of capabilities was written by Ian Mathis and shared as a “fork” off of Jekyll. As Mr. Mathis discovered / authored / integrated contributions to his “fork” of Jekyll it became more useful. Eventually the Mathis-originated version of Jekyll deserved its own name, Octopress.
Problems
I was a fairly early convert to Octopress. However staying up to date was implemented in a somewhat complex way. One of the present maintainers (Brandon Mathis) summed the matter up as:
What’s wrong?
If I’m being harsh, I’ll tell you that as it is now, Octopress is basically some guy’s Jekyll blog you can fork and modify. The first, and most obvious flaw, is that Octopress is distributed through Git. I want to punch through a wall when I think about users wrestling with merge conflicts from updating their sites. It’s absurd.
Tough talk from the maintainer.
As I started on my work hiatus I went to octopress.org, found the post containing the citation above, and saw that the last update date was: JAN 15TH, 2015. I was quite scared that there was going to be no way to move forward, that Octopress was basically abandoned.
To my surprise, Octopress 3.0 did emerge, did see the light of day, and is usable. The team adopted the approach of using Jekyll as the base software and wrapping the Octopress extensions up as a library (or Ruby “gem”) that can be rolled in. This guide will take you from migrating your blog from Octopress 2 → 3.
Step 1: Install Jekyll and Make an Initial Commit
Follow the quick install guide at Jekyll’s landing page or in its more extensive installation guide. And create a new site in a new directory. We’ll migrate the old content momentarily.
Initialize this directory with git
.
git init; git add . ; git commit -m "Jekyll install"
should do the work for you.
Step 2: Preliminary Customization
Files ending in .md
in the “top-level” directory are “static pages.” By
default, you’re given an about
page as well as an index
page. I’d recommend
customizing your about
page and I’d recommend editing (and then removing) the
welcome-to-jekyll
page in _posts
just to prove that you can make changes.
Your _config.yml
also probably should be changed from defaults: name, contact
information, tagline, etc.
To verify these changes, use the built-in web server.
bundle; bundle exec jekyll s
This will start a local web server on port 4000
that you can browse to.
If your changes look correct, create a commit.
Customization (Advanced, Skippable)
I do my work on a VPS. I keep my development workspaces on a server run by [Digial Ocean][] which I connect to over the network. This means that any computer is basically a keyboard and screen for my “thinking computer” which lives in a data center with backups “in the cloud.”
Consequently, when I start up a local server I need it to run on a different
hostname than localhost
. To make this change I provide --host vps-host-domain-name
. Similarly, the--port
can be changed as well.
Step 3: Import!
This is surprisingly easy. Take the old posts in your old Octopress site’s
_posts
directory and dump them in the new. Add them all and make a commit.
You’ll see your web server recognize the new content and rebuild (or attempt to rebuild )the site. Congratulations, your data has been migrated. It’s possible something will go wrong here – in fact it’s most likely guaranteed. Dont' worry, they’re easy to fix.
Step 4: Data Sanitization
It’s possible that the content you used in Octopress 2 is not supported in Octopress 3. Or, perhaps you used a character in a way that Jekyll’s templating engine thinks has significance and it’s confused. This section will guide you through reasoning about these.
You’re going to want to fix these, but I’ll guide you on a path to dealing with the most likely culprits.
The {% img % } tag of Octopress
Even writing {% above, just now, triggered a build error ). It’s OK, we’ll fix this!
In the Jekyll templating language, Liquid, curly brace and percents trigger a call to plugins, Ruby programming, or template capabilities. If you have calls that aren’t supported by Jekyll, which you probably do since you were using Octopress, you have to clean those up.
My most popular newly-invalidted tag was the img
tag that helped style and
center images. I had the choice of removing all of those….or coding in
support for the img
“plugin.” I opted for the latter. Here’s the code I used:
module Jekyll
class OldOctopressImgTag < Liquid::Tag
def initialize(tag_name, text, tokens)
super
args = text.split(/\s+/)
@class_name = args.shift
@path = args.shift
if args.length >= 3
@height = args.shift.to_i
@width = args.shift.to_i
end
@alt_text = args.join(" ")
end
def render(context)
output = "<img class='"
output += @class_name
output += "' src='#{@path}' "
output = height_and_width(output)
output += "alt_text='#{@alt_text}' />"
end
def height_and_width(s="")
return s if @height.nil? or @width.nil?
s + "height='#{@height}' width='#{@width}' "
end
end
end
Liquid::Template.register_tag('img', Jekyll::OldOctopressImgTag)
You can use this img
command with:
{% raw %} {% img style-class path-to-image height width "alt-text" %} {% endraw %}
After adding a plugin you’ll need to restart any running web servers. They’re
only read in at start time. As the server starts up you’ll see whether you’ve
cleared these issues. Alternatively run jekyll build
and see if you’re clean
of errors.
Remove other Meta-Tags
In several posts I used {{ or }} as a parenthetical. Those
characters are used by Liquid and thus a no-no. Using grep
to find them and
fix them and commit the changes is wise. I replaced them with HTML entity codes
instead. For the record, ou can tell Liquid to stop processing a block by using
its “[raw][]” directive.
Also, in some places I posted LaTeX formatted code which also uses those same characters.
Optional: Normalize your Categories
Over the years I’d been inconsistent with my categories tags. As a result in my
custom categories
page (code at bottom) I had both e.g. ajax
and Ajax
. I
wrote a quick Python script to clean those files up. My advice is to make sure
you’re working files are all committed and then apply this program. As a
caveat, the effectiveness of this program will be impacted by how clean your
data are. Nevertheless, it might give you a leg up if you have to do some
clean-up. This script ensures all categories are “- Capitalized”. This is
written in Python3. Lastly, this is not robust, clean code. It was write-once
;)
#!/usr/bin/env python3
# Takes a list of files as argument e.g. `ls _posts/2017* | review-file.py`
import os.path
import re
DIR = "./_posts"
def build_fullpath(file_name):
return os.path.join(DIR, file_name)
def capitalize_categories_in(f):
# Build path names
current_file_path = build_fullpath(f)
new_file_path = os.path.join("/tmp", f + "_new")
# Find categories block's offsets
current_file = open(current_file_path)
lines = current_file.readlines()
if (not 'categories:\n' in lines):
return
start = lines.index('categories:\n', 1) + 1
end = lines.index('---\n', 1)
# Replace categories to be "Single capitalized and rest lower-cased"
replacement_categories = [ "- " + line[2:].capitalize() for line in lines[start:end]]
# Write the new file
the_new = lines[0:start] + replacement_categories + lines[end:]
out = open(new_file_path, "w")
out.writelines(the_new)
# Replace the old file
os.unlink(current_file_path)
os.rename(new_file_path, current_file_path)
for fi in input().split():
capitalize_categories_in(fi)
These are the major techniques at your disposal. The end goal is to get your Jekyll build to run cleanly.
Step 5: Migrate the Images
I recursively copied my images
directory from my old Octopress directory to
the new. git add
the images
directory and commit. Done!
Step 6: Preserve Your Permalinks
To tell Jekyll to use the permalink style that your Octopress 2 style site was indexed with, add the following to _config.yml:
# Add to preserve permalinks from Octopress site
permalink: /blog/:year/:month/:day/:title/
This will require a restart of your server.
Step 6: Advanced Customization
Custom Templates
To customize the default theme, you’re expected to copy its template OUT of the Gem directory and put it in your site directory. This takes precedence over the default them’s version and thus you can customize the look and feel of the site.
For example, consider the case that I wanted to modify the footer.
$ grep theme _config.yml theme: minima
As you can see, I’m running the default theme, “minima.” So I ask bundler where my “minima” is installed:
$ bundle show minima /home/user/.gem/ruby/2.4.1/gems/minima-2.1.1
I copy footer.html
out of the _includes
directory contained in: bundle show minima
and place it in my own _includes/footer.html
file. Thus my
file will override the theme-provided file by the same name. The Jekyll
customization page covers this well.
Custom CSS
Add custom CSS in assets/main.scss
Custom Pages
Customizing your templates or using their helper “includes” follows the general pattern described above in “Custom Templates.” You take something out of a gem, put it locally which overrides the default behavior, and then you customize.
I wound up adding the following:
categories.html
_includes/
_includes/footer.html
_includes/google_search.html
_includes/post_body.html
_includes/header.html
_layouts/
_layouts/home.html
_templates/
_templates/page
_templates/draft
_templates/post
I’ve included their bodies below (for searchability) but they’ll upset the flow of this post. Consequently let me wind things up and paste them at the absolute end of this post.
Deployment & Conclusion
From there on out, the work is learning Liquid Templates rules so that you can learn about how to customize your site. I’ve found Liquid’s documentation to be approachable and easy. It operates like most template frameworks (e.g. ERB, Handlebars) but has a rich set of transformations (“filters”) which make doing template-layer transformation easy.
I wound up creating another web host that serves my content. So I do a “git
push” to a git repository on another server. That server is my web server.
Deployment happens by means of a git post-receive hook. After I push to that
repository, the hook code runs which effectively jekyll build
s the content
into a new directory on the remote site. I’ll include that code here as well.
$ cat post-receive
GIT_REPO=/full/path/to/repo
TMP_GIT_CLONE=/full/path/to/build/dir/for/jekyll
PUBLIC_WWW=/full/path/to/directory/served/by/nginx/or/other/webserver
export PATH=/home/user/.rbenv/bin:$PATH
/home/user/.rbenv/bin/rbenv local 2.4.1
/home/user/.rbenv/shims/bundle
git clone $GIT_REPO $TMP_GIT_CLONE
cd $TMP_GIT_CLONE
/home/user/.rbenv/shims/bundle exec jekyll build -s $TMP_GIT_CLONE -d $PUBLIC_WWW
rm -Rf $TMP_GIT_CLONE
exit
This gets Jekyll + Octopress up and running. To use Octopress’ handy scripts
like octopress new post
or octopress isolate <filename>
consult octopress --help
. Jekyll + Octopress are a great combination.
And that’s it! Enjoy your screamin’ fast site!
Customized Files
categories.html
{% raw %}
---
layout: page
permalink: /categories/
title: Categories
---
<div id="archives">
{% assign sorted_categories = site.categories |sort %}
{% for category in sorted_categories %}
<div class="archive-group">
{% capture category_name %}{{ category | first }}{% endcapture %}
<h3 class="category-head">
{{ category_name | capitalize }} ({{ site.categories[category_name] | size }})
<a class="expand" href="#"> Expand</a>
</h3>
<a name="{{ category_name | slugize }}"></a>
{% assign sorted_posts = site.categories[category_name] | sort %}
{% for post in sorted_posts %}
<article class="archive-item hidden">
<h4><a href="{{ site.baseurl }}{{ post.url }}">{{post.title}}</a></h4>
</article>
{% endfor %}
</div>
{% endfor %}
</div>
<script type="application/javascript">
Array.from(document.querySelectorAll("a.expand")).forEach(elem => {
elem.addEventListener('click', e => {
e.preventDefault();
let expandLink = e.target;
let articles = expandLink.parentElement.parentElement.querySelectorAll("article");
articles.forEach(article => article.classList.toggle("hidden"));
expandLink.classList.toggle("expanded-category-link");
});
});
</script>
{% endraw %}
_includes/footer.html
{% raw %}
<footer class="site-footer">
<div class="wrapper">
<h2 class="footer-heading">{{ site.title | escape }}</h2>
<div class="footer-col-wrapper">
<div class="footer-col footer-col-1">
<ul class="contact-list">
<li>
{% if site.author %}
{{ site.author | escape }}
{% else %}
{{ site.title | escape }}
{% endif %}
</li>
{% if site.email %}
<li><a href="mailto:{{ site.email }}">{{ site.email }}</a></li>
{% endif %}
</ul>
</div>
<div class="footer-col footer-col-2">
<ul class="social-media-list">
{% if site.github_username %}
<li>
{% include icon-github.html username=site.github_username %}
</li>
{% endif %}
{% if site.twitter_username %}
<li>
{% include icon-twitter.html username=site.twitter_username %}
</li>
{% endif %}
</ul>
</div>
<div class="footer-col footer-col-3">
<p>{{ site.description | escape }}</p>
</div>
</div>
</div>
</footer>
<script>
(function(i,s,o,g,r,a,m){i['GoogleAnalyticsObject']=r;i[r]=i[r]||function(){
(i[r].q=i[r].q||[]).push(arguments)},i[r].l=1*new Date();a=s.createElement(o),
m=s.getElementsByTagName(o)[0];a.async=1;a.src=g;m.parentNode.insertBefore(a,m)
})(window,document,'script','https://www.google-analytics.com/analytics.js','ga');
ga('create', 'YOUR GOOGLE ANALYTICS ID HERE', 'auto');
ga('send', 'pageview');
</script>
{% endraw %}
_includes/google_search.html
{% raw %}
<form action="http://google.com/search" method="get" id="search-form">
<fieldset role="search">
<input type="hidden" name="q" value="site:stevengharms.com/" />
<input class="search" type="text" name="q" results="0" placeholder="Search"/>
</fieldset>
</form>
{% endraw %}
_includes/post_body.html
{% raw %}
<article class="post" itemscope itemtype="http://schema.org/BlogPosting">
<header class="post-header">
<h1 class="post-title" itemprop="name headline">{{ include.content.title | escape }}</h1>
<p class="post-meta">
<a href="{{ include.content.url | relative_url }}">
<time datetime="{{ include.content.date | date_to_xmlschema }}" itemprop="datePublished">
{% assign date_format = site.minima.date_format | default: "%b %-d, %Y" %}
{{ include.content.date | date: date_format }}
{{ include.content.date | date: date_format }}
</time>
</a>
{% if include.content.author %}
• <span itemprop="author" itemscope itemtype="http://schema.org/Person"><span itemprop="name">{{ include.content.author }}</span></span>
{% endif %}</p>
</header>
<div class="post-content" itemprop="articleBody">
{% assign word_limit = 100 %}
{% capture word_count %}{{include.content.content | number_of_words | minus: word_limit}}{% endcapture %}
{% if include.content.content contains "<!-- more -->" %}
{% assign words = include.content.content | split: "<!-- more -->" %}
{{ words[0] }}
<a class="preview post-link" href="{{ include.content.url | relative_url }}"><em>Read more</em></a>
{% elsif word_count contains "-" %}
{{ include.content.content }}
{% elsif post.noabbrev %}
{{ include.content.content }}
{% else %}
{{ include.content.content | truncatewords: word_limit -}}...
<a class="preview post-link" href="{{ include.content.url | relative_url }}"><em>Continue</em></a>
{% endif %}
</div>
{% if site.disqus.shortname %}
{% include disqus_comments.html %}
{% endif %}
</article>
{% endraw %}
_includes/header.html
{% raw %}
<header class="site-header" role="banner">
<div class="wrapper">
{% assign default_paths = site.pages | map: "path" %}
{% assign page_paths = site.header_pages | default: default_paths %}
<a class="site-title" href="{{ "/" | relative_url }}">{{ site.title | escape }}</a>
{% if page_paths %}
<nav class="site-nav">
<input type="checkbox" id="nav-trigger" class="nav-trigger" />
<label for="nav-trigger">
<span class="menu-icon">
<svg viewBox="0 0 18 15" width="18px" height="15px">
<path fill="#424242" d="M18,1.484c0,0.82-0.665,1.484-1.484,1.484H1.484C0.665,2.969,0,2.304,0,1.484l0,0C0,0.665,0.665,0,1.484,0 h15.031C17.335,0,18,0.665,18,1.484L18,1.484z"/>
<path fill="#424242" d="M18,7.516C18,8.335,17.335,9,16.516,9H1.484C0.665,9,0,8.335,0,7.516l0,0c0-0.82,0.665-1.484,1.484-1.484 h15.031C17.335,6.031,18,6.696,18,7.516L18,7.516z"/>
<path fill="#424242" d="M18,13.516C18,14.335,17.335,15,16.516,15H1.484C0.665,15,0,14.335,0,13.516l0,0 c0-0.82,0.665-1.484,1.484-1.484h15.031C17.335,12.031,18,12.696,18,13.516L18,13.516z"/>
</svg>
</span>
</label>
<div class="trigger">
{% include google_search.html %}
{% for path in page_paths %}
{% assign my_page = site.pages | where: "path", path | first %}
{% if my_page.title %}
<a class="page-link" href="{{ my_page.url | relative_url }}">{{ my_page.title | escape }}</a>
{% endif %}
{% endfor %}
</div>
</nav>
{% endif %}
</div>
</header>
{% endraw %}
_layouts/home.html
{% raw %}
---
layout: default
---
<div class="home">
<h1 class="page-heading">Posts</h1>
{{ content }}
<ul class="post-list">
{% assign i = 0 %}
{% for post in site.posts %}
{% if i < 3 %}
{% include post_body.html content=post %}
<hr/>
{% else %}
<li>
{% assign date_format = site.minima.date_format | default: "%b %-d, %Y" %}
<span class="post-meta">{{ post.date | date: date_format }}</span>
<span class="post-meta">{{ post.date | date: date_format }}</span>
<h2>
<a class="post-link" href="{{ post.url | relative_url }}">{{ post.title | escape }}</a>
</h2>
</li>
{% endif %}
{% assign i = i | plus: 1 %}
{% endfor %}
</ul>
<p class="rss-subscribe">subscribe <a href="{{ "/feed.xml" | relative_url }}">via RSS</a></p>
</div>
{% endraw %}
_templates/page
{% raw %}
---
layout: {{ layout }}
title: {{ title }}
---
{% endraw %}
_templates/draft
{% raw %}
---
layout: {{ layout }}
title: {{ title }}
---
{% endraw %}
_templates/post
{% raw %}
---
layout: {{ layout }}
title: {{ title }}
date: {{ date }}
date: {{ date }}
---
{% endraw %}