Special Characters (+Extensions!)

October 5, 2014 at 11:33 pm

I’m working on a very primitive extension to the excellent Python Markdown library that deals with typographic issues. There are several libraries that handle this and most of the time they work, sometimes I’d much rather just type an em-dash in-line — I mean without fussing with the -- or --- syntax. I feel this is more portable in terms of source files and I’m completely comfortable hitting Option+Shift+- as needed.

Conversions I’m initially concerned about are listed in the table below.

Sign HTML Entity Relevance
 —  — An em-dash is used to separate out departures in thought — like this.
– An en-dash is used to indicate ranges: 10–20, 9th–13th, etc.
× × Times sign is used to indicate multiplication or height/width. 1080px × 1920px for example
“”‘’ “ ” ‘ ’ “Smart” quotes are the curly ones.

In addition to the items above, adding extras like a hair space around em-dashes was a consideration.

I implemented a rough Preprocessor for Markdown that does simple regular-expression based replacements of these into HTML entities. Obviously the entire thing falls over when you’re handling non-UTF-8 files but that wasn’t a restriction for me.

Here’s the biggest change which is the em-dash hair space item. An em-dash—without hair space — with hair space. Not bad eh?

Source code is on GitHub at joshkehn/typoplus. It plays nicely with Typogrify and render_math. When I get around to banging Pelican into shape I’ll setup a projects page.


October 2014

Can’t find what you’re looking for? Try hitting the home page or viewing all archives.