Three Steps to Heaven

We argue that to make semantic publishing possible we need to:

  • make life better for the machine to

  • make life better for the author to

  • make life better for the reader

Making life better for the machine is a reasonable objective in itself. To some extent, it is the point of semantic publishing. But, it is not an end, because, ultimately, we need the machine to do something for us.

We have to make life better for the author. Ultimately, it’s fine for a bioinformatician to say to a biologist "please mark up your text, so I can mine it". They will say no, I don’t have time.

We need to make life better for the reader because that way they may nag the authors to add more, which is something the authors care about.

To begin at the beginning


Or to begin at the end. We are interested in the long tail. Academics produce a lot of content, much of it falling into the category of grey literature. Short tutorial information, lab books and so on. We believe that an academic should be able to publish this information straight-forwardly, while maintaining the important features of current academic publishing.

The problem with publishing

  • Often time-consuming

  • Over a long time

There are three big problems with current academic publishing, however, which are going to prevent this. First, it is far too time consuming; I refer here to the publishing process itself and not to the effort of authoring which is hard because of the intellectual creativity that it requires. Second, it’s too expensive; to some extent, I don’t care about this — the money comes of my grant, and it’s still relatively small compared to the cost of science over all. But, it does mean that I have to think about what is worth publishing. Third, the publishing industry is one that, to misquote Douglas Adams, thinks that dumping a PDF on the web is a pretty neat idea.

The process

The process

The process


  • Again!

The process

The process

The process

The process


  • Again, Again!

The process


  • Web First

  • RSS

  • Trackbacks

  • Stats

  • Gravatars

Here is one journal publishing process. It’s very heavy-weight. Pushing semantics through this is going to be well-nigh impossible.

So, we wanted a web-first, easily accessible process. We wanted to integrate with existing scientists workflows, with existing tools. We wanted to build on top of commodity software where ever possible: we need tools like RSS, trackbacks, gravatars, commenting; we didn’t want to write them.

So, we used wordpress.

Existing Workflow

  • Fitting in with peoples workflows

  • Fitting in with peoples tools

    • Word/Email

    • Word/Dropbox

    • LaTeX/Versioning

    • Asciidoc/Dropbox

  • The wordpress editor is lacking

There are many existing workflows. Most of them use word, email, dropbox and so on. One of the nice consequences of going commodity is that most of this is sorted out for us up front. Most blogging engines provide an XML-RPC; and many text production systems can or can be induced to talk this.

Adding Semantics

  • Adding semantics is hard

  • The three steps to heaven limit us

  • Going to describe two exemplars

The three steps place constraints on us. We cannot ask too much of the authors. We want them to see immediate advantage at each point.

In this paper, we describe three use cases. Because of time constraints, I am just going to describe two here, and also want to add some additional stuff we have done more recently.

Maths (author)

  • The author adds [latex]e=mc^2[/latex] to their document

  • Which is rendered as:

e = m c 2
  • This works in any authoring environment

  • The author adds "here is maths" semantics

Maths (reader)

  • The reader gets a nicely rendered equation