Highly Literate Ontology

  • What is an ontology and what is it for

  • The need for description

  • Ontologies and Programming

  • Literate Ontologies

  • Lenticular Text

  • The Future

Note

Introduction. Today is going to be a bit of a story, talking about the need for description in ontologies. I want to talk about a new paradigm for ontology development which enables this, and two pieces of technology that implement it.

There is a fair bit of technology involved, so breath deeply, as parts of this will be a whistlestop tour.

What is an ontology

  • A specification of a conceptualisation

  • A conceptualisation of something

  • Lots of disagreement on what "something" is

Note

An ontology is a specification of a conceptualisation. Clearly it’s a conceptualisation of something. Now there has been a lot of discussion and disagreement on exactly what that "something" should be. And a lot of documents written about it.

What is the problem?

  • Clear that "something" is complicated

  • We all agree that it’s worth writing about.

Note

But there are some sources of agreement at least. First, we all agree that the something is complicated, and we all agreement that it is worth talking and writing about.

Story

  • From OBI

A supernatant role is a role which inheres in a material entity…
— OBI
A pellet is a material entity which results from the aggregation of cells…
— OBI
Note

As Christmas is approach, we should start off with a story, a Christmas fable. A long, long time ago, I was working on an ontology called OBI at a meeting not far from here. We found these two definitions in OBI talking about the pellet and supernatant in a centrifuge tube.

Story

  • Pellet is a material entity?

  • Supernatant is a role?

  • Why the asymmetry?

  • Difficult to remember

Note

Now, why do we have this asymmetry in our definition. It seemed strange and unobvious, and after five minutes of discussion we all agreed that it was wrong. Till I remembered having exactly the same discussion six months before. And it’s because the supernatant is a liquid; once you pour it off, it’s not really a supernatant any more.

This was difficult to remember. And, while the documentation defined the terms, there was no justification.

Conclusions

  • Ontologies need a lot of description.

  • OWL doesn’t provide rich support for documentation

  • Annotations are un-ordered

  • Protege treats documentation as an add-on.

Note

My conclusion is this. Ontologies need a lot of description. But, there is a problem. OWL does not really support this. Annotations are un-ordered for instance, and protege treats documentation as an add-on. We can’t add sections, bibliographic references and so forth.

Literate Programming

  • Some programs require lots of documentation

  • "Literate Programming"

  • Single source file, generate out programmatic and documentation source

  • Best Examples: LaTeX and Sweave

images/tangle.png
Note

Some programs require a lot of documentation also, so this is not an new issue. In fact, this problem gave rise to the idea of literate programming; the overall workflow for a literate program looks like this. We edit a single piece of source code which has got both code for execution and turning into documentation. Examples of this include latex (which documents latex which is as confusing as it sounds) and Sweave which combines latex and R to generate out figures for a paper.

Can we do Literate Ontology?

Note

This seemed like a good idea, so I thought that I would this with ontology. I tried a variety of solutions over the years which you can read about on my blog here. All of them were based around OWL Manchester syntax which is a clean syntax for editing.

My conclusions are this:

Conclusion

  • No one is going to write ontologies this way

  • Writing ontologies is hard

  • Only a crazy person would write an ontology in a flat-file

Note

Basically, no one is going to write an ontology in a flat file. It’s hard enough as it is, without having to get everything right in some obscure syntax. Only a crazy person would write an ontology in a flat file.

Conclusion

  • No one is going to write ontologies this way

  • Writing ontologies is hard

  • Only crazy person would write an ontology in a flat-file

images/midori_harris.jpg images/mike_ashburner.jpg

  • Even GO has mostly stopped doing this now

  • But what about karyotypes?

Note

Here are two crazy people — in fact, many early versions of GO were written in a flat file, and there are some advantages to it. But, largely, even they have stopped now.

And that is what I thought, until we got to karyotypes.

A diversion: Karyotypes and Tawny-OWL

  • Humans (normally) have 46 chromosomes

  • Lost chromosomes or bits of chromosome are bad

  • Describing this ontology seems useful

images/karyogram.png
Note

So, what is a karyotype? It’s a description of all the chromosomes normally, in a human. It’s clinically important because losing bits or all of chromosome is generally bad. We want to describe this ontologically.

A diversion: Karyotypes and Tawny-OWL

  • Each of the 23 pairs has bands (visible structures)

  • That’s 1000 classes all very similar

  • Protege is not going to work

images/ChromXISCN09.jpg
Note

There are lots of chromsomes and lots of bands. It’s complicated, and repetitive. In the end, we realised we had to program it.

Tawny-OWL

  • Interactive environment, create new ontologies, classes

  • Built on Clojure, the JVM, OWL API and some reasoners!