Highly Literate Ontology
-
What is an ontology and what is it for
-
The need for description
-
Ontologies and Programming
-
Literate Ontologies
-
Lenticular Text
-
The Future
Note
|
Introduction. Today is going to be a bit of a story, talking about the need for description in ontologies. I want to talk about a new paradigm for ontology development which enables this, and two pieces of technology that implement it. There is a fair bit of technology involved, so breath deeply, as parts of this will be a whistlestop tour. |
What is an ontology
-
A specification of a conceptualisation
-
A conceptualisation of something
-
Lots of disagreement on what "something" is
Note
|
An ontology is a specification of a conceptualisation. Clearly it’s a conceptualisation of something. Now there has been a lot of discussion and disagreement on exactly what that "something" should be. And a lot of documents written about it. |
What is the problem?
-
Clear that "something" is complicated
-
We all agree that it’s worth writing about.
Note
|
But there are some sources of agreement at least. First, we all agree that the something is complicated, and we all agreement that it is worth talking and writing about. |
Story
-
From OBI
Note
|
As Christmas is approach, we should start off with a story, a Christmas fable. A long, long time ago, I was working on an ontology called OBI at a meeting not far from here. We found these two definitions in OBI talking about the pellet and supernatant in a centrifuge tube. |
Story
-
Pellet is a material entity?
-
Supernatant is a role?
-
Why the asymmetry?
-
Difficult to remember
Note
|
Now, why do we have this asymmetry in our definition. It seemed strange and unobvious, and after five minutes of discussion we all agreed that it was wrong. Till I remembered having exactly the same discussion six months before. And it’s because the supernatant is a liquid; once you pour it off, it’s not really a supernatant any more. This was difficult to remember. And, while the documentation defined the terms, there was no justification. |
Conclusions
-
Ontologies need a lot of description.
-
OWL doesn’t provide rich support for documentation
-
Annotations are un-ordered
-
Protege treats documentation as an add-on.
Note
|
My conclusion is this. Ontologies need a lot of description. But, there is a problem. OWL does not really support this. Annotations are un-ordered for instance, and protege treats documentation as an add-on. We can’t add sections, bibliographic references and so forth. |
Literate Programming
-
Some programs require lots of documentation
-
"Literate Programming"
-
Single source file, generate out programmatic and documentation source
-
Best Examples: LaTeX and Sweave
Note
|
Some programs require a lot of documentation also, so this is not an new issue. In fact, this problem gave rise to the idea of literate programming; the overall workflow for a literate program looks like this. We edit a single piece of source code which has got both code for execution and turning into documentation. Examples of this include latex (which documents latex which is as confusing as it sounds) and Sweave which combines latex and R to generate out figures for a paper. |
Can we do Literate Ontology?
-
Ontologies need more documentation that most programs
-
Literate Ontology sounds like a good idea.
-
I’ve tried a variety of solutions.
-
All based around OWL Manchester Syntax
Note
|
This seemed like a good idea, so I thought that I would this with ontology. I tried a variety of solutions over the years which you can read about on my blog here. All of them were based around OWL Manchester syntax which is a clean syntax for editing. My conclusions are this: |
Conclusion
-
No one is going to write ontologies this way
-
Writing ontologies is hard
-
Only a crazy person would write an ontology in a flat-file
Note
|
Basically, no one is going to write an ontology in a flat file. It’s hard enough as it is, without having to get everything right in some obscure syntax. Only a crazy person would write an ontology in a flat file. |
Conclusion
-
No one is going to write ontologies this way
-
Writing ontologies is hard
-
Only crazy person would write an ontology in a flat-file
-
Even GO has mostly stopped doing this now
-
But what about karyotypes?
Note
|
Here are two crazy people — in fact, many early versions of GO were written in a flat file, and there are some advantages to it. But, largely, even they have stopped now. And that is what I thought, until we got to karyotypes. |
A diversion: Karyotypes and Tawny-OWL
-
Humans (normally) have 46 chromosomes
-
Lost chromosomes or bits of chromosome are bad
-
Describing this ontology seems useful
Note
|
So, what is a karyotype? It’s a description of all the chromosomes normally, in a human. It’s clinically important because losing bits or all of chromosome is generally bad. We want to describe this ontologically. |
A diversion: Karyotypes and Tawny-OWL
-
Each of the 23 pairs has bands (visible structures)
-
That’s 1000 classes all very similar
-
Protege is not going to work
Note
|
There are lots of chromsomes and lots of bands. It’s complicated, and repetitive. In the end, we realised we had to program it. |
Tawny-OWL
-
Interactive environment, create new ontologies, classes
-
Built on Clojure, the JVM, OWL API and some reasoners!
Note
|
And, so I built Tawny-OWL — and some of you will have been at the tutorial yesterday. |
Tawny-OWL
Allows us to repurpose software engineering tools.
-
version control
-
commit discussions
-
pull management
-
unit testing
-
continuous integration
-
collaborative editing
-
Integrated Development Environment
-
Power Editor
Note
|
We have all of these capabilities. Many of these WebProtege adds, but they have been explicitly written and coding for WebProtege. We use off-the-shelf tooling, maintained by other people. And generally it’s very good. |
Literate Ontology
-
We can write ontologies in text, using a rich programming environment
-
We can write documentation in text, also with a rich markup
Note
|
So, this gives us a powerful environment, and one that is as competitive although very different from Protege. But it also means that we can define our ontologies in a flat-file, using a rich programming environment. And, of course, we can also write documentation in text, either latex or any of the more recent markup (markdown!) languages. |
Problem
-
Do we used a rich environment for programming?
-
Eclipse, Emacs CIDER, Intellij
-
-
Do we use a rich environment for documentation?
-
Scientific Word, LyX, Emacs auctex
-
Note
|
But, there is still a problem. We can use a rich environment for programming. We can use eclipse, or Emacs or Intellij. Or we can use a rich environment for documentation, tools like scientific word, lyx or auctex. But we have to choose. And, if we choose one the other will suffer. |
Solution: Lenticular Text
-
Invented the notion of "Lenticular Text"
-
After "Lenticular Printing"
-
Text which changes depending on the way you look at it
-
Model-View-Controller for Text
Note
|
Worried about this for a while, so I invented a new notion of lenticular text. This is a little bit hard to describe, so I will show some pictures in a minute. Lenticular is after "lenticular printing" which are those images that change depending on the angle you look at them from. Same idea — I wanted text which changes depending on your point-of-view. Or, for those of you from a software engineering background, you can think of this as model-view-controller but for text. |
Solution: Lenticular Text
-
View either representation
-
Edit either in a rich environment
Note
|
We should be able to have two representations, linked togehter. Then we can edit either and still use our downstream tools like tawny-owl, like asciidoc to operate on it. |
Implementing Lenticular Text
-
Implemented this with Emacs "lentic" package
-
Why Emacs?
-
Strong support for Clojure
-
Strong support for several markup languages (latex, asciidoc, markdown).
-
Note
|
Now, to achieve this, we have to build the tool into an editor. I chose Emacs for a variety of reasons — it has strong support for CLojure, and also for writing documentation in latex or others. |
Implementing Lenticular Text
-
Implementation is for Emacs
-
Algorithms are portable
-
Solution is very low-level
-
So, (mostly) neutral to text
-
Currently works for
-
Clojure, Emacs-lisp, Scala, Haskell, python
-
Latex, Org-Mode, Asciidoc
-
-
New languages can be added
Note
|
While the implementation is for Emacs, the algorithms are portable. It plugs in at a fairly low-level — so, it’s most neutral to the text and can provide lenticular views for many different syntaxes — currently we have this lot, in any combination. We can even do several languages at once. None of the tools in use have to be modified at all. |
What it looks like
-
Harder to explain than show
-
In practice, works straight-forwardly