Introduction
-
Tawny-OWL is fully programmatic environment for Ontology Development
-
Presents a different model of ontology development
-
Very powerful
-
Allows reuse of commodity tooling
Today
-
Some taught material
-
Some "follow-by-leader"
-
Plenty of time for questions
-
A flexible tail
Note
|
Today, we will have some taught material, particularly at the beginning and the end, where I need to describe parts of the system, some practical time where we build a small ontology using tawny. Then I have a reasonably flexible tail where I try and address questions that people have. Also, there is a degree of guesswork into how much material I can get into the time available, which this helps to address. I do not have time to describe all of the features of tawny, nor give a full tutorial. |
Today
-
Attempted to make decoupled
-
Aware that people will come in and out
-
Probably not be entirely successful at this.
-
Gets more programmatic as we go on
-
Will not explain all of the programming
Note
|
Although the tutorial does "build" I have tried to make it as loosely coupled as possible, although the truth be told, I have probably not been entirely successful at this. The latter half requires more programming knowledge, and there is probably little that I can do about this. I will skip over what some of it means, in the hope that those who care can go back later and look it up. I do not want this to become a Clojure tutorial. |
Knowledge Prerequisites
-
Required
-
Basic knowledge of ontologies
-
Basic knowledge of OWL
-
Basic knowledge of biology (amino-acids!)
-
For last 1/3: Basic Programming Knowledge
-
-
Not Required
-
Knowledge of Clojure
-
Knowledge of OWL API
-
Knowledge of Tawny-OWL
-
Outcomes
-
Understand the motivation behind Tawny
-
Understand and use basic Clojure infrastructure
-
Build a basic ontology with Tawny
-
Understanding pattern usage within Tawny
-
Implementing a pattern
-
Extending Tawny locally
Secondary Outcomes
-
Understand how to repurpose software tooling for ontologies
Notes on Notes
-
All materials are available
-
These slides (2015_lisbon.html)
-
As a book with lecture notes (2015_lisbon_book.html)
-
Same thing as PDF (but less well checked!).
Motivation
-
Tawny-OWL was initially an accidental tool
-
We did not start off writing a new environment
-
We discovered we needed while trying to build an ontology
Use Case (chromosomes)
-
The human karyotype is complex to describe
-
Alterations more so
Note
|
We didn’t start out to develop a new ontology engineering tool. It happened along the way, as we tried at address a specific use case, which was modelling human karyotypes. |
Use Case (karyotypes)
-
Current, representation comes from ISCN
-
Written in a book, no computational representation.
-
47,XXY
-
46,Xc,+21
-
46,XY,+21c,-21
-
45,XY,-10,der(10)t(10;17)(q22;p12)
-
46,XY,der(7)t(2;7)(q21;q22)ins(7;?)(q22;?)
-
46,XX,der(8)t(8;17)(p23;q21)inv(8)(p22q13)t(8;22)(q22;q12)
-
46,XX,der(9)del(9)(p12)t(9;22)(q34;q11.2),der(9)t(9;12)(p13;q22)inv(9)(q13q22)
-
An ontological representation seems like a nice idea
The current representation goes back to the days of type writers. The specification is not machine interpretable, the ISCN strings themselves are not interpretable, in fact, you cannot even represent them fully in ascii because they use meaningful underlines.
A partonomy?
-
What you see is what you get?
-
There are 23 chromosomes
-
Around 1000 bands, at different resolution levels.
Protege
-
Could do this in protege
-
Technically, it scales well to an ontology of this size
Note
|
Technically, 1000 terms is not a problem for protege, it can scale to this size (or, indeed, considerably larger) with relative easy. |
Protege
-
But the user interface does not
-
Generating many similar classes is painful
-
Hard to know how an axiomatisation will perform at the start
-
Changing them afterwards even worse
Note
|
But the UI doesn’t scale in this way. It involves an awful lot of clicking — one report I have heard suggests that protege users spend upto 50% of their time expanding and closing the hierarchy. With the karyotype ontology this problem would be profound. Worse, with the karyotype ontology we have a specific computational use in mind, and we don’t know what the performance is going to be like — reasoners can change performance quite a lot with different axiomatisations. |
Protege
-
But the user interface does not
-
We end up more like this.
Note
|
In practice, we are more likely to end up like this; 1000 classes is an awful lot of clicking, particularly when many of the classes are very similar. |
Can we do this programmatically?
-
Yes, but painfully
-
OWL API — used by many, including Protege 4
-
Java and the OWL API are long-winded
-
Compile-Code-Test cycle!
OWLClass clsA = df.getOWLClass(IRI.create(pizza_iri + "#A")); OWLClass clsB = df.getOWLClass(IRI.create(pizza_iri + "#B")); // Now create the axiom OWLAxiom axiom = df.getOWLSubClassOfAxiom(clsA, clsB); // add the axiom to the ontology. AddAxiom addAxiom = new AddAxiom(o, axiom); // We now use the manager to apply the change m.applyChange(addAxiom);
Note
|
The main API out there is the OWL API. It’s nice, but is long winded, and difficult, both because of the complexity of a type system needed for OWL (from the javadoc it is hard to work out which methods can be invoked on which type), the change object system (so, you can use AddAxiom to add an annotation to an ontology, but only if you don’t care about it working), and the factory layer. All complex. |
Brain
-
Written by Samuel Croset, EBI
-
EL only
-
Compile/run cycle
-
How does this fit with Java’s OO?
Brain brain = new Brain(); //Add the OWL classes brain.addClass("Nucleus"); brain.addClass("Cell"); //Add the OWL object property brain.addObjectProperty("part-of"); //Assert the axiom brain.subClassOf("Nucleus", "part-of some Cell");
Note
|
Another option, writtem by Samuel Croset is Brain. Much lighter weight than the OWL API. But EL only, and it is unclear how to marry what is essentially a script with Java’s OO design. And still we keep the essential characteristics of Java. Any changes require, recompile, restart: it is slow. |
The Paragon
-
R provides an interactive, exploratory environment for stats
-
Command line shell, wrapped by several GUIs
-
Language is convienient to type and use
-
It’s not all good!
-
The syntax can be bizzare
-
The language semantics are strange
Note
|
So, my paragon here is R, the statistical language. It is interactive, convienient to use. It can be used cleanly in batch. In general, very nice. Not to say that I want to copy all of its features though. |
Constraints
-
Simple to do (structurally) simple ontologies
-
OWL API — too much code to rewrite
-
Java (JVM) — because of the OWL API
-
Pre-existing development tooling
Note
|
These are the limitations that we had to live within. Most importantly of all, I wanted it to be as simple as possible to build structurally simple ontologies. It should largely be possible to type and write ontologies without feeling that you are programming. It was going to be written using the OWL API because there is too much code there to rewrite, and no one would trust me to do that in a standards compliant way. This required the use of the JVM. And I wanted access to pre-existing development tooling. I did not want to build a complete development environment, I needed something off-the-shelf, so that it was good. |
Karyotype Ontology
-
What have we achieved?
-
Build by Dr Jennifer Warrender
-
Around 1000 classes in the karyotype ontology
-
Similar numbers of tests, structural and reasoner based
-
Models 10 events, with patterns for downstream use
-
Multiple levels of ploidy
-
Performance tested axiomatisation
Note
|
Before I move onto tawny-owl, what did we achieve with the karyotype ontology. Well, I think quite a lot. We now have a large, consistent (in both the formal and informal sense of the word) ontology that describes most levels of the ISCN. |
Tawny-OWL Key Features
-
Now we move onto Tawny-OWL features
-
Ontology building tool
-
Unprogrammatic syntax, minimal baggage
-
Evaluative
-
Broadcasting
-
Patternized
-
Fully Extensible
-
Integrated Reasoning
-
Build on commondity language
-
Access to fully programming Tool Chain
Note
|
Next, we move onto a formal walk-through of tawny-owl features. In this section I do not intend to describe all of the features in detail, but to give an overview, so that you will know what is coming up. This is a literately programmed document, so we start with a namespace, but I have hidden this from the slides because we do not need so much baggage so early on. It will be introduced in detail later. |
(ns lisbon.features (:use [tawny owl reasoner]))
Ontology Building Tool
-
Tawny-OWL is an Ontology Building Tool
-
"A Textual User Interface"
-
Usable as an API.
-
But not designed as an API
Note
|
For those of you from a functional programming background, tawny is not very functional. It is usuable as an API for manipulating ontologies, but was not really designed for that purpose. Still, it is no worse for this than the OWL API. |
Unprogrammatic Syntax
-
Both OWL API and Brain carry Java baggage
-
You can never forget you are programming
-
Tawny-OWL aimed to avoid this
(defontology o)
Note
|
We have aimed as far as possible to make tawny simple to define simple ontologies. So this statement for instead defines a new ontology. Of course, the choice of programming language that we have chosen has implications and the parenthesis is the most obvious one to anyone from a lisp background. |
Unprogrammatic Syntax
-
Are many OWL syntaxes to chose from
-
Functional, Concrete, XML.
-
Or RDF (RDF/XML, Turtle, N3)
-
Manchester (OMN) syntax designed for typing
-
Frame based, rather than axiom based
Class: o:A Annotations: rdfs:label "A"@en, rdfs:comment "A is a kind of thing."@en
Note
|
I did not want to create my own syntax because there are really far too many of these already. One which was built for the specific use case of typing was Manchester syntax (also known as "OWL Manchester Notation" or OMN). It is a relatively clean syntax, and can be used to define new classes easily. In my very early work on the karyotype ontology, I even considered writing an OMN environment, but OMN is not that easy to deal with programmatically, so I dropped the idea. |
Unprogrammatic Syntax
-
Tawny-OWL modelled on OMN
-
Modified to use lisp/clojure syntax
-
Entities need parens
-
No longer need commas
-
Blocks are explicit, so easier to parse
(defclass A :label "A" :comment "A is a kind of thing.")
Note
|
This is the equivalent in tawny-owl. Some things are easier, and some are harder. |
Unprogrammatic Syntax
-
Frame names use
:colon
and notcolon:
-
Just for fit with lisp
-
Some names have changed
:super
rather thatSubClassOf:
-
Consistent with property (
:super
not:SubPropertyOf
) -
omn is wrong anyway
-
-
Some new "convenience" frames
-
And
sub
meaning ontologies can be built bottom up or Top-down. -
Easy creation of new entities
(defclass B :super A :label "B")
Note
|
Being a programming language rather than a format it is relatively easy to add new features with a clearly defined semantics. So, for example, I wanted to add a "sub" keyword so that I could build ontologeis bottom up. In practice, so far I have not really used this, but I do not feel that the syntax should dictate the ontology development style. |
Unprogrammatic Syntax
-
Other Differences from OMN:
-
Two Comment syntaxes
-
Explicit creation of new entities
-
Define before use optional
-
-
Same syntax for patterns
-
GCI fully supported
-
The parser works
Note
|
There are some other differences. Firstly, tawny has (two) comment syntaxes. OMN is commentable too but the parse doesn’t always work. The other major one is the use of an "explicit definition" semantics. Classes must be defined before they are used by other classes. This is a semantics shared with Brain, and was chosen deliberately. My early experience with OMN showed that it was too easy to make typo’s without. It is possible to avoid this if you wish. General Concept Inclusions are fully supported which OMN doesn’t do. We can also build patterns in the same syntax and files as the ontology. You will see many examples of this through the tutorial. |
Evaluative
-
Tawny-OWL is "evaluative"
-
Add new classes, new properties, new frames on-the-fly
-
Redefine patterns
-
Add new tests, and rerun
-
There is no compile cycle
Note
|
We can type, change and add new entities and reprogram things as we go. For the non-programmers this is likely to be so obvious that you cannot see why I am describing it, but for programmers from a Java background it is hard to under-estimate what a massive difference this makes to development styles. There is no compile cycle — you can change things as you go, and you do not need to continually restart your application. This is not entirely true, you do need to restart, but not that often. |
Compiled
-
There is a compile cycle
-
But you won’t notice it
-
Except that Tawny is performant
-
Tawnyized version of GO loads ~1 min.
Note
|
Actually, there is a compile cycle. So those of you from a programming background, may be thinking "hmm, an interpreted language running on top of a JVM. Actually, clojure statements are compiled to bytecode on-the-fly them run directly on the JVM (which in turn will JIT compile them). So, it’s fast enough. |
Broadcasting
-
An idea borrowed from R
-
R is very flexible with numbers and lists
-
Add number to list, adds the number of every element of the list
> c(1,2,3) + 4 [1] 5 6 7
Note
|
Broadcasting is a really very handy feature from R. You do not have to explictly deal with the lists and numbers (ontology entities in the case of tawny). |
Broadcasting
-
Tawny-OWL does something similar
(defoproperty r) (defclass C :super (owl-some r A B))
-
C some r A
-
C some r B
Class: o:C SubClassOf: o:r some o:B, o:r some o:A
Note
|
One statement in tawny unwinds to two in OMN. Two calls to the OWL API also. It is also one of those things that makes tawny less like an API and more like a textual UI. Although this is reasonably efficiently implemented, it does have a performance cost — more than made up for in saved typing for ontology developers. |
Patternized
-
Tawny-OWL allows patterns
-
Broadcasting works naturally with patterns
-
some-only
the most common
(defclass D :super (some-only r A B))
-
Expands into three (or
n+2
) axioms
Class: o:D SubClassOf: o:r some o:A, o:r some o:B, o:r only (o:A or o:B)
Note
|
This is the first example of a pattern that we see (although broadcasting is a pattern also, in a sense. This is the "some-only" pattern which is so common, it often not seen as a pattern. This was also the motivation for broadcasting as some-only makes little sense without broadcasting, although it might not be immediately obvious why this is the case. |
Single syntax Extensible
-
Tawny-OWL is implemented in Clojure
-
Tawny-OWL patterns are implemented in Clojure
-
Tawny-OWL Ontologies are written in Clojure
-
Therefore, adding new patterns is trivial
-
Here we introduce two new patterns and use them
(defn and-not [a b] (owl-and a (owl-not b))) (defn some-and-not [r a b] (owl-some r (and-not a b))) (defclass E :super (some-and-not r A B))
-
Which gives
Class: o:E SubClassOf: o:r some (o:A or (not (o:B)))
Note
|
In theory, tawny does nothing that it is not possible to do already. But the single syntax and environment is important. I can easily add new syntax even for a specific ontology. Doing this where half the ontology is build in protege and half outside is just intractable. With a single syntax it becomes so easy that it happens often and all the time. |
Reasoned Over
-
Tawny fully supports reasoning
-
In this case using hermit
-
Based on all the examples given so far,
F
has three subclasses
(defclass F :equivalent (owl-some r (owl-or A B))) ;; #{} (subclasses F) (reasoner-factory :hermit) ;; #{C D E} (isubclasses F)
Commodity Language
-
Built on Commodity Language
-
Full access to all APIs
-
Serialisation
-
Spreadsheet reading
-
Database access
-
Networking
-
Logic Programming
-
Test Library
-
Statistics and Plotting
-
Benchmarking
-
Note
|
Most of this I am not going to show anything other than implicitly, but tawny is based on a commodity language. So it access to many APIs which can do useful things for you. We have used quite a few of these either in the context of programming tawny or in developing OWL ontologies using tawny (in fact all of those given here). The key point to remember here is that programming tawny and developing ontologies are not disjoint. You have the same power in using tawny as we do in developing it. |
Commodity Toolchain
-
Editing
-
IDEs: Eclipse, IntelliJ, Netbeans
-
Power Editors: Emacs, Vim, Sublime
-
Web Editors: Catnip, GorrilaRepl
-
Novel: LightTable
-
-
Version Control: Any
-
Build and Dependency
-
Lein, Maven or Boot
-
-
Testing
-
Travis-CI, or any CI environment
-
-
Linters, Rewriters, Remote Evaluation (nREPL)
Note
|
Finally, we have full access to a rich tool chain, including a wide range of IDEs, power-editors or web editors, as well as some very novel environments (take a look at lighttable — implemented in Clojure and supporting it first). We also make extensive use of version control — we’ve been using git, but you can use what ever you want. You can integrate your ontology development process and software development process. Dependency — we’ll see later how to access ontologies using the maven dependency management system. Some one else can host your ontology without having to use their URIs! Testing, continuous integration. Remote evaluation (actually, you will use this all the time even if it seems you are not). Linters. Rewriters. We have only just got started with these. |
Tawny Tutorial
-
The source for these slides
-
All the sample code
-
And an empty workspace
-
git clone https://github.com/phillord/tawny-tutorial.git
-
Or download https://github.com/phillord/tawny-tutorial/releases
Other Technical Pre-Requisites
-
JVM
-
Leiningen
-
Protege
Installing JVM
-
You will need Java 1.7 or upward
-
1.8 or upward is recommended
-
Or use your OS package manager
-
You need JDK, not just Java
Installing Leiningen
-
This is a Clojure build tool
-
Download the shell script (Linux,Mac) and run it!
-
Or your package manager (leiningen 2 — lein 1 will not work)
-
For Windows, the installer works
-
Or you can download the batch file and setup manually
Testing the Pre-Requisities
-
In the terminal/cmd, in tawny-tutorial directory
lein run
-
There will be a long download period then:
The Tawny Tutorial is Installed and Ready
Protege
-
Not really a pre-requisite
-
One part will use it if available
Or: the alternative
-
Install some Clojure IDE/Environment
-
Emacs with CIDER is my environment of choice
-
Eclipse with Counter Clockwise.
-
VIM Fireplace
-
Any of the many other options
-
If you do this, you are on your own…
-
You will get a much stronger editing environment thanotherwise
Paying the Piper
-
Tawny-OWL has many benefits, out-of-the-box
-
Comes with one significant cost
-
Requires a working Clojure environment
-
And an understanding of how to use it
Note
|
While Tawny-OWL does come with many significant benefits as an ontology development, it also comes with one significant cost. It requires the use of a Clojure development environment. In this short section, we will get "hello world" up and running. |
Clojure environments
-
IDEs provide a shrink-wrapped environment
-
But come with their own baggage
Note
|
I could have opted to use a shrink-wrapped IDE. But while these provide a very rich environment, they also come with an enormous amount of baggage, and give a very programmatic feel to the experience. They also tend to use a lot of resource, which might not be ideal on laptops. More over, the practical reality is that I do not use one myself and can only give partial help on them. |
Catnip
-
I’ve opted for "catnip"
-
A light and simple Clojure editor
-
Useful for tutorials!
Note
|
Instead I have opted for catnip. This is a small, light and simple clojure editor, which was primarily designed as a tutorial environment for Clojure. It is, itself, a Clojure application and works quite nicely. It is not, of course, a fully featured environment. If you want to develop seriously, them you’d need to move to something else. It does, however, reinforce the point that we are using a commodity language. There are many powerful tools around. |
Catnip
-
Assuming you have followed pre-requisities launch with:
lein edit
-
It should print
Catnip running on http://localhost:nnnn
-
And with luck a web browser will pop up.
Catnip
-
Open the file
src/lisbon/hello.clj
-
Save the file with Ctrl-S
-
This also compiles the file
-
Ctrl-R or click where it says
lisbon.hello
-
Type
hello
(ns lisbon.hello) (def hello "hello world")
What have we achieved
-
We have defined a new namespace
-
Created a variable (
hello
) -
Evaluated
hello
to find it’s value (prosaically "hello world") -
This is the basic Clojure workflow
Task 1
Build a Hello World Ontology!
Ontological Hello World
-
We have a working Clojure "hello world"
-
Let’s build an ontological one
Ontological Hello World
-
For everything that follows you have:
-
An empty file for trying out the examples (
src/lisbon/onto_hello.clj
) -
A full file with all the answers (
src/lisbon/onto_hello_s.clj
)
-
-
Better to copy and paste if you can!
Note
|
The slides and tutorial material that I am showing are all developed in files
with names (ns lisbon.onto-hello-s (:use [tawny.owl])) |
The namespace
-
Clojure has a namespace mechanism
-
The namespace is the same as the file name
-
But
_
in file name is-
in namespace -
:use
makestawny.owl
namespace available
(ns lisbon.onto-hello (:use [tawny.owl]))
Note
|
The Clojure namespace declaration is the only "programmatic" part of tawny that you have to see. Rather like Java, Clojure namespaces are consistent with the filename (although dashes are replaced with underscores for strange reasons). Most development environments will put this in for you. We also add a statement to say that we wish to "use" tawny.owl. This is a file local import, and it will occur a lot! |
Define an Ontology
-
Using the
defontology
statement -
hello
is a new variable -
Can be used to refer to the ontology
-
Becomes "default" for this namespace
-
All frames are optional
-
We see more later
(defontology hello :iri "http://www.w3id.org/ontolink/tutorial/hello")
Note
|
Next we define a new ontology. It has a name that we can use to refer to it,
which is a useful property as we shall see, although most of the time, we do
not have to. For the The name is also used as a prefix when saving the ontology (although this can be overriden), so I tend not to use hyphens, as it messes with syntax hightlight for my Manchester syntax viewer. |
Introduce a Class
-
Use a
defclass
statement -
Statements are delineated with
()
-
Also called "form" or "sexp" (s-expression)
-
Hello
is also a variable -
Hello
andhello
are different -
Cannot use the same name twice
-
We use the "default" ontology
(defclass Hello)
Note
|
Now we define a new class. As we can see, by comparing against the
One key point from a programming point-of-view, clojure is a lisp-1. All variables are in the same namespace, and that includes all the ontology entities that we might define. It’s easy to clobber one with another so be careful! |
Properties
-
Properties use
defoproperty
-
o
to distinguish annotation and datatype property -
Both
defclass
anddefoproperty
take a number of frames -
All are optional
(defoproperty hasObject) (defclass World)
Note
|
Properties use defoproperty. There are also annotation and object properties, and I have opted for somewhat opaque function names to avoid a large amount of typing. Hard to know whether this was a great decision or not, but the alternative really seemed various unreadable to me. More over, there are only a few of these names, and OWL 3 is unlikely to come along any time soon. |
A complex class
-
Now we use a frame
-
:super
says "everything that follows is a super class" -
Or,
HelloWorld
has asuper
which isHello
-
owl-some
is the existential operator -
The default operator (or only) operator in many ontologies
(defclass HelloWorld :super Hello (owl-some hasObject World))
Note
|
Finally, we bring in a frame. Frames are introduced using "keywords" — that is something beginning with a ":" — this is a fundamental part of lisp syntax (it is one of two ways of defining self-evaluating forms if you are interested!), which by good fortune is also very similar to the way that Manchester syntax does it. Hopefully, most people are familiar with the meaning of "some" in this context. |
On the use of "owl" (a quick diversion)
-
It is
owl-some
rather thansome
. -
But it is
only
and notowl-only
-
This avoids a name clash with
clojure.core
-
Have not gone the OWL API route
(owl-some hasObject World) (only hasObject World)
Note
|
As we move through the material, you will notice the "owl" prefix popping up.
Although, clojure includes full namespace support clashes with the
|
On the use of "owl"
-
Full list of "owl" prefixed functions is
-
owl-and
-
owl-or
-
owl-not
-
owl-some
-
owl-class
-
owl-import
-
owl-comment
-
-
And the variables
-
owl-nothing
-
owl-thing
-
On the use of "owl"
-
There are also some "short-cuts"
-
&&
,||
and!
-
-
A "long-cuts" (
owl-only
) -
And consistent but clashing names in
tawny.english
.
Note
|
There are also options. We have symbol short-cuts which should be obvious to
programmers. We have an |
Save the Ontology
-
Either add this form to the file and Ctrl-S
-
Or just type into REPL (window with prompt)
-
Open
o.omn
either in Catnip, a text editor or Protege to check
(save-ontology "o.omn" :omn)
Note
|
To check my axiomatisation I often save my ontology in manchester syntax — I normally always use the same filename (makes using git, and .gitignore easier). And I have my editor auto-reload. We will be looking at how to build an auto-save function later. |
More Frames
-
Can add new frames to existing definition with
refine
-
Could also just change
defclass
form and re-evaluate -
:label
and:comment
add annotations using OWL built-in -
:annotation
is general purpose frame -
We add a label in Portugese
-
Re-save the ontology
(refine HelloWorld :label "Hello World" :comment "Hello World is a kind of greeting directed generally at everything." :annotation (label "Olá mundo" "pt"))
Note
|
Finally, let’s extend our definition somewhat. We could, of course, just
change the defclass statement, but I wanted to introduce the Tawny has a number of convenience frames that have no direct equivalent
in OMN. |
Task 1: Conclusions
-
Tawny-OWL uses frames
-
We look like Manchester syntax
-
But in a lispy way
-
Easy to type, including short-cuts
Task 2: Defining a Tree
-
For the next section, we build an amino acid ontology
-
We do this in several ways
-
First, we start with a simple hierarchy
Amino Acids
-
Chemical Molecules
-
Central Carbon, with an amino and acid group
-
And a "side chain" or "R" group which defines the differences
-
Have a number of chemical properties
-
There are 20
A namespace
-
You can should create your ontology in the file
lisbon/amino_acid_tree.clj
-
The full code in these slides can be found in
lisbon/amino_acid_tree_s.clj
Note
|
(ns lisbon.amino-acid-tree-s (:use [tawny.owl])) |
(ns lisbon.amino-acid-tree (:use [tawny.owl]))
Tree
-
So, we define our (flat) tree
(defontology aa) (defclass AminoAcid) (defclass Alanine :super AminoAcid) (defclass Arginine :super AminoAcid) (defclass Asparagine :super AminoAcid) ;; and the rest...
Note
|
Defining the basic tree is not too hard — we just use the |
Disjoint
-
Let’s make everything disjoint
-
Asparagine
is not the same asAlanine
orArginine
-
With three amino-acids, this is painful and error-prone
-
With twenty it would untenable
(defclass Alanine :super AminoAcid :disjoint Arginine Asparagine)
Disjoint
-
But there is a more serious problem
-
Change
Arginine
to this -
This will now crash (probably)
(defclass Arginine :super AminoAcid :disjoint Alanine Asparagine)
Note
|
In theory, this should crash, but it may not if you have previously evaluated
(or saved in catnip) the Most REPL programmers restart defensively every hour or two to guard against this. |
Explict Definition
-
Tawny uses explicit definition
-
Variables must be defined before use
-
This is deliberate!
-
Can be avoided by using Strings
-
But that is error-prone
Note
|
Explicit definition is a good thing, but does hold out the possibility for being a real pain. We think that it is not so in Tawny, because we have features for working around it. It is possible to avoid entirely anyway, but I personally do not, because it makes spelling errors all too likely. I will show this once later on. |
Simplifying the definition
-
We can use the
as-subclasses
form -
Adds
AminoAcid
as super to all arguments -
This syntax also protects against future additions.
(defclass AminoAcid) (as-subclasses AminoAcid (defclass Alanine) (defclass Arginine) (defclass Asparagine) ;; and the rest... )
Note
|
Before we show the solution to the disjoint problem, we simplify our
definitions. Typing The advantage of lexically grouping all of the subclasses in this way is also that it makes the intent of the developer obvious. If we need to add a new subclass later (unlikely in this case once finished, but during development yes), then adding it next in the list also makes it a subclass as it should be. This ability to make the developer intent, and conformance to a pattern explicit is a good thing! |
And disjoint!
-
This simplifies the disjoint behaviour also
-
We add
:disjoint
keyword
(defclass AminoAcid) (as-subclasses AminoAcid :disjoint (defclass Alanine) (defclass Arginine) (defclass Asparagine) ;; and the rest... )
Note
|
Now that we have achieved this, we can also solve the disjoint problem, by adding a keyword in. All the classes will now be made disjoint. If you want to be sure, evaluate the source, save it and look at the OMN. |
And, finally, covering
-
We might also want to say
-
These are the only amino-acids there are
-
To do this we use a "covering" axiom
-
Interesting ontology! Biologically true, chemically not.
Note
|
We can add a covering axiom in the same way. Unlike |
And, finally, covering
-
This is also easy to implement
-
Here with all the amino-acids in full
-
Again, the source code grouping is useful!
(defclass AminoAcid) (as-subclasses AminoAcid :disjoint :cover (defclass Alanine) (defclass Arginine) (defclass Asparagine) (defclass Aspartate) (defclass Cysteine) (defclass Glutamate) (defclass Glutamine) (defclass Glycine) (defclass Histidine) (defclass Isoleucine) (defclass Leucine) (defclass Lysine) (defclass Methionine) (defclass Phenylalanine) (defclass Proline) (defclass Serine) (defclass Threonine) (defclass Tryptophan) (defclass Tyrosine) (defclass Valine))
Note
|
Putting in a covering axiom, then adding a new sibling and forgetting to modify the covering axiom is an easy mistake to make, and can be very difficult to pick up, either by eye or by testing. Tawny makes this harder. |
And, finally, covering
-
And, here is a subset of the equivalent OMN
Class: aa:AminoAcid EquivalentTo: aa:Alanine or aa:Arginine or aa:Asparagine or aa:Aspartate or aa:Cysteine or aa:Glutamate or aa:Glutamine or aa:Glycine or aa:Histidine or aa:Isoleucine or aa:Leucine or aa:Lysine or aa:Methionine or aa:Phenylalanine or aa:Proline or aa:Serine or aa:Threonine or aa:Tryptophan or aa:Tyrosine or aa:Valine DisjointClasses: aa:Alanine, aa:Arginine, aa:Asparagine, aa:Aspartate, aa:Cysteine, aa:Glutamate, aa:Glutamine, aa:Glycine, aa:Histidine, aa:Isoleucine, aa:Leucine, aa:Lysine, aa:Methionine, aa:Phenylalanine, aa:Proline, aa:Serine, aa:Threonine, aa:Tryptophan, aa:Tyrosine, aa:Valine
Note
|
I have not shown all the OMN because it is long and tedious, but these are the keypoints added by the subclasses function. If you evaluated the |
Task 2: Conclusions
-
It is easy to build simple hierarchies
-
We can group parts of the tree
-
There is support for disjoints
-
There is support for covering axioms
Task 3 — Defining some properties
-
A List of amino acids only gets us so far
-
Now we define some properties
-
First, we do this the long-hand way.
Note
|
We start with our usual duplicated namespace. (ns lisbon.amino-acid-props-s (:use [tawny owl pattern])) |
Namespace
-
By now you shold be familiar with the name space definition
-
It is different.
-
The
:use
clause means "use bothtawny.owl
andtawny.pattern
".
(ns lisbon.amino-acid-props (:use [tawny owl pattern]))
Starting our ontology
-
As before, we define (yet another!) amino acid ontology
(defontology aa) (defclass AminoAcid)
Amino Acid properties
-
Amino Acids have many properties
-
Most of these are continuous
-
Hard to model Ontologically
Note
|
It is, of course, possible to model continuous values as data type properties and just put the numbers in. This works to a certain extent and is an option for modelling. |
The Value Paritition
-
We introduce the "value partition"
-
We split a continous range up into discrete chunks
-
Like the colours of the rainbow
-
First we define the partition itself
-
AminoAcid’s have a size, and only one size!
(defclass Size) (defoproperty hasSize :domain AminoAcid :range Size :characteristic :functional)
Note
|
For full details and a discussion of the advantages and disadvantages of this approach, the value partition is described in this recommendation edited by Alan Rector. Putting "Size" under a ValuePartition superclass would also make sense. |
Value Partition
-
Now we define the values
-
Three values, and only three values
-
All of which are different
(as-subclasses Size :disjoint :cover (defclass Tiny) (defclass Small) (defclass Large))
Note
|
In a "real" ontology it would be good to add annotation properties and comments describing exactly what these partitions mean. |
Using the values
-
We can now create our amino-acids using these three sizes.
-
We only create three amino-acids here
-
More would be needed.
(as-subclasses AminoAcid :disjoint :cover (defclass Alanine :super (owl-some hasSize Tiny)) (defclass Arginine :super (owl-some hasSize Large)) (defclass Asparagine :super (owl-some hasSize Small)))
Using Facets
-
Many classes are associated with properties
-
Call these "facets" after "facetted classification".
-
Value Partition is a good example
-
Define
Charge
using the same pattern asSize
.
(defclass Charge) (defoproperty hasCharge :domain AminoAcid :range Charge :characteristic :functional) (as-subclasses Charge :disjoint :cover (defclass Positive) (defclass Neutral) (defclass Negative))
Note
|
See https://en.wikipedia.org/wiki/Faceted_classification for a description of a facetted classification. Facets are a relatively new feature of tawny-owl and I may come to regret the name as "facet" also has a meaning wrt to the OWL specification (it’s a feature of a datatype property). The tawny-owl facet is unrelated to the OWL specification. |
Facet
-
Now define the values as facet of
hasCharge
-
facets are extra-logical
-
They do not change the semantics of ontology statements
-
They are visible in the ontology
(as-facet hasCharge Positive Neutral Negative)
Note
|
I had a number of options for the implementation of facets. It would be possible to do this without having them visible in the ontology, but this seemed to the best way forward. Happy to solict opinions on this. |
Using the facet
-
Can now give just the class
-
Again, using
refine
although could just alter the code -
(facet Neutral)
rather than(owl-some hasCharge Neutral)
-
Saves some typing
(refine Alanine :super (facet Neutral)) (refine Arginine :super (facet Positive)) (refine Asparagine :super (facet Neutral))
Note
|
We convert the facet into an existential restriction. |
And others
-
Does not save that much typing
-
Let’s add
Hydrophobicity
(defclass Hydrophobicity) (defoproperty hasHydrophobicity :domain AminoAcid :range Hydrophobicity :characteristic :functional) (as-subclasses Hydrophobicity :disjoint :cover (defclass Hydrophobic) (defclass Hydrophilic)) (as-facet hasHydrophobicity Hydrophilic Hydrophobic)
Note
|
We have to add the |
Using Facets
-
And Polarity.
(defclass Polarity) (defoproperty hasPolarity) (as-subclasses Polarity :disjoint :cover (defclass Polar) (defclass NonPolar)) (as-facet hasPolarity Polar NonPolar)
Using Facets
-
facet
broadcasts -
We can apply two at once
-
A different property can be used for each
-
We could do all four value partitions at once!
(refine Alanine :super (facet Hydrophobic NonPolar)) (refine Arginine :super (facet Hydrophilic Polar)) (refine Asparagine :super (facet Hydrophilic Polar))
Note
|
Now we can see the full saving advantage. Instead of four separate statements, we have a single one. The advantage goes beyond typing, of course, this also is an advantage in the consistency of building our ontology. We cannot associate the wrong class with the wrong property. In this case, we have set the logical axioms up such that if we did use the wrong property, the reasoner is likely to pick the complaint up anyway. But this is immediate and at the point of use. |
Using Facets
-
And the output.
Class: aa:Alanine SubClassOf: aa:hasCharge some aa:Neutral, aa:AminoAcid, aa:hasSize some aa:Tiny, aa:hasHydrophobicity some aa:Hydrophobic, aa:hasPolarity some aa:NonPolar
Task 3: Conclusions
-
Can use value partitions to split up numerical ranges
-
Can define facets to ease the use of object properties
-
Can apply several facets at once
Task 4 Patternising
-
Make full use of an existing pattern from Tawny
Patternising
-
We still have a lot of typing
-
The value partition has lots of bits
-
Easy to get wrong (
Polarity
was wrong) -
Tawny supports this pattern directly
Note
|
I missed half the axioms of |
Namespace
-
The value partition pattern is found in
tawny.pattern
-
We
use
it here
Note
|
As usual, we start with the real/false namespace (ns lisbon.amino-acid-pattern-s (:use [tawny owl pattern])) |
(ns lisbon.amino-acid-pattern (:use [tawny owl pattern]))
And the preamble
-
This is the same as before
(defontology aa) (defclass AminoAcid)
The size value partition
-
This is a new form
-
Syntactically similar to what we seen before
-
defpartition
defines that we will have a partition -
[Tiny Small Large]
are the values -
hasSize
is implicit — it will be created
(defpartition Size [Tiny Small Large] :domain AminoAcid)
Note
|
This is syntactically similar because it’s just a new function defined in the language. We are adding nothing clever here just using a language as it is intended to be used. |
The size value partition
-
And (some of) the OMN.
ObjectProperty: aa:hasSize Domain: aa:AminoAcid Range: aa:Size Characteristics: Functional Class: aa:Size EquivalentTo: aa:Large or aa:Small or aa:Tiny
More value partitions
-
Adding partitions for all the properties is easy.
(defpartition Charge [Positive Neutral Negative] :domain AminoAcid) (defpartition Hydrophobicity [Hydrophobic Hydrophilic] :domain AminoAcid) (defpartition Polarity [Polar NonPolar] :domain AminoAcid) (defpartition SideChainStructure [Aromatic Aliphatic] :domain AminoAcid)
Using these partitions
-
defpartition
also applies theas-facet
function -
So, we can use
facet
also -
Syntactically, the ontology has simplfied
-
Same semantics underneath
(as-subclasses AminoAcid (defclass Alanine :super (facet Neutral Hydrophobic NonPolar Aliphatic Tiny)) (defclass Arginine :super (facet Positive Hydrophilic Polar Aliphatic Large)) (defclass Asparagine :super (facet Neutral Hydrophilic Polar Aliphatic Small)) (defclass Aspartate :super (facet Negative Hydrophilic Polar Aliphatic Small)) ;; and the rest )
Task 4: Patternising
-
Tawny directly supports the value partition
-
This integrates with facets
-
Together can simply this (very common!) form of ontology
Task 5: Understanding Names
-
Understand how tawny uses three namespaces
-
See how to use them independently
-
Understand how they allow OBO ID support
Background
-
Ontology Entities need names
-
Tawny has different requirements for names
-
But we need to support alternative workflows
-
So, Tawny is flexible
-
With "sensible" defaults
OWL
-
Tawny is build on the OWL API
-
Underneath, therefore, it is part of the web
-
OWL uses IRIs (i.e. URIs or URLs)
-
IRIs provide a single, shared global namespace
-
With a (social) mechanism for uniqueness
Note
|
We have to inherit directly from the web, because all the software that we are building on depends on it and/or requires it. IRIs have this unusual characteristic of being global. |
Symbols
-
Tawny uses symbols to identify entities
-
Easy to type (
A
rather than"A"
) -
Provides define-before-use semantics
-
Comes directly from Clojure
-
Supported in the IDEs
Note
|
Symbols are a core feature of Clojure and, indeed, any lisp. They are kind of equivalent to variable names in other programming languages, but not exactly the same since lisp gives you flexibility to handle them directly. Using symbols also provides a set of other advantages — in addition to the define-before-use semantics, they are normally syntax highlighted specially by editors and, rather usefully, they will normally auto-complete ("code-complete" or "intellisense") in an IDE. Very useful for big ontologies. |
Namespace
Note
|
(ns lisbon.whats-in-a-name-s (:use [tawny owl]) (:require [tawny.obo]) (:require [clojure.string :as s])) |
-
We will use
tawny.obo
later to show numeric IDs -
clojure.string
is for string manipulation.
(ns lisbon.whats-in-a-name (:use [tawny owl obo]) (:require [clojure.string :as s))
Symbols and IRIs
-
What is the relationship between symbols and IRIs?
-
In tawny, this is a per-ontology setting
-
We use the
:ontology
frame -
By default the symbol forms the fragment of the IRI
(defontology o) ;; => #<OWLClassImpl <8d9d3120-d374-4ffb-99d8-ffd93a7d5fdd#o#A>> (defclass A :ontology o)
Note
|
If we do nothing else, tawny identifies an ontology using a random UUID. This is not entirely best practice and, indeed, is illegal for some serialisation formats. However, the OWL API supports it, and it’s useful where we are playing or giving a demo. If we define a class, we use the ontology IRI to form the entity IRI, just
adding the symbol name as the fragment. Note in this case I have used the
To see the IRI, type "A" into the REPL after evaluating the two forms above. The UUID is random, so it will be different each time! The "[OWLClassImpl]" stuff comes from the OWL API, and is just the "toString" method. It’s not great and at some point I will fix this. |
Symbols and IRIs
-
We should identify our ontologies correctly
-
We use the
:iri
frame for our second ontology -
Again, the class uses the symbol name as the fragment
(defontology i :iri "http://www.w3id.org/ontolink/example/i") ;; => #<OWLClassImpl <http://www.w3id.org/ontolink/example/i#B>> (defclass B :ontology i)
Symbols and IRIs
-
The relationship is programmatically defined
-
We can change it to what ever we want
-
Using the
:iri-gen
frame to supply a function -
Here we reverse the symbol name
-
We call the symbol name: "the tawny name"
(defontology r :iri "http://www.w3id.org/ontolink/example/r" :iri-gen (fn [ont name] (iri (str (as-iri ont) "#" (s/reverse name))))) ;; => #<OWLClassImpl <http://www.w3id.org/ontolink/example/r#EDC>> (defclass CDE :ontology r)
Note
|
This is a pretty pointless transformation! |
The iri-gen function
(fn [ont name] ;; <1> (iri ;; <2> (str ;; <3> (as-iri ont) ;; <4> "#" ;; <5> (s/reverse name)))) ;; <6>
This is the first function we have seen so, we go through it in detail
-
Create an anonymous function, with parameters
ont
andname
-
Create an IRI object from the string
-
Concatentate all arguments
-
Get the Ontology IRI
-
"#"
-
Reverse the name passed in!
OBO identifiers
-
OBO identifiers present a challenge
-
Source code is the ultimate in WYSIWYG
(defclass GO:00004324 :super (owl-some RO:0000013 GO:00003143) :annotation (annotation IAO:0504303 "Transporters are..."))
Note
|
With protege or another GUI, we can use the underlying identifiers and display something different to the user. With source code, we cannot. Clearly something like the example above is just not acceptible (although it is actually legal tawny code or lisp). |
OBO Identifiers
-
Tawny provides an OBO style ID iri-gen function.
-
We set that here
(defontology obo :iri "http://www.w3id.org/ontolink/example/obo" :iri-gen tawny.obo/obo-iri-generate)
-
Will explain this bit later!
(tawny.obo/obo-restore-iri obo "./src/lisbon/whats_in_a_name.edn")
OBO Identifiers
-
Now we can eval these forms
-
Each gets a numeric identifier, OBO style
-
The identifier is stable
;; => #<OWLClassImpl <http://purl.obolibrary.org/obo/EXAM_000003>> (defclass F :ontology obo) ;; => #<OWLClassImpl <http://purl.obolibrary.org/obo/EXAM_000002>> (defclass G :ontology obo) ;; => #<OWLObjectPropertyImpl <http://purl.obolibrary.org/obo/EXAM_000001>> (defoproperty ro :ontology obo)
Note
|
In this case, we have totally dissociated the IRI from the symbol. The IRI is not auto-generated here — it is stable, and will come out the same every time and on every machine. You should get the same IRIs exactly. |
OBO Identifiers
-
This one does not!
;; => #<OWLClassImpl <http://purl.org/ontolink/preiri/#4b463bc1-414b-4730-89a3-7ff72902c744>> (defclass H :ontology obo)
Note
|
Interestingly, when we get to here, we get a strange ID with a UUID for fragment and an IRI that it unrelated to anything. |
How OBO Identifiers work
-
The mapping is stored in a file
-
The
obo-restore-iri
line above reads this file -
If a symbol has no mapping, we use the "pre-iri" form.
How OBO Identifiers work
-
The mapping file is generated
-
Human readable, and line-orientated
-
Deterministically ordered
-
Will version!
-
Uses EDN format.
("ro" "http://purl.obolibrary.org/obo/EXAM_000001" "G" "http://purl.obolibrary.org/obo/EXAM_000002" "F" "http://purl.obolibrary.org/obo/EXAM_000003")
Note
|
The mapping file has been designed to work with version control, because it needs to be shared between all developers. Although it is a generated file, it is source code, since it cannot be recreated from fresh (not in the same order anyway). EDN format is a Clojure thing. It’s basically a Clojure read syntax. |
How OBO identifiers
-
Stable pre-iri’s
-
No need for a server such as URIgen
;; this stores any new IDs we have created (comment (tawny.obo/obo-store-iri obo "./src/lisbon/whats_in_a_name.edn"))
Note
|
While the preiris are automatically created, if we choose they can be made stable by simply saving them into the file with the form above. This can be safely done every time the file is evaluated, because the order is deterministic, so it will cause no false diffs in versioning. This is potentially useful if you are collaborating with others and want to co-ordinate at pre-release time. It’s not essential if others are using tawny — there is no need, since classes can be refered to by symbol. There is a potential disadvantage. This creates an IRI (and entry in the file) for every new entity created. Not a problem with protege, but tawny is fully programmatic. I can create 10^6 new classes in one line of code. pre-iris all appear at the end of the EDN file! |
How OBO Identifers work
-
How to create permanent IDs
-
Needs to be co-ordinated, since IDs are incremental
-
Use version-control to co-ordinate
-
One person, or as part of a release process
;; this coins permanent IDS, in a controlled process! (comment (tawny.obo/obo-generate-permanent-iri "./src/lisbon/whats_in_a_name.edn" "http://purl.obolibrary.org/obo/EXAM_"))
Note
|
At some point, you need to coin new IDs that will become permanent. This has to happen in a co-ordinated fashion. It could be done as part of release. Or by bot during continuous integration. |
How OBO Identifiers work
-
I think having no server is nice
-
Reusing version control makes sense
-
It’s programmatic! You are free to disagree.
Note
|
How well would this workflow work in practice? Not sure. It would work for a small number of developers. There are many tweaks that could be made for different scales — saving to multiple files, pre-iris in one place, perms in another. No pre-iris at all. Use URIGen. Manually coin permanent IRIs as part of quality control. It’s programmatic and easy to change. |
Tawny Names
-
It is possible not to use symbols
-
The
iri-gen
function takes a string not a symbol! -
This string is the tawny name
-
Consider the following
;; String building! (defontology s) (owl-class "J" :ontology s) (object-property "r" :ontology s) (owl-class "K" :ontology s :super (owl-some "r" "J"))
Note
|
We can do without symbols and instead use just strings. Note that we also have
to switch functions |
Tawny Name
-
These forms do NOT define symbols
-
This WILL NOT work
-
Neither
r
norJ
have been defined
(comment (owl-class "L" :ontology s :super (owl-some r J)))
Tawny Name
-
Danger!
-
Consider this statement.
-
But we did not define "L"
-
But we have used it.
(owl-class "M" :ontology s :super "L")
-
And so, it becomes defined
Class: s:M SubClassOf: s:L Class: s:L
Note
|
The use of strings means that we can define things without the tawny-name being in "primary position". It just happens. You need to be careful. |
Why use strings?
-
Partly, there for implementation
-
But made public as string manipulation is easier
-
Most useful for development
Note
|
The main reason that I have left this in place is for use as an API. Clojure allows full manipulation of symbols like most lisps, but it’s a bit of a pain. It’s not possible, for example, to concatenate two symbols to make a longer one (or rather, they need to be converted to strings, then concat’d then converted back again). And the creation and interning of new symbols as variables requires the use of macros rather than normal functions. Having said that, tawny does offer some facilities to help with this process
We will see a few of these later. |
Summary
-
The relationships are summarized as follows
-
We will look at the arrow on the right next!
Task 5: Conclusions
-
There are three namespaces in use
-
tawny-name
-
clojure symbols
-
IRIs
-
-
The relationship between the three is fluid
-
Generally, just use symbols!
Task 6: Importing Other Ontologies
-
Understand how to import and use another tawny ontology
Importing
-
Many Ontologies import other ontologies
-
Allows cross-linking, and resuse of work
-
Tawny supports this also
J’accuse reuse
-
Now I will rant!
-
Is a belief that reuse is automatically good
-
You should always use terms from another ontology
-
I disagree! Consequences are serious!
-
Duplicate, Duplicate, Duplicate
-
After you’ve used five or six terms, then consider refactoring
-
Now I will stop ranting!
Note
|
At this point I would like a rant. There has become a widely held believe, typified (although not solely caused by) the believe that we should always reuse terms from another ontology if they exist. And that this cross-linking is necessarily good. But this is not true. It is as at least as wrong in the ontology world as it is in the software world. In software, reusing a library in your own is called adding a dependency. You become dependent on it. You can be affected by changes in it. Your release process can be affected by its release process. If it rots, your project rots. If it is insecure, yours is insecure. Not only this, you become dependent on its dependencies also. And, in fact, the transitive closure of your dependencies. This project has over 70 dependencies. And these can change over time. While in "SNAPSHOT" mode, the dependency graph of your project can change overnight, without any change to your code. And you have multi-path problems, "OWL Hell", where one ontology can be imported multiple times. And no one knows have to deal sanely with versioning. And this is just the software issue. With ontologies the problem is worse. You are making an ontological commitment. And it may not be one that want, it may not be one that you agree with, and it may be one that contradicts your actual use case. Do not go contiually looking for terms that you can use and reuse. Consider searching for terms when you realise that you are getting too far away from the core requirements for your ontology, from your competency questions. If you really like someone elses definition, cut-and-paste it, give it your own identifiers, and comment that you have done so. Maybe, once you have five or six terms from the same ontology, then consider importing the other ontology. And don’t blame me if it goes wrong. |
An ontology to import
-
However, reuse is inevitable
-
And sometimes good
-
So, how do we do it.
-
In tawny, there are two steps, use and import
(ns lisbon.abc (:use [tawny.owl])) (defontology abc :iri "http://www.w3id.org/ontolink/example/abc.owl") (defclass A) (defclass B) (defclass C)
Using this ontology
Note
|
(ns lisbon.use-abc-s (:use [tawny.owl]) (:require [lisbon.abc])) |
Using
-
We have seen
use
many times before -
A namespace with an ontology can be
use
'd like any other -
Here we use
require
-
Helps to avoid name collisons
(ns lisbon.use-abc (:use [tawny.owl]) (:require [lisbon.abc]))
Using
-
Normally, using is not enough
-
We also need to explicitly import the ontology
-
Only after import will its axioms become available
-
Warning! Clojure has an
import
function and it does not do the same thing.
(defontology useabc) (owl-import lisbon.abc/abc)
Note
|
I did strongly consider combing the Without the import statement, we gain access to the identifiers from the
|
Using
-
Of course, we can also use IRIs direct from the imported ontology
-
For which we need to use the
iri
function. -
This works with any ontology
(defclass MyA :super (iri "http://www.w3id.org/ontolink/example/abc.owl#A"))
Using
-
The
require
statement also allows us to use symbols -
Here, we use an explicit name space
lisbon.abc
-
And a symbol
B
.
(defclass MyB :super lisbon.abc/B)
Note
|
Here, we are reusing basic clojure functionality and it’s namespacing mechanism. That the symbols refer to ontology terms really makes no difference. If you get bored of typing, then you can also alias |
Using
-
The final output is the same in both cases
-
The symbolic approach (as always) protects against spelling mistakes!
Class: useabc:MyA SubClassOf: abc:A Class: useabc:MyB SubClassOf: abc:B
Task 6: Summary
-
require
oruse
is a part of Clojure -
Gives us access to symbols from another namespace
-
Ontologies still need to use
owl-import
Task 7: Reading an Ontology
-
Understand what reading an ontology achieves
-
Use a simple example
The problem
-
In the previous example,
abc.owl
was developed in Tawny-OWL -
We need the Tawny-OWL source code for this to work
-
What if we do not have it?
-
Or, worse, what if it does not exist?
Note
|
It is possible that you want access to an OWL ontology that was not developed using Tawny-OWL, but by some legacy editor like Protege? |
Namespace
-
As usual, we define a namespace for our experiments!
(ns lisbon.read-abc-s (:use [tawny.owl]) (:require [tawny.read]))
Note
|
(ns lisbon.read-abc-s (:use [tawny.owl]) (:require [tawny.read])) |
Solution 1
-
We could
owl-import
a IRI -
And refer to all entities by IRI
-
Painful
-
Error prone
-
And with OBO identifiers, untenable
Solution 2
-
Tawny-OWL provides a solution called reading
-
Reading makes all entities available as symbols
-
In this case, a file
abcother.owl
has been saved locally -
Can read from any URL.
(tawny.read/defread abc :iri "http://www.w3id.org/ontolink/example/abcother.owl" :location (tawny.owl/iri (clojure.java.io/resource "abcother.owl")))
Note
|
Normally, you read an ontology into a namespace with nothing else in it, if for no other reason than to avoid name collisons. Also, note that we are using the OWL file from local, which gives us a degree of flexibility — you do not want to download GO everytime you restart the REPL. Although not covered here, |
Reading
-
Now we define our new ontology and import ABC
(defontology myABC) (owl-import abc)
Reading
-
And access it’s value by symbol
-
Symbols must be defined.
(defclass MyA :super A) (defclass MyB :super B)
Note
|
Having read our ontology this now gives us the ability to refer directly, with
symbols. So, we can type |
Task 7: Conclusion
-
Tawny-OWL supports a read mechanism
-
Ontologies only available as OWL files can be used transparently
Task 8: Programming an Autosave
-
Extend Tawny, so that it saves the ontology on every change
Extending Tawny
-
Tawny is implemented directly in Clojure
-
We can extend in the same syntax
-
Can extend in general and ontology specific ways
Note
|
I’ve picked an autosave because it is a nice general function that we might want to use and demonstrates some of the possibilities. But ontology specific is perhaps most useful of all, because it allows ontology development groups to tailor Tawny to their own development practices without having to generate bespoke extensions for Protege or equivalent. |
Namespace
-
The namespace definition here is a bit different
-
Require
tawny.owl
through an alias -
Also import an OWL API interfaces
(ns lisbon.autosave (:require [tawny.owl :as o]) (:import [org.semanticweb.owlapi.model.OWLOntologyChangeListener]))
Note
|
In general the use of For my own use, I think that with tawny this risk is worth it (and it is easy
to fix if it happens). For ontology namespaces (i.e. those in which I
define an ontology not much else), I tend The name collison between (ns lisbon.autosave-s (:require [tawny.owl :as o]) (:import [org.semanticweb.owlapi.model.OWLOntologyChangeListener])) |
Saving the Listener
-
We need a variable in which to save our listener
-
Clojure variables are immutable
-
Stores a
atom
and change that
(def auto-save-listener "The current listener for handling auto-saves or nil." (atom nil))
Note
|
Clojure was designed for concurrency and generally does not allow changing variables (although it does allow re-evaluation). Instead it has a set of objects which can store any value and which can be changed but which require the developer to make an explicit choice about how to deal with concurrent changes. It’s very nice and very sensible, but largely we just ignore it here. The reality with tawny is that the it’s build on the OWL API which is mutable, is not thread-safe and is not build for concurrency. |
The auto-save function
-
OWL API has an listener for ontology changes
-
We address it directly here.
(defn auto-save "Autosave the current ontology everytime any change happens." ([o filename format] (let [listener (proxy [org.semanticweb.owlapi.model.OWLOntologyChangeListener] [] (ontologiesChanged[l] (o/save-ontology o filename format)))] (reset! auto-save-listener listener) (.addOntologyChangeListener (o/owl-ontology-manager) listener) listener)))
auto-save in detail
-
Instantiate an object which implements OWLOntologyChangeListener
-
Implement a single method of this
-
Just save the ontology
-
proxy is a closure
-
o
,filename
,format
are closed over
(proxy [org.semanticweb.owlapi.model.OWLOntologyChangeListener] [] (ontologiesChanged[l] (o/save-ontology o filename format)))
Note
|
This is our actual listener. Proxy objects are available in the Java core, and the Clojure ones do much the same thing — implement an interface on the fly. They are rather more common in Clojure because a) it makes more sense in Clojure and b) implementing a new class is a bit of a pain (although entirely possible). Proxy objects work through reflection and have the performance characteristics that you would expect — but this is meant to be an end user function, so we really don’t care. Saving the ontology is likely to take far more effort than a reflective call. |
auto-save in detail
-
save the listener
-
discarding any existing one!
(reset! auto-save-listener listener)
Note
|
Clojure follows the scheme convention of marking mutating functions with a
Dropping the existing value is, of course, wrong and a memory leak, since the existing listener will be held by the manager, and may even result in strange behaviour. It’s a demo! |
auto-save in detail
-
The
o/owl-ontology-manager
function returns anOWLOntologyManager
-
Clojure uses the
.
syntax to call methods -
So, call
addOntologyChangeListener
on the manager -
With
listener
as an argument
(.addOntologyChangeListener (o/owl-ontology-manager) listener)
Note
|
I’ve used Clojure’s java interaction extensively during the development of the Tawny and I have to say I found it to be very nice. It fits very comfortably with normal Clojure development. In general, I hide the existance of the You might be wondering why It would be possible to deal with this much more formally, and pass an environment, probably integrated with the "default-ontology" functionality of Tawny. But I have not found a strong use case for this yet. |
Remove the auto-save
-
And a function to reverse the process
-
@
dereferences theatom
(defn auto-save-off "Stop autosaving ontologies." [] (when @auto-save-listener (.removeOntologyChangeListener (o/owl-ontology-manager) @auto-save-listener)))
Note
|
And finish off. If you don’t know what |
auto-save
-
There is already an
auto-save
function intawny.repl
-
And an
on-change
function
Task 8: Conclusion
-
Clojure provides easy interop with Java
-
We can use this to extend Tawny-OWL capabilities
Note
|
So, this has been a very rapid run through of how to integrate directly with the OWL API and add new functionality to Tawny that cannot be achieved through tawny itself. This is very commonly done in Clojure (where the mantra is do not wrap if you do not need to) and so it is well supported. |
Task 9: Create New Syntax
-
Create new syntax describing the amino-acids
-
Create some defined defined classes
-
Create all the defined classes
-
Use the reasoner
The Finale
-
This is rather more advanced
-
But pulls together most of the strands
-
Demonstrates the value of a programmatic environment
-
Possible to build ontologies without this.
-
Sometimes only lambda is enough!
-
The biology is quite cute also
Note
|
This is going to be a highly advanced, showing an high programmatic use of Tawny. In this course of this we will generate some brand new syntax — this demonstrates one of the uses of tawny — while it is harder to generate this new syntax, than just use existing, once you have done it is easier to use. There is also some interesting biology and an interesting ontological question that comes out of the end of it. |
The namespace
-
Lots of namespaces involved here
-
Tawny-OWL does have more namespaces
-
But not many
(ns lisbon.amino-acid-build (:use [tawny owl pattern reasoner util]))
Note
|
Probably this is an extreme example of (ns lisbon.amino-acid-build-s (:use [tawny owl pattern reasoner util])) |
The Upper Ontology
-
Nothing new here!
(defontology aabuild) (defclass AminoAcid) (defclass PhysicoChemicalProperty) (defpartition Size [Tiny Small Large] :domain AminoAcid :super PhysicoChemicalProperty) (defpartition Charge [Positive Neutral Negative] :domain AminoAcid :super PhysicoChemicalProperty) (defpartition Hydrophobicity [Hydrophobic Hydrophilic] :domain AminoAcid :super PhysicoChemicalProperty) (defpartition Polarity [Polar NonPolar] :domain AminoAcid :super PhysicoChemicalProperty) (defpartition SideChainStructure [Aromatic Aliphatic] :domain AminoAcid :super PhysicoChemicalProperty)
Defining our AminoAcids
-
Done this before
-
It involves too much typing
-
Want new syntax
-
Also to help ensure consistency
Note
|
Building amino acids by defining a class for each is no good at all, as it’s too long. Also, it’s too risky. So, we are going to expand the syntax, so that it will work well. |
Defining out AminoAcid
-
The function is relatively easy
-
defdontfn
gives default ontology handling -
& properties
is variadic or "one or more args". -
owl-class
function does NOT define a new variable
(defdontfn amino-acid [o entity & properties] (owl-class o entity :super (facet properties)))
Note
|
The function for making a new amino acid is relatively simple, as these
things go. It just passes off most of it’s work to This function does not intern — we can define a new amino-acid like this, but
the We cannot just replace |
Making a new variable
-
If we want to create a new symbol tawny provides
defentity
-
It does a few other things as well
(defentity defaminoacid "Defines a new amino acid." 'amino-acid)
Note
|
In fact,
|
Define Lots of Amino Acids
-
Lets define all the amino-acids at once
-
Pass definitions as a list (of lists)
-
Call using
map
-
The anonymous function destructures
-
->Named
packages name and entity together
(defdontfn amino-acids [o & definitions] (map (fn [[entity & properties]] ;; need the "Named" constructor here (->Named entity (amino-acid o entity properties))) definitions))
Note
|
Better than creating one amino-acid, let’s create all of them at once. This
function does two things — it calls the |
And make variables
-
We want to use symbols and define a new variable.
-
Tawny has some support for this
-
Not going to explain in detail
(defmacro defaminoacids [& definitions] `(tawny.pattern/intern-owl-entities (apply amino-acids (tawny.util/name-tree ~definitions))))
Note
|
We want to transform a bunch of symbols into strings because we are going to
use symbols we have not defined yet. We are missing the fact that the symbol
name and the string are equivalent here, so we could do this better, but the
|
Define the amino-acids
-
Now we can define all the amino-acids in one go.
-
The syntactic regularity means we are unlikely to miss something.
-
For me, this makes the effort worth while.
-
We also define subclasses, disjoints and covering.
-
Pay attention to the
:cover
(as-subclasses AminoAcid :disjoint :cover (defaminoacids [Alanine Neutral Hydrophobic NonPolar Aliphatic Tiny] [Arginine Positive Hydrophilic Polar Aliphatic Large] [Asparagine Neutral Hydrophilic Polar Aliphatic Small] [Aspartate Negative Hydrophilic Polar Aliphatic Small] [Cysteine Neutral Hydrophobic Polar Aliphatic Small] [Glutamate Negative Hydrophilic Polar Aliphatic Small] [Glutamine Neutral Hydrophilic Polar Aliphatic Large] [Glycine Neutral Hydrophobic NonPolar Aliphatic Tiny] [Histidine Positive Hydrophilic Polar Aromatic Large] [Isoleucine Neutral Hydrophobic NonPolar Aliphatic Large] [Leucine Neutral Hydrophobic NonPolar Aliphatic Large] [Lysine Positive Hydrophilic Polar Aliphatic Large] [Methionine Neutral Hydrophobic NonPolar Aliphatic Large] [Phenylalanine Neutral Hydrophobic NonPolar Aromatic Large] [Proline Neutral Hydrophobic NonPolar Aliphatic Small] [Serine Neutral Hydrophilic Polar Aliphatic Tiny] [Threonine Neutral Hydrophilic Polar Aliphatic Tiny] [Tryptophan Neutral Hydrophobic NonPolar Aromatic Large] [Tyrosine Neutral Hydrophobic Polar Aromatic Large] [Valine Neutral Hydrophobic NonPolar Aliphatic Small]))
Defined Classes
-
Defined Classes can be reasoned over.
-
Anything with a
Small
facet is aSmallAminoAcid
(defclass SmallAminoAcid :equivalent (facet Small))
Note
|
Defined subclasses are the heart of reasoning in OWL. Effectively, they form queries and they tell us useful things. |
And some more
-
There are lots of these
-
We can combine them in many ways.
(defclass SmallPolarAminoAcid :equivalent (owl-and (facet Small Polar))) (defclass LargeNonPolarAminoAcid :equivalent (owl-and (facet Large NonPolar)))
Note
|
So, having done one, surely we should do more. So here are the next two. Here we are combining two. |
Where to stop
-
Where to stop
-
3? 10?
-
Why not do them all?
-
We are using a programmatic tool
-
How would we do this
A defined class function
-
Similar to before
-
This does not create variables
(defn amino-acid-def [partition-values] (owl-class (str (clojure.string/join (map #(.getFragment (.getIRI %)) partition-values)) "AminoAcid") :equivalent (owl-and (facet partition-values))))
Note
|
We can build a function to replicate this. We use some string manipulation for this to generate the name. I have not done the interning here — I leave this an exercise! |
Doing them all
-
"Doing them all" actually means the cartesian product
-
Surprisingly there is not a function for this
-
This is pure Clojure, not doing to describe it
(defn cart [colls] (if (empty? colls) '(()) (for [x (first colls) more (cart (rest colls))] (cons x more))))
Doing them all
-
Call the
amino-acid-def
function on cartesian product -
This creates 453 defined classes
(doall (map amino-acid-def ;; kill the empty list (rest (map #(filter identity %) ;; combination of all of them (cart ;; list of values for each partitions plus nil ;; (so we get shorter versions also!) (map #(cons nil (seq (direct-subclasses %))) ;; all our partitions (seq (direct-subclasses PhysicoChemicalProperty))))))))
Note
|
We now call actually run the cartesian product. We add "nil" so that we get single, double, and triple as well as full length products, and filter for nil to get rid of them again. |
Reasoning
-
Finally, we reason over this.
-
We choose to use hermit
-
And check consistency
-
This takes a second or two
(reasoner-factory :hermit) (consistent?)
Note
|
Tawny support a couple of reasoners out of the box, including Hermit and ELK.
Here we are instantiating using a The reasoner is invoked to check consistency automatically. Tawny uses a GUI (a progress bar) by default to show this process, but falls back to text if that is not possible (so you can check consistency in a CI environment without hassles. |
Reasoning
-
We can count numbers
-
We have reasoned many subclases of
AminoAcid
(count (subclasses AminoAcid)) (count (isubclasses AminoAcid))
Note
|
Working out what has happened can be quite hard (this is something that we wish to fix in future versions of tawny), but counting subclasses work as well as anything. We now have a lot more inferred subclasses than asserted. |
Reasoning
-
But we are not coherent!
-
In fact, we have many unsatisfiable classes
-
What is happening?
(coherent?) (count (unsatisfiable))
Visualising
-
Are many ways to visualize our ontology
-
Saving it and opening in Protege is easiest
(save-ontology "o.owl" :owl)
Note
|
Protege is really nice for visualising ontologies. So, we use that here. I always save to the same file name, but better to save as OWL rather than OMN because it parses more correctly. |
Visualising
-
Many defined classes are equiavlent
-
Many are unsatifisable
-
Happens because there are 20 amino-acids
-
But 700 defined classes
-
Many defined classes have necessarily the same extent
-
Many can have no individuals (
Negative
andHydrophobic
) -
Only happens with the covering axiom
Note
|
The reason that this happens is obscure, perhaps, but the base reason is because we have many more defined classes than we have primitive ones. So, we must have equivalences or unsatisifiables. This only happens because of the magic So, the reasoning is telling us something about the biology — and whether we want this form of conclusion depends on whether we are talking about biology or chemistry — after all if we were a chemist many amino acids could be created that separate out of equivalent classes, and many make some unsatisifiable classes satisfiable. Life is complex but, in this case, simpler than chemistry. |
As a query
-
Using defined classes as a query
-
Not that useful
-
Most of the infered subclasses are defined!
(count (isubclasses SmallAminoAcid))
As a query
-
We have a full programming language
-
So, we filter for only undefined classes
(count (filter #(not (.isDefined % aabuild)) (isubclasses SmallAminoAcid)))
Conclusions
-
We can use highly programmatic nature of Tawny
-
We can generate many defined classes
-
To do so is useful
-
In this case one axiom can have a large effect
-
Results depend on the choices we make in the modelling
Hiatus
-
Hope that I have shown basic and advanced functionality
-
Key feature of Tawny is extensibility
-
Reuse of existing tooling
-
Would welcome feedback
-
Never used catnip in this sense before, glad to know if it worked for you
Questions and Answers
-
Remainder of tutorial will be audience driven
-
Welcome to take any questions
-
Also have set of potential questions with prepared answers
-
Can get full tutorial, with notes
-
Readable http://homepages.cs.ncl.ac.uk/phillip.lord/download/tawny/icbo_2015/
Questions
-
Can I add annotations on axioms?
-
How does this affect ontology deployment?
-
How do you version your ontology?
-
How do you test your ontology?
-
How do you continously integrate your ontology?
-
What about advanced documentation for ontologies?
-
How do I collaboratively develop my ontology?
-
Can I internationalise my ontology?
-
Can I scaffolding my ontology from existing sources?
-
What happens if the labels of read ontologies change?
-
How do you convert an existing ontology to Tawny?
-
How fast is tawny?
-
Can I integrate more tightly with protege?
-
How does Tawny affect dependency management with ontologies?
-
Can I link ontologies into software?
-
What’s this
:super
? why not:subclass
?
Can I add annotations on axioms?
-
OWL allows annotation of axioms, for provenance for example
-
Tawny provides a syntax for this
-
Annotates
SubClassOf
axiom betweenMan
andPerson
with a comment.
(defclass Man :super (annotate Person (owl-comment "States that every man is a person")))
Note
|
One of the reasons for the complexity of the OWL API is that it allows annotations to be passed in lots of places, including on the axioms that assert the relationship between, for example, two classes. One simplification I made with Tawny is to hide this complexity. Moreover, Tawny is frame-centric so the axioms are not normally seen explicitly. Unfortunately, it appears that in hiding the complexity, I had also hidden a capability that people actually use: GO uses axiom annotations for provenance for instance. So I have added this capability into tawny with a slightly expanded syntax. The |
How does this affect ontology deployment
-
Potentially none — Tawny generates an OWL file
-
Potentially automatable
-
In project source, or through leiningen plugin
-
-
Or can publish as a maven artefact
-
Ontology can be downloaded as a software artefact
-
Separates out ontology identifier and download location
-
Note
|
We can deploy ontologies to bioportal or to anywhere else exactly as we do now. Save the OWL file, and do stuff to it manually. Of course, in a programmable environment, it is also very easy to add additional deployment technology — so generate one or more files, copy them to another location, via ssh or http, check them in somewhere. This can be achieved either in the project source or as a leiningen plugin. Or, finally, we can deploy our ontology like any other piece of Clojure code, as a maven artifact, to maven central, to clojars (like maven central but for Clojure), or to our own private repo. This also separates our the ontology identifier from the download location. It’s possible to argue whether this is a good thing or a bad thing. |
How do you version your ontology?
-
Tawny-OWL uses a line-orientated syntax
-
You edit source code not a visualisation of an XML file
-
Like various OBO flat file syntaxes, it works well in git or any VCS
-
Leiningen supports release versioning, using Semantic Versioning
Note
|
Tawny-OWL uses a line-orientated syntax and what you edit is source code, not some XML that has been generated. So, it works very well with version control. The serialisation order is entirely predictable because there is no serialisation — it’s source code, it only changes if you edit it. Working with source code also means that diff tools show you changes in the same form of the code that you edit, so it’s very easy to compare two versions, two branches what ever. The only exception to this are the EDN files used for numeric IDs of OBO identifiers. These have been designed to version well (they are generated but not regeneratable, so should be regarded a source code), but only time will tell. "They do not version well" will be considered to be a bug though, and will be something I would want to fix. We’ve used git for all the various ontologies we have developed, and it works nicely. Actually "nicely" is an understatement. The move to modern version management, rather than anything bespoke build just for ontologies makes an enormous difference. For my money, it’s reason to move to tawny all by itself. It is also possible to version in the sense of release tagged versions of an ontology and to use a dependency mechanism. |
How do you test your ontology
-
Clojure supports one or more unit tests frameworks
-
We use the default (core.test) framework
-
Tawny-OWL provides some fixtures
-
Also use spreadsheet generated testing
-
Paper – http://arxiv.org/abs/1505.04112
-
Tawny-Karyotype – http://github.com/jaydchan/tawny-karyotype
-
Plain English Summary - http://www.russet.org.uk/blog/3074
-
Note
|
We test our ontologies explicitly and sometimes very heavily. Tawny-OWL
provides some fixtures to make this easy and we use |
How do you continuously integrate your ontology?
-
We can test ontologies with standard frameworks
-
These can run directly from leiningen
-
This workflow allows the use of standard CI environment with no changes
-
We use github/travis-CI
-
Note, iff you import ontologies via URI, you may not get a repeatable build.
Note
|
Once you can test, then you can continuous integrate. We do this with Travis-CI which is nice and it supports Clojure and Leiningen out of the box (er, cloud). It also continuously integrates with the software environment (including tawny). You can reason on there as well (tawny works headless just fine). If you import ontologies via URI you are totally dependent on them being stable or you will not get a repeatable build. It’s probably better to import ontologies via their version IRI anyway to ensure this, although it’s not widely done. |
What about advanced documentation for Ontologies?
-
Tawny-OWL ontologies are readable text
-
It is possible to embed rich readable comments
-
Also can use literate programming tools
-
noweb, or org-mode use traditional approach
-
I have also developed "lentic" which integrates with editor
Note
|
OWL raw does allow documentation but it’s poor. With annotation properties you have no structure at all unless you use microsyntax. Moreover most annotations are sets — so no order. We have been experimenting heavily with literate programming tools. I started this quite a few years back, but they now work well, and we have build a specialised tool called "lentic". |
How do I collaborative develop by ontology?
-
The same was as all software
-
Version control for asynchronous, fork and merge with git
-
Collaborative chat use gerrit, or skype
-
Synchronous editing, try floobits, a web editor
Note
|
Collaborative development is not a new requirement and is, in fact, the default for some environments. Just use existing tools. Git if you want asynchronous development, or floobits, or even a virtual machine, tmux and Emacs in the console. What ever. The point is, it’s not a problem for tawny. It’s a problem for many software engineers world-wide and they have provided some very, very slick solutions. |
Can I internationalise my ontology?
-
Can add internationalised labels
(label "Ciao" "it")
-
Can define internationalized function calls
(defn etichetta [l] (label l "it"))
-
Can use
tawny.polyglot
to use property bundles
Note
|
It’s very easy. Tawny programmability means that you can also support a
default language if you chose to, and have your ontologists use their own
native language for all parts of the system. Our example of |
Can I scaffold my ontology from existing source
-
Can "import" ontology terms from spreadsheet, XML or a database
-
Can work over existing source
-
Therefore can generate core of ontology
-
And expand it with manually annotated crosslinks
-
See paper in ICBO 2015!
What happens if the labels of read ontologies change
-
OBO ontologies use numeric IDs
-
These are unreadable, so we syntactically transform labels
-
If label changes (but ID remains the same) is a problem
-
Can use
tawny.memorize
to remember mappings -
Which adds aliases to those now missing (with optional "deprecated" warnings)
Note
|
The problem here is that we have to do something to get readable names for OBO style ontologies. But we are now using a part of the OBO style ontology that is open to change with, perhaps, fewer guarantees than for identifiers. Tawny has support for this. It’s solves the problem by saving the mapping that it creates between a label and a URI. If the URI remains, but the label disappears than tawny adds an alias and deprecation warning. |
How do you convert an existing ontology to Tawny
-
tawny.render
can perform a syntactic transformation -
Given OWL provides equivalent Clojure code
-
Used interactively to provide documentation
-
Can be used to port an ontology
-
Currently "patternising" ontology is manual
-
See Jennifer Warrenders PhD thesis where she did this with SIO
Note
|
We do have a methodology for doing this. We can render most of the ontology
automatically, which provides the basis for this kind of port. But actually
making use of the advanced features of tawny (like patterns or the
|
How Fast is Tawny
-
For raw, un-patternized ontology tawny takes about 2x as reading OWL/XML
-
Tested by rendering and load GO
-
About 56Mb of lisp
-
Loads in about 1min
-
Most of excess time is in parsing, (Clojure also compiles)
-
Patternized ontology would involve less parsing
Note
|
In short, it’s fast enough that you are probably never going to notice it. We cannot currently test how much difference the patternisation would make, but it might be substantial. It is also worth noting that with an ontology the size of GO, iff it were developed in Tawny, it would be unlikely to be single file. Interactively (i.e. in the editor) you probably would not be loading the whole ontology most of the time anyway. |
Can I integrate more tightly with protege?
-
We have built a GUI shell into Protege
-
Can also use Protege to open a Clojure REPL via a socket
-
Protege then displays directly the state of Tawny
-
Good for demonstration
-
But a little flaky for normal use
-
Having Protege reload an OWL file easier
Note
|
This should work nicely and it does, but the truth is that at the moment a REPL opened inside Protege hangs periodically and I do not know exactly why; I suspect it is that Protege is not entirely happy with having it’s data structures changed underneath it, but I have not had the leisure to debug this yet. In our hands, the auto-reload function works well. I tend to render first to OMN and look at that. Tawny also has a documentation capability which shows you the "unwound" definition of terms. And then finally I use protege after that. |
How does Tawny affect dependency management with ontologies?
-
Clojure uses maven dependency management
-
We can now publish ontologies as maven artefacts
-
And specify dependencies, with versions, and tooling
-
Can publish on Maven central or Clojars (no infrastructure to maintain!)
-
Separates ID and download location — disobeys LOD principles
-
But fulfils, SLOD principles.
Note
|
Clojure uses maven dependency management. As a tawny ontology is just a piece of clojure, we can use the same mechanism with tawny ontologies also. Which means that we can specify ontological dependencies also. This means we can specify version ranges (OWL doesn’t allow this to my knowledge). And we can can reuse tooling. We can use leiningin to show us a dependency graph, we can look for version conflicts, we can exclude duplicates from the transistive closure. Interestingly, we can also publish our ontology independently from our IDs. So, we can get someone else to maintain all the infrastructure for deployment (including of multiple versions) without having to adopt their identifiers (like bioportal). This rather breaks the Linked Open Data (LOD) principles, of course which says that IDs should resolve. Using maven dependencies we don’t need this at all. But it fulls the SLOD (significant load of dependencies) principle which says if your software has lots of dependencies and lots of different people maintaining the infrastructure for their availability it is going to break all the time. Thanks to Helen Parkinson for inspiring (a slightly different version) of the SLOD acronym. |
Can I link ontologies into software?
-
OWL API objects become first class entities in Clojure
-
Can refer to them directly
-
We integrated Overtone — a music generation system
-
Added in Tawny-OWL and the Music Ontology
-
We now have software that plays a tune
-
And provides OWL metadata about that tune
-
More to investigate here.
Note
|
One of the great unexplored areas of Tawny at the moment is how much value we can get embedding an ontology into software. We did have a very short project integrating a semantic system (tawny) with a music generation system. This works and was fun. I think there is a lot of scope for research in this area yet. |
What’s this :super
? why not :subclass
?
-
Manchester syntax uses
SubClassOf:
-
Tawny uses
:super
for the same purpose! -
Confusing!
-
Manchester syntax is actually backward
-
In tawny, all frames are
A has :frame B
-
In Manchester
A is a SubClassOf: B
Note
|
I made this change very carefully and was very reticient about it: not least
because it made my main user of Tawny at the time (Jennifer Warrender) change
all of their existing code. But I had really confusing code inside Tawny where
my I changed from ":subclass" to a more plain ":super" at the same time. This opens up a slight risk because object-property has the same frame but for a different purpose. I do not worry about this too much because other tools will pick up, for example, the use of a class as a super-property. |
Conclusions
-
Hope the tutorial was worthwhile!
-
Tawny can change the way you build ontologies
-
Will actively support use
-
Always interested in future collaborations.
Acknowledgements
-
Jennifer Warrender, Karyotype Ontology, Tawny-OWL
-
Anthony Moorman, Driving use case
-
Ignazzio Palmizziano, OWL API Support
-
Matt Horridge, Protege, OWL API Support
-
Robert Stevens, Amino Acid Ontology