How, What and Why to Test an Ontology

Structure

The Karyotype Ontology
Tawny-OWL
Classification of ontology tests
How, What and Why
Discussion

Note	Today, I am going to talk about some work performed by Jennifer Warrender on the karyotype ontology; describes a new way of building ontology tests and the heirarchy of tests which we have developed.

Use Case (chromosomes)

The human karyotype is complex to describe
Alterations more so

Note

Obviously, if we are going to describe testing, we need to describe what we are testing. So, first our driving use case. This is a picture of a chromosome 9, showing it’s characteristic banding pattern. Together all the chromosomes make up a karyotype. This is hard to describe and re-arrangements of it are more so.

Use Case (karyotypes)

Current, representation comes from ISCN
Written in a book, no computational representation.

47,XXY
46,Xc,+21
46,XY,+21c,-21
45,XY,-10,der(10)t(10;17)(q22;p12)
46,XY,der(7)t(2;7)(q21;q22)ins(7;?)(q22;?)
46,XX,der(8)t(8;17)(p23;q21)inv(8)(p22q13)t(8;22)(q22;q12)
46,XX,der(9)del(9)(p12)t(9;22)(q34;q11.2),der(9)t(9;12)(p13;q22)inv(9)(q13q22)
An ontological representation seems like a nice idea

Note	The specification for this is written down in a book somewhere. There is no computational representation of it. Building an ontology seemed like it might be a profitable way to do this. We have now build this ontology but, along the way, we have build a novel tool for ontology development.

Tawny-OWL

Novel tool
Library written in Clojure, that wraps around the OWL API
Textual user interface
http://github.com/phillord/tawny-owl
Motivated by karyotype work

Note

The tool is called "tawny-owl". It’s actually an application, written in Clojure, and wraps the OWL API which is also the underpinning for Protege. I like to call it a textual user interface, rather than a library, because it has been designed for ontology development, not as a library for OWL manipulation.

Tawny-OWL and the Karyotype Ontology

Chromosome Bands are highly repetitive
Allows Pattern-Driven development
Define patterns for Downstream usage
Allowed us to try different axiomatisation
And allows us to test

Note	We build tawny because chromosome bands are very repetitive and it gives us a number of features which I just list except for the critical one, wrt to this talk. It allows us to test

Tawny-OWL

Modelled on Manchester Syntax
Patterns and simple statements in a single syntax

(defclass Pizza
   :label “Pizza”
   :super
    (owl-some hasTopping PizzaTopping)
    (owl-some hasBase PizzaBase))

Note	Tawny’s base syntax is modelling on (but different from) Manchester syntax. We can build patternized sections of our ontology freely.

Why Test?

A single error in code, can generate many errors in ontology
KO is repetitive and large
Tests become an encoding of the (paper) specification
The ontology is reasoned over

Note	One of the few disadvantages of pattern use is that it is easy to generate a lot of mistakes very quickly. So, we need to test this. More over, we used tests to encode part of the specification. And finally, the ontology is reasoned over. We need to check the implications are what we expect.

How Test?

Clojure (and therefore Tawny-OWL) also has a test frame work
Allows "unit" testing as well as other forms
We reuse and recast software testing for ontology testing

(deftest plus
  (is (= 4 (+ 2 2))))

Note	Fortunately, Clojure already has a test framework or harness that we can use. It looks like this — we can reuse and recast software testing for ontologies. Which is good, because software testing software is pretty good (and has been heavily tested!).

Hierarchy of Tests

Note	From this we have generated a hierarchy of tests. This is an informal hierarchy and not an ontology.

Software-Bound Test

What: Tests which do not reference any ontology object, but which tests software which affects the ontology generated
Why: If the software is wrong, the ontology will be so!
How: traditional Unit Test
Example:
Predicate on a string
Does it represent a short (p) arm or a long (q) arm

(defn str-pband? [ band ]
 (re-find #“p” band))

(is (h/str-pband? “HumanChromosome1Bandp10”))
(is (not (h/str-pband? “HumanChromosome1Bandq10”)))

Ontology-Bound Test

What: Tests which do reference an ontology object, but do not require reasoning to test.
Why: Test whether the ontology generation is as expected
How: Unit Tests, using transitive closure hierarchy testing
Example:
Predicate on whether an entity is a band or not

(defn band? [x]
 (or
  (= x HumanChromosomeBand)
  (superclass? human x HumanChromosomeBand)))

(is (h/band? h/HumanChromosome1Bandp10))

Note	"Does not require reasoning" generally means does not mean full DL reasoning — this example is, effectively using a very limited form of structural reasoning, with "superclass?".

Ontology-Bound Test

Our ability to express ontology-bound tests convieniently is limited
Working on "tawny.query" to allow richer matching

Reasoner-Bound Test

What: Tests which reference an ontology object, and which require reasoning to test.
Why: Test whether ontology has the implications, it should have
How: Unit Tests, using reasoner predicates
Unit Test syntax is too cumbersome in some cases
Described later.

(is (r/isuperclass? i/k46_XY n/MaleKaryotype))

Note	r/isuperclass calls a reasoner via the OWL API — in our case this is hermit.

Probe-Bound Test

What: Tests which change the ontology, and then reason
Why: Tests whether ontology detects incoherency
Why: In practice, mostly that we have appropriate disjoints!
How: Unit Tests, with macro’s for safely changing the ontology
How: Maintaining test independence is key here!

(is (not
      (with-probe-entities
       [ _ (owl-class “_”
             :super HumanAutosome
                 HumanSexChromosome)]
       (r/coherent?))

Note	We have only a few of these, although probably we should have a few more. In practice, because we use patternised development having lots is not really needed.

How we Test: Ontology-Bound

ISCN contains many "examples"
Actually these effectively form part of the specification
Many have been encoded
But testing all these is painful, using unit tests

Note	Testing everything would be entirely possible. Tawny is extensible, so we could create new syntax. But still, requires some complexity to use.

How we test

Put all the names into a spreadsheet
With facets ("diploid, haploid, male, addition, autosomal gain")
Put 1,0, or -1 (yes, don’t know, no) in each cell

Note	Instead, we used a different approach. We put all the statements that we should be able to reason true or false, and put them into a spreadsheet. This allows specification of a pretty large number of tests quickly — you can see some of it here!

How we test

Use Clojure to parse the spreadsheet and test the assertions
Spreadsheet is part of source code
We use Clojure to generate and then cache unit tests (for performance)

Note	Java is good at reading excel spreadsheets. We actually generate source code from the spreadsheet — clojure is good at this sort of thing, as it is a lisp, but this is just a performance optimisation.

Test Numbers

Software-Bound: 53
Ontology-Bound: 759
Reasoner-Bound: 2273
Probe-Bound: 3

Note	Taken together, this gives us a pretty large number of tests.

Conclusions

The karyotype ontology is complex and needs testing
With Tawny-OWL we can reuse unit test paradigm
We can build new interfaces, including spreadsheets
We have an hierarchy of test types
Tawny-OWL can be used to test any OWL ontology.
Can we bring test-driven development to ontology?

Note

Conclusions — different kinds of tests, test different bits of the ontology. Most of these are relevant to any ontology and the others are relevant to many others. We think that this form of testing is very useful, and note that Tawny provides a nice testing environment for any OWL (or OWL convertable) ontology.

Acknowledgements

Did the ontology: Jennifer D. Warrender – jennifer.warrender@newcastle.ac.uk
Wrote Tawny-OWL: Phillip Lord – phillip.lord@newcastle.ac.uk
Driving Use Case: Anthony Moorman, Leukaemia Research Cytogenetics Group
Paper – http://arxiv.org/abs/1505.04112
Tawny-OWL – http://github.com/phillord/tawny-owl
Tawny-Karyotype – http://github.com/jaydchan/tawny-karyotype
Plain English Summary - http://www.russet.org.uk/blog/3074