ISMB finishing

Finally, ISMB is coming to an end. The database and ontologies track had a couple of interesting talks, with Suzi Lewis' being the day before. To finish off, I am in a Open Science meeting — rather smaller than I thought it would be, but this might be because it was not very well attended, but then it's at the end of the conference.

Not a bad conference, but too long as always.

Permalink
   

ISMB

Yesterday was the SIG co-ordinators meeting for ISMB. One of the big and recurrent issues (besides the timing of coffee breaks) was the timing of ISMB. At 7 days, ISMB is a long, long conference and is a bit of a killer. Of course, bringing it down to 4 days will mean that more events will run concurrently. Live with it, I say.

Bio-Ontologies was a success, but I want to think about the future (Blair-like, perhaps I am thinking of my legacy, as I will not chair it for that much longer). Perhaps, "Bio-Ontologies: knowledge in biology" would be a way to go — I want to move the workshop away from a technology and more toward a function.

Permalink
   

10th Annual Bio-Ontologies Meeting

Today is the day of the Bio-Ontologies SIG meeting, which I have now co-organised for 4 years or so. It's a surprisingly large amount of work to do, not least this year because we had 36 submissions. The organisation of this is a large part of the effort, but it has made for a strong programme; it's gratifying to see that we have an audience of size to match.

09:10

We had a moment of worry when the first speaker didn't register, but Mark Musen is a notable replacement, talking about representing OBO to OWL mappings.

09:30

Following Mark's talk about using more rigourous models of OWL, Simon Jupp is talking about using the more light-weight semantics of SKOS, which turns out to be well suited for document navigation.

09:50

Lina Yip covers a familar problem — mapping between one resource and another: in this case MESH and Swissprot — to support the flow of knowledge from bioinformatics research toward medical practice.

10:10

The mapping theme continued (you'd almost think it was planned!) by Julie Chabalier who has mapped a number of resources to build a query warehouse.

11:00

Judy Blake has just spoke on annotation of GO and exactly what they mean. It's good to see an increased formality to the relationships between a GO term and the entity that it is describing. This talk has generated the most questions so far, mostly asking for more details.

11:29

Mikel Arungen is now talking about design patterns, which are analogous to software design patterns. These should help to bridge the gap between the desire to write rigourous logical definitions, but the difficulties of doing this.

11:51

Daniel Schober is now describing efforts to standardise naming conventions, fitting with the theme of methods to help people produce interoperable and standardised ontologies.

12:10

Lunch, and nearly on time. Most of the lag was from coffee break, so I don't feel that I, as timekeeper can be held responsible for this! Next for poster session, followed by the panel.

14:00

Well, the panel session has an element of self-indulgence about it. Robert has been doing this for much longer than I, but even for me it's four years. After such a long span, it'a amasing that we have got to ten yeas. All of the speakers commented on how big the community has got, and that we are all a little surprised about this. The current religious themes running through bio-ontologies are also here, but so far fairly muted. A good panel all in all, and a nice marker for 10 years.

16:00 (ish)

Larisa Soldatova's talk addressed the need for an tool enabling scientists to add additional semantics to their written work.

16:30

Catia Pesquita is talking about semantic similarity, which is a topic close to my heart. An interesting and careful body of work which covered the ground well, I thought.

16:50

Kieran O'Neil is not showing some interesting research, where he has been investigating novel techniques for query building over integrated databases.

17:10

Irena Spasic talked about some building term lists for metabolomics from literature mining. Once again she highlighted the need for access to full papers.

17:30

Daniel Faria took the graveyard slot, and discussed measure for protein clustering using sequence and GO information.

Conclusions

Overall a good day. It was great to have some many papers, and such a lively debate. This also marks the retirement of Robert as co-chair. His presence will be greatly missed — he's taught my everything I know about being relaxed and not faffing too much while conference organising.

Onward till next year.

Permalink
   

Preservation for the Future

I've been attacking email systems this week. I've been helping to transfer email from the Nottingham exchange server upto Newcastle. The process has not gone easily. I think that the problem is that university IT departments think mostly about their current users, rather than users coming or going elsewhere. To me this is a real problem: for an academic, their correspondence is an essential ingredient of the historical record, their knowledge of what they have done.

Spurred on by this, I decided to recover all of my mail from the archives where I have kept it, and place it into my current email system. This is made easier for me because I have used Emacs for pretty much my entire time on a computer; I remember a DOS based application before that. I've moved from RMAIL to Gnus, but that is it. Gnus uses an one message per file, text based format. It's pretty future proof; I suspect in 2000 years, when people look back they will assume that everyone used Gnus and similar applications, as all the PST files will be unreadable. There's a big gap in the middle of my email for 6 months after I got to Newcastle, when I had used Outlook. A pity.

My total collection of email is 1.4G in size — I've been reasonably careful about dumping 100M attachments over the years. The earliest email sent by me talking about SET domains in a Drosophila gene. The oldest email I can find sent to me comes from 1994. It's from a nice bloke I remember meeting on one of the guitar boards, called Paul R. Leach. At that time he was at Colorado. He was kind enough to send me some Herco Flex 50s from the US. These are guitar plectrums that seem to have disappeared from the market at the time. I think I still have a few of them left. Thanks Paul! An act of generosity, that I now remember 13 years later. The internet was a kinder place in those days.

Permalink
   

Aging File Formats

An interesting article on the BBC today about digitial preservation. The issue is a well-known one, that file formats go out of date very quickly. They have a chap from Microsoft showing that you using a virtual machine you can still open word 3.0 documents; this seems to miss the point, to my mind. Great, so I can still read it, with my eyes, by looking at it. But can I compute over it? If we are to take this approach, then it might make more sense to just print out over thing that we want to store and save the paper.

I think that it's good that we are moving toward open documentation standards. Microsoft's standardisation of their file formats is welcome, if belated. However, it has to be acknowledged that a large, 6000 page specification is going to be a problem in the future. It's notable, that I have 15 year old latex documents on my machine and on the whole they still just work; when they do not, almost all of the knowledge in them is easily recoverable with a text editor. As far as I can see, the only way that you can guarentee that a file format will be usable into the future is to make it as simple as possible.

Permalink

Page by Phillip Lord
Disclaimer: This is my personal website, and represents my opinion.
Science