CSC8312 -- Bioinformatics Theory and Applications
New
The IDs are now available.
This assignment is designed to test your ability understand and
computationally derive useful biological information from a
DNA/Protein sequence. You will need to write a short report on
protein sequences that you have been given, checking the annotation,
the genomic context and sequence features.
You will be provided with Uniprot IDs for 4 proteins, from different
organisms. You should pick 1 proteins from these 4; you should do an initial
analysis and then pick the one you think is most interesting. You need to
write a short report on it, checking the accuracy of the Uniprot record, and
providing evidence for as many of the statements in the record as possible.
In particular, you should:
- Retrieve the nucleotide sequence of the genomic segement from which
the transcript is derived. You will need to retrieve upstream and
downstream sequence also.
- Using relevant tools, check whether predicted coding regions
correspond to the amino acid sequence of hte protein when
translated, identifying relevant introns and exons if possible. Do
not rely on the annotations in the EMBL/GENBANK records, but
reproduce the results.
- Check any genomic features that you can find in the genomic DNA
records, and see which tools predict these computationally.
- Many Uniprot records have functional assignments. Using
appropriate tools, report on whether you agree with this functional
assignment.
- For those records with associated publications, what is the
relationship between the sequence and the publications.
Deliverables
A finished paper produced in the style of a Bioinformatics
paper. You can see some examples of the style at
http://www.oxfordjournals.org/bioinformatics/for_authors/general.html
The paper should obey a strict 6 page limit. People exceeding this
page limit will be penalised. You must obey the style guide lines for
bioinformatics — you should not shrink font sizes.
Marks will be given for scientific writing style (10%), background
(10%), methods (35%), results of analysis (30%) and discussion of the
results(15%).
In particular, additional marks will be award for:
- The application of a wide range of bioinformatics tools. The
discipline is constantly growing and changing. We do not teach all
the tools that you should be aware of.
- You should record your work, and the methods that you have used
accurately and consistently.
- The reasons for any conclusions you make should be described
accurately. It is as important to know why you have concluded
something as what you have concluded.
- You should present the work in as concise a manner as possible. Use
tables, figures and graphs wherever appropriate.
- Adequate references and reference lists.