This is the sole assignment for the CSC8303 course. Do not worry if you are unsure of the techniques needed, they will become clear during the lectures and practicals.
You have been asked, as a part of a larger bioinformatics project, to design a package that parses EMBL files. As your work will be utilised by other members of the lab it important you test your API thoroughly. You should know how to find an EMBL file; please ask if you are not sure. It is important that you comment your code and provide suitable documentation (5 marks). Finally you must follow a suitable style for your code, including appropriate variable names, indenting and spacing, up to 25% of available marks will be awarded for clearly written code, conforming to standard coding conventions.
You are required to carry out all of steps 1-3 and one part of step 4
EMBL files adhere to a clear, well defined structure, they have two character tags at the start of a line, followed by three spaces, then a value. For example, the first line of the file is the ID line, and includes the entry ID as well as the sequence length.
EMBL.java
and Sequence.java
, with EMBL classes having a Sequence
object as a field. (8 marks)Scanner
object,
and then parses the EMBL file into the classes you have defined
in 1. (4 marks)Sequence
class should implement the java.lang.CharSequence
interface. It should store the sequence as a List
of
java.lang.Character
objects. In particular, charAt(int)
should
extract the appropriate Character
from the list and return the
equivalent char (6 marks)AGGAGGU
) in the DNA sequence.OC Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; Mammalia; Eutheria; Euarchontoglires; Primates; Catarrhini; Hominidae; Gorilla. OC Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; Mammalia; Eutheria; Euarchontoglires; Primates; Catarrhini; Hominidae; Homo.
hence, the list "Eukaryota" "Metazoa" "Chordata" "Craniata" "Vertebrata" "Euteleostomi" "Mammalia" "Eutheria" "Euarchontoglires" "Primates" "Catarrhini" "Hominidae" should be returned
Files should be uploaded as Java source files. Any files that you wish to support the code should be .txt files.