Publication Title

“Optimisation of Parameters for the Assessment of Unclassified Disease Gene Sequence Variants"

Authors

“Jennifer D. Warrender, Daniel C. Swan, Ciaron McAnulty"

Abstract

“Bioinformatics tools are routinely employed to assess the clinical significance of unclassified missense variants identified following mutation screening. There are a wide variety of ‘Pathogenic or Not’ (PON) tools available some of which require a Multiple Sequence Alignment (MSA) of the gene in question to give the best results. We have investigated the production of qualitatively ‘best’ MSAs and a number of PON tools to determine practices that can be used to maximise the quality of PON predictor results. To optimise parameters for the PON predictors, we used four approaches. These included: Investigation into which of the seven MSA tools (ClustalW, DIALIGN, HMMER, MAFFT, MUSCLE, SAM, T-Coffee) gave the best quality alignment by using known MSA Benchmarks (BAliBASE and PREFAB); Obtaining control data (known variants for test genes); Investigation of six PON predictors (AlignGVGD, MAPP, PolyPhen, PMut, SIFT, SNPS&GO) and the best thresholds for AlignGVGD and PolyPhen PON predictions; Investigation into the optimal MSAs for each PON tool and the idea of a generic optimal MSA using combinations of different phylogenetic species data and scored with Matthew's Correlation Coefficient (MCC). We draw a number of conclusions from this work including rules on selecting the best MSA tools, thresholds for AlignGVGD and PolyPhen, rules for selecting representative sequences to build MSAs and the dependence on the disease gene for performance of the PON tools. From this work we can conclude that predictions based on custom MSAs in conjunction with these PON predictors should be interpreted with caution given the confidence levels we have seen with our data sets."

Reference

"J. D. Warrender, D. C Swan and C. McAnulty. Optimisation of Parameters for the Assessment of Unclassified Disease Gene Sequence Variants. ACC/CMGS Spring Meeting, 2011."