Signal-processing-based bioinformatics approach for the identification of influenza A virus subtypes in Neuraminidase genes.
Neuraminidase (NA) genes of influenza A virus is a highly potential candidate for antiviral drug development that can only be realized through true identification of its sub-types. In this paper, in order to accurately detect the sub-types, a hybrid predictive model is therefore developed and tested over proteins obtained from the four subtypes of the influenza A virus, namely, H1N1, H2N2, H3N2 and H5N1 that caused major pandemics in the twentieth century. The predictive model is built by the following four main steps; (i) decoding the protein sequences into numerical signals by means of EIIP amino acid scale, (ii) analysing these signals (protein sequences) by using Discrete Fourier Transform (DFT) and extracting DFT-based features, (iii) selecting more influential sub-set of the features by using the F-score statistical feature selection method, and finally (iv) building a predictive model on the feature sub-set by using support vector machine classifier. The protein sequences were chosen as to be of high percentage identity that they demonstrate within individual influenza subtype classes and high variation that they display in the percentage identity. This makes the proteins very difficult to distinguish from each other even they belong to different subtypes. Given this set of the proteins, the predictive model yielded 98.3% accuracy based on a 5-fold cross validation. This also results in a twenty feature sub-set that can also help reveal spectral characteristics of the subtypes. The proposed model is promising and can easily be generalized for other similar studies.
Citation : Chrysostomou, C. & Seker, H. (2013) Signal-processing based bioinformatics approach for the identification of influenza A virus subtypes in Neuraminidase genes. Engineering in Medicine and Biology Society (EMBC), 2013 35th Annual International Conference of the IEEE, pp. 3066 - 3069
ISSN : 1557-170X