DEVELOPMENT AND APPLICATION OF NOVEL COMPUTATIONAL INTELLIGENCE TECHNIQUES TO THE MULTIVARIATE ANALYSIS OF METABOLOMICS BIOFLUIDS DATASETS
The present decades have witnessed major advances in the development and applications of Computational Intelligence Techniques (CITs), which are commonly associated with metabolomics and omics analyses related to diseases diagnosis. This includes, amongst others, research work performed on Niemann-Pick class 1 and 2 diseases (NPC1 and NPC2 respectively), the severest form of which may involve liver dysfunction. Some of the main reasons for the high frequency of CITs use in metabolomics studies are also related to the development of techniques to detect major discriminatory metabolite variables for the purpose of disease diagnosis and progression. Alongside this, is the major demanding requirement to further understand potential metabolic pathways involved in order to improve our understanding of the molecular mechanisms underlying NPC1 disease. NPC1 is a rare neurodegenerative disorder attributable to NPC1 gene function loss, which causes adverse fat storage at the lysosomal levels (Mathieson, 2013; Xu et al., 2010). However, plasma metabolite profiling can provide insights into disease diagnosis and prognosis, while providing a clear ‘picture’ of the underlying metabolites altered during disease processes, including their early stages. Currently, biomarker discovery appears as the most effective solution to employ regarding the monitoring of disease progression (Mathieson, 2013; Ruiz-Rodado et al., 2014). In the present thesis, the intelligent tri-modelling techniques (ITMTs) which are combination of CITs applied to the multivariate (MV) analysis of biofluid datasets is proposed. The ITMTs serves as a combination of the scalar visualisation algorithm (SVA) for data visualisation and high-dimensional data representation into bi or tri-dimensional spaces. The optimum super support vector machine (OSVM) is also employed for the MV classification of metabolic datasets. Moreover, principal component regression (PCR) was also employed for data iv probabilistic classification and regression purposes. This was followed by investigations of correlations between these biomolecular diseases features. Furthermore, the tri-ranking techniques (TRTs) was developed in order to establish a ranking between the NPC1 disease features, in addition to those available for the NPC1 liver dysfunction disease features in a mouse model system to determine their importance in the further development of these diseases. High-resolution proton nuclear magnetic resonance (1H NMR) is used as high-throughput multicomponent analytical technique to generate very large quantities of metabolic data, which hold essential and useful information regarding the metabolites analysed. Prior to the performance of MV metabolomics analysis, a robust data handling technique based on balancing the dataset, feature selection, and stratified cross-validation of datasets is involved. Furthermore, the intelligent task technology fit theory has been proposed here, enabling a swift, consistent and rational model development through threshold settings for model validations. Application of the intelligent tri-modelling techniques (ITMTs) using SVA, OSVM and PCR, combined with the tri-ranking techniques (TRTs) have allowed the discovery of major discriminatory variables for NPC1 disease. Hence, using the blood plasma dataset the scalar visualisation algorithm could diagnose NPC1 disease through the following potential biomarkers: hexacosanoate acid, (R)-3-hydroxybutyrate, L-fucose, lactate, 3-hydroxyisovalerate, Citrate, N-acetyl-4-O-acetylneuraminate, methionine, and glutamine. Additionally, the SVA strategy highlighted the following major biomarkers in the 1H NMR NPC liver dysfunction dataset, including glycogen, glutamate, glutamine, taurine, glycerophosphocholine, acetoacetate, taurine, myo-inositol, lactate, leucine, isoleucine, and alanine. However, the PCR approach established a significant correlation between biomarker features for NPC1 disease, in addition to the mouse model of NPC liver dysfunction progression. Moreover, the OSVM technique could clearly segregate between the two classes of patients/animals in both disease pathogenesis studies. This thesis presents the Intelligent Tri-Model Techniques (ITMTs), using 1H NMR-linked metabolite profiling, new biomarkers for NPC1 disease diagnosis, and the NPC1-based liver dysfunction were discovered; these biomarkers displayed very high-performance accuracies. v This may represent a major advance regarding the diagnosis of NPC1 disease and its pathological sequelae. Such biomarkers may serve as valuable assets for monitoring the effectiveness of appropriate treatments for this debilitating condition, for example miglustat.
- PhD