Self organising maps for variable selection: application to human saliva analysed by nuclear magnetic resonance spectroscopy to investigate the effect of an oral healthcare product
SOMs (Self Organising Maps) are derived from the machine learning literature and serve as a valuable method for representing data. In this paper, the use of SOMs as a technique for determining the most significant variables (or markers) in a dataset is described. The method is applied to the NMR spectra of 96 human saliva samples, half of which have been treated with an oral rinse formulation and half of which are controls, and 49 variables consisting of bucketed intensities. In addition, three simulations, two of which consist of the same number of samples and variables as the experimental dataset and a third that contains a much larger number of variables, are described. Two of the simulations contain known discriminatory variables, and the remaining is treated as a null dataset without any specific discriminatory variables added. The described SOM method is contrasted to Partial Least Squares Discriminant Analysis, and a list of the markers determined to be most significant using both approaches was obtained and the differences arising are discussed. A SOM Discrimination Index (SOMDI) is defined, whose magnitude relates to how strongly a variable is considered to be a discriminator. In order to ensure that the model is stable and not dependent on the random starting point of the SOM, one hundred iterations were performed and variables that were consistently of high rank were selected. A variety of approaches for data representation are illustrated, and the main theoretical principles of employing SOMs for determining which variables are most significant are outlined. Software used in this paper was written in-house, allowing greater flexibility over existing packages, and tailored for the specific application in hand.
Citation : Lloyd, G. R. et al. (2009) Self organising maps for variable selection: application to human saliva analysed by nuclear magnetic resonance spectroscopy to investigate the effect of an oral healthcare product. Chemometrics and Intelligent Laboratory Systems, 98 (2), pp. 149-161.
ISSN : 01697439
- Leicester School of Pharmacy