Improving the drug discovery process by using multiple classifier systems
Machine learning methods have become an indispensable tool for utilizing large knowledge and data repositories in science and technology. In the context of the pharmaceutical domain, the amount of acquired knowledge about the design and synthesis of pharmaceutical agents and bioactive molecules (drugs) is enormous. The primary challenge for automatically discovering new drugs from molecular screening information is related to the high dimensionality of datasets, where a wide range of features is included for each candidate drug. Thus, the implementation of improved techniques to ensure an adequate manipulation and interpretation of data becomes mandatory. To mitigate this problem, our tool (called D2-MCS) can split homogeneously the dataset into several groups (the subset of features) and subsequently, determine the most suitable classifier for each group. Finally, the tool allows determining the biological activity of each molecule by a voting scheme. The application of the D2-MCS tool was tested on a standardized, high quality dataset gathered from ChEMBL and have shown outperformance of our tool when compare to well-known single classification models.
The file attached to this record is the author's final peer reviewed version. The Publisher's final version can be found by following the DOI link.
Citation : Ruano-Ordas, D. et al. (2019) Improving the drug discovery process by using multiple classifier systems. Expert Systems with Applications, 121, pp. 292-303
Research Group : Cyber Technology Institute (CTI)
Research Institute : Cyber Technology Institute (CTI)
Peer Reviewed : Yes