Show simple item record

dc.contributor.authorIliya, Sundayen
dc.contributor.authorNeri, Ferranteen
dc.date.accessioned2016-05-10T10:09:37Z
dc.date.available2016-05-10T10:09:37Z
dc.date.issued2016-03-17
dc.identifier.citationIliya, S. and Neri, F. (2016) Towards Artificial Speech Therapy: A Neural System for Impaired Speech Segmentation. International Journal of Neural Systems. 26 (6), 1650023en
dc.identifier.issn0129-0657
dc.identifier.issn1793-6462
dc.identifier.urihttp://hdl.handle.net/2086/12051
dc.description.abstractThis paper presents a neural system-based technique for segmenting short impaired speech utterances into silent, unvoiced, and voiced sections. Moreover, the proposed technique identifies those points of the (voiced) speech where the spectrum becomes steady. The resulting technique thus aims at detecting that limited section of the speech which contains the information about the potential impairment of the speech. This section is of interest to the speech therapist as it corresponds to the possibly incorrect movements of speech organs (lower lip and tongue with respect to the vocal tract). Two segmentation models to detect and identify the various sections of the disordered (impaired) speech signals have been developed and compared. The first makes use of a combination of four artificial neural networks. The second is based on a support vector machine (SVM). The SVM has been trained by means of an ad hoc nested algorithm whose outer layer is a metaheuristic while the inner layer is a convex optimization algorithm. Several metaheuristics have been tested and compared leading to the conclusion that some variants of the compact differential evolution (CDE) algorithm appears to be well-suited to address this problem. Numerical results show that the SVM model with a radial basis function is capable of effective detection of the portion of speech that is of interest to a therapist. The best performance has been achieved when the system is trained by the nested algorithm whose outer layer is hybrid-population-based/CDE. A population-based approach displays the best performance for the isolation of silence/noise sections, and the detection of unvoiced sections. On the other hand, a compact approach appears to be clearly wellsuited to detect the beginning of the steady state of the voiced signal. Both the proposed segmentation models display outperformed two modern segmentation techniques based on Gaussian mixture model and deep learning.en
dc.language.isoenen
dc.publisherWorld Scientificen
dc.subjectImpaired speechen
dc.subjectsteady stateen
dc.subjectartificial neural networken
dc.subjectsupport vector machineen
dc.subjectcompact differential evolutionen
dc.subjectmetaheuristicsen
dc.titleTowards Artificial Speech Therapy: A Neural System for Impaired Speech Segmentationen
dc.typeArticleen
dc.identifier.doihttp://dx.doi.org/10.1142/S0129065716500234
dc.researchgroupDIGITSen
dc.peerreviewedYesen
dc.funderN/Aen
dc.projectidN/Aen
dc.cclicenceN/Aen
dc.date.acceptance2016-03-17en
dc.researchinstituteInstitute of Artificial Intelligence (IAI)en


Files in this item

FilesSizeFormatView

There are no files associated with this item.

This item appears in the following Collection(s)

Show simple item record