Extraction of Arabic word roots: An Approach Based on Computational Model and Multi-Backpropagation Neural Networks
Stemming is a process of extracting the root of a given word, by stripping off the affixes attached to this word. Many attempts have been made to address the stemming of Arabic words problem. The majority of the existing Arabic stemming algorithms require a complete set of morphological rules and large vocabulary lookup tables. Furthermore, many of them give more than one potential stem or root for a given Arabic word. According to Ahmad , the Arabic stemming process based on the language morphological rules is still a very difficult task due to the nature of the language itself. The limitations of the current Arabic stemming methods have motivated this research in which we investigate a novel approach to extract the word roots of Arabic language named here as MUAIDI-STEMMER 2. This approach attempts to exploit numerical relations between Arabic letters, avoiding having a list of the root and pattern of each word in the language, and giving one root solution. This approach is composed of two phases. Phase I depends on a basic calculations extracted from linguistic analysis of Arabic patterns and affixes. Phase II is based on artificial neural network trained by backpropagation learning rule. In this proposed phase, we formulate the root extraction problem as a classification problem and the neural network as a classifier tool. This study demonstrates that a neural network can be effectively used to ex- tract the word roots of Arabic language The stemmer developed is tested using 46,895 Arabic word types3. Error counting accuracy evaluation was employed to evaluate the performance of the stemmer. It was successful in producing the stems of 44,107 Arabic words from the given test datasets with accuracy of 94.81%. 2.Muaidi is the author father's name. 3.Types mean distinct or unique words.
- PhD