Rule-based Approach for Arabic Root Extraction: New Rules to Directly Extract Roots of Arabic Words
Abstract
Extracting word roots in Arabic language is very problematic due to the specific morphological and structural changes in the language. To address this problem, several techniques have been proposed. This paper continues the problem of identifying and exploiting relationship amongst Arabic letters for Arabic root extraction begun in [1]. Eight different rules that detect the root letters according to other letters in the word have been proposed and tested, four of them benefiting from the idea of morphological substitution (MUTATION). The approach has been evaluated using the Holy Quran words. The evaluation results show a promising root extraction algorithm.
Keywords
rule-based stemmer, word root, suffixes, prefixes, words patterns
Full Text:
PDFDOI: https://doi.org/10.2498/cit.1002174
This work is licensed under a Creative Commons Attribution-NoDerivatives 4.0 International License.