Recognition of Linguistic Features by Hidden Markov Model (HMM)

Božidar Tepeš; Lajos Szirovicza; Anita Sujoldžić; Martina Primorac

Recognition of Linguistic Features by Hidden Markov Model (HMM)

Božidar Tepeš, Lajos Szirovicza, Anita Sujoldžić, Martina Primorac

Abstract

Based on recent results in creating automatic taggers for different European languages, including the Croatian language, an attempt has been made to use Hidden Markov Model (HMM) for analyzing linguistic (dialectal) microdifferentiation of reproductively isolated populations in the Eastern Adriatic. As in this geographic area two main dialects are spoken, two different HMM were created, one for the recognition of the "čakavian" dialect, and the other one for the recognition of the "štokavian" dialect. The recognition of the dialects is based on their differential phonetic characteristics. The paper gives a short introduction of HMM as a potential mathematical background for future research and results, the development of HMM for dialect classification ("čakavian" and "štokavian"), description of the corpora available at the moment, and the results obtained.

Keywords

Hidden Markov Model, HMM, stochastic tagging, language processing

Full Text:

PDF

This work is licensed under a Creative Commons Attribution-NoDerivatives 4.0 International License.

Username
Password
Remember me

Journal of Computing and Information Technology