Recognition of Linguistic Features by Hidden Markov Model (HMM)

Božidar Tepeš, Lajos Szirovicza, Anita Sujoldžić, Martina Primorac

Abstract


Based on recent results in creating automatic taggers for different European languages, including the Croatian language, an attempt has been made to use Hidden Markov Model (HMM) for analyzing linguistic (dialectal) microdifferentiation of reproductively isolated populations in the Eastern Adriatic. As in this geographic area two main dialects are spoken, two different HMM were created, one for the recognition of the "čakavian" dialect, and the other one for the recognition of the "štokavian" dialect. The recognition of the dialects is based on their differential phonetic characteristics. The paper gives a short introduction of HMM as a potential mathematical background for future research and results, the development of HMM for dialect classification ("čakavian" and "štokavian"), description of the corpora available at the moment, and the results obtained.

Keywords


Hidden Markov Model, HMM, stochastic tagging, language processing

Full Text:

PDF


Creative Commons License
This work is licensed under a Creative Commons Attribution-NoDerivatives 4.0 International License.

Crossref Similarity Check logo

Crossref logologo_doaj

 Hrvatski arhiv weba logo