Scopus Indexed Publications

Paper Details


Title
PMM: A Model for Bangla Parts-of-Speech Tagging Using Sentence Map
Author
Prosanta Kumar Chaki, Md. Mozammel Hossain Sazal, Shikha Anirban,
Email
anirban@daffodilvarsity.edu.bd
Abstract

The Part-of-speech (POS) tagging is mandatory for almost all kinds of Natural Language Processing (NLP) tasks such as Grammar checking, Machine translation, summary writing, sentiment analysis, information retrievals, and speech processing etc. Having very few successful researches on computational linguistics in Bangla language, it still remains the demand for technology. The existing works on Bangla parts-of-speech tagging require large training data set and not applicable for all language styles. In this research, we proposed Prediction Maximization Model (PMM) for Bangla parts-of-speech tagging. We used statistical data for learning and used rule-based analysis. Hidden Markov Model (HMM) is applied with tag mapping and scoring in PMM to maximize the accuracy by using relatively less statistical training data. PMM achieved 95.6% accuracy that is relatively high compared with two other existing POS tagger which claims the nearest accuracy but with the relatively much higher number of training data sets. In our experiment, we used around 14K unique token as training data for PMM and the other two existing systems and PMM performed best.

Keywords
Part-of-speech Bangla language Machine learning Hidden Markov Model Lexicon Maps PMM
Journal or Conference Name
Communications in Computer and Information Science
Publication Year
2020
Indexing
scopus