Scopus Indexed Publications

Paper Details


Title
Suffix Based Automated Parts of Speech Tagging for Bangla Language
Author
Monjoy Kumar Roy, Pinto Kumar Paul, Sheak Rashed Haider Noori, S.M. Hasan Mahmud,
Email
Abstract

Natural    language    processing    (NLP)    is    the    technique  by  which  we  process  the  human  language  with  the  computer.   Parts-of-Speech   (POS)   tagging   is   one   of   the   fundamental  requirements  for  some  NLP  applications.  It  is  considered  as  a  solved  problem  for  some  foreign  languages,  such as English, Chinese, due to higher accuracy (97%), where it   is   still   an   unsolved   problem   for   Bangla   because   of   its   ambiguity. Although making a POS tagger for Bangla is not a new work, but each one of available POS taggers has different kinds  of  limitations.  We  choose  to  develop  an  unsupervised  system  rather  than  a  supervised  system,  because  a  supervised  system  needs  a  huge  data  resource  for  training  purpose  and  available resources in Bangla is really poor.  Here we develop a POS   tagger   mainly   based   on   Bangla   grammar   especially   suffixes.  Because  Bangla  is  a  very  inflectional  language,  where  a single word has many variants based on their suffixes. In this POS  tagger,  we  assign  8  base  POS  tags,  where  some  rules,  based  on  Bangla  grammar  and  suffix,  are  applied  to  identify  POS  tags  with  the  cooperation  of  verb  root  dataset.  To  handle  non-suffix words, a dataset of almost 14500 Bangla words, with having their default POS tags, is added with the system, which helps  to  increase  the  efficiency  of  this  POS  tagger.  A  modified  version  of  previously  used  algorithm  for  suffix  analysis  is  applied, which result in a satisfactory level of about 94.2% 

Keywords
Parts of Speech (POS) Tagger, Natural Language Processing (NLP), Bangla Language, Suffix Analysis
Journal or Conference Name
2nd International Conference on Electrical, Computer and Communication Engineering, ECCE 2019
Publication Year
2019
Indexing
scopus