Scopus Indexed Publications

Paper Details


Title
Bangla Document Classification using Character Level Deep Learning
Author
, Al Amin Biswas,
Email
alamin.cse@diu.edu.bd
Abstract
Last few decades, the availability and accessibility of the Bangla document and its content have rapidly increased due to the rapid technological advancement. Intense research needs to be performed on various Bangla documents due to the diversity of the language and associated sentiment. Document classification is one of the fundamental problems of Natural Language Processing. To handle miss-classification and convenient indexing and searching of Bangla documents on the web, researchers nowadays exploring different fields of computer science to classify Bangla documents. In this paper, Deep Learning based approaches are implemented to classify Bangla text documents. Convolutional Neural Network (CNN) and Long Short Term Memory (LSTM) is used here for the classification task. Here we have implemented an advanced technique that encoded the documents at their character level. Documents from three different data sources are used to validate and test of the working models. The highest classification accuracy is 95.42% that is achieved on the Prothom Alo data set using LSTM. Furthermore, we presented a comparison between two models and explained how well the classification task can be carried out using our character level approach with higher accuracy.

Keywords
Bangla Documents , Classification , CNN , LSTM
Journal or Conference Name
4th International Symposium on Multidisciplinary Studies and Innovative Technologies, ISMSIT 2020 - Proceedings
Publication Year
2020
Indexing
scopus