Scopus Indexed Publications

Paper Details


Title
Vector Representation of Bengali Word Using Various Word Embedding Model
Author
Ashik Ahamed Aman Rafat, Fazle Rabby Khan, Mushfiqus Salehin, Sheikh Abujar, Syed Akhter Hossain,
Email
sheikh.cse@diu.edu.bd
Abstract
To transfer human understanding of language to a machine we need word embedding. Skipgram, CBOW, and fastText is a model which generate word embedding. But finding pretrained word embedding model for the Bengali language is difficult for researchers. Also, training word embedding is time-consuming. In this paper, we discussed different word embedding models. To train those models, we have collected around 500000 Bengali articles from various sources on the internet. Among them, we randomly chose 105000 articles. Those articles have 32 million words. We trained them on SkipGram and CBOW model of Word2Vec, fastText. We also trained those words in Glove model. Among the all result fastText (Word2Vec) gave us a satisfactory result.

Keywords
Bengali Words , Skip Gram , CBOW , Word2Vec , FastText , Glove , Word Embedding
Journal or Conference Name
Proceedings of the 2019 8th International Conference on System Modeling and Advancement in Research Trends, SMART 2019
Publication Year
2020
Indexing
scopus