Scopus Indexed Publications

Paper Details


Title
Automatic Bangla Image Captioning Based on Transformer Model in Deep Learning
Author
, Md. Suhag Ali,
Email
Abstract

"Indeed, Image Captioning has become a crucial

aspect of contemporary artificial intelligence because it has

tackled two crucial parts of the AI field: Computer Vision and

Natural Language Processing. Currently, Bangla stands as the

seventh most widely spoken language globally. Due to this, image

captioning has gained recognition for its significant research

accomplishments. Many established datasets are found in English

but no standard datasets in Bangla. For our research, we have

used the BAN-Cap dataset which contains 8091 images with

40455 sentences. Many effective encoder-decoder and Visual

Attention approaches are used for image captioning where CNN

is utilized for the encoder and RNN is used for the decoder.

However, we suggested a transformer-based image captioning

model in this study with different pre-train image feature

extraction models like Resnet50, InceptionV3, and VGG16 using

the BAN-Cap dataset and find out its effective efficiency and

accuracy based on many performances measured methods like

BLEU, METEOR, ROUGE, CIDEr and also find out the

drawbacks of others model. "


Keywords
"Bangla image captioning; image processing; natural language processing; attention mechanism; transformer model"
Journal or Conference Name
International Journal of Advanced Computer Science and Applications
Publication Year
2023
Indexing
scopus