Scopus Indexed Publications

Paper Details


Title
Lyricist Identification using Stylometric Features utilizing BanglaMusicStylo Dataset
Author
Ahmed Al Marouf, Rafayet Hossian,
Email
marouf.cse@diu.edu.bd
Abstract
This paper presents a profile-based approach utilizing supervised learning methods to identify the lyricist of Bangla songs written by two legendary poets & novelist Kazi Nazrul Islam and Rabindranath Tagore. The problem statement for this paper could be considered as authorship attribution using stylometric features on Bangla lyrics. We have utilized the BanglaMusicStylo dataset, which consists of 856 and 620 songs of Rabindranath Tagore and Kazi Nazrul Islam, respectively. The traditional authorship attribution works found in the literature are based on the novels written by the authors, not Bangla song lyrics. Using the Bangla song lyrics made it a challenging task, as the word choices made by the authors in songs depends on the rhythms, completeness, situation and many more. In this paper, we have tried to fusion different types of stylometric features, such as lexical, structural, stylistic etc. For experimentation, we have designed the prediction model based on supervised learning exploiting Naïve Bayes (NB), Simple Logistic Regression (SLR), Decision Tree (DT), Support Vector Machine (SVM), and Multilayer Perceptron (MLP). The experimental model consists of several steps including data pre-processing, feature extraction, data processing, and classification model. After performance evaluation, we have got approximately 86.29% accuracy from SLR, which is quite satisfactory.

Keywords
Authorship Attribution , Linguistic Feature , Stylometric Features , BanglaMusicStylo Dataset , Supervised Learning
Journal or Conference Name
2019 International Conference on Bangla Speech and Language Processing, ICBSLP 2019
Publication Year
2019
Indexing
scopus