DoR - Division of Research

Scopus Indexed Publications

Paper Details

Title: A Machine Learning Based Approach to Classify Tense from English Text

Author: Umme Ayman, Dewan Mamun Raza, Md. Azmain Mahtab Rahat, Md. Hasan Imam Bijoy, Md. shafiqul Islam, Narayan Ranjan Chakraborty ,

Email

Abstract: This paper investigates the classification of tense in English text using machine learning algorithms. Support Vector Machine (SVM), Random Forest (RF), Multinomial Naive Bayes (MNB), Decision Tree (DT), XGBoost, and K-Nearest Neighbors (KNN) are the six classifiers used in the study. The dataset was collected from diverse sources including novels, books, blogs, articles, social media platforms, newspapers, websites and some of them self-made. The data underwent preprocessing steps such as cleaning, normalization, and feature extraction using TfidfVectorizer. Among the other algorithms, SVM achieved the highest accuracy at 97.17%. Classifier performance was assessed with metrics such as F1-score, recall, accuracy, and precision. To evaluate performance, ROC curves, and confusion matrices were also examined. The study underlines the necessity for focused approaches and draws attention to the significant gaps in the field of natural language processing (NLP) regarding tense classification studies. By leveraging machine learning, this research aims to enhance the accuracy and contextual appropriateness of tense classification, thereby improving cross-cultural communication and understanding in machine translation systems. This research contributes to NLP by offering a robust approach to tense classification and demonstrates the potential of SVM in achieving high accuracy for this task. Future work will focus on addressing limitations such as short training data, overfitting and tense conversion.

Keywords

Journal or Conference Name: 2024 IEEE Conference on Computing Applications and Systems, COMPAS 2024

Publication Year: 2024

Indexing: scopus