Scopus Indexed Publications

Paper Details


Title
Algorithms efficiency measurement on imbalanced data using geometric mean and cross validation
Author
Mustakim Al Helal, Mohammad Salman Haydar, Seraj Al Mahmud Mostafa,
Email
mustakimsunny.cse@diu.edu.bd
Abstract
The recent computing trend is producing tons of data every minutes where the amount of imbalanced data is quite high as far as real life data sets are concerned. In practical aspects of data mining, the imbalanced data set is prone to misguide a data mining model. However, data set needs pre-processing before mining. This work focuses on some practical data mining techniques and produces a valid evaluation process for imbalanced data set. A critical comparison of few well established algorithms are illustrated. Accuracy of few well known different algorithms such as, Decision Tree Classifier (DTC), Support Vector Machine (SVM), K-Nearest Neighbor (KNN) and Random Forest (RF) are also compared. The data was tested before and after Over-sampling using Synthetic Minority Over-sampling Technique (SMOTE) and then verified by using Geometric Mean (GM) and Cross Validation techniques. The results we achieved in this work demonstrates a critical comparison of some algorithms and most importantly performance measure that is valid for imbalanced data.

Keywords
Imbalanced data , SMOTE , Geometric Mean , Cross Validation , SVM , KNN , Decision Tree Classifier , Random Forest
Journal or Conference Name
2016 International Workshop on Computational Intelligence (IWCI)
Publication Year
2017
Indexing
scopus