Scopus Indexed Publications
Paper Details
- Title
-
Clustering-Based Under-Sampling with Normalization in Class-Imbalanced Data
- Author
-
,
Tanmoy Mondal,
- Email
-
- Abstract
-
In some real-world data sets,
there is a class imbalance where one class (the minority class) has a
limited number of data points and the other class (the dominant class)
has a large number of data points. With the state-of-the-art machine
learning approaches, it is extremely challenging to build an efficient
model without taking data preparation into account to balance the
unbalanced data sets. To ensure that each class has the same number of
data points, random under-sampling has been used in numerous research.
During the data preparation phase of this study, this research
experiments with under-sampling techniques that use a clustering
technique. This Research uses under-sampling techniques with
normalization in this non-interest-bearing imbalanced data collection
with a majority and minority class. Both majority and minority
classifications contain personal information focuses. This Research
first uses k-fold cross-validation to separate this unbalanced data set
into preparation and testing sets. This Research separates the data into
a majority course subset and a minority course subset after normalizing
it. The majority of lesson information evaluations are minimized by
using a clustering-based under-sampling method. The minority lesson
subset is then combined with the reduced lion's share course subset to
create an updated preparation set. Then The author normalizes the
information once more at that moment. The classifier is then
independently prepared and evaluated using the updated preparation and
testing sets.
- Keywords
-
Imbalanced Data-sets , Under-sampling , Clustering , Classification , Normalization
- Journal or Conference Name
- Proceedings of 2022 IEEE International Conference on Current Development in Engineering and Technology, CCET 2022
- Publication Year
-
2022
- Indexing
-
scopus