Predicting Absenteeism at Work Using Tree-Based Learners
Absenteeism at workplace acts as a crucial role in demonstrating the productive and profitable capacity of a company. Thus the knowledge of absenteeism of employees' becomes the foundation for an organization in its multiple dimensions. Because the proper determination of employees' profile allows the identification of excesses of occurrences of certain morbidities. The early absenteeism research primarily focused on predicting the characteristics and the categories of diseases of employees that make them perform higher absenteeism at workplace. However, predicting the absenteeism time of employees using different machine learning classifiers is able to give the researches a new dimension in line with the intention of revealing the underlying causes and patterns of absenteeism. In this paper, we have applied 4 prominent machine learning algorithms namely Decision Tree, Gradient Boosted Tree, Random Forest, and Tree Ensemble on the absenteeism dataset of a courier company in Brazil in order to predict the absenteeism time of employees at work as well as the best classifier. Based on the 7 evaluation metrics such as True Positive, True Negative, False Positive, False Negative, Sensitivity, Specificity, and Accuracy we found that Gradient Boosted Tree produced the best result with an accuracy rate of 82% whereas Tree Ensemble performed the lowest with the accuracy rate of 79%.
Zaman Wahid, A. K. M. Zaidi Satter, Abdullah Al Imran, Touhid Bhuiyan