Cardiovascular disease (CVD) is a leading cause of death globally, particularly in South Asia, where high cholesterol intake contributes to its prevalence. This study aims to predict CVD using machine learning models applied to patient data from healthcare systems in Dhaka, Bangladesh. The study seeks to identify the most reliable model for early diagnosis and decision-making support. The dataset comprises 1019 patient records, collected from two prominent hospitals in Dhaka, and includes nine critical features. Several machine learning models were employed and rigorously tested using stratified fivefold cross-validation and best parameters were chosen using GridSearchCV. The model’s performance was evaluated using metrics such as accuracy, precision, recall, F1-score, AUC and ROC curve analysis. To enhance interpretability, SHapley Additive exPlanations (SHAP) analysis was applied, focusing on global feature importance. Among the models tested, XGBoost exhibited the highest performance, achieving 97.12% training accuracy and 86.07% testing accuracy (AUC = 0.91). Random Forest also performed strongly with 83.08% testing accuracy and the highest AUC (0.92). Decision Tree and K-Nearest Neighbors achieved moderate results with testing accuracies of 78% and 74.63%, respectively. Logistic Regression and Support Vector Machine showed lower overall accuracy (~ 66%), though both attained high recall (0.91 and 0.95), indicating sensitivity to positive cases. These results highlight XGBoost’s robustness while also demonstrating the trade-offs of alternative models. The results demonstrate that machine learning, particularly the XGBoost model with SHAP based explainability, offers a promising approach for diagnosing CVD with high accuracy. Incorporating this model into medical diagnostic systems can assist healthcare specialists in making more informed and accurate decisions, potentially reducing the morbidity and mortality associated with CVD.