1 School of Computing and Data Science, Wentworth Institute of Technology, USA.
2 Department of Finance and Economics, Faculty of Business and Law, Manchester Metropolitan University, UK.
3 Department of Computer Science and Quantitative Methods, Austin Peay State University, Tennessee, USA.
4 Department of Health Care Administration, University of the Potomac, Washington, DC, USA.
5 Department of Computer Science, Metropolitan College, Boston University, USA.
6 Booth School of Business, University of Chicago, USA.
World Journal of Advanced Research and Reviews, 2025, 28(01), 914-929
Article DOI: 10.30574/wjarr.2025.28.1.3457
Received on 31 August 2025; revised on 11 October 2025; accepted on 13 October 2025
Thirty-day hospital readmissions represent a critical challenge in healthcare, contributing to significant financial burdens, increased patient morbidity, and reflecting gaps in care continuity. This study aimed to develop and evaluate machine learning models for predicting 30-day hospital readmissions using a comprehensive, statewide healthcare dataset from Massachusetts. Employing a quantitative, predictive modeling design, this research compared the performance of Ridge regression with two advanced ensemble methods: Random Forest and Gradient Boosting. The models were trained and tested on a hospital-year panel dataset derived from the Massachusetts readmissions data book. Performance was evaluated using Root Mean Squared Error (RMSE) and the coefficient of determination (R²). The results demonstrated the superior predictive power of the ensemble methods over the traditional linear model. Gradient Boosting emerged as the top-performing model, achieving the lowest RMSE of 1.48 and the highest R² of 0.81, followed closely by Random Forest (RMSE = 1.52, R² = 0.80). In contrast, Ridge regression showed limited predictive capability (RMSE = 2.54, R² = 0.43). Feature importance analysis from the Gradient Boosting model identified the number of deaths/readmissions and the number of cases as the most influential predictors, with hospital quality ratings and geographic factors also contributing significantly. The findings indicate that machine learning, particularly Gradient Boosting, provides a robust and accurate tool for identifying patients at high risk of readmission. Implementing such models can enable healthcare systems to better allocate resources, tailor discharge planning, and ultimately improve patient outcomes by reducing costly and disruptive readmissions.
Machine Learning; Predictive Modeling; Gradient Boosting; Random Forest; Healthcare Analytics; Risk Stratification; Data-Driven Healthcare; Transitional Care
Preview Article PDF
Awele Okolie, Dumebi Okolie, Callistus Obunadike, Darlington Ekweli, Pearlrose Nwuke, Bello Abdul-Waliyyu and Paschal Alumona. Machine Learning Approaches for Predicting 30-Day Hospital Readmissions: Evidence from Massachusetts Healthcare Data. World Journal of Advanced Research and Reviews, 2025, 28(01), 914-929. Article DOI: https://doi.org/10.30574/wjarr.2025.28.1.3457.
Copyright © 2025 Author(s) retain the copyright of this article. This article is published under the terms of the Creative Commons Attribution Liscense 4.0