Developing a Robust Statistical Model for Predicting Mean Healthcare Costs across Different Sample Size Distributions in the Volta Region of Ghana
Main Article Content
Abstract
Healthcare cost, particularly positively skewed cost data, modelling is an important area in health policy formulation as it provides policymakers with valuable information on the appropriate distribution, as well as the important covariates to use in cost minimization programmes. Previous studies have attempted to undertake this but with simulated data, inadequate sample size, or different distributions. This study aimed to determine the robustness of some statistical models based on large healthcare cost data. Using real-life healthcare cost data, the study sought to determine how the various statistical models performed with different sample sizes (n=1100 and n=2444). Data for the study was obtained from the Ghana Health Service’s (GHS) facilities in the Volta Region of Ghana. We extracted the data from the District Health Information Management System 2 (DHIMS 2) database from 1st January to 31st December 2021 with covariates such as gender of the patients, age, length of stay in the hospital, and cost incurred. We explored both descriptive and inferential statistical techniques to analyze the data. Statistical models such as the ordinary least square (OLS), the OLS log (y), the log-normal (log(y)), the Poisson, the Cox proportional hazard, the Weibull, and the Gamma distributions were employed and the best model(s) were selected based on standard statistical metrics including the Akaike Information Criteria (AIC), the mean average percentage error (MAPE) and the mean squared error (MSE). The OLS log (y) was found to have outperformed all other models across different sample sizes. Policymakers could adopt the OLS log (y) model to predict healthcare costs in order to make a stronger case for adequate budgetary and logistic support.