A comparison of parametric and non-parametric methods to estimate the costs of obesity
Presenter: Stephan Gohmann, University of Louisville
Abstract
Rationale: Many methods are available to determine the association between patient characteristics and health care expenditures. However, the quality of the predictions depends upon the method used for prediction. Most existing research in this area has focused on various regression-based methods. This research explores the feasibility of non-parametric methods and compares their performance in cost estimation to that of regression-based methods.
Objectives
This study compares 4 parametric methods to estimate costs with 3 non-parametric methods. We use a sample of individuals from the Medical Expenditure Panel Survey to estimate costs as a function of demographic characteristics and individual health conditions. We then examine how well costs for a subsample of obese individuals are predicted.
Methodology:
This paper compares the predicted results of four parametric alternatives – ordinary least squares, generalized linear model using log link and Gaussian family, generalized linear model using Poisson family and log link, and a retransformation smearing model as proposed by Duan (1983) – with three non-parametric methods - neural networks, case-based reasoning and decision trees. The data are for health expenditures in 2002 from the Medical Expenditure Panel Survey. We implement a k-fold cross validation method to build and test the models on our sample data. We subject the predicted values to six criteria to determine which models predict best. These are root mean squared error, mean absolute error, maximum absolute error, root relative squared error, root absolute error, and correlation. Methods of feature reduction, such as principal component analysis and variable selection, have also been used for possible improvement of the final results. We compare the results for the full sample and then for a subsample of obese individuals.
Results:
Using the selection criteria above, the case-based reasoning and neural network models perform better than the parametric models under three of the six criteria. For the obese subsample, the nonparametric methods tend to predict better. The results indicate that while the non-parametric methods are not markedly superior to the parametric methods, they produce comparable results and are clearly worthy of further study.
Conclusions:
The non-parametric methods may be a viable alternative to the parametric methods used in previous studies.
Authors: Jeff Guan, Jozef Zurada
Session: Poster
Time: -
Room: No.3 Hall
