We create a hedonic price model for house prices for six geographical submarkets in the Netherlands. Our model is based on a recent data mining technique called boosting. Boosting is an ensemble technique that combines multiple models, in our case decision trees, into a combined prediction. Boosting enables capturing of complex nonlinear relationships and interaction effects between input variables. We report mean relative errors and mean absolute error for all regions and compare our models with a standard linear regression approach. Our model improves prediction performance with up to 40% compared with Linear Regression. Next, we interpret the boosted models: we determine the most influential characteristics and graphically depict the relationship between the most important input variables and the house price. We find the size of the house to be the most important input for all but one region, and find some interesting nonlinear relationships between inputs and price. Finally, we construct hedonic price indices and compare these to the mean and median index and find that these indices differ notably in the urban regions of Amsterdam and Rotterdam.

Additional Metadata
Keywords data mining, gradient boosting, hedonic price index, hedonic price models, housing, machine learning
Persistent URL hdl.handle.net/1765/7665
Kagie, M., & van Wezel, M.C.. (2006). Hedonic price models and indices based on boosting applied to the Dutch housing market (No. EI 2006-17). Report / Econometric Institute, Erasmus University Rotterdam. Retrieved from http://hdl.handle.net/1765/7665