L.F. Hoogerheide (Lennart)
http://repub.eur.nl/ppl/2001/
List of Publicationsenhttp://repub.eur.nl/eur_logo_new.png
http://repub.eur.nl/
RePub, Erasmus University RepositoryBayesian Analysis of Instrumental Variable Models: Acceptance-Rejection within Direct Monte Carlo
http://repub.eur.nl/pub/73371/
Sat, 01 Feb 2014 00:00:01 GMT<div>A. Zellner</div><div>T. Ando</div><div>N. Basturk</div><div>L.F. Hoogerheide</div><div>H.K. van Dijk</div>
We discuss Bayesian inferential procedures within the family of instrumental variables regression models and focus on two issues: existence conditions for posterior moments of the parameters of interest under a flat prior and the potential of Direct Monte Carlo (DMC) approaches for efficient evaluation of such possibly highly non-elliptical posteriors. We show that, for the general case of m endogenous variables under a flat prior, posterior moments of order r exist for the coefficients reflecting the endogenous regressors' effect on the dependent variable, if the number of instruments is greater than m +r, even though there is an issue of local non-identification that causes non-elliptical shapes of the posterior. This stresses the need for efficient Monte Carlo integration methods. We introduce an extension of DMC that incorporates an acceptance-rejection sampling step within DMC. This Acceptance-Rejection within Direct Monte Carlo (ARDMC) method has the attractive property that the generated random drawings are independent, which greatly helps the fast convergence of simulation results, and which facilitates the evaluation of the numerical accuracy. The speed of ARDMC can be easily further improved by making use of parallelized computation using multiple core machines or computer clusters. We note that ARDMC is an analogue to the well-known "Metropolis-Hastings within Gibbs" sampling in the sense that one 'more difficult' step is used within an 'easier' simulation method. We compare the ARDMC approach with the Gibbs sampler using simulated data and two empirical data sets, involving the settler mortality instrument of Acemoglu et al. (2001) and father's education's instrument used by Hoogerheide et al. (2012a). Even without making use of parallelized computation, an efficiency gain is observed both under strong and weak instruments, where the gain can be enormous in the latter case.Censored Posterior and Predictive
Likelihood in Bayesian Left-Tail
Prediction for Accurate Value at Risk
Estimation
http://repub.eur.nl/pub/39847/
Mon, 15 Apr 2013 00:00:01 GMT<div>L. Gatarek</div><div>L.F. Hoogerheide</div><div>K. Hooning</div>
Accurate prediction of risk measures such as Value at Risk (VaR) and Expected Shortfall (ES) requires precise estimation of the tail of the predictive distribution. Two novel concepts are introduced that offer a specific focus on this part of the predictive density: the censored posterior, a posterior in which the likelihood is replaced by the censored likelihood; and the censored predictive likelihood, which is used for Bayesian Model Averaging. We perform extensive experiments involving simulated and empirical data. Our results show the ability of these new approaches to outperform the standard posterior and traditional Bayesian Model Averaging techniques in applications of Value-at-Risk prediction in GARCH models.
Genome-wide analysis of macrosatellite repeat copy number variation in worldwide populations: Evidence for differences and commonalities in size distributions and size restrictions
http://repub.eur.nl/pub/40840/
Mon, 04 Mar 2013 00:00:01 GMT<div>M. Schaap</div><div>R.J.L.F. Lemmers</div><div>R. Maassen</div><div>P.J. van der Vliet</div><div>L.F. Hoogerheide</div><div>H.K. van Dijk</div><div>N. Basturk</div><div>P. de Knijff</div><div>S.M. van der Maarel</div>
Background: Macrosatellite repeats (MSRs), usually spanning hundreds of kilobases of genomic DNA, comprise a significant proportion of the human genome. Because of their highly polymorphic nature, MSRs represent an extreme example of copy number variation, but their structure and function is largely understudied. Here, we describe a detailed study of six autosomal and two X chromosomal MSRs among 270 HapMap individuals from Central Europe, Asia and Africa. Copy number variation, stability and genetic heterogeneity of the autosomal macrosatellite repeats RS447 (chromosome 4p), MSR5p (5p), FLJ40296 (13q), RNU2 (17q) and D4Z4 (4q and 10q) and X chromosomal DXZ4 and CT47 were investigated. Results: Repeat array size distribution analysis shows that all of these MSRs are highly polymorphic with the most genetic variation among Africans and the least among Asians. A mitotic mutation rate of 0.4-2.2% was observed, exceeding meiotic mutation rates and possibly explaining the large size variability found for these MSRs. By means of a novel Bayesian approach, statistical support for a distinct multimodal rather than a uniform allele size distribution was detected in seven out of eight MSRs, with evidence for equidistant intervals between the modes. Conclusions: The multimodal distributions with evidence for equidistant intervals, in combination with the observation of MSR-specific constraints on minimum array size, suggest that MSRs are limited in their configurations and that deviations thereof may cause disease, as is the case for facioscapulohumeral muscular dystrophy. However, at present we cannot exclude that there are mechanistic constraints for MSRs that are not directly disease-related. This study represents the first comprehensive study of MSRs in different human populations by applying novel statistical methods and identifies commonalities and differences in their organization and function in the human genome. A class of adaptive importance sampling weighted EM algorithms for efficient and robust posterior and predictive simulation
http://repub.eur.nl/pub/37738/
Sat, 01 Dec 2012 00:00:01 GMT<div>L.F. Hoogerheide</div><div>A. Opschoor</div><div>H.K. van Dijk</div>
A class of adaptive sampling methods is introduced for efficient posterior and predictive simulation. The proposed methods are robust in the sense that they can handle target distributions that exhibit non-elliptical shapes such as multimodality and skewness. The basic method makes use of sequences of importance weighted Expectation Maximization steps in order to efficiently construct a mixture of Student-t densities that approximates accurately the target distribution-typically a posterior distribution, of which we only require a kernel-in the sense that the Kullback-Leibler divergence between target and mixture is minimized. We label this approach Mixture of t by Importance Sampling weighted Expectation Maximization (MitISEM). The constructed mixture is used as a candidate density for quick and reliable application of either Importance Sampling (IS) or the Metropolis-Hastings (MH) method. We also introduce three extensions of the basic MitISEM approach. First, we propose a method for applying MitISEM in a sequential manner, so that the candidate distribution for posterior simulation is cleverly updated when new data become available. Our results show that the computational effort reduces enormously, while the quality of the approximation remains almost unchanged. This sequential approach can be combined with a tempering approach, which facilitates the simulation from densities with multiple modes that are far apart. Second, we introduce a permutation-augmented MitISEM approach. This is useful for importance or Metropolis-Hastings sampling from posterior distributions in mixture models without the requirement of imposing identification restrictions on the model's mixture regimes' parameters. Third, we propose a partial MitISEM approach, which aims at approximating the joint distribution by estimating a product of marginal and conditional distributions. This division can substantially reduce the dimension of the approximation problem, which facilitates the application of adaptive importance sampling for posterior simulation in more complex models with larger numbers of parameters. Our results indicate that the proposed methods can substantially reduce the computational burden in econometric models like DCC or mixture GARCH models and a mixture instrumental variables model. The R Package MitISEM: Mixture of Student-t Distributions using Importance Sampling Weighted Expectation Maximization for Efficient and Robust Simulation
http://repub.eur.nl/pub/37313/
Thu, 20 Sep 2012 00:00:01 GMT<div>N. Basturk</div><div>L.F. Hoogerheide</div><div>A. Opschoor</div><div>H.K. van Dijk</div>
This paper presents the R package MitISEM, which provides an automatic and flexible method to approximate a non-elliptical target density using adaptive mixtures of Student-t densities, where only a kernel of the target density is required. The approximation can be used as a candidate density in Importance Sampling or Metropolis Hastings methods for Bayesian inference on model parameters and probabilities. The package provides also an extended MitISEM algorithm, â€˜sequential MitISEMâ€™, which substantially decreases the computational time when the target density has to be approximated for increasing data samples. This occurs when the posterior distribution is updated with new observations and/or when one computes model probabilities using predictive likelihoods. We illustrate the MitISEM algorithm using three canonical statistical and econometric models that are characterized by several types of non-elliptical posterior shapes and that describe well-known data patterns in econometrics and finance. We show that the candidate distribution obtained by MitISEM outperforms those obtained by â€˜naiveâ€™ approximations in terms of numerical efficiency. Further, the MitISEM approach can be used for Bayesian model comparison, using the predictive likelihoods.Are education and entrepreneurial income endogenous? A Bayesian analysis.
http://repub.eur.nl/pub/38059/
Sun, 01 Jul 2012 00:00:01 GMT<div>J.H. Block</div><div>L.F. Hoogerheide</div><div>A.R. Thurik</div>
Education is a well-known driver of (entrepreneurial) income. The measurement of its influence, however, suffers from endogeneity suspicion. For instance, ability and occupational choice are mentioned as driving both the level of (entrepreneurial) income and of education. Using instru-mental variables can provide a way out. However, two questions remain: whether endogeneity is really present and whether it matters for the size of the estimated relationship. Using Bayesian methods, we find that the relationship between education and entrepreneurial income is indeed en-dogenous and that the impact of endogeneity on the estimated relationship between education and income is sizeable. Implications of our findings for research and practice are discussed.Family background variables as instruments for education in income regressions: A Bayesian analysis
http://repub.eur.nl/pub/32155/
Fri, 16 Mar 2012 00:00:01 GMT<div>L.F. Hoogerheide</div><div>J.H. Block</div><div>A.R. Thurik</div>
The validity of family background variables instrumenting education in income regressions has been much criticized. In this paper, we use data from the 2004 German Socio-Economic Panel and Bayesian analysis to analyze to what degree violations of the strict validity assumption affect the estimation results. We show that, in case of moderate direct effects of the instrument on the dependent variable, the results do not deviate much from the benchmark case of no such effect (perfect validity of the instrument's exclusion restriction). In many cases, the size of the bias is smaller than the width of the 95% posterior interval for the effect of education on income. Thus, a violation of the strict validity assumption does not necessarily lead to results which are strongly different from those of the strict validity case. This finding provides confidence in the use of family background variables as instruments in income regressions.
The paper analyzes to what degree violations of the perfect validity of the exclusion restriction for family background variables in income regression affect the estimation results. ► In case of moderate direct effects of the instrument on the dependent variable, the results do not deviate much from the benchmark case of no such effect (perfect validity of the instrument's exclusion restriction). ► The finding provides confidence in the use of family background variables as instruments in income regressions.
Forecast rationality tests based on multi-horizon bounds: Comment
http://repub.eur.nl/pub/32010/
Sun, 01 Jan 2012 00:00:01 GMT<div>L.F. Hoogerheide</div><div>F. Ravazzolo</div><div>H.K. van Dijk</div>
Instrumental Variables, Errors in Variables, and Simultaneous Equations Models: Applicability and Limitations of Direct Monte Carlo
http://repub.eur.nl/pub/26507/
Tue, 27 Sep 2011 00:00:01 GMT<div>A. Zellner</div><div>T. Ando</div><div>N. Basturk</div><div>L.F. Hoogerheide</div><div>H.K. van Dijk</div>
A Direct Monte Carlo (DMC) approach is introduced for posterior simulation in the Instrumental Variables (IV) model with one possibly endogenous regressor, multiple instruments and Gaussian errors under a flat prior. This DMC method can also be applied in an IV model (with one or multiple instruments) under an informative prior for the endogenous regressor's effect. This DMC approach can not be applied to more complex IV models or Simultaneous Equations Models with multiple endogenous regressors. An Approximate DMC (ADMC) approach is introduced that makes use of the proposed Hybrid Mixture Sampling (HMS) method, which facilitates Metropolis-Hastings (MH) or Importance Sampling from a proper marginal posterior density with highly non-elliptical shapes that tend to infinity for a point of singularity. After one has simulated from the irregularly shaped marginal distri- bution using the HMS method, one easily samples the other parameters from their conditional Student-t and Inverse-Wishart posteriors. An example illustrates the close approximation and high MH acceptance rate. While using a simple candidate distribution such as the Student-t may lead to an infinite variance of Importance Sampling weights. The choice between the IV model and a simple linear model un- der the restriction of exogeneity may be based on predictive likelihoods, for which the efficient simulation of all model parameters may be quite useful. In future work the ADMC approach may be extended to more extensive IV models such as IV with non-Gaussian errors, panel IV, or probit/logit IV.Backtesting Value-at-Risk using Forecasts for Multiple Horizons, a Comment on the Forecast Rationality Tests of A.J. Patton and A. Timmermann
http://repub.eur.nl/pub/26505/
Thu, 01 Sep 2011 00:00:01 GMT<div>L.F. Hoogerheide</div><div>F. Ravazzolo</div><div>H.K. van Dijk</div>
Patton and Timmermann (2011, 'Forecast Rationality Tests Based on Multi-Horizon Bounds', Journal of Business & Economic Statistics, forthcoming) propose a set of useful tests for forecast rationality or optimality under squared error loss, including an easily implemented test based on a regression that only involves (long-horizon and short-horizon) forecasts and no observations on the target variable. We propose an extension, a simulation-based procedure that takes into account the presence of errors in parameter estimates. This procedure can also be applied in the field of 'backtesting' models for Value-at-Risk. Applications to simple AR and ARCH time series models show that its power in detecting certain misspecifications is larger than the power of well-known tests for correct Unconditional Coverage and Conditional Coverage.Education and entrepreneurial choice: An instrumental variables analysis
http://repub.eur.nl/pub/37789/
Wed, 01 Jun 2011 00:00:01 GMT<div>J.H. Block</div><div>L.F. Hoogerheide</div><div>A.R. Thurik</div>
Abstract: Education is argued to be an important driver of the decision to start a business. However, the measurement of its influence is difficult since it is considered to be an endogenous variable. This study accounts for this endogeneity by using an instrumental variables approach and a dataset of more than 10,000 individuals from 27 European countries and the USA. The effect of education on the decision to become self-employed is found to be strongly positive, much higher than the estimated effect in case no instrumental variables are used. That is, the higher the respondent’s level of education, the greater the likelihood that they will start a business. Implications for entrepreneurship research and practice are discussed.A Class of Adaptive EM-based Importance Sampling Algorithms for Efficient and Robust Posterior and Predictive Simulation
http://repub.eur.nl/pub/22332/
Sat, 01 Jan 2011 00:00:01 GMT<div>L.F. Hoogerheide</div><div>A. Opschoor</div><div>H.K. van Dijk</div>
A class of adaptive sampling methods is introduced for efficient posterior and predictive simulation. The proposed methods are robust in the sense that they can handle target distributions that exhibit non-elliptical shapes such as multimodality and skewness. The basic method makes use of sequences of importance weighted Expectation Maximization steps in order to efficiently construct a mixture of Student-t densities that approximates accurately the target distribution -typically a posterior distribution, of which we only require a kernel - in the sense that the Kullback-Leibler divergence between target and mixture is minimized. We label this approach Mixture of t by Importance Sampling and Expectation Maximization (MitISEM). We also introduce three extensions of the basic MitISEM approach. First, we propose a method for applying MitISEM in a sequential manner, so that the candidate distribution for posterior simulation is cleverly updated when new data become available. Our results show that the computational effort reduces enormously. This sequential approach can be combined with a tempering approach, which facilitates the simulation from densities with multiple modes that are far apart. Second, we introduce a permutation-augmented MitISEM approach, for importance sampling from posterior distributions in mixture models without the requirement of imposing identification restrictions on the model's mixture regimes' parameters. Third, we propose a partial MitISEM approach, which aims at approximating the marginal and conditional posterior distributions of subsets of model parameters, rather than the joint. This division can substantially reduce the dimension of the approximation problem.Stock Index Returns' Density Prediction using GARCH Models: Frequentist or Bayesian Estimation?
http://repub.eur.nl/pub/22344/
Sat, 01 Jan 2011 00:00:01 GMT<div>L.F. Hoogerheide</div><div>D. David</div><div>N. Corre</div>
Using well-known GARCH models for density prediction of daily S&P 500 and Nikkei 225 index returns, a comparison is provided between frequentist and Bayesian estimation. No significant difference is found between the qualities of the forecasts of the whole density, whereas the Bayesian approach exhibits significantly better left-tail forecast accuracy.A comparative study of Monte Carlo methods for efficient evaluation of marginal likelihood
http://repub.eur.nl/pub/21335/
Mon, 18 Oct 2010 00:00:01 GMT<div>D. David</div><div>D. David</div><div>L.F. Hoogerheide</div><div>H.K. van Dijk</div>
Strategic choices for efficient and accurate evaluation of marginal likelihoods by means of Monte Carlo simulation methods are studied for the case of highly non-elliptical posterior distributions. A comparative analysis is presented of possible advantages and limitations of different simulation techniques; of possible choices of candidate distributions and choices of target or warped target distributions; and finally of numerical standard errors. The importance of a robust and flexible estimation strategy is demonstrated where the complete posterior distribution is explored. Given an appropriately yet quickly tuned adaptive candidate, straightforward importance sampling provides a computationally efficient estimator of the marginal likelihood (and a reliable and easily computed corresponding numerical standard error) in the cases investigated, which include a non-linear regression model and a mixture GARCH model. Warping the posterior density can lead to a further gain in efficiency, but it is more important that the posterior kernel be appropriately wrapped by the candidate distribution than that it is warped.Family Background Variables as Instruments for Education in Income Regressions: A Bayesian Analysis
http://repub.eur.nl/pub/20281/
Thu, 01 Jul 2010 00:00:01 GMT<div>L.F. Hoogerheide</div><div>J.H. Block</div><div>A.R. Thurik</div>
The validity of family background variables instrumenting education in income regressions has been much criticized. In this paper, we use data of the 2004 German Socio-Economic Panel and Bayesian analysis in order to analyze to what degree violations of the strong validity assumption affect the estimation results. We show that, in case of moderate direct effects of the instrument on the dependent variable, the results do not deviate much from the benchmark case of no such effect (perfect validity of the instrument). The size of the bias is in many cases smaller than the standard error of education’s estimated coefficient. Thus, the violation of the strict validity assumption does not necessarily lead to strongly different results when compared to the strict validity case. This provides confidence in the use of family background variables as instruments in income regressions.A Comparative Study of Monte Carlo Methods for Efficient Evaluation of Marginal Likelihoods
http://repub.eur.nl/pub/19830/
Tue, 01 Jun 2010 00:00:01 GMT<div>D. David</div><div>N. Basturk</div><div>L.F. Hoogerheide</div><div>H.K. van Dijk</div>
Strategic choices for efficient and accurate evaluation of marginal likelihoods by means of Monte Carlo simulation methods are studied for the case of highly non-elliptical posterior distributions. A comparative analysis is presented of possible advantages and limitations of different simulation techniques; of possible choices of candidate distributions and choices of target or warped target distributions; and finally of numerical standard errors. The importance of a robust and flexible estimation strategy is demonstrated where the complete posterior distribution is explored. Given an appropriately yet quickly tuned adaptive candidate, straightforward importance sampling provides a computationally efficient estimator of the marginal likelihood (and a reliable and easily computed corresponding numerical standard error) in the cases investigated in this paper, which include a non-linear regression model and a mixture GARCH model. Warping the posterior density can lead to a further gain in efficiency, but it is more important that the posterior kernel is appropriately wrapped by the candidate distribution than that is warped.Efficient Bayesian Estimation and Combination of GARCH-Type Models
http://repub.eur.nl/pub/19380/
Tue, 27 Apr 2010 00:00:01 GMT<div>D. David</div><div>L.F. Hoogerheide</div>
This paper proposes an up-to-date review of estimation strategies available for the Bayesian inference of GARCH-type models. The emphasis is put on a novel efficient procedure named AdMitIS. The methodology automatically constructs a mixture of Student-t distributions as an approximation to the posterior density of the model parameters. This density is then used in importance sampling for model estimation, model selection and model combination. The procedure is fully automatic which avoids difficult and time consuming tuning of MCMC strategies. The AdMitIS methodology is illustrated with an empirical application to S&P index log-returns. Several non-nested GARCH-type models are estimated and combined to predict the distribution of next-day ahead log-returns.Bayesian forecasting of Value at Risk and Expected Shortfall using adaptive importance sampling
http://repub.eur.nl/pub/76547/
Thu, 01 Apr 2010 00:00:01 GMT<div>L.F. Hoogerheide</div><div>H.K. van Dijk</div>
An efficient and accurate approach is proposed for forecasting the Value at Risk (VaR) and Expected Shortfall (ES) measures in a Bayesian framework. This consists of a new adaptive importance sampling method for the Quick Evaluation of Risk using Mixture of t approximations (QERMit). As a first step, the optimal importance density is approximated, after which multi-step 'high loss' scenarios are efficiently generated. Numerical standard errors are compared in simple illustrations and in an empirical GARCH model with Student-t errors for daily S&P 500 returns. The results indicate that the proposed QERMit approach outperforms alternative approaches, in the sense that it produces more accurate VaR and ES estimates given the same amount of computing time, or, equivalently, that it requires less computing time for the same numerical accuracy.Forecast accuracy and economic gains from Bayesian model averaging using time-varying weights
http://repub.eur.nl/pub/18574/
Mon, 01 Mar 2010 00:00:01 GMT<div>L.F. Hoogerheide</div><div>R.H. Kleijn</div><div>F. Ravazzolo</div><div>H.K. van Dijk</div><div>M.J.C.M. Verbeek</div>
Several Bayesian model combination schemes, including some novel approaches that simultaneously allow for parameter uncertainty, model uncertainty and robust time-varying model weights, are compared in terms of forecast accuracy and economic gains using financial and macroeconomic time series. The results indicate that the proposed time-varying model weight schemes outperform other combination schemes in terms of predictive and economic gains. In an empirical application using returns on the S&P 500 index, time-varying model weights provide improved forecasts with substantial economic gains in an investment strategy including transaction costs. Another empirical example refers to forecasting US economic growth over the business cycle. It suggests that time-varying combination schemes may be very useful in business cycle analysis and forecasting, as these may provide an early indicator for recessions.Are Education and Entrepreneurial Income Endogenous and do Family Background Variables make Sense as Instruments? A Bayesian Analysis
http://repub.eur.nl/pub/18349/
Fri, 26 Feb 2010 00:00:01 GMT<div>J.H. Block</div><div>L.F. Hoogerheide</div><div>A.R. Thurik</div>
Education is a well-known driver of (entrepreneurial) income. The measurement of its influence, however, suffers from endogeneity suspicion. For instance, ability and occupational choice are mentioned as driving both the level of (entrepreneurial) income and of education. Using instrumental variables can provide a way out. However, three questions remain: whether endogeneity is really present, whether it matters and whether the selected instruments make sense. Using Bayesian methods, we find that the relationship between education and entrepreneurial income is indeed endogenous and that the impact of endogeneity on the estimated relationship between educa-tion and income is sizeable. We do so using family background variables and show that relaxing the strict validity assumption of these instruments does not lead to strongly different results. This is an important finding because family background variables are generally strongly correlated with education and are available in most datasets. Our approach is applicable beyond the field of returns to education for income. It applies wherever endogeneity suspicion arises and the three questions become relevant.