Goodness-of-fit tests for a heavy tailed distribution
Introduction
We say that a distribution has a heavy tail with tail index ifholds for some (notation: ). Heavy tailed distributions have been applied in many different areas such as population size, random graphs, Internet traffic, hydrology, market research and finance, see Zipf (1949), Reittu and Norros (2004), Resnick (1997b), Katz et al. (2002), Anderson (2006) and Danielsson and deVries (1997).
Most of the theoretical works on heavy tailed distributions concentrates on estimating the tail index . One of the best known estimators is the Hill estimatorsee Hill (1975), which employs a certain fraction of upper order statistics , . Several data-driven methods for choosing the sample fraction are proposed in the literature; see Drees and Kaufmann (1998), Danielsson et al. (2001) and Guillou and Hall (2001).
As far as we know, goodness-of-fit testing has not received as much attention as tail index estimation. A goodness-of-fit test for a heavy tail distribution is a test of the null hypothesis : “F is heavy tailed” versus the alternative hypothesis : “F is not heavy tailed”, that is, : “(1) holds” versus : “(1) does not hold”. An interesting aspect of such tests is that the null hypothesis provides a description of the heavy tail distribution which is incomplete in the following two aspects:
- (i)
The tail index is unknown under the null hypothesis, and hence should be estimated. It is well-known that estimation of unknown parameters has a nonnegligible effect on the distributions of test statistics, see Durbin, 1973a, Durbin, 1973b.
- (ii)
The tail index only describes the tail behaviour of the distribution, and thus only gives a partial description of the underlying distribution.
Recently, Beirlant et al. (2006) studied tests for Pareto-type data and Drees et al. (2006) employed the Cramer–von Mises statistic to test the null hypothesis that a distribution is in the domain of attraction with extreme value index larger than . This may be thought as a generalization of Choulakian and Stephens (2001), since a deterministic threshold in Choulakian and Stephens (2001) is replaced by a random threshold in Drees et al. (2006).
In contrast to Beirlant et al. (2006), we compare in this paper three tests, the Kolmogorov–Smirnov test, the Berk–Jones test and the “estimated score” test in terms of Bahadur efficiency. The Kolmogorov–Smirnov test has a long tradition in statistics (see Kolmogorov, 1933, Smirnov, 1948, Durbin, 1973a). The Berk–Jones test may be viewed as a nonparametric likelihood test, and was derived for the situation where the null hypothesis completely specifies the distribution of the observations (see Berk and Jones, 1978, Berk and Jones, 1979). In this particular situation, the Berk–Jones test was shown to be more efficient, in the sense of Bahadur efficiency, than any weighted Kolmogorov test at any alternative. Like Li (2003), one may argue that the Berk–Jones test should also perform better than the Kolmogorov–Smirnov test in situations where the null hypothesis does not completely specify the distribution of the observations. In contrast to the Berk–Jones test, the “estimated score” test in Hjort and Koning (2002) was specifically proposed for the situation where parameters are unknown.
The paper is organized as follows. In Section 2 we present Kolmogorov–Smirnov test, Berk–Jones test, score test and their integrated versions. Large deviation results for tail empirical processes are obtained as a byproduct of studying Bahadur efficiency. A simulation study and real applications are given in Section 3. All proofs are gathered in Section 4.
Section snippets
The KS, BJ and SC supremum tests
Suppose that are i.i.d. observations with distribution function F, and define where , and are the order statistics of . One may think of as the empirical distribution function corresponding to the ordered sample , . Observe that Throughout we shall assume that and as .
Our methods are motivated by the following fact. Let
Simulation study
First we simulate 100,000 random samples from Frechet distribution with sample size , and then compute the test statistics , , , , and for . Based on these computed test statistics, we obtain the 0.95 level critical values; see Table 1.
Using Table 1, we compute the powers of these six test statistics by simulating 10,000 random samples of size from distributions
Proofs
In this section we provide the proofs of Theorem 1, Theorem 4, Theorem 6. Proof of Theorem 1 It follows from de Haan and Resnick (1998, Propositions 2.2, 2.3, 3.1), see also de Haan and Resnick (1993, Proposition 4.1), that there exists a Wiener process such that and Let denote the Brownian bridge defined by . Then we may rewrite the two previous equations asand
Acknowledgement
We thank two reviewers for their helpful comments which correct some errors in the proofs.
References (46)
- et al.
A goodness-of-fit statistic for Pareto-type behaviour
J. Comput. Appl. Math.
(2006) - et al.
Tail index and quantile estimation with very high frequency data
J. Empirical Finance
(1997) - et al.
Using a bootstrap method to choose the sample fraction in tail index estimation
J. Multivariate Anal.
(2001) - et al.
Selecting the optimal sample fraction in univariate extreme value estimation
Stochastic Process. Appl.
(1998) - et al.
Approximations to the tail empirical distribution function with application to testing extreme value conditions
J. Statist. Plann. Inference
(2006) - et al.
Statistics of extremes in hydrology
Adv. Water Resources
(2002) Nonparametric likelihood ratio goodness-of-fit tests for survival data
J. Multivariate Anal.
(2003)- et al.
On the power-law random graph model of massive data networks
Performance Evaluation
(2004) Exact Bahadur efficiencies for the Kolmogorov–Smirnov and Kuiper one- and two-sample statistics
Ann. Math. Statist.
(1967)Some test statistics based on the martingale term of the empirical distribution function
Ann. Inst. Statist. Math.
(1986)
The Long Tail: The Revolution Changing Small Markets into Big Business
Some Limit Theorems in Statistics
Relatively optimal combinations of test statistics
Scand. J. Statist.
Goodness-of-fit test statistics that dominate the Kolmogorov statistics
Z. Wahrscheinlichkeitstheorie Verwandte Gebiete
Large deviation theorem for Hill's estimator
Acta Math. Sinica New Ser.
A measure of asymptotic efficiency for tests of a hypothesis based on the sum of observations
Ann. Math. Statist.
Goodness-of-fit tests for the generalized Pareto distribution
Technometrics
Models for exceedances over high thresholds
J. Roy. Statist. Soc. Ser. B: Methodological
Extreme Value Theory: An Introduction. Springer Series in Operations Research and Financial Engineering
On asymptotic normality of the Hill estimator
Comm. Statist. Stochastic Models
Estimating the limit distribution of multivariate extremes
Comm. Statist. Stochastic Models
Generalized regular variation of second order
Australian Math. Soc. J. Ser A: Pure Math. Statist.
Cited by (24)
An improved method for forecasting spare parts demand using extreme value theory
2017, European Journal of Operational ResearchCitation Excerpt :An Anderson-Darling type test of (1) based on the tail empirical process is proposed in Drees, de Haan, and Li (2006). Several tests of (1) for γ > 0 are found in Koning and Peng (2008). Three different ways of choosing k have appeared in literature: moment estimator plot (Hill, 1975), bootstrap method (Danielsson, de Haan, Peng, & de Vries, 2001); (Draisma, de Haan, Peng, & Pereira, 1999) and unbiased moment estimator plot (HaanMercadierZhou2014).
Bayesian estimation of the tail index of a heavy tailed distribution under random censoring
2016, Computational Statistics and Data AnalysisAn asymptotically unbiased minimum density power divergence estimator for the Pareto-tail index
2013, Journal of Multivariate AnalysisA bias-reduced estimator for the mean of a heavy-tailed distribution with an infinite second moment
2013, Journal of Statistical Planning and InferenceCitation Excerpt :There are 310 weekly maxima and 132 monthly maxima from the given 2167 observations (see Fig. 4). All studies confirm that the Danish data exhibit a heavy tail with index between 1 and 2 (see, e.g., Koning and Peng, 2008). This allows us to fit the data to heavy-tailed models with infinite second moment which meets the objective of this paper.
Goodness-of-fit testing for Weibull-type behavior
2010, Journal of Statistical Planning and Inference
- 1
The author is grateful to the School of Mathematics at the Georgia Institute of Technology for hospitality and financial support.
- 2
The author's research was partly supported by NSF Grant SES 0631608.