Goodness-of-fit tests for a heavy tailed distribution

https://doi.org/10.1016/j.jspi.2008.02.013Get rights and content

Abstract

We study the Kolmogorov–Smirnov test, Berk–Jones test, score test and their integrated versions in the context of testing the goodness-of-fit of a heavy tailed distribution function. A comparison of these tests is conducted via Bahadur efficiency and simulations.

In the simulations, the score test and the integrated score test show the best performance. Although the Berk–Jones test is more powerful than the Kolmogorov–Smirnov test, this does not hold true for their integrated versions; this differs from results in Einmahl et al. [2003. Empirical likelihood based hypothesis testing. Bernoulli 9(2), 267–290], which shows the difference of Berk–Jones test in testing distributions and tails.

Introduction

We say that a distribution has a heavy tail with tail index α iflimt1-F(tx)1-F(t)=x-αforallx>0holds for some α>0 (notation: 1-FRV-α). Heavy tailed distributions have been applied in many different areas such as population size, random graphs, Internet traffic, hydrology, market research and finance, see Zipf (1949), Reittu and Norros (2004), Resnick (1997b), Katz et al. (2002), Anderson (2006) and Danielsson and deVries (1997).

Most of the theoretical works on heavy tailed distributions concentrates on estimating the tail index α. One of the best known estimators is the Hill estimatorα^=k-1i=1klnXn,n-k+i-lnXn,n-k-1,see Hill (1975), which employs a certain fraction of upper order statistics Xn,n-k, Xn,n-k+1,,Xn,n. Several data-driven methods for choosing the sample fraction k/n are proposed in the literature; see Drees and Kaufmann (1998), Danielsson et al. (2001) and Guillou and Hall (2001).

As far as we know, goodness-of-fit testing has not received as much attention as tail index estimation. A goodness-of-fit test for a heavy tail distribution is a test of the null hypothesis H0: “F is heavy tailed” versus the alternative hypothesis Ha: “F is not heavy tailed”, that is, H0: “(1) holds” versus Ha: “(1) does not hold”. An interesting aspect of such tests is that the null hypothesis provides a description of the heavy tail distribution which is incomplete in the following two aspects:

  • (i)

    The tail index is unknown under the null hypothesis, and hence should be estimated. It is well-known that estimation of unknown parameters has a nonnegligible effect on the distributions of test statistics, see Durbin, 1973a, Durbin, 1973b.

  • (ii)

    The tail index only describes the tail behaviour of the distribution, and thus only gives a partial description of the underlying distribution.

By fitting a generalized Pareto distribution to exceedances over a high threshold, Davison and Smith (1990) employed Kolmogorov and Anderson–Darling statistics to test the fit, and compared these statistics by employing 5% critical values derived from testing an exponential distribution with unknown mean. However, using these critical values in this context “is suspect since the exponential distribution is only a submodel of the generalized Pareto distribution, so the true critical points are smaller”, see Davison and Smith (1990, p. 414). Other studies on testing a generalized Pareto distribution include Marohn (2002) and Choulakian and Stephens (2001), where critical values for Cramer–von Mises and Anderson–Darling statistics are given.

Recently, Beirlant et al. (2006) studied tests for Pareto-type data and Drees et al. (2006) employed the Cramer–von Mises statistic to test the null hypothesis that a distribution is in the domain of attraction with extreme value index larger than -12. This may be thought as a generalization of Choulakian and Stephens (2001), since a deterministic threshold in Choulakian and Stephens (2001) is replaced by a random threshold in Drees et al. (2006).

In contrast to Beirlant et al. (2006), we compare in this paper three tests, the Kolmogorov–Smirnov test, the Berk–Jones test and the “estimated score” test in terms of Bahadur efficiency. The Kolmogorov–Smirnov test has a long tradition in statistics (see Kolmogorov, 1933, Smirnov, 1948, Durbin, 1973a). The Berk–Jones test may be viewed as a nonparametric likelihood test, and was derived for the situation where the null hypothesis completely specifies the distribution of the observations (see Berk and Jones, 1978, Berk and Jones, 1979). In this particular situation, the Berk–Jones test was shown to be more efficient, in the sense of Bahadur efficiency, than any weighted Kolmogorov test at any alternative. Like Li (2003), one may argue that the Berk–Jones test should also perform better than the Kolmogorov–Smirnov test in situations where the null hypothesis does not completely specify the distribution of the observations. In contrast to the Berk–Jones test, the “estimated score” test in Hjort and Koning (2002) was specifically proposed for the situation where parameters are unknown.

The paper is organized as follows. In Section 2 we present Kolmogorov–Smirnov test, Berk–Jones test, score test and their integrated versions. Large deviation results for tail empirical processes are obtained as a byproduct of studying Bahadur efficiency. A simulation study and real applications are given in Section 3. All proofs are gathered in Section 4.

Section snippets

The KS, BJ and SC supremum tests

Suppose that X1,,Xn are i.i.d. observations with distribution function F, and define Gk(r)=k-1i=1kIXn,n-k+iXn,n-kr,where 1kn, and Xn,1Xn,2Xn,n are the order statistics of X1,,Xn. One may think of Gk(r) as the empirical distribution function corresponding to the ordered sample Xn,n-k+1/Xn,n-k, Xn,n-k+2/Xn,n-k,,Xn,n/Xn,n-k. Observe that 1-Gk(r)=1ki=1nI(Xi>rXn,n-k)forr>1.Throughout we shall assume that k=k(n) and k/n0 as n.

Our methods are motivated by the following fact. Let U(x)

Simulation study

First we simulate 100,000 random samples from Frechet distribution F(x)=exp{-x-1} with sample size n=1000, and then compute the test statistics KS=supr>1|kKS(r;α^)|, BJ=supr>1kBJ(r;α^), SC=supr>1|kSC(r;α^)|, KSI, BJI and SCI for k=20,30,,200. Based on these computed test statistics, we obtain the 0.95 level critical values; see Table 1.

Using Table 1, we compute the powers of these six test statistics by simulating 10,000 random samples of size n=1000 from distributions 1-F(x)=1+αlnx1/δ-1/δ(x>1)

Proofs

In this section we provide the proofs of Theorem 1, Theorem 4, Theorem 6.

Proof of Theorem 1

It follows from de Haan and Resnick (1998, Propositions 2.2, 2.3, 3.1), see also de Haan and Resnick (1993, Proposition 4.1), that there exists a Wiener process W(v) such that supr>1|k{(1-Gk(r))-r-α}-{W(r-α)-r-αW(1)}|p0and k{α^-1-α-1}d1s-1W(s-α)ds-α-1W(1).Let B(v) denote the Brownian bridge defined by B(v)=W(v)-vW(1). Then we may rewrite the two previous equations assupr>1|k{(1-Gk(r)t)-r-α}-B(r-α)|p0andk{α^-α}d-α21B

Acknowledgement

We thank two reviewers for their helpful comments which correct some errors in the proofs.

References (46)

  • C. Anderson

    The Long Tail: The Revolution Changing Small Markets into Big Business

    (2006)
  • R.R. Bahadur

    Some Limit Theorems in Statistics

    (1971)
  • R.H. Berk et al.

    Relatively optimal combinations of test statistics

    Scand. J. Statist.

    (1978)
  • R.H. Berk et al.

    Goodness-of-fit test statistics that dominate the Kolmogorov statistics

    Z. Wahrscheinlichkeitstheorie Verwandte Gebiete

    (1979)
  • Bernstein, S., 1924. Sur une modification de l’inégalité de Tchebichef. Annals Science Institute Sav. Ukraine Sect....
  • S.H. Cheng

    Large deviation theorem for Hill's estimator

    Acta Math. Sinica New Ser.

    (1992)
  • H. Chernoff

    A measure of asymptotic efficiency for tests of a hypothesis based on the sum of observations

    Ann. Math. Statist.

    (1952)
  • V. Choulakian et al.

    Goodness-of-fit tests for the generalized Pareto distribution

    Technometrics

    (2001)
  • A.C. Davison et al.

    Models for exceedances over high thresholds

    J. Roy. Statist. Soc. Ser. B: Methodological

    (1990)
  • L. de Haan et al.

    Extreme Value Theory: An Introduction. Springer Series in Operations Research and Financial Engineering

    (2006)
  • L. de Haan et al.

    On asymptotic normality of the Hill estimator

    Comm. Statist. Stochastic Models

    (1998)
  • L. de Haan et al.

    Estimating the limit distribution of multivariate extremes

    Comm. Statist. Stochastic Models

    (1993)
  • L. de Haan et al.

    Generalized regular variation of second order

    Australian Math. Soc. J. Ser A: Pure Math. Statist.

    (1996)
  • Cited by (24)

    • An improved method for forecasting spare parts demand using extreme value theory

      2017, European Journal of Operational Research
      Citation Excerpt :

      An Anderson-Darling type test of (1) based on the tail empirical process is proposed in Drees, de Haan, and Li (2006). Several tests of (1) for γ > 0 are found in Koning and Peng (2008). Three different ways of choosing k have appeared in literature: moment estimator plot (Hill, 1975), bootstrap method (Danielsson, de Haan, Peng, & de Vries, 2001); (Draisma, de Haan, Peng, & Pereira, 1999) and unbiased moment estimator plot (HaanMercadierZhou2014).

    • A bias-reduced estimator for the mean of a heavy-tailed distribution with an infinite second moment

      2013, Journal of Statistical Planning and Inference
      Citation Excerpt :

      There are 310 weekly maxima and 132 monthly maxima from the given 2167 observations (see Fig. 4). All studies confirm that the Danish data exhibit a heavy tail with index between 1 and 2 (see, e.g., Koning and Peng, 2008). This allows us to fit the data to heavy-tailed models with infinite second moment which meets the objective of this paper.

    • Goodness-of-fit testing for Weibull-type behavior

      2010, Journal of Statistical Planning and Inference
    View all citing articles on Scopus
    1

    The author is grateful to the School of Mathematics at the Georgia Institute of Technology for hospitality and financial support.

    2

    The author's research was partly supported by NSF Grant SES 0631608.

    View full text