Department of Mathematics
Çukurova University
Key
Words : Univariate GT
Distribution; Box and Tiao (BT) Distribution; Generalized Gamma (GG)
Distribution; Maximum Likelihood Estimation; Robustness; EM Algorithm
In this paper, we consider the univariate
generalized t distribution (GT),
introduced by McDonald and Newey [1]. We derive the maximum likelihood estimators
which can be considered as alternative redescending M-estimators for location
and scale parameters of a univariate data set. We give an iteratively
reweighting (IR) algorithm to compute the location and scale estimates and show
that this algorithm can be identified as an EM algorithm. We give some examples
to illustrate the performance of the location and scale estimators based on the
GT distribution.
McDonald and Newey [1] introduced the
univariate GT distribution as an
alternative to the normal and t distributions for modeling errors in the
regression. They used the GT
distribution to develop robust partially adaptive estimation procedure. This
procedure includes least squares, LAD (Least Absolute Deviation), Lp
and several other estimators as special cases.
Statistical distributions of returns on
financial instruments such as stocks, bonds or options play a central role in
much of the financial literature. The GT
family applies to symmetric, fat-tailed distributions and therefore adjusts for
the leptokurtosis of the nonnormal security returns. McDonald and Nelson [2]
and Butler, McDonald, Nelson and White [3] applied the GT-based partially adaptive estimation procedure for the robust
estimation of the market model. Partially adaptive estimates of ARMA time
series models based on the GT
distribution were given in McDonald [4], and applications of the GT to U.S. stock index returns were
presented in Bollerslev, Engle and Nelson [5, pp. 3017-3027].
Besides its usefulness in statistical
economics, the GT distribution has
not been received much interest in statistical literature. Although it has been
widely used as an alternative robust modeling distribution to the normal
distribution, its theoretical properties has not been considered much.
Recently, Arslan and Genç [6] have investigated the existence and uniqueness of
the solutions to the maximum likelihood estimating equations of the GT distribution.
One of the main objective of this paper is
to use the GT distribution to find
estimates for the location and scale parameters of a univariate data set. We
will model our data set with the GT
distribution with known shape parameters and unknown location and scale
parameters. The maximum likelihood estimators for the location and scale
parameters of the GT distribution
will provide alternative robust estimators for the location and scale of a
univariate data set. The maximum likelihood estimating equations can be viewed
as a set of redescending M-estimating equations.
Like most of the robust estimates,
location and scale estimates based on the GT
distribution cannot be computed explicitly. Numerical methods have to be used
to find estimates . Besides some fast convergent algorithms an IR algorithm can
be updated from the estimating equations. Further this IR algorithm can be
easily identified as a well-known EM algorithm.
The paper is organized as follows. Section
2 introduces the GT distribution.
Section 3 derives the maximum likelihood estimating equations. In Section 4, an
IR algorithm to compute the maximum likelihood estimates and its relation with
the EM algorithm are given. In Section 5 some examples based on the four
different data sets are given to demonstrate the performance of the GT M-estimators on the other robust
location and scale estimators. Paper is finalized with a conclusions section.
2.
THE GENERALIZED T (GT) DISTRIBUTION
Arslan and Genç [6] show that the
distribution is the
scale mixture of a BT and the
distribution. They
obtained the
distribution as the
ratio of two independent random variables. Their result can be summarized as
follows.
Let the random
variable U have a BT(u;
p) distribution with the shape
parameter
and let the random
variable T have a GG(t;
p/2,1, q) distribution with the shape parameters
, 1 and
. Assume that U and
T are independent. Then the random
variable
X=
(1)
has a
distribution with the
density function
![]()
, (2)
where B(.) denotes the beta function,
is a scale parameter,
Ñ is a location parameter and
,
are shape parameters.
The details of this result can be found in [6].
As we vary the parameters p and q we obtain densities with very different tail behavior. Larger
values of
and
are associated with
thinner tails of the density. Similarly , smaller values of
and
correspond to thicker
tails. It can be easily shown that the density function (2) is symmetric and
unimodal.
The
family includes
several important distributions as special or limiting cases [1]. For example,
for the case
we get the usual t distribution with the degrees of
freedom
. In this case the location and scale parameters are
and
, respectively. The density function of the
distribution
approaches the
and uniform
as
and
, respectively.
The moments of the GT distribution about the origin can be obtained easily. Since the
density of the GT distribution is
symmetric, the odd ordered moments are zero. The even ordered moments are
![]()
,
(3)
if
. For the standard GT
density the expected value is E(X)=0 and the variance is
![]()
,
(4)
if
[1].
3. MAXIMUM LIKELIHOOD ESTIMATION
Let
be a data set in Ñ. Suppose we model
with the
distribution with
known shape parameters and unknown location and scale parameters. We would like
to find estimates for the location and scale parameters
and
of the
distribution. The
likelihood function up to a scalar constant is
,
(5)
and the corresponding
log-likelihood function is
. (6)
Taking derivatives of (6)
with respect to
and
, and setting them to zero give the following estimating
equations.
, (7)
. (8)
Note that
exists for
. (When p<1, it
can be defined only at the points
.)
The equations (7) and (8)
can be rewritten as
, (9)
, (10)
where
. (11)
Here the weight function
, where
, is a nonnegative, decreasing function of s so that the outlying observations will
receive very small weights and the effect of outlying observations on the
estimators will be reduced. It can be easily seen that
. The equations given in (9) and (10) can be considered as
redescending M-estimators for
and
.
Let
, where m is the
number of coincident observations. Then
and
are the sufficient
conditions for the likelihood function for the location and the scale
parameters of the
distribution to have a
unique maximum in the parameter space ÑxÑ+ (See 6.)
If
and
does not hold, the
uniqueness of the maximum cannot be guaranteed, and the likelihood function may
have other critical points : local maxima or saddle points. However, we will
never get any local minima (See 6.)
In the location-only case (scale parameter
being fixed), the likelihood function has always a local maximum in the parameter
space Ñ. Especially, if
the parameter q is chosen small
enough then it will have n local
maxima, each close to one sample point, and n-1
local minima. The likelihood function will have cusps at each observation when
(See 6.)
In the scale-only case (location parameter
being fixed) the necessary and sufficient condition for the likelihood function
to have a unique maximum in the parameter space
is
, where
(See 6.)
4.
COMPUTATION OF THE ESTIMATES
4.1. The IR Algorithm
The estimating equations (9) and (10) give
no closed expressions for
and
since the weights
depend on the
parameters to be estimated. Thus a numerical method is needed to compute the
estimates. Newton-Raphson type algorithms can be proposed to find the solution
to the estimating equations (9) and (10). However, one simple and natural
choice is the following IR algorithm suggested by the updating equations (9)
and (10).
, (12)
, (13)
where
, and
is the iteration number.
4.2. Relation Between the IR
and the EM Algorithms
The EM algorithm (Dempster,
Laird and Rubin [7]) is an iterative computation method to find a maximum
likelihood estimate when data can be conveniently viewed as incomplete.
Suppose we observe X directly but we regard T
as missing in X=
.Thus the data set
can be viewed as
incomplete with the missing information
so that
will be regarded as
the complete data. Thus the joint density of the random variables X and T will be the density of the complete data. That is;
,
(14)
where
.
The complete data
log-likelihood function (ignoring the constant) is

. (15)
In the E-step of the EM algorithm we take
the conditional expectation of (15) given x,
and
. This conditional expectation is
|
. (16)
In the M-step of the algorithm, we
maximize
|
with respect to
and
. The solution of the equations
yields the equations
given in (12) and (13). Therefore we have obtained that the IR algorithm
derived from the estimating equations (9) and (10) is an EM algorithm.
5.
EXAMPLES
To see the performance of the estimators
from the GT distribution, we have
used two artificial samples and two real data sets. We have modeled each data
set by T and GT distributions for a comparison. Table 1 and Table 2 show the
results of the estimates for the location and the scale parameters for the data
sets obtained from the T and GT distributions with known shape
parameters. The estimates for the GT
distribution have been calculated using the IR algorithm given above. For the T distribution a similar algorithm is
given in [8], as labeled Algorithm I. Note that the algorithm given in this
paper give the same algorithm as given in [8] when we set p=2. The convergency behavior of this algorithm is under
consideration. From our limited experience the convergency rate of the
algorithm is very similar to that of given in [8].
Sample 1 consists of 20 normal N(0,1) and 5 normal N(10,1) random numbers. Sample 2 consists of 20 normal N(0,1) and 10 normal N(10,1) random numbers. The real data
sets are from Cushney & Peebles (1905) and Rosner (1977), which are often
used in the literature to illustrate various robust estimators of location.
Cushney & Peebles data consist of the differences of excess hours of 10
patients’ sleep under the influence of two different drugs (Fig.1). Rosner data
consist of 10 monthly diastolic blood pressure measurements (Fig.2).
Table
1.
Location and scale estimates of two samples generated from the normal
distribution (mve= minimum volume ellipsoid estimate for the univariate data).
|
Sample1 Sample 2 |
||||
|
Distribution |
|
|
|
|
|
Normal |
1.8074 |
15.3355 |
3.2414 |
24.0842 |
|
Cauchy |
.0603 |
.9658 |
.0560 |
1.7597 |
|
T( |
.1118 |
1.9399 |
.7609 |
8.4400 |
|
T( |
.1045 |
.5076 |
-.0385 |
0.6938 |
|
GT(p=1.6,q=1) |
.0943 |
2.7244 |
.2130 |
7.5065 |
|
GT(p=1.6,q=.3) |
.1234 |
.9466 |
-.0508 |
1.3253 |
|
GT(p=2.6,q=.4) |
.0302 |
1.9747 |
.0896 |
3.6265 |
|
GT(p=2.8,q=.3) |
.0235 |
1.6004 |
.0330 |
2.5647 |
|
GT(p=2.8,q=.2) |
.0530 |
1.0989 |
-.0410 |
1.4602 |
|
GT(p=2.9,q=.1) |
.1482 |
.5441 |
-.0403 |
.6521 |
|
mve |
-.0848 |
1.1542 |
-.1597 |
.9720 |
Table
2. Location
and scale estimates of two real data sets (mve= minimum volume ellipsoid
estimate for the univariate data).
|
Cushney & Peebles Rosner |
||||
|
Distribution |
|
|
|
|
|
Normal |
1.5800 |
1.3616 |
82.2000 |
232.3600 |