Variance analysis of advantages and disadvantages of the service that active financial associations provide their customers solution by the model of hierarchy classification with two factors and an application
Doç. Dr. Adnan MAZMANOĞLU
Department of Mathematics
Marmara University.
Göztepe Campus/İstanbul
E-Mail:amazman@marun.edu.tr
Key Word : Linear Models, Two-Way Nested Classification
V.A. : Variance
Analysis
The
“data table” below is formed by the results of analysis made face to face with
some customers chosen random and among different geographical regions and different status, to test the advantage of
the level of consumer credit, credit
card and commercial credits(level of
to factor in
) which are not only included in the economical activities of
the public and special banks(factor
) but also assumed by our model, provide their customers.
Table :1: Observation Table
|
Observation Values |
|||||
|
Banks |
Bank type |
Points |
Sum(Observation) |
Number |
Average |
|
Public |
Credit
card (1) |
5, 3 |
8 |
2 |
4 |
|
Consumer
credit(2) |
9, 3 |
12 |
2 |
6 |
|
|
Commercial (3) credit |
8,8 |
16 |
2 |
8 |
|
|
|
Sum |
36 |
6 |
6 |
|
|
Special Banks |
Credit
card (1) |
9, 9, 6 |
24 |
3 |
8 |
|
Consumer
credit(2) |
9 |
9 |
1 |
9 |
|
|
Commercial (3) credit |
3, 8, 10 |
21 |
3 |
7 |
|
|
Sum |
54 |
7 |
8 |
||
|
General Sum : 90 13 7 |
|||||
In this table, having
shown the credit card, consumer credit and commercial credits service types
among the busiest activities of the public and special banks, and finding out how
much the customers are satisfied with this services, have directed us to the
two-factored hierarchic(nested) variance analyses model in which the different
levels of hierarchicly set two factors will be tested
Model
Looking at the data
table we can understand that the most suitable model is
(1)
. Let’s try to explain this parameters:
: is the k observation
value of the i type of bank in the j
service level
: is the type of i bank
: shows the effect of the i type bank and j
type service type
The model consist of two factors
and
and these are one within the other. The
factor has two levels
named public and special banks, and the
factor has six levels,
three of them in the first factor of
and three of them in
the second factor of
.
p=2; i=1,...p
q1=3; j=1,...q1
q2=3
Assuming that there are nij observation
value j service type of i bank, and
k=1,...nij such that;
, is the error term.
ni. =
and n..=![]()
Using the data table easily,
for example,
n1.= ni.
=
=
= n11 + n12 + n13 =6
n2.=
=
= n21 + n22 + n23 = 7
n.. =
=6+7=13
Let’s write the normal equations of these thirteen
observations using the (1) model.
y111 =
5=
+
1 +
+
111
y112 =
3=
+
1 +
+
112
y121 =
9=
+
1 +
+
121
y122 =
3=
+
1 +
+
122
y131 =
8=
+
1 +
+
131
y132 =
8=
+
1 +
+
132
y211 =
9=
+
2 +
+
211
y212 =
9=
+
2 +
+
212
y213 =
6=
+
2 +
+
213
y221 =
9=
+
2 +
+
221
y231 =
3=
+
2 +
+
231
y232 =
8=
+
2 +
+
232
y233 = 10=
+
2 +
+
233
If we write the models consisting of the (1, 0)
indicator values of these equations again,
y111 = 5+
+
1(1)+
2(0) +
(1) +
(0) +
(0) +
(0)+
(0) +
(0)+
111
y112 = 3+
+
1(1)+
2(0) +
(1) +
(0) +
(0) +
(0)+
(0) +
(0)+
112
y121 = 9+
+
1(1)+
2(0) +
(0) +
(1) +
(0) +
(0)+
(0) +
(0)+
121
y122 = 3+
+
1(1)+
2(0) +
(0) +
(1) +
(0) +
(0)+
(0) +
(0)+
122
y131 = 8+
+
1(1)+
2(0) +
(0) +
(0) +
(1) +
(0)+
(0) +
(0)+
131
y132 = 8+
+
1(1)+
2(0) +
(0) +
(1) +
(1) +
(0)+
(0) +
(0)+
132
y211 = 9+
+
1(0)+
2(1) +
(0) +
(0) +
(0) +
(1)+
(0) +
(0)+
211
y212 = 9+
+
1(0)+
2(1) +
(0) +
(0) +
(0) +
(1)+
(0) +
(0)+
212
y213 = 6+
+
1(0)+
2(1) +
(0) +
(0) +
(0) +
(1)+
(0) +
(0)+
213
y221 = 9+
+
1(0)+
2(1) +
(0) +
(0) +
(0) +
(0)+
(1) +
(0)+
221
y231 = 3+
+
1(0)+
2(1) +
(0) +
(0) +
(0) +
(0)+
(0) +
(1)+
231
y232 = 8+
+
1(0)+
2(1) +
(0) +
(1) +
(0) +
(0)+
(0) +
(1)+
232
y233 = 10+
+
1(0)+
2(1) +
(0) +
(1) +
(0) +
(0)+
(0) +
(1)+
233
The result is like shown above. Let’s write the indicator matrix(design
matrix) changed into matrix form.
=
=
+ ![]()
We can show these equation system by matrix and
vectors like below:
Y
= X
+ ![]()
Y = Vector of observations
X = The matrix of coefficients consisting of 0 and 1.
= The unknown parameter vector consisting of three sub vectors(
=(
))
= Vector of errors
In the system above, we see that each sub vector of each
equation has one member at least. It causes the
X indicator matrix to go into a
certain order:
1) It only consists of 0 and 1 values.
2) The X matrix
won’t be a full rank matrix either it is square or rectangle because of its
special structure.(The first column is equal to the sum of not only the 2. ,3.,
but also the 4., 5.,6., 7., 8., 9. columns.)
This means that we will come across to models not
having full rank. We will solve this models with a new method called
“g-inverse”.
(![]()
)
=
(2)
For the
solution of normal equations
=![]()
must be like this. But (2) has just one solution for
the full rank regression models.
,is an estimable parameter. In another way, it is
E(
)=
.
In our model, it can be seen easily that
matrix(design matrix~indicator matrix )is not full rank. 0 and
1 called “dummy variable” have a special meaning. The dispersal of 0 and 1 in
shows us how the terms of models line up among the
observations, in other ways what kind of classification these observation have.
An endless solutain is about to be made for
when
is not full rank.And to tell about the estimated vector for
just one
, some writers like
Kempthorne, Federer, Steel-Torrie have chosen the way of making the
solution single(Y=X
+
). They have made abbreviations like
. And for the second, they have made the solution by
“g-inverse(generalized inverse). They have showed that the solution of the
system will be
=![]()
if the equation system is consistent. Obtaining the G generalized inverse matrix
and
is a general solution of the normal equations,
gives us the ![]()
![]()
equality.
Z is a vector made of the unknown. There are
different methods related to calculating the g-inverses.Now let’s go back to
our model and try to find a solution for
.This solution’s
is this. Let’s
express the matrix here numerically.

=
=![]()
The degree of
indicator matrix is
9 and rank is 6. Because;
The
sum of the 2. and 3. lines gives the
first line.
The
sum of the 4., 5., and 6. lines gives
the second line
The
sum of the 7., 8. and 9. lines gives
the third line. Due to this linear relation,
rank
.
And we can say that, for hierarchic classification
with two factors model, the rank of X’X
matrix is equal to the number of its lower classes. And its total level which
we will express with q is 6 . Considering that p is the number showing the
equations of factors
, in
, q is number showing the
equations and the
number 1 showing the
equation, it’s found
that when the degree of X’X matrix is
calculated by m=1+p+q, the rank is X’X=1+p+q-(1+p)=q .
Also normal equations are made by taking the
solution vector of
(1+a) member of zero(
=0) . In other words, a limit is put. And this is
.
=[ 0 0 0
4 6 8 8 9 7]
is the line vector
that is formed by the cell averages.The g-inverse(generalized inverse matrix)
that responses to X’X matrix is,

assuming that D(1/nij) is a matrix whose
diagonal members are ½, ½, ½,1/3, 1 and
1/3 it is a diagonal matrix provided from the sub(lower) square matrix of the
sixth degree of the X’X matrix;
matrix is
G-generalized inverse matrix. For
=
is that, let’s calculate the
multiply.
=
and
=
.
Variance
Analysis Of The Model
Let’s calculate the sum of the squares
of the model considering the observation data in Tablo-1.
R(
)=SSM=n..
=13.(72)=637
R(
=SSR=
=
=
82/2+122/2+162/2+242/3+92/1+212/3
=652
R(
=R(
-R(
)=652-637=15
SST=
=52+32+92+32+82+82+92+92+62+92+32+82+102
=704
and SSE=SST-SSR=704-652=52
Let’s try to explain the meanings of the R() functions(Reduction) used here:
R(
): yi=
reduction according
to the last model.The average of sum of the squares.
R(
:joining of the
factor to the model
after the
factor. Average of the
squares
SST: Sum of the
squares
SSE: sum of
the squares of errors
Variance
Analysis (V.A) Table
|
Source
of variation |
s.d |
Sum of Square |
Average
of squares |
Statistics
of F |
|
Average |
1 |
R( |
637/1=637 |
F( |
|
After
the average model |
b-1=6-1=5 |
R( |
15/5=3 |
F(Rm)=3/(SSE/N-r)=0,4 |
|
Error |
N-b=13-6=7 |
SSE=52 |
52/7=7,43 |
|
|
Sum |
13 |
SST=704 |
|
|
Looking at the table values of the F statistics in
%5 significance level,
F1,7,0.05=5,59 and F6,7,0.05=3,87
is seen. Commenting on the result of this test like
below is possible.
Comment: In %5 significance
level F1,7,0.05=5,59 value is smaller than the
calculated F(Rm)=85,73 value. And H0=E(
)=0 hypothesis is
rejected
E(yijk)=
model can explain the changes in “y” much more than E(yijk)=
model explain. But the hypothesis can be accepted
because F6,7,0.05=3,87 is bigger than the calculated F(Rm)=0,4 value. And
E(yijk)=
model can explain the changes in “y” much more than E(yijk)=
model explain. Explanition of the model with only the
average is not enough.
(Banks) factors and all of these levels(commercial
activities) must also join to the model.
Conclusion
|
|
|
|
|
|
|
|
|
|
=[ 0 0 0
4 6 8
8 9 7]
parameter of the estimable vector shows us that
customers of the public banks(
) don’t benefit from the service provided them in the result
of “Credit card activities” while the special banks provide this service more
well.(
=4 ;
=8)
It’s observed that the level of “Consumer credit”
provided to their customers by special banks is very useful and it has created a more advantageous condition
according to the public banks. Also we can say that level of “Commercial credits” provided by public banks
are more advantageous than that provided by commercial banks.(
). It can certainly be shown in the model that doing
and without joining
the model the advantages and disadvantages cannot be tested.
References
1) İpek M.
,”Genelleştirilmiş Ters Matrislerle ve Rankı Tam olmayan Modellere uygulama”,
1980, İ.Ü.İktisat Fak., İST.
2) Rao, C.R. ,”Linear Statistical Inference And Its Applications”, John Wiley&Sons, 1965
3) Searl S. R., “Linear Models”, John Wiley&Sons, 1971
4) Graybill, F.A.,”An Introduction linear Statistical Models”, McGraw-Hill, N.Y, 1961
5) Mazmanoğlu, A. “Etkileşimsiz Çapraz 2-Faktörlü Varyans Analizi Modellerinde Matrislerle Çözümleme”, Doktora tezi, İ.Ü.İktisat Fak., 1984
6) Mazmanoğlu, A.,”Otomotiv Sektöründen Alınan Rastgele Örneklemenin Kalite Saptanmasında Çapraz 2-Faktörlü Dengeli Verili V.A. Yönteminin kullanımı ve Bir Uygulama, YA/EM’96 Bildiri kitabı, S.176-179, İ.T.Ü. 1996, İST.
7) Mazmanoğlu, A., “Tek Yönlü Sınıflama Lineer Modeli ve İki faktörlü Çapraz Lineer Modelleri Kullanarak Büyük marketlerden Yapılan alışverişlerin getirdiği Faydanın Ölçülmesi”, IV. Ulusal Ekonometri ve İstatistik Sempozyumu, 14-16 Mayıs 1999, Sol Bevil Hotel, Belek-ANTALYA
8) Kari, G.,”İç-İçe 2-Faktörlü Varyans Analizi modellerinde Matrislerle Çözümleme ve Bir Uygulama”, Doktora tezi, Y.T.Ü, Fen Bilimleri Enstitüsü, !994, İST.