Variance analysis of advantages and disadvantages of the service that active financial associations provide their customers solution by the model of hierarchy classification with two factors and an application

 

Doç. Dr. Adnan MAZMANOĞLU

Department of Mathematics

Marmara University.

Göztepe Campus/İstanbul

E-Mail:amazman@marun.edu.tr

           

                Key Word : Linear Models, Two-Way Nested Classification

                V.A.          :  Variance Analysis

 

           

                The “data table” below is formed by the results of analysis made face to face with some customers chosen random and among different geographical regions and different status, to test the advantage of the level of consumer  credit, credit card and commercial credits(level of   to factor  in ) which are not only included in the economical activities of the public and special banks(factor ) but also assumed by our model, provide their customers.

                Table :1: Observation Table                                           

Observation Values

Banks

Bank type

Points

Sum(Observation)

Number

Average

 

Public

 

Credit card        (1)

5, 3

8

2

4

Consumer credit(2)

9, 3

12

2

6

Commercial      (3)

credit

8,8

16

2

8

 

Sum

36

6

6

Special

Banks

Credit card       (1)

9, 9, 6

24

3

8

Consumer credit(2)

9

9

1

9

Commercial      (3)

 credit

3, 8, 10

21

3

7

                                                       Sum               

54

7

8

                                                  General Sum :               90                   13               7            

 

            In this table, having shown the credit card, consumer credit and commercial credits service types among the busiest activities of the public and special banks, and finding out how much the customers are satisfied with this services, have directed us to the two-factored hierarchic(nested) variance analyses model in which the different levels of hierarchicly set two factors will be tested

                       

Model

            Looking at the data table we can understand that the most suitable model is

 

                                             (1)

. Let’s try to explain this parameters:

: is the  k observation value of the i type of bank in the j  service level

 :  is the type of i bank

 :  shows the effect of the i type bank and j type service type

 

The model consist of two factors   and and these are one within the other. The   factor has two levels named public and special banks, and the  factor has six levels, three of them in the first factor of    and three of them in the second factor of .

p=2;     i=1,...p

q1=3;   j=1,...q1

q2=3

Assuming that there are nij observation value j service type of  i bank, and k=1,...nij  such that;

 

, is the error term.

ni. =  and   n..=

            Using the data table easily, for example,

            n1.= ni. = == n11 + n12 + n13 =6

            n2.= = = n21 + n22 + n23  = 7

            n.. = =6+7=13

Let’s write the normal equations of these thirteen observations using the (1) model.

 

y111 =   5= + 1 +  + 111

y112 =   3= + 1 +  + 112

y121 =   9= + 1 +  + 121

y122 =   3= + 1 +  + 122

y131 =   8= + 1 +  + 131

y132 =   8= + 1 +  + 132

y211 =   9= + 2 +  + 211

y212 =   9= + 2 +  + 212

y213 =   6= + 2 +  + 213

y221 =   9= + 2 +  + 221

y231 =   3= + 2 +  + 231

y232 =   8= + 2 +  + 232

y233 = 10= + 2 +  + 233

 

 

 

 

If we write the models consisting of the (1, 0) indicator values of these equations again,

 

 

 

 

 

 

 

y111 =   5+ + 1(1)+2(0) + (1) + (0) +(0) +(0)+(0) +(0)+ 111

y112 =   3+ + 1(1)+2(0) + (1) + (0) +(0) +(0)+(0) +(0)+ 112

y121 =   9+ + 1(1)+2(0) + (0) + (1) +(0) +(0)+(0) +(0)+ 121

y122 =   3+ + 1(1)+2(0) + (0) + (1) +(0) +(0)+(0) +(0)+ 122

y131 =   8+ + 1(1)+2(0) + (0) + (0) +(1) +(0)+(0) +(0)+ 131

y132 =   8+ + 1(1)+2(0) + (0) + (1) +(1) +(0)+(0) +(0)+ 132

y211 =   9+ + 1(0)+2(1) + (0) + (0) +(0) +(1)+(0) +(0)+ 211

y212 =   9+ + 1(0)+2(1) + (0) + (0) +(0) +(1)+(0) +(0)+ 212

y213 =   6+ + 1(0)+2(1) + (0) + (0) +(0) +(1)+(0) +(0)+ 213

y221 =   9+ + 1(0)+2(1) + (0) + (0) +(0) +(0)+(1) +(0)+ 221

y231 =   3+ + 1(0)+2(1) + (0) + (0) +(0) +(0)+(0) +(1)+ 231

y232 =   8+ + 1(0)+2(1) + (0) + (1) +(0) +(0)+(0) +(1)+ 232

y233 = 10+ + 1(0)+2(1) + (0) + (1) +(0) +(0)+(0) +(1)+ 233

 

The result is like shown above. Let’s write the indicator matrix(design matrix) changed into matrix form.

 

 =    =   +

 

We can show these equation system by matrix and vectors like below:

 

                        Y = X   + 

 

 Y = Vector of observations

 X = The matrix of coefficients consisting of 0 and 1.

= The unknown parameter vector consisting of three sub  vectors(=())

= Vector of errors

 

In the system above, we see that each sub vector of each equation has one member at least. It causes the  X indicator matrix to go into a certain order:

1)      It only consists of 0 and 1 values.

2)      The  X matrix won’t be a full rank matrix either it is square or rectangle because of its special structure.(The first column is equal to the sum of not only the 2. ,3., but also the 4., 5.,6., 7., 8., 9. columns.)

This means that we will come across to models not having full rank. We will solve this models with a new method called “g-inverse”.

 

                        ()=                        (2)

For the solution of normal equations

                        =

must be like this. But (2) has just one solution for the full rank regression models. ,is an estimable parameter. In another way, it is

                        E()=.

In our model, it can be seen easily that matrix(design matrix~indicator matrix )is not full rank. 0 and 1 called “dummy variable” have a special meaning. The dispersal of 0 and 1 inshows us how the terms of models line up among the observations, in other ways what kind of classification these observation have. An endless solutain is about to be made for  whenis not full rank.And to tell about the estimated vector for just one , some writers like  Kempthorne, Federer, Steel-Torrie have chosen the way of making the solution single(Y=X+). They have made abbreviations like  . And for the second, they have made the solution by “g-inverse(generalized inverse). They have showed that the solution of the system will be

                       

                                   =

if the equation system is consistent. Obtaining the G generalized inverse matrix

                          and    is a general solution of the normal equations, gives us the

                       

equality.

 

 

 

 

Z is a vector made of the unknown. There are different methods related to calculating the g-inverses.Now let’s go back to our model and try to find a solution for .This solution’s

                           is this. Let’s express the matrix here numerically.

 

 

 

 

= =

The degree of indicator matrix is  9  and rank is 6. Because;

            The sum of the 2. and 3. lines   gives the first line.

            The sum of the 4., 5., and 6. lines         gives the second line

            The sum of the 7., 8. and 9. lines          gives the third line. Due to this linear relation,  rank.

And we can say that, for hierarchic classification with two factors model, the rank of  X’X matrix is equal to the number of its lower classes. And its total level which we will express with q is 6 . Considering that p is the number showing the equations of factors , in , q is number showing the   equations and the number 1 showing the  equation, it’s found that when the degree of  X’X matrix is calculated by m=1+p+q, the rank is X’X=1+p+q-(1+p)=q .

Also normal equations are made by taking the solution vector of  (1+a) member of zero(=0) . In other words, a limit is put. And this is

 

            .

 

=[ 0  0  0  4  6  8  8  9  7]

 

 is the line vector that is formed by the cell averages.The g-inverse(generalized inverse matrix) that responses to X’X matrix is,

 

           

 

           

 

 

assuming that D(1/nij) is a matrix whose diagonal members are  ½, ½, ½,1/3, 1 and 1/3 it is a diagonal matrix provided from the sub(lower) square matrix of the sixth degree of the X’X matrix;

 

 

 

 

 matrix is G-generalized inverse matrix. For   = is that, let’s calculate the    multiply.

 

=  and      = .

 

Variance Analysis Of The Model

 

Let’s calculate the sum of the squares of the model considering the observation data in Tablo-1.

                R()=SSM=n..=13.(72)=637

                R(=SSR==

                                                                                                              = 82/2+122/2+162/2+242/3+92/1+212/3

                                                                               =652

            R(=R(-R()=652-637=15

 

            SST==52+32+92+32+82+82+92+92+62+92+32+82+102

                                                               =704

and      SSE=SST-SSR=704-652=52

Let’s try to explain the meanings of the  R() functions(Reduction) used here:

 R(): yi=    reduction according to the last model.The average of sum of the squares.

R(:joining of  the  factor to the model after the  factor. Average of the squares

SST: Sum of the  squares

SSE:   sum of the squares of errors

 

Variance Analysis (V.A) Table

 

Source of variation

    s.d

 

Sum of Square

 

Average of squares

Statistics of F

Average

        1

R()=637

637/1=637

F()=637/(SSE/N-r)=85,73

After the average model

 b-1=6-1=5

R(=15

15/5=3

F(Rm)=3/(SSE/N-r)=0,4

Error

N-b=13-6=7

SSE=52

52/7=7,43

 

Sum

13

SST=704

 

 

 

Looking at the table values of the F statistics in %5 significance level,

 

            F1,7,0.05=5,59               and      F6,7,0.05=3,87

is seen. Commenting on the result of this test like below is possible.

            Comment: In %5 significance  level F1,7,0.05=5,59 value is smaller than the calculated  F(Rm)=85,73 value. And H0=E()=0   hypothesis is rejected

E(yijk)= model can explain the changes in “y” much more than E(yijk)= model explain. But the hypothesis can be accepted because   F6,7,0.05=3,87   is bigger than the calculated F(Rm)=0,4 value. And  E(yijk)= model can explain the changes in  “y” much more than E(yijk)= model explain. Explanition of the model with only the average is not enough. (Banks) factors and all of these levels(commercial activities) must also join to the model.

 

Conclusion

 

 

=[ 0    0      0      4      6      8      8      9      7]

parameter of the estimable vector shows us that customers of the public banks(  ) don’t benefit from the service provided them in the result of “Credit card activities” while the special banks provide this service more well.(=4    ; =8)

It’s observed that the level of “Consumer credit” provided to their customers by special banks is very useful and  it has created a more advantageous condition according to the public banks. Also we can say that level of  “Commercial credits” provided by public banks are more advantageous than that provided by commercial banks.(). It can certainly be shown in the model that doing   and without joining the model the advantages and disadvantages cannot be tested.

 

 

 

 

 

 

           

 

 

 

 

References

1)       İpek  M. ,”Genelleştirilmiş Ters Matrislerle ve Rankı Tam olmayan Modellere uygulama”, 1980, İ.Ü.İktisat Fak., İST.

2)       Rao, C.R. ,”Linear Statistical Inference And Its Applications”, John Wiley&Sons, 1965

3)       Searl S. R., “Linear Models”, John Wiley&Sons, 1971

4)       Graybill, F.A.,”An Introduction linear Statistical Models”, McGraw-Hill, N.Y, 1961

5)       Mazmanoğlu, A. “Etkileşimsiz Çapraz 2-Faktörlü Varyans Analizi Modellerinde Matrislerle Çözümleme”, Doktora tezi, İ.Ü.İktisat Fak., 1984

6)       Mazmanoğlu, A.,”Otomotiv Sektöründen Alınan Rastgele Örneklemenin Kalite Saptanmasında Çapraz 2-Faktörlü Dengeli Verili V.A. Yönteminin kullanımı ve Bir Uygulama, YA/EM’96 Bildiri kitabı, S.176-179, İ.T.Ü. 1996, İST.

7)       Mazmanoğlu, A., “Tek Yönlü Sınıflama Lineer Modeli ve İki faktörlü Çapraz Lineer Modelleri Kullanarak Büyük marketlerden Yapılan alışverişlerin getirdiği Faydanın Ölçülmesi”, IV. Ulusal Ekonometri ve İstatistik Sempozyumu, 14-16 Mayıs 1999, Sol Bevil Hotel, Belek-ANTALYA

8)       Kari, G.,”İç-İçe 2-Faktörlü Varyans Analizi  modellerinde Matrislerle Çözümleme ve Bir Uygulama”, Doktora tezi, Y.T.Ü, Fen Bilimleri Enstitüsü, !994, İST.