# Multivariate Analysis

of 9 ## Content

MULTIVARIATE ANALYSIS Nature of Multivariate Analysis Most business problems are multi-dimensional. MVA helps to solve complex problems Investigation of one variable –Univariate Analyis Investigation of 2 variables – Bivariate Analysis Investigate 3 or more variables –Multi-variate analysis Eg; sales dependent on profits alone – Classification of MV techniques 1.Dependence methods 2 . In t e r d e p e n d e n c e m e t h o d s

If a multivariate technique attempts to explain or predict the dependent variables on the basis of 2 or more independent then we are analyzing dependence. Multiple regressionanalysis, multiple discriminant analysis, multi-variate analysis of variance and canonicalcorrelation analysis are all dependence methods. Analysis of Interdependence The goal of interdependence methods is to give meaning to a set of variables or to seek togroup things together. No one variable or variable subset is to be predicted fro m theothers or explained by them. The most common of these methods are factor analysis,cluster analysis and multidimensional scaling. A manager might utilize these techniquesto identify profitable market segments or clusters. Can be used for class ification of similar cities on the basis of population size, income distribution etc; As in other forms of data anal ysis, the nature of measurement scales will determinewhich MV technique is appropriate for the data. The exhibits below show the selectionof M V t e c h n i q u e r e q u i r e s c o n s i d e r a t i o n o f t h e t yp e s o f m e t h o d s f o r d e p e n d e n t a n d interdependent variables. Non-metric- nominal and ordinal scales Metric- Interval and ratio scales Exhibit 1 – independent variable is metric

Classification of dependence Methods

ANALYSIS OF DEPENDENCE Multiple Regression analysis is an extension of bivariate regression analysis, which allows forthe simultaneous investigation of the effect of two or more independen tvariables on a single interval-scaled dependent variable. In reality several factors arelikely to affect such a dependent variable. An example of a multiple regression equation is Y = a + B1X1 + B2X2 + B3X3+ …………..BnXn + e Where B0= a constant, the value of Y when all X values =0 Bi= slope of the regression surface, B represents the regression coefficient associatedwith each X E= an error term, normally distributed about a mean of 0 Let us look at a forecasting example. Suppose a toy manufacturer wishes to forecast sales by sales territory. It is thought that c ompetitor’s sales, the presence or absence of a company’s salesperson in the territory (a binary variable) and grammar school enrollmentare the independent variables that might explain the variation in the sales of a toy. The data is fit in and the results from the mathematical computations are as follows Y = 102.18 + 387X1 + 115.2X2 + 6.73X3R2 = 0.845F value 14.6The regression equation indicates sales are positively related to X1 and X2 and X3 The coefficients B show the effects on the dependent variables of unit increases in anyindependent variable. The value of B2 = 115.2 indicates that an increase of Rs 115,200 intoy sales is expected with an additional unit of X2. Thus it appears that adding a companysalesperson has a very positive effect on sales. Grammar school enrollments also help predict sales. An increase in 1 enrollment of students ( 1000) indicates a sales increase of Rs 6730. A 1 unit increase in competitor sales volume X1 does not add much to the toymanufacturer’s sales.The regression coefficient can either be stated in raw score units (actual X values) or asstandardized coefficients(X values in terms of their standard deviation. When regressioncoefficient are standardized they are called as beta weights B an their values

indicate therelative importance of the associated X values especially when predictors are unrelated. If B1= .60 and B2 = .20 then X1 has three times the influence on Y as X2In multiple regression the coefficients B1 and B2 etc are called coefficients of partialregression because the independent variables are correlated with other inde pendentvariables. The correlation between Y and X1, with the correlation that X1 and X2 have incommon with Y held constant is partial correlation. Because the partial correlat ion between sales and X1 has been adjusted for the effect produced by variation produced inX2(and other independent variables) , the coefficient of correlation coefficient obtainedfrom the bivariate regression will not be the same as the partial coefficient in the multipleregression coefficient. N multiple regression the coefficient B1 is defined a partialregression coefficient for which the other independent variables are held constant.The coefficient of multiple determination indicates the percentage of variation in Yexplained by the variation in the independent variables. R2 = .845 tells us that thevariation in the independent accounted for 64.5% of the variance in the dependentvariable. Adding more of the independent variables in the equation explains more of thevariation in Y.To test for statistical significance an F-test comparing the different sources of variation isnecessary. The F test allows for testing the relative magnitudes of the sum of squares dueto the regression (SSe) and the error sum of squares (SSr) with their appropriate degreesof freedomF = (SSr)/k/ (SSe) / (n-k-1) K= nos of independent variables N= nos of respondents or observations.Refer F tables and test hypothesis at .05 level of significanceIn the eg F ratio = 14.6df = df for numerator =k =3df for denominator n-k-1 = 8Accept or reject H0 on the basis of comparison between calculated and table valueA continuous interval-scaled dependent variable is required in multiple regression as in bivariate regression, interval scaling is also required for the independent varia bles.However dummy variables such as the binary variable in our example may be utilized. Adummy variable is one that has two or more distinct levels 0 and 1Multiple regression is used as a descriptive tool in three types of situations1.It is often used to develop a self-weighting estimating equation by which to predictvalues for a criterion variable (DV) from the values of several predictor variables(IV)2.A descriptive application of multiple reg calls for controlling for confoundingvariables to better evaluate the contribution of other variables- control brand andstudy effect of price alone3 . T o t e s t a n d e x p l a i n c a s u a l t h e o r i e s - r e f e r r e d t o a s P a t h a n a l ys i s r e g i s u s e d t o describe an entire structure of linkages that have advanced from casual theories4.Used as an inference tool to test hypotheses and estimate population valuesLet us look at the following eg for SPSSLet us assume that we use multiple regression to arrive at key drivers of customer usagefor hybrid mail. Among the explanatory variables are customer perceptions of (1) costspeed valuation, (2) security, (3) reliability, (4) receiver technology, (5) Impact/emotionalvalue. Let us choose the first 3 variables all measured on a 5

point scaleY=customer usageX1=cost/speed evaluationX2= securityX3= reliabilitySPSS computed the model and the regression coeff. Eqn can be built with 1 . s p e c i f i c v a r i a b l e s 2 . a l l v a r i a b l e s 3.select a method that sequentially adds or removes variables.. Forward selectionstarts with the constant and variables that result in large R2 increases. Backwardelimination begins with a model containing all independent var and removes var that changes R2 the least. Stepwise selection, most popular, combines the two.The independent var that contributes the most to explaining the dependent var isadded first. Subsequent var are added based on their incremental contribution over the first var whenever they meet the criterion of entering the Eqn (eg a level of sigof .01. var may be removed at each step if they meet the removal criterion whichis larger sig level than for entryThe std elements of a step-wise important indicator of the relative importance of predictor variables output are shown in exhibitCollinearity and MulticollinearityIs a situation where two or more of the independent variables are highly correlated andthis can have a damaging effect on the multiple regression. When this condition exists,the estimated regression coeff can fluctuate widely from sample to sample making itrisky to interpret the coeff as an as an important indicator of predictor var. Just how highcan acceptable correlations be between indep var? There is no definitive answer, but cor at .80 or> should be dealt with in one of the following two ways1.Choose one of the var and delete the other 2.create a new var that is composite of the highly intercorrelated variables use thisvar in place of its components. Making this deci sion with a corr matrix alone isnot sufficient. The exhibit shows a VIF index. This is a measure of the effect of other indep var on a reg coeff. Large values of 10 or more suggests Collinearity or Multicollinearity. With only 3 predictors his is not a problem.3.Another difficulty with reg occurs when researchers fail to evaluate the eqn withdata beyond those used originally to calculate it. A solution would be to set aside a portion of the data and use only the remainder to estimate the eqn. This is calleda hold out eg. One then uses the eqn on the holdout data to calculate R2. This canthen be compared to the original R2 to determine how well the eqn predicts beyond the database. DISCRIMINANT ANALYSIS In a m yriad of situations the researcher’s purpose is to classify objects by a set of independent variables into two or more exclusively categories. A manager might want todistinguish between applicants as those to hire and not to hire. The challenge is to findthe discriminating variables to be utilized in a predictive equation that will produce better than chance assignment of the individuals to the groups.The prediction of a categorical variable (rather than a continuous interval-scaled variableas in multiple regressions) is the purpose of multiple discriminant analysis. In each of theabove problems the researcher must determine which variables are associated with the probability of an object falling into a particular group. In a statistical sense the problemso f s t u d yi n g t h e d i r e c t i o n o f g r o u p d i f f e r e n c e s i s t h e p r o b l e m o f f i n d i n g a l i n e a r combination of independent variables, the discriminant function tha t shows largedifferences as group means. Discriminant analysis is a statistical tool for determiningsuch linear combinations. Deriving the coefficients of a li near function is the task of aresearcher.We will consider a two group discriminant analysis problem where the

dependentvariable Y is measured on a nominal scale (n way discriminant analysis is possible)Suppose a personnel manager believes that it is possible to predict whether an applicantwill be successful on the basis of age, sales aptitude test scores and mechanical abilityt e s t s c o r e s s t a t e d a t t h e o u t s e t , t h e p r o b l e m i s t o f i n d a l i n e a r c o m b i n a t i o n o f t h e independent variables that shows large differences in group means. The first task is toestimate the coefficients of the individuals discriminant scores. The following linear function is usedZi =b1X1i +b2x2i +………+ bnXniWhere Xni = applicant’s value on the nth independent variablebn= discriminant function for the nth variableZi = ith applicants discriminant scoreUsing scores for all individuals in the sample, a discriminate function is determined basedon the criterion that the groups be maximally discriminated on the set of independentvariables. Returning to the example with three independent variables, suppose the personnel manager calculates the standardized weights in the equation to beZ = b1X1 +b2X2+b3X3= .069X1 + .013X2 +.0007X3This means that age (X1) is much more important than the sales aptitude test scores(X2)and mechanical ability (X3) has relatively less discriminating power.In the computation of linear discriminant function weights are assigned to the variablessuch that the ratio of difference between the means of the two groups to the std devwithin the group is maximized. The standardised discriminant coefficients of weights p r o v i d e i n f o r m a t i o n a b o u t t h e r e l a t i v e i m p o r t a n c e o f e a c h o f t h e s e v a r i a b l e s i n discriminating between these groups.An important goal of discriminant analysis is to perform a classification function. Theobject of classification in our example is to predict which applicants will be successfula n d w h i c h w i l l b e u n s u c c e s s f u l a n d t o g r o u p t h e m a c c o r d i n g l y. T o d e t e r m i n e i f d i s c r i m i n a n t a n l ys i s c a n b e u s e d a s a g o o d p r e d i c t o r i n f o r m a t i o n p r o v i d e d i n t h e ‘confusion matrix’is utilized. Suppose the personnel manager has 40 successful and 45unsuccessful employees in the sample The confusion matrix shows that the number of correctly classified employees (76%) ismuch higher than would be expected by chance. Tests can be performed to determine if the create of correct classification is statistically significant.A second example will allow us to portray DA from a graphic perspective. Suppose a bank loan officer wants to segregate corporate loan applicants into those likely to defaultand not o default. Assume that some data is available on a group of firms that went bankrupt and another that did not. For simplicity we assume that only current ratio anddebt/asset ratio are analysed. The ratio for the sample firms are given.The data in the table have been plotted in the graph. Xs represent firms that remainedsolvent. For eg Point A in the upper left segment is the point for firm 2 which had a current ratio of 3.0 and debt/asset ratio of 20% .The dot at point A indicates that the firmdid not go bankrupt. From a graphic perspective we construct a boundary line (the discriminant function) through the graph such that if a firm is to the left of the line it isnot likely to become insolvent. In our example the line takes this formZ= a + b1(current ratio) +b2(debt/asset ratio)Here a is a constant term and b1 and b2 indicate the effect that the current ratio and thedebt/asset ratio have on the probability of a firm going bankrupt.The following discriminant function is obtainedZ= - .3877-1.0736(current

ratio) +.0579(debt/asset ratio)This equation may be plotted in the graph as the locus of points for which Z=0. Allcombinations of current ratio and debt/asset ratio shown on the line result in Z=0.Companies that lie to the left of the line are not likely to go bankrupt while those to theright are likely to fail. It can be seen for the graph that one X indicating a failed companylies to the left of the point while two dots indicating non bankrupt companies lie to theright of the line. Thus the DA failed to properly classify three companies.Once we have determined the parameters of the discriminant function we can calculateThe Z scores for our hypothetical companies given may be interpreted as follows • Z=0 50-50 probability of future bankruptcy (say within 2 yrs). The companies lieon the boundary line • Z<0 If Z is negative there is less than a 50% probability of bankruptcy. Thesmaller the Z score the lower the probability of bankruptcy • Z>0 If Z is positive the probability of bankruptcy is greater that 50%. Larger Z greater the probability of bankruptcyThe mean Z score of the companies that did not go bankrupt is .583, while that of the b a n k r u p t f i r m s i s + . 6 4 8 . T h e s e m e a n s a l o n g w i t h a p p r o x i m a t i o n s o f t h e Z s c o r e probability distributions of the 2 groups are graphed. We may interpret this graph asindicating that if Z is about < than -.3 there is a small probability that a firm will turn outto be bankrupt, while if Z is> than +.3 there is only a small probability that it will remainsolvent. If Z is in the range +_ .3 we are highly uncertain as to how the firms will be classified-zone of ignoranceThe sign of the coeff of the discriminant fn are logical. Since its coeff is negative, the larger the current ratio, the lower a company’s Z-score and lower the Z score the smaller the probability of failureSimilarly high debt ratios produce high Z scores which means higher probability of bankruptcy.In the discriminant fn we have been discussing only 2 var but other var like ROA(rate of return on assets)can be introduced.Z=a +b1(current ratio) +2(D/A) +b3 (ROA) Multivariate Analysis of Variance(MANOVA) MANOVA is used when there are multiple interval-or ratioscaled dependent variables.There be one or more nominally scaled independent variables. . By manipulating thesales compensaton system in an experimental situation and by holding the compensationsystem constant in a controlled situation a researcher may be able o identify the effect of anew compensation system on the sales volume as well job satisfaction and turnover.With MANOVA a significance test of mean difference between groups can be madesimultaneously for two or more dependent variablesMANOVA assess the relationship between two or more dep var and classifactory var or factors. In business research MANOVA can be used to test differences among samples of employees, customers, manufactured items etc; MANOVA is similar to univariate ANOVA with the added ability to handle severalindependent var. MANOVA employs sum of squares ans cross products (SSCP) matricesto test for differences among groups. The variance between groups is determined by partitioning the total SSCP matrix nad testing for sig. The F ratio generalsed to a ratio of the withih group variance and total group variance matrix test for equality

amongtreatment groups.The central hypothesis of MANOVA is that all centroids(multivariate means) are equal.When H0 is rejected additional tests are done to better understand the data. Severalalternatives may be considered.1.Univariate Ftests can be run on dep var 2.Simultaneous confidence intervals cn be produced for each var 3 . S t e p d o w n a n a l y s i s l i k e s t e p w i s e r e g c a n b e r u n b y c o m p u t i n g F v a l u e s successively. Each value is computed after the effects of the previous dependentvariables are eliminated.4.Multiple discriminanat analysis can be used on the SSCP matrices. This aids inthe discovery of which var contribute to MANOVA sigWhen MANOVA is applied properly, the dep var are correlated. If the dep var areunrelated there would be no test for multivariate test and we could use separate F tests for each characteristic. CONJOINT ANALYSIS In management research the most common applications for conjoint analysis are marketresearch and product development. A customer buying a computer may evaluate a set of attributes to choose a product that best meets their needs. They may consider brand, speed, price, educational values games or capacity for work-related risks. The attributesand their features require the buyer to make trade-offs in the final decision making. MethodCA uses input from nonmetric independent variables. Nomally we would use cross classification tables to handle such data, but even multiway tables become complex. If there were three prices, three brands, three speeds, two levels of educational values, twocategories for games, and two categories for work assistance. The model will have(3x3x3x2x2x2) This poses enormous difficulties for respondents and researchers. CA solves this problem with various optimal scaling approaches, often with log linea r models, to provide reliable answers.The objective of Ca is to obtain utility scores that represent the importance of each aspectof the product in the subjects overall performance rankings or ratings of a set of cards.E a c h c a r d i n t h e d e c k d e s c r i b e s o n e p o s s i b l e c o n f i g u r a t i o n o f c o m b i n e d p r o d u c t attributes.The first step in a conjoint analysis is to select the attributes most pertinent for the purchase decision.. This may require a exploratory study such as a purchase group or done by an expert. The attributes selected are independent factors called factors, the possible values far an attribute are called factor levels. Factors like speed can bequantified and others like brand are discrete variables.After selecting the factors and their factor levels a computer programme determines thenumber of product descriptions necessary to estimate the utilities. SPSS ORTHOPLAN<PLANCARDS< and CONJOINT build a file structure for all possible combinations,generate the subset required for testing, produce the card descriptions and analyse results.T h e c o m m a n d s t r u c t u r e w i t h i n t h e s e p r o c e d u r e s p r o v i d e f o r h o l d o u t s a m p l i n g , simulations and other requirements frequently used in commercial applications solves this problem with various optimal scaling approaches, often with log linea r models, to provide reliable answers.The objective of Ca is to obtain utility scores that represent the importance of each aspectof the product in the subjects overall performance rankings or ratings of a set of cards.E a c h c a r d i n t h e d e c k d e s c r i b e s o n e p o s s i b l e c o n f i g u r a t i o n o f c o m b i n e

d p r o d u c t attributes.The first step in a conjoint analysis is to select the attributes most pertinent for the purchase decision.. This may require a exploratory study such as a purchase group or done by an expert. The attributes selected are independent factors called factors, the possible values far an attribute are called factor levels. Factors like speed can bequantified and others like brand are discrete variables.After selecting the factors and their factor levels a computer programme determines thenumber of product descriptions necessary to estimate the utilities. SPSS ORTHOPLAN<PLANCARDS< and CONJOINT build a file structure for all possible combinations,generate the subset required for testing, produce the card descriptions and analyse results.T h e c o m m a n d s t r u c t u r e w i t h i n t h e s e p r o c e d u r e s p r o v i d e f o r h o l d o u t s a m p l i n g , simulations and other requirements frequently used in commercial applications INTERDEPENDENCE METHODS Factor Analysis FA is a general term for several specific computational techniques. It has the objective of r e d u c i n g t o a m a n a g e a b l e n u m b e r m a n y v a r t h a t b e l o n g t o g e t h e r a n d h a v e m a n yoverlapping measurement characteristics. The predictor-criterion relationship that wasfound in the dependence situation is replaced by a matrix of inter correlations amongseveral variables, none of which viewed as dependent on the other. For eg, one may havedata on 100 employees with scores on 6 attitude scale items.MethodFA begins with the construction of new set of var based on the relationships in thecorrelation matrix. While this can be done in a number of ways, the frequently used approach is the Principal Component Analysis. This method transforms a set of var into anew set of composite var or principal components that are not correlated with oneanother. These linear combinations of var called factors, account for the variance in thedata as a whole. The best combinations makes up the first principal component. The second principal component is defined is defined as the best linear combination of var for the variance not explained by the first factor. In turn there may be third, fourth and kthcomponent, each being the best of linear combination of variables not accounted for bythe previous factors. The process continues until all the var have been accounted for but is usually stoppedafter a few fac have been extracted.The values in the table are cor coeff bet fac and the var. (.70 is the r bet fac 1 and var A)The cor coeff are called as loadings Eigen values are the sum of the variances of thefac values (.70sq+.60Sq---+.60Sq). When divided by the nos of var aneigen value yieldsan estimate of tha amt of total var explained by the fac. Eg fac 1 accounts for36% of thetot var. The col hsq gives the communalities or estimates of the var in each var explained by 2 other fac. With var A the communality is .70sq=(-.40) sq =.65, indicating that that65% of the variance in A is statistically explained in terms of fac 1&2In this case the unratated fac loadings are not enlightening. We need to find some patternin fac 1 which would have ahigh r on some var and fac II on others. We can attempt tosecure this less ambiguous condition bet fac nad var by rotation. This procedure can be carried out by either orthogonal or oblique methods.The interpretation of fac loadings is Lrgely subjective. There is no way to calculate themeaning of the fac, they are what one sees them. For this reason fac analysis is largelyuse dfor exploration. One can detect patterns in latent var, discover new concepts andreduce data.In order to further clarify the fac, a varimax rotation issued to secure the matrix. Varimaxcan clarify relationships but interpretation is largely

subjectiveCLUSTER ANALYSISCA is a technique of grouping similar objects or prople. CA shares some similarities withFA, especially when FA is applied to people instead of var. It differs from discriminantanalysis in that DA begins with a well define group composed of 2 or more distinct set of charac in search of a set of var to seprate them. Ca starts with an undifferentiated grp of people, events or objects and attempts to reorganize them into homo subgrpsMethod5 steps are basic to the application of cluster studies1.Selection of the sample to be clustered (eg buyers, employees etc) Definition of the var on which to measure the objects, events or people(financialstatus, political affiliation etc)3 . C o m p u t a t i o n o f s i m i l a r i t i e s a m o n g t h e e n t i t i e s t h r o u g h c o r r e l a t i o n a n d o t h e r techniques4.Selection of mutually exclusive clusters(maxn of within cluster similarity and between cluster differences) or hierarchically arranged clusters)5.Cluster comparison and validationDifferent clustering methods can and do produce different solutions. It is important tohave enough information about the data to know when the derived groups are real and notmerely imposed on the data by the methodCA can be used to plan marketing campaigns and develop strategies.MULTIDIMENSIONAL SCALINGMDS creates a special description of a respo ndent’s perception about a product, serviceor any other product of interest. This helps the business researcher to understand difficult-to-measure constructs such as product quality or desirability. In contrast to var that can bemeasured directly many constructs are perceived and cognitively mapped in differentways by individuals. With MDS items that are perceived to be similar will fall closetogether on MD space and items that are dissimilar will be farther apart.METHODWe may think of 3 type of attribute space, each representing a MD map.1.Objective base, in which an object can be positioned in terms of its measureableattributes; flavour, weight, nutritional value2.Subjective space: perceptions about the objects flavour, weight and nutritionalvalue can be positioned. Obj and sub attribute assessments may coincide but oftenthey do not. A comparison of the 2 allows us to judge how accurately an objectiveis being perceived. Individuals may hold different perceptions of an obje ctsimultaneously and these may be averaged to present a summary measure of p e r c e p t i o n . A p e r s o n ’ p e r c e p t i o n m a y v a r y o v e r t i m e a n d i n d i f f e r e n t circumstances. Such measurements are valuable to gauge the impact of various perception-affecting actions such as advertising programmes 3.Describe respondent’s preferences using the object’s attributes. This representstheir ideal. All objects close to this ideal point are interpreted as preferred byrespondents to those that are more distant. Ideal points from many people can be positioned in this preference space to reveal the pattern and size of preferenceclusters. These can be compared to subjective space to how well the preferencescorrespond to perception clusters. In this way CA and MDS can be combined tomap market segments and then design products designed for those segments

## Recommended

Or use your account on DocShare.tips

Hide