You are on page 1of 46

ONE SAMPLE T-TEST

OBJECTIVE To conduct one sample T-test using SPSS.

PROBLEM A major oil company developed a petrol additive that was supposed to increase engine efficiency. Twenty two cars were test driven both with and without the additive and the number of kilometer per liter was recorded. Whether the car was automatic or manual was also recorded and coded as 1 = manual and 2 = automatic. During an earlier trial 22 cars were test driven using the additive. The mean number of kilometers per liter was 10.5.

NULL HYPOTHESIS There is no significant difference in engine efficiency between the present trial and the earlier trial.

ALTERNATE HYPOTHESIS There is a significant difference in engine efficiency between the present trial and the earlier trial.

PROCEDURE

1. Select the Analyze menu. 2. Click on compare means and then one-sample T Test. To open the One-Sample T Test dialogue box. 3. Select withadd and move the variable into the Test Variable(s): box 4. In the Test Value: box type the mean score (10.5). 5. Click Ok.

OUTPUT

ONE SAMPLE T- TEST

One-Sample Statistics N withadd 22 Mean 13.86 Std. Deviation 2.748 Std. Error Mean .586

One-Sample Test Test Value = 10.5 95% Confidence Interval of the Difference Lower Upper 2.15 4.58

withadd

t 5.741

df 21

Sig. (2-tailed) .000

Mean Difference 3.364

INFERENCE 1. The difference between the sample mean and the hypothesized mean is determined by consulting the t-value, degree of freedom (df) and two-tail significance. 2. If the value for two-tail significance is less than .05 (p<.05), then the difference between the means is significant. 3. The cars in the present trial appear to have greater engine efficiency than that of those in the earlier trial t (21) = 5.74, p<.05.

RESULT The output indicates that there is a significant difference in engine efficiency between the present trial and the earlier trial.

INDEPENDENT SAMPLE T-TEST

OBJECTIVE:

To find out the difference in opinion among two sets of people by Independent sample t-test using SPSS.

PROBLEM:

As marketers of brand jeans, we want to find out whether a set of customers in Delhi and set of customers in Mumbai thought of our brand in the same way or not. A small survey was conducted in both the cities and the ratings were obtained on an interval scale 1-7. We want to find out whether the two sets of rating are significantly different.

S.NO. 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15

RATING 2 3 3 4 5 4 4 5 3 4 5 4 3 3 4

CITY 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1

S.NO. 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30

RATING 3 4 5 6 5 5 5 4 3 3 5 6 6 6 5

CITY 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2

NULL HYPOTHESIS There is no significant difference between the ratings given by the customers in Mumbai and Delhi at 95% confidence interval.

ALTERNATE HYPOTHESIS

There is significant difference between the ratings given by the customers in Mumbai and Delhi at 95% confidence interval.

PROCEDURE:

1. The variables are entered in the variable view of the SPSS data editor where city is a categorical variable using nominal measure and respondents ratings in scale. 2. In the value cell for city, enter the label values as 1- Mumbai and 2-Delhi. 3. The given data is entered in the data view. 4. Choose Analyse from the main menu. 5. Then choose Compare means > Independent sample T-test. 6. In the Independent sample t-test dialogue box, ratings given by the respondents are entered as a test variable and the city they belong to is entered as grouping variable. 7. Enter the specified values for the groups after clicking defining groups. 8. The output chart is generated and it is analyzed and inference is obtained.

OUTPUT: Group statistics Std. Error Mean .228 .284

Respondent's rating

Respondent's city Mumbai Delhi

N 15 15

Mean 3.73 4.73

Std. Deviation .884 1.100

Independent Samples Test Levene's Test for Equality of Variances F Equal variances assumed Equal variances not assumed Sig . T Df

t-test for Equality of Means

Sig. (2tailed) .010

95% Confidence Std. Mean Interval of the Error Differ Difference Differ ence ence Upper Lower -1.000 .364 -1.746 -.254

Ratings

.727

.40 -2.745 1

28

-2.745

26.759

.011

-1.000

.364

-1.748

-.252

INFERENCE:

1. The Independent samples t-test procedure compares the two group means (both Mumbai and Delhi). 2. The mean value for the two groups are displayed in the Group Statistics table (3.73 4.73 = - 1.00) 3. One test assumes that the variances of the two groups are equal. Levene tests this assumption. 4. The significance value for the Levenes test is high (0.401 is typically greater than of 0.10), so the result is assumed that there is equal variance for both the groups and the second test is ignored. 5. The significance value for the t-test 0.010 is less than 0.05 and the confidence interval for the mean difference does not contain zero. 6. So, the Null hypothesis is rejected and the Alternate hypothesis accepted. This indicates that there is a significant difference between the two group means.

RESULT:

There is a significant difference in the ratings on the brand, given by the respondents in the cities of Mumbai and Delhi.

PAIRED SAMPLE T-TEST

OBJECTIVE To conduct a Paired Sample T-Test using SPSS.

PROBLEM A major oil company developed a petrol additive that was supposed to increase engine efficiency. Twenty two cars were test driven both with and without the additive and the number of kilometer per liter was recorded. Whether the car was automatic or manual was also recorded and coded as 1 = manual and 2 = automatic Does engine efficiency improve when the additive is used? This is a repeated measure t-test design.

NULL HYPOTHESIS There is no significant difference exists between engine efficiency with and without the additive.

ALTERNATE HYPOTHESIS There is a significant difference exists between engine efficiency with and without the additive.

PROCEDURE 1. Select the Analyze menu. 2. Click on Compare Means and then Paired-Samples T Test to open the PairedSample T Test dialogue box. 3. Select the variables without and withadd and move the variables into the Paired Variables: box 4. Click Ok.

OUTPUT PARIED SAMPLE T-TEST


Pair d ampl

Paired Differences 95% C nfi ence Interval f t e Difference L wer Upper -6.651 -4. 6

Pair d ampl

tati ti

t . 22

Pair d ampl

Corr lation 22

INFERENCE 1. It can be determined that whether the groups come from the same or different populations 2. The significance is determined by looking at the probability level (p) specified under the heading two tail significance. 3. If the probability value is less than the specified alpha value, then the observed t-value is significant 4. The 95 percent confidence interval indicates that 95 percent of the time the interval specified will contain the true difference between the population means 5. The additive significantly improves the number of kilometers to the liter, t(21) = 8.66, p<.05 RESULT The output shows that there is a significant difference exists between engine efficiency with and without the additive.





P ir

wit

t & wit

   

C rr l ti . 9

 

P ir

wit wit

8. .86

i ti . 2. 8

ig. .

  # 

t . rr r

. . 86

   "! 







P ir 1 wit

t - wit

! 

Mean St . - . 64

eviati n 2.9 4

St . rr r Mean .619

   

 # 

t -8.663

df 21

Sig. (2-tailed) .

ONE WAY ANOVA

OBJECTIVE:

To test the preferred ad copy by the target population before the launch of its campaign.

PROBLEM:

There are three different versions of advertising copy created by an advertising agency for a campaign. Let us call these versions of copy as adcopy 1, 2 and 3. A sample of 18 respondents is selected from the target population in the nearby areas of the city. At random, these 18 respondents are assigned to the 3 versions of ad copy. Each version of ad copy is thus shown to six of the respondents. The respondents are asked to rate their liking for the ad copy shown to them on a scale of 1 to 10. (1 = Not liked at all, 10 = Liked a lot, and other values in between these two).

S. No. 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18

Ad copy 1 1 1 1 1 1 2 2 2 2 2 2 3 3 3 3 3 3

Rating 6.00 7.00 5.00 8.00 8.00 8.00 4.00 4.00 5.00 7.00 7.00 6.00 5.00 5.00 4.00 7.00 8.00 7.00

Null Hypothesis There is no difference in the ratings between the three versions of the ad copy at 95% confidence level.

Alternative Hypothesis

There is a significant difference between the three versions of the ad copy at 95% confidence level.

PROCEDURE:

1. The given data is entered in the variable view and then in the data view. 2. Choose Analyse > Compare means > One-way ANOVA. 3. In the one-way ANOVA dialog box, select ratings as the dependent list and ad copy as its factor. 4. Select other variables as required. 5. The output chart is generated and analysed and inference obtained.

OUTPUT:

Descriptives Ratings N Ad copy1 Ad copy2 Ad copy3 Total Mea n 7.00 00 5.50 00 6.00 00 6.16 67 Std. Deviatio n 1.26491 1.37840 1.54919 1.46528 Std. Erro r .516 40 .562 73 .632 46 .345 37 95% Confidence Interval for Mean Lower Upper Bound Bound 5.6726 4.0535 4.3742 5.4380 8.3274 6.9465 7.6258 6.8953 Minimu m 5.00 4.00 4.00 4.00 Maximu m 8.00 7.00 8.00 8.00

6 6 6 1 8

Test of Homogeneity of Variances Ratings Levene Statistic .536

df1 2

df2 15 ANOVA

Sig. .596

Ratings Sum of Square s Between Groups Within Groups Total 7.000 29.500 36.500 df 2 15 17 Mean Square 3.500 1.967 F 1.78 0 Sig. .203

INFERENCE:

1. The descriptive of the ratings are obtained in terms of mean and standard deviation.

2. The mean values of the three versions of ad copy are displayed.

3. The significance value for the Levenes test of homogeneity of variables is high (0.596 >0.05) and the ANOVA table, sig represents the significance level of F-test.

4. Therefore the null hypothesis is not rejected and alternate hypothesis is not accepted. Hence the variances for the three versions are equal and the assumption is justified.

RESULT:

There is no significant difference in the preferences over the three versions of the ad copy by a target population before the launch of its campaign.

CORRELATION

OBJECTIVE:

To find the interrelationship between the dependent and the independent variables.

PROBLEM:

A manufacturer and the marketer of electric motors would like to build a regression model consisting of 5 or 6 independent variables to sales. Past data has been collected for 15 sales territories, on sales and 6 different independent variables. Build a regression model and recommend whether or not it should be used by the company.

Dependent Variable:

Sales (in Rs. Lakhs) in the territory.

Independent Variables:

X1

Market potential of the territory.

X2

No. of dealers of the Company in the territory.

X3

No. of sales person in the territory.

X4

Index of the competitor activity in the territory on a 5 point scale.

X5

No. of service people in the territory.

X6

No. of existing customers in the territory.

S. No. 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15

Sales (Y) 5 60 20 11 45 6 15 22 29 3 16 8 18 23 81

Potential (X1) 25 150 45 30 75 10 29 43 70 40 40 25 32 73 150

Dealers (X2) 1 12 5 2 12 3 5 7 4 1 4 2 7 10 15

People (X3) 6 30 15 10 20 8 18 16 15 6 11 9 14 10 35

Competition (X4) 5 4 3 3 2 2 4 3 2 5 4 3 3 4 4

Service (X5) 2 5 2 2 4 3 5 6 5 2 2 3 4 3 7

Customers (X6) 20 50 25 20 30 16 30 40 39 5 17 10 31 43 70

Null Hypothesis

There is no significant relationship between the independent and the dependent variables at 95% confidence interval.

Alternative Hypothesis

There is significant relationship between the independent and dependent variables at 95% confidence interval.

PROCEDURE: Let the estimating equation be Y= a1X1+a2X2+a3 X3+a4X4+a5 X5 1. The variables are defined in the variable view of the SPSS data editor. 2. Enter the data in the data view. 3. Choose Analyze > Correlate > Bivariate from the main menu. 4. In the bivariate correlations dialogue box select all the dependent and independent variables. 5. Select Pearsons correlation coefficient with test of significance being one tailed. 6. Also include the statistics for mean and standard deviation. 7. The output chart is generated, analyzed and inference obtained

OUTPUT:

Descriptive Statistics Std. Deviation 21.980 42.543 4.408 8.340 .986 1.633 16.829

Mean Sales in Rs.lakhs in the territory Market potential in the territory (in Rs. lakhs) No. of dealers of the company in the territory No. of sales people in the territory Index of competitor activity in the territory No. of service people in the territory No. of existing customers in the territory 24.13 55.80 6.00 14.87 3.40 3.67 29.73

N 15 15 15 15 15 15 15

Correlation No. of sales people in the territory Index of competit or Sales in activity Rs.lakhs in the in the territory territory No. of Market service potential people in in the the territory territory (in Rs. No. of Lakhs) existing No. of customer dealers of s in the the territory company in the territory Pearson Correlation Sig. (1-tailed) N Pearson Correlation Sig. (1-tailed) Pearson Correlation N Sig. (1-tailed) N Pearson Pearson Correlation Correlation Sig. (1-tailed) Sig. (1-tailed) N N Pearson Correlation Pearson Sig. (1-tailed) Correlation Sig. (1-tailed) N N .953(** ) Sales in .000 Rs.lakh s 15 in the territor -.046 y .436 1 15 15 .726(** .945(** ) ) .001 .000 15 15 .878(** ) .908(** .000 ) .000 15 15

.877(**) Market .000 potential in15 the territory (in Rs. .140 lakhs) .309 .945(**) 15 .000 15 .613(**) 1 .008 15 15 .831(**) .837(**) .000 .000 15 15

.855(**) No. of .000 dealers of the 15 company in the -.082 territory .385 .908(**) 15 .000 15 .685(**) .837(**) .002 .000 15 15 .860(**) 1 .000 15 15

1 No. of sales people 15 in the territor -.036 y .449 .953(* *) 15 .000 15 .794(* .877(* *) *) .000 .000 15 15 .854(* *) .855(* .000 *) .000 15 15

-.036 .794(**) Index of .449 .000 competit No. of or service 15 15 activity people in in 1 the the -.178 territory territory .263 -.046 .726(**) 15 15 .436 .001 15 15 -.178 1 .140 .613(**) .263 .309 .008 15 15 15 15 -.015 .818(**) -.082 .479 .385 15 15 .685(**) .000 .002 15 15

.854(**) .000 No. of existing 15 customer s-.015 in the territory .479 .878(**) 15 .000 15 .818(**) .831(**) .000 .000 15 15 1 .860(**) .000 15 15

INFERENCE:

The correlations table shows Pearson correlation coefficients, significance values, and the number of cases with non missing values.

1. The Pearson correlation coefficient measures the linear association between two variables if the value of the correlation coefficient ranges from -1 to 1. 2. The sign of the correlation coefficient indicates the direction of the relationship. Hence from the inference there is a negative relation between sales and the index of the competitor activity and the positive relationship with market potential, number of dealers, no of salespersons, number of service people and the no of existing customers. 3. The absolute value of the correlation coefficient indicates the strength, with larger absolute values indicating stronger relationships. 4. The significance levels of market potential is 0.000, no of service people is 0.001 and no of existing customers is 0.000 which is less than 0.05. So null hypothesis is rejected and alternate hypothesis is accepted. Hence it indicates that the correlation is significant and the variables are linearly related with sales. 5. The significance level of the index of the competitor 0.436 is greater than 0.05 then the correlation is not significant and the variable is not linearly related. 6. This indicates that the manufacturer should not consider the index of the competitor since it does not affect the sales.

RESULT:

Hence there is dependence between the sales (dependent variable) and the market potential of the territory, number of dealers of the company in the territory, number of sales person in the territory, number of service people in the territory, number of existing customers in the territory (independent variables). Index of the competitor activity in the territory and sales are negatively correlated.

FACTOR ANALYSIS

OBJECTIVE:

To find the factors which are fewer but linear combinations of original 10 variables.

PROBLEM:

A two wheeler manufacturer is interested in determining which variables potential customers think about when they consider his product. Twenty two-wheeler owners were surveyed by the manufacturer. They were asked to indicate on a 7 point scale, 1- Completely agree to 7- Completely disagree. Their agreement or disagreement with a set of 10 statements relates to their perception and some attributes of the two-wheeler. Use factorial analysis to find underlying factors which are fewer but are linear combinations of original 10 variables.

TEN STATEMENTS:

1. I use a two-wheeler because it is affordable. 2. It gives me a sense of freedom to own a two-wheeler. 3. Low maintenance cost makes a two-wheeler very economical in a long run. 4. Two-wheeler is essentially a mans vehicle. 5. I feel very powerful when I am on my two-wheeler. 6. Some of my friends who dont have their own vehicle is jealous of me. 7. I feel good whenever I see the ad of my two-wheeler. 8. My vehicle gives me a comfortable ride. 9. I think two-wheelers are safe way to travel. 10. Three people should be legally allowed to travel on a two-wheeler.

S.No. 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20

Q1 1 2 2 5 1 3 2 4 2 1 1 1 3 2 2 5 1 2 3 4

Q2 4 3 2 1 2 2 2 4 3 4 5 6 1 2 5 6 4 3 3 3

Q3 1 2 2 4 2 3 5 3 2 2 1 1 4 2 1 3 2 1 2 2

Q4 6 4 1 2 5 3 1 4 6 2 3 1 4 2 3 2 2 1 3 7

Q5 5 3 2 2 4 3 2 4 5 1 2 1 4 2 2 1 1 2 4 6

Q6 6 3 1 2 4 3 1 5 6 2 3 1 3 2 3 3 2 2 3 6

Q7 5 3 1 2 4 3 2 3 5 1 2 1 3 2 2 2 1 2 4 6

Q8 2 5 7 3 1 6 4 2 1 4 2 1 6 1 2 5 1 3 3 2

Q9 3 5 6 2 1 5 4 3 4 4 2 2 5 3 1 5 1 2 3 3

Q10 2 2 2 3 2 3 5 3 1 1 1 2 3 2 6 4 3 2 3 6

PROCEDURE:

1. The variables are defined in the variable view of the SPSS data editor.

2. Enter the given data in the data view.

3. Choose Analyze > Data reduction > Factor analysis from the main menu and enter the variables.

4. In the factor analysis dialogue box select the analysis variables and check the options as required.

5. The output chart is generated, analyzed and inference obtained.

OUTPUT:

Descriptive Statistics Mean It is affordable Gives sense of freedom Economical Man's vehicle Feel powerful Friends would be jealous Feel good to see ad of my vehicle Comfortable driving Safe way to travel 3 people should be legally allowed 2.35 3.25 2.25 3.10 2.80 3.05 2.70 3.05 3.20 2.80 Std. Deviation 1.309 1.482 1.118 1.804 1.508 1.605 1.455 1.905 1.508 1.473 Analysis N 20 20 20 20 20 20 20 20 20 20

Communalities Initial Extractio n .722 .452 .731 .945 .950 .914

It is affordable 1.000 Gives sense of freedom 1.000 Economical 1.000 Man's vehicle 1.000 Feel powerful 1.000 Friends would be jealous 1.000 Feel good to see ad of my 1.000 .955 vehicle Comfortable driving 1.000 .799 Safe way to travel 1.000 .777 3 people should be legally 1.000 .789 allowed Extraction Method: Principal Component Analysis.

Component Matrix (a) Component 2 .670 -.608 .820 -.036 .166 -.084 .096 .775 .735 .319

It is affordable Gives sense of freedom Economical Man's vehicle Feel powerful Friends would be jealous Feel good to see ad of my vehicle Comfortable driving Safe way to travel 3 people should be legally allowed

1 .176 -.136 -.107 .966 .951 .952 .971 -.322 -.069 .161

3 .493 .254 .218 -.097 -.136 -.025 -.046 -.308 -.482 .814

Extraction Method: Principal Component Analysis. 1. 3 components extracted. Component Score Coefficient Matrix Component 2 .023 -.278 .176 .003 .081 -.038 .026 .360 .406 -.203

It is affordable Gives sense of freedom Economical Man's vehicle Feel powerful Friends would be jealous Feel good to see ad of my vehicle Comfortable driving Safe way to travel 3 people should be legally allowed

1 .004 -.063 -.041 .256 .257 .245 .253 -.047 .033 -.032

3 .434 .043 .283 -.042 -.030 -.007 .014 -.057 -.166 .568

Extraction Method: Principal Component Analysis. Rotation Method: Varimax with Kaiser Normalization. Component Scores.

Component Score Covariance Matrix Compon ent 1 2 3 1 1.000 .000 .000 2 .000 1.000 .000 3 .000 .000 1.000

Extraction Method: Principal Component Analysis. Rotation Method: Varimax with Kaiser Normalization. Component Scores. INFERENCE:

Factor analysis is primarily used for data reduction or structure detection. 1. Communalities indicate the amount of variance in each variable that is accounted for. 2. Communalities table reports the factor loadings for each variable on the unrotated components or factors. 3. Rotated component matrix table (called the Pattern Matrix for oblique rotations) reports the factor loadings for each variable on the components or factors after rotation. 4. Group the factors which have high values. 5. Here mans vehicle, feel powerful, friend would be jealous and feel good to see ad of my vehicle have high value (greater than 0.5). So we can group them into component 1. 6. Similarly economical, comfortable driving and safe way to travel have high value and hence are grouped in component 2. 7. Finally, it is affordable and three people should be legally allowed are grouped into component 3. 8. Since value of sense of freedom is negative in all the three components this factor is eliminated.

RESULT: The ten factors are clustered into three components

DISCRIMINANT ANALYSIS

OBJECTIVE:

To conduct Discriminant Analysis for the given data using SPSS software

PROBLEM:

Conduct Discriminant Analysis that predicts membership of two groups based on the dependent variable category and creating the discriminant equation with inclusion of 17 independent variables selected by a step-wise procedure based on minimization of Wilks Lambda at each step. NULL HYPOTHESIS There is no discrimination in membership of two groups. ALTERNATE HYPOTHESIS There is discrimination in membership of two groups. PROCEDURE:

1. Select Analyze from the menu 2. Select grouping variable (1,2).

Classify Discriminant.

3. Define the range min:1 & max: 2. 4. Select all the variables as independent variables select Use stepwise method. 5. Click Statistics check Means, Univariate ANOVA, Boxs M and Under

standardized Continue. 6. Click select Wilks Lambda method and enter F value Entry: 1.15 and Exit: 1

Continue. 7. Click Classify Check All groups equal, Case wise results, Summary table, Combined-groups 8. Click OK. Continue.

OUTPUT:

Analy i

ase Pr

essing ummary 50 0 0 Percent 100.0 .0 .0

Box's Test of Equality of Covariance Matrices

L g Determinants 1= OMPL TED PHD, 2=DID OT COMPLETE PHD FI ISH OT FI ISH Pooled wit in-groups Rank 6 6 6 Log Determinant 3.031 2.918 3.800

The ranks and natural logarithms of determinants printed are those of the group covariance matrices.

Test Results Box's M F Approx. df1 df2 Sig. 39.633 1.633 21 8474.108 .034

Tests null hypothesis of equal population covariance matrices.

Unwei ted ases Valid xcluded Missing r t- f-range gr des At least ne missing discriminating variable Bot missing or out-of-range group codes and at least one missing discriminating variable Total Total

'&

2 25 432 2 32 2

%$$

&

0)

0 0 50

.0 .0 100.0

Tes s of Eq a Wilk ' La bda .795 .951 .998 .628 .974 .650 .756 .534 .679 1.000 .993 .768 .904 .787 .972

of G o p Means d1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 d2 48 48 48 48 48 48 48 48 48 48 48 48 48 48

VER LL C LLEGE G A MAJ R AREA G A GRE SC RE ON S EC AL Y EXAM GRE SCORE ON QUAN A VE GRE SCORE ON VER AL F RS LE ER OF RECOMMENDA ON SECOND LE ER OF RECOMMENDA ON H RD LE ER OF RECOMMENDA ON S UDEN S MO IVA ION S UDEN S EMO IONAL S ABILI Y FINAICIAL/ ERSONAL RESOURCES O COM LE E AGE IN YEARS A EN RY ABILI Y O IN ERAC EASILY RA ING OF S UDEN HOS ILI Y MEAN RA ING OF SELEC ORS IM RESSION OF A LICAN

F 12.356 2.493 .113 28.393 1.283 25.889 15.476 41.969 22.722 .007 .337 14.526 5.079 13.017

1.378

9 C

B8A@ 9

Sig. .001 .121 .738 .000 .263 .000 .000 .000 .000 .934 .564 .000 .029 .001 .246

H H

H H H G H

H F

HH

HH

HH

G G HD H H H H H H H H

H H

D H

FF F

H H H

48

z~
ct

m h j i o g r w z zy x rwvut
ri r f l s
is t ti

}q

nnn ij nnn jj nnn hj nnn mj nnn pj nnn kj o|


f . . . . . .

g mjj i g g kk h g hlj p g onm o kim ko g lml j zq


t tistic . . . . . .

i|
f

o|
f

kji pmi g li g h j g hj jihwv { z| z


. . . . . .

sr

vvv

vvv us f d
ct

t
f

wxw ts t
f t tistic .

vvv us y
a,b,c,d

t r WS `YSX WS S VU TS
t r / v Variabl f

t
f

srq t
f

d ed e d e d d f i h qtt t d dd ff ee h c vrd d e e h p dehyedg c dg eyd cp iey d cp g pi f x pg cei p i yyfde he gdycde y p i pdfh q i d ip y pd s i dp e pbdi id hhegfe d y dc f r i p ee yh idb y e pbii d hhegd d fdec d b a yt y
t t c st . c. . . ,t i i l i i ri l, t l r l t T R I I R TI R T R T IR TT R I IR T TT R T T T R R rti l TR rti l TI R I R t I I T c , r I I TI TI T i i iz s t R r f st TI t t r t r t r is . s is i s ffici is . . r ll . t f r f rt ilks' rc t r .

vvv

vvv rs
.

xss rt
.

vvv us
.

usr

vvv vvv vvv vvv

vvv ss vvv qs vvv xs vvv s


. . . . .

uut qt qws t vx t urx u


. . . .

vvv us vvv us vvv us vvv us


. . . . .

t t t t

xr twr qts tqs

s r

s r

c
ilks'

Stepwise Statistics:

t tistic

Wilk ' Lambda

i . . . . . . .

nnn nnn nnn nnn nnn nnn

m h j i o g

kj kj kj kj kj kj

g g g g g g

m h j i o g

m h j i o g q

i .

. . . . . .

Va ab es n he Ana s s

HIRD LE ER OF RECOMMENDA ION HIRD LE ER OF RECOMMENDA ION S UDEN S MO IVA ION HIRD LE ER OF RECOMMENDA ION S UDEN S MO IVA ION FIRS LE ER OF RECOMMENDA ION HIRD LE ER OF RECOMMENDA ION S UDEN S MO IVA ION FIRS LE ER OF RECOMMENDA ION AGE IN YEARS A EN RY HIRD LE ER OF RECOMMENDA ION S UDEN S MO IVA ION FIRS LE ER OF RECOMMENDA ION AGE IN YEARS A EN RY MEAN RA ING OF SELEC ORS IM RESSION OF A LICAN HIRD LE ER OF RECOMMENDA ION S UDEN S MO IVA ION FIRS LE ER OF RECOMMENDA ION AGE IN YEARS A EN RY MEAN RA ING OF SELEC ORS IM RESSION OF A LICAN FINAICIAL/ ERSONAL RESOURCES O COM LE E

1.000 .987 .987 .957 .934 .909 .955 .913 .908 .969 .935 .898 .897 .943 .926 .935 .882 .892 .885 .926

41.969 23.774 8.633 15.183 4.617 3.943 12.842 3.122 3.418 2.733 13.679 3.642 2.417 3.452 2.941 .679 .534 .552 .457 .451 .503 .419 .421 .415 .481 .397 .387 .396 .391 .448 .381 .369 .385 .370

12.445 4.182 2.580 4.599 2.755

.897

2.372

S ep 1

ole an e

F to Re ove

.367

Wilk ' La bda

Summary of Canoni al Di riminant Functions

Eigenvalues Function 1 Eigenvalue % of Variance 1 876a 100 0 Cumulative % 100 0 Canonical Correlation 808

a First 1 canonical discriminant functions were used in the analysis

Wilks' Lambda

Standardized Canonical Discriminant Function Coefficientts


Functi n 1

FI

T L TT F C MM ND TI N

.312 .607 .393 . 99 .409

THI D L TT F C MM ND TI N TUD NT M TI

TI N

G IN

RS T NTRY

M N RATING OF S L CTORS IMPR SSION OF APPLICANT

.316

FIN ICI L/P RSON L R SOURC S TO COMPL T

Test of Function(s) 1

Wilks' Lambda 348

Chi-square 47 541

df

Sig 000

Str ct re Matrix Function 1 THIRD LETTER OF RECOMMENDATION RE SCORE ON a UANTATIVE FIRST LETTER OF RECOMMENDATION STUDENTS MOTIVATION A E IN YEARS AT ENTRY RATIN OF STUDENT a HOSTILITY SECOND LETTER OF a RECOMMENDATION a OVERALL COLLE E A ABILITY TO INTERACT a EASILY a MA OR AREA A RE SCORE ON a S ECIALITY EXAM MEAN RATIN OF SELECTORS IM RESSION OF A LICANT a RE SCORE ON VERBAL FINAICIAL/ ERSONAL RESOURCES TO COM LETE STUDENTS EMOTIONAL a STABILITY .683 .547 .536 .502 .402 -.335 .278 .237 .178 .129 -.126

ooled within-groups correlations between discriminating variables and standardized canonical discriminant functions Variables ordered by absolute size of correlation within function. a. This variable not used in the analysis.

.124 -.068 .061 -.027

Canon ca D sc

nan F nc on Coeff c en s Fun tion 1

F nc ons a G o p Cen o s

Un tanda di ed anoni al di i inant fun tion evaluated at g oup mean

NOT FINISH

1=COM LETED HD 2=DID NOT COM LETE PHD FINISH

Fun tion 1 1.342 1.342

Un tanda di ed oe i ient

FIRST LETTER OF RECOMMENDATION THIRD LETTER OF RECOMMENDATION STUDENTS MOTIVATION FINAICIAL/ ERSONAL RESOURCES TO COM LETE AGE IN YEARS AT ENTRY MEAN RATING OF SELECTORS IM RESSION OF A LICANT Con tant)

.288 .617 .490 .175 .091 .262 15.564

Casew se S a s cs Highe t Group Se ond Highe t Group Squared Mahalanobi Di tan e to Centroid .645 .150 .007 .149 .097 1.812 .004 .090 .599 .068 .891 .246 .184 .769 .318 1.386 .062 3.393 1.436 .001 .141 .173 .301 1.118 .001 .214 .034 .278 .306 1.184 .015 .348 .982 .000 .617 .706 .893 .207 .064 .148 .966 2.054 .000 1.629 .001 .140 .018 .634

Ca e Numbe 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48

A tual G oup 1 1 1 1 2 1 1 1 1 1 1 1 1 2 1 1 1 1 1 1 2 1 1 1 1 2 2 2 2 1 2 2 2 2 2 2 2 2 1 2 2 2 2 2 2 2 1 2

1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1

P(G=g | D=d) .997 .990 .967 .929 .941 .999 .968 .988 .997 .987 .998 .906 .991 .777 .994 .609 .986 1.000 .595 .976 .931 .991 .994 .998 .975 .992 .957 .993 .893 .664 .964 .883 .998 .973 .997 .997 .744 .992 .949 .990 .724 .999 .972 .999 .976 .990 .962 .812

Group 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 1 2 2 2 2 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1

P(G=g | D=d) .003 .010 .033 .071 .059 .001 .032 .012 .003 .013 .002 .094 .009 .223 .006 .391 .014 .000 .405 .024 .069 .009 .006 .002 .025 .008 .043 .007 .107 .336 .036 .117 .002 .027 .003 .003 .256 .008 .051 .010 .276 .001 .028 .001 .024 .010 .038 .188

Predi ted Group 1 1 1 1 1** 1 1 1 1 1 1 1 1 1** 1 1 1 1 1 1 2 1 1 1 1 2 2 2 2 2** 2 2 2 2 2 2 2 2 2** 2 2 2 2 2 2 2 2** 2

P(D>d | G=g) p df .422 .698 .934 .699 .755 .178 .947 .765 .439 .795 .345 .620 .668 .381 .573 .239 .804 .065 .231 .970 .708 .677 .583 .290 .980 .644 .854 .598 .580 .277 .903 .555 .322 .999 .432 .401 .345 .649 .800 .701 .326 .152 .986 .202 .973 .709 .892 .426

Di

Squared Mahalanobi Di tan e to Centroid 12.160 9.436 6.766 5.281 5.629 16.244 6.851 8.901 11.958 8.669 13.165 4.788 9.693 3.266 10.551 2.271 8.601 20.487 2.207 7.408 5.332 9.611 10.450 13.998 7.340 9.902 6.251 10.316 4.541 2.547 6.564 4.385 13.508 7.195 12.037 12.422 3.025 9.852 5.909 9.414 2.894 16.953 7.108 15.687 7.385 9.350 6.495 3.565

Classi icati P T PHD, 2=DID T P T PHD I ISH N T INISH INISH N T INISH

INFERENCE:

1. The F and significant F values identify for which variables the two groups differ significantly. 2. The canonical correlation coefficient is .808 which shows strong correlation. 3. The significance values are <0.05

RESULT:

There is discrimination in membership between two groups.

f ri i

l r

c s s c rr ctly cl ssifi

ri i

INISH 22 2 88. 8.

a sults

Pr

ict

r rs i N T INISH 3 23 2. 2.

T t l 25 25 . .

CLUSTER ANALYSIS

OBJECTIVE: To conduct K-means cluster analysis.

PROBLEM: Brands of 21 VCRs are given along with their attributes. Determine the hierarchical K-means cluster analysis.

PROCEDURE: 1. 2. 3. 4. 5. 6. Select Analyze Classify K-means cluster Select all variables and move into the variables boz. Label case as brand Enter number of clusters as 3 Then select the required options from Save. Then Continue Click Options check initial cluster centre continue ok

OUTPUT: Quick Cluster


a Iterati n Hist ry

Iteration 1 2 3

Change in Cluster Centers 1 2 3 52.791 70.267 80.393 14.037 12.547 .000 .000 .000 .000

a. Convergence achieved due to no or small change in cluster centers. The maximum absolute coordinate change for an center is .000. The current iteration is 3. The minimum distance between initial centers is 335.404. Nu ber f Cases in each C uster Cluster 1 2 3 8.000 8.000 5.000 21.000 .000

Valid Missing

Initial Clust r Centers Cluster 2 535 5 5 5 5 5 3 4 3 5 4 4 4 6 365 3 4 4 12 12 12

pric pict r1 pict r2 pict r3 pict r4 pict r5 pr r r c pt1 r c pt3 audi 1 audi 2 audi 3 f atur s ts days r t 1 r t 2 r t 3 extras1 extras2 extras3

200 3 3 3 3 3 3 4 2 4 4 4 4 8 365 3 3 3 3 3 3

3 380 2 2 2 2 2 3 3 3 4 4 4 4 4 30 4 4 4 6 6 6

Final Cluster Centers Cluster 2 453 4 4 4 4 4 3 4 3 4 4 4 4 7 365 3 3 3 8 8 8

1 price pictur1 pictur2 pictur3 pictur4 pictur5 pr ram recept1 recept3 audi 1 audi 2 audi 3 features events days remote1 remote2 remote3 extras1 extras2 extras3 239 3 3 3 3 3 3 4 2 4 4 4 4 8 365 3 3 3 3 3 3

3 460 4 4 4 4 4 3 3 3 4 3 4 4 6 30 4 4 4 10 10 10

INFERENCE: Three clusters were formed. RESULT: Thus cluster analysis was done using SPSS.

REGRESSION

OBJECTIVE:

To find the dependency of the variables with respect to the sales of the company.

PROBLEM:

A manufacturer and the marketer of electric motors would like to build a regression model consisting of 5 or 6 independent variables to sales. Past data has been collected for 15 sales territories, on sales and 6 different independent variables. Build a regression model and recommend whether or not it should be used by the company.

Dependent Variable:

Sales (in Rs. Lakhs) in the territory.

Independent Variables:

X1 X2 X3 X4 X5 X6

Market potential of the territory. No. of dealers of the Company in the territory. No. of sales person in the territory. Index of the competitor activity in the territory on a 5 point scale. No. of service people in the territory. No. of existing customers in the territory.

S. No. 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15

Sales (Y) 5 60 20 11 45 6 15 22 29 3 16 8 18 23 81

Potential (X1) 25 150 45 30 75 10 29 43 70 40 40 25 32 73 150

Dealers (X2) 1 12 5 2 12 3 5 7 4 1 4 2 7 10 15

People (X3) 6 30 15 10 20 8 18 16 15 6 11 9 14 10 35

Competition (X4) 5 4 3 3 2 2 4 3 2 5 4 3 3 4 4

Service (X5) 2 5 2 2 4 3 5 6 5 2 2 3 4 3 7

Customers (X6) 20 50 25 20 30 16 30 40 39 5 17 10 31 43 70

Null Hypothesis There is no dependence between the independent variables, market potential, no. of dealers, no. of sales person, index of the competitor activity, no. of service people, no. of existing customers and the dependent variable sales at 95% confidence interval.

Alternative Hypothesis There is dependence between the independent and dependent variables at 95% confidence interval.

PROCEDURE: Let the estimating equation be Y= a1X1+a2X2+a3 X3+a4X4+a5 X5 1. The variables are defined in the variable view of the SPSS data editor. 2. Enter the data in the data view. 3. Choose Analyze >Regression > Linear from the main menu. 4. In the linear regression dialogue box enter sales as the dependent variable and all the other variables as the independent variables. 5. Click the statistics button and click the regression coefficient estimates, model fit and descriptive check boxes. 6. The output chart is generated and it is analyzed and inference obtained.

OUTPUT: Model Summary (b) Mode l 1 Adjusted R Square R Square .977 .960 Std. Error of the Estimate 4.391

R .989(a)

a Predictors: (Constant), No. of existing customers in the territory, Index of competitor activity in the territory, No. of service people in the territory, Market potential in the territory (in Rs. lakhs), No. of dealers of the company in the territory, No. of sales people in the territory b Dependent Variable: Sales in Rs.lakhs in the territory

ANOVA (b) Mode l 1 Regressi on Residual Total Sum of Squares 6609.48 5 154.249 6763.73 3 df 6 8 14 Mean Square 1101.581 19.281 F 57.1 33 Sig. .000( a)

a Predictors: (Constant), No. of existing customers in the territory, Index of competitor activity in the territory, No. of service people in the territory, Market potential in the territory (in Lakhs), No. of dealers of the company in the territory, No. of sales people in the territory b Dependent Variable: Sales in Lakhs in the territory

Coefficients (a) Mod el Standardi ed Coeffi ie nts Beta -.546 .439 .600

Unstandardi e d Coeffi ients Std. Error -3.173 5.813 B

Sig.

95% Confidence Interval for B Lower Bound -16.579 .055 Upper Bound 10.233 .399

(Constant Market potential in the territory (in Rs. .227 .075 Lakhs) No. of dealers of the company in the .819 .631 territory No. of sales people 1.091 .418 in the territory Index of competitor activity in the -1.893 1.340 territory No. of service people in the -.549 1.568 territory No. of existing customers in the .066 .195 territory

3.040 .016

.164 .414 -.085

1.298 .230 2.609 .031 -1.413 .195

-.636 .127 -4.982

2.275 2.055 1.197

-.041

-.350

.735

-4.166

3.067

.050

.338

.744

-.384

.516

a Dependent Variable: Sales in Rs.lakhs in the territ ry Therefore the esti ating equation is: Y= 0.439X1+0.164X2+0.414X3-0.085X4-0.041X5+0.05X6

INFERENCE: 1. The variables are selected using the enter method. 2. The values of R ranging from 0 to 1 are determined. Larger values indicate stronger relationship. 3. The significance value .000 arrived through ANOVA is less than 0.05. So null hypothesis is rejected and alternate hypothesis is accepted. Hence the independent variables explain the variations of the dependent variable. 4. The t statistics shows the relative importance of each variable with respect to the regression coefficients where t values below -2 or above +2 are good predictors. 5. The t statistic and its significance value are used to test the null hypothesis that the regression coefficient is zero (or that there is no linear relationship between the dependent and independent variable). 6. The significance levels of the market potential (0.016) and no of sales people (0.031) are less than 0.05. So null hypothesis is rejected and alternate hypothesis is accepted. Hence the variables are linearly related with sales. 7. The significance level of the number of dealers (0.230), index of competitor (0.195), no of service people (0.735) and no. of existing customer (0.744) is greater than 0.05. So null hypothesis is not rejected and alternate hypothesis is not accepted. Hence the variables are not linearly related. 8. Residuals are estimates of the true errors in the model. The residual statistic gives the difference between the observed value of the dependent variable and the value predicted by the model. 9. Since the residual value (154.249) is less than regression value (6609.485) the estimating equation is the best fit. 10. Since the significance value is less than 0.05 the estimating equation is the best fit. 11. Since the model is appropriate for the data, the residuals follow a normal distribution as indicated by a histogram. RESULT There is dependency between the sales (dependent variable) and the market potential of the territory and number of sales person in the territory (independent variables). The other independent variables, number of dealers, number of service people, index of the competitor activity and number of existing customers in the territory has a non-linear relationship with sales.The estimating equation is : Y= 0.439X1+0.164X2+0.414X3 -0.085X4-0.041X5+0.05X6

CORRELATION

OBJECTIVE:

To find the interrelationship between the dependent and the independent variables.

PROBLEM:

A manufacturer and the marketer of electric motors would like to build a regression model consisting of 5 or 6 independent variables to sales. Past data has been collected for 15 sales territories, on sales and 6 different independent variables. Build a regression model and recommend whether or not it should be used by the company.

Dependent Variable:

Sales (in Rs. Lakhs) in the territory.

Independent Variables:

X1 X2 X3 X4 X5 X6

Market potential of the territory. No. of dealers of the Company in the territory. No. of sales person in the territory. Index of the competitor activity in the territory on a 5 point scale. No. of service people in the territory. No. of existing customers in the territory.

S. No. 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15

Sales (Y) 5 60 20 11 45 6 15 22 29 3 16 8 18 23 81

Potential (X1) 25 150 45 30 75 10 29 43 70 40 40 25 32 73 150

Dealers (X2) 1 12 5 2 12 3 5 7 4 1 4 2 7 10 15

People (X3) 6 30 15 10 20 8 18 16 15 6 11 9 14 10 35

Competition (X4) 5 4 3 3 2 2 4 3 2 5 4 3 3 4 4

Service (X5) 2 5 2 2 4 3 5 6 5 2 2 3 4 3 7

Customers (X6) 20 50 25 20 30 16 30 40 39 5 17 10 31 43 70

Null Hypothesis

There is no significant relationship between the independent and the dependent variables at 95% confidence interval.

Alternative Hypothesis There is significant relationship between the independent and dependent variables at 95% confidence interval.

PROCEDURE: Let the estimating equation be Y= a1X1+a2X2+a3 X3+a4X4+a5 X5 8. The variables are defined in the variable view of the SPSS data editor. 9. Enter the data in the data view. 10. Choose Analyze > Correlate > Bivariate from the main menu. 11. In the bivariate correlations dialogue box select all the dependent and independent variables. 12. Select Pearsons correlation coefficient with test of significance being one tailed. 13. Also include the statistics for mean and standard deviation. 14. The output chart is generated, analyzed and inference obtained.

OUTPUT:

Descriptive Statistics Std. Deviation 21.980 42.543 4.408 8.340 .986 1.633 16.829

Mean Sales in Rs.lakhs in the territory Market potential in the territory (in Rs. lakhs) No. of dealers of the company in the territory No. of sales people in the territory Index of competitor activity in the territory No. of service people in the territory No. of existing customers in the territory 24.13 55.80 6.00 14.87 3.40 3.67 29.73

N 15 15 15 15 15 15 15

CHI-SQUARE TEST

OBJECTIVE:

To find out whether there is a significant association between the income of the individuals and intention to purchase.

PROBLEM:

A customer survey was conducted for a brand of detergent. One of the questions dealt with the income category and the other asked the respondent to rate his purchase intention. These two variables are listed in the table below. Both variables are coded as follows:

INCOME TABLE

CODE 1 2 3 4

INCOME IN Rs./month <=5000 5001-10000 10001-20000 >20000

PURCHASE TABLE

Code 1 2 3 4 5

Intention None Low High Very high Certain

S. No. 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20

Income <5000 <5000 <5000 <5000 <5000 5001-10000 5001-10000 5001-10000 5001-10000 5001-10000 10001-20000 10001-20000 10001-20000 10001-20000 10001-20000 >20000 >20000 >20000 >20000 >20000

Code 1 1 1 1 1 2 2 2 2 2 3 3 3 3 3 4 4 4 4 4

Intent None Low Low None High Low High Very high High Low High Very high Certain High Very high High Certain Very high Certain Certain

Intent Code 1 2 2 1 3 2 3 4 3 2 3 4 5 3 4 3 5 4 5 5

Null Hypothesis

There is no significant association between income and purchase intention.

Alternate Hypothesis There is significant association between income and purchase intention.

PROCEDURE: 1. The field names and the corresponding data types are entered in the variable view with the income and purchase intention in nominal measure. 2. The given data is entered in the data view. 3. Choose Analyze > Descriptive statistics > Cross tabs > statistics from the main menu. 4. Select Chi-square test and the required cells are checked. 5. Select income of the respondent (the independent variable) into the rows option and the intention of the respondents (the dependent variable) into the columns option. 6. The value and significance column are compared from the output and the inference is made.

OUTPUT: CHI-SQUARE TEST Case Processing Summary Cases Missing N Percent 0 .0%

N Income of the respondent * Intention to purchase 20

Valid Percent 100.0%

N 20

Total Percent 100.0%

Income of the respondent * Intention to purchase Cross tabulation Count None Income of the respondent <5000 5001-10000 1000120000 >20000 Total 2 0 0 0 2 Intention to purchase Very Low High high 2 1 0 2 2 1 0 0 4 2 1 6 2 1 4 Total Certain 0 0 1 3 4 None 5 5 5 5 20

Chi-Square Tests Value Pearson Chi-Square Likelihood Ratio Linear-by-Linear Association N of Valid Cases 18.667(a) 21.134 11.790 20 Df 12 12 1 Asymp. Sig. (2-sided) .097 .048 .001

a. 20 cells (100.0%) have expected count less than 5. The minimum expected count is .50.

Symmetric Measures Approx. Sig. .097 .097

Nominal by Nominal N of Valid Cases

Phi Cramer's V

Value .966 .558 20

a. Not assuming the null hypothesis. b. Using the asymptotic standard error assuming the null hypothesis.

Bar Chart
Intention to purchase
none low high very high certain

Count
1 0 <5000 5001-10000 10001-20000 >20000

Income of the respondent

INFERENCE:

1. The Chi-square tests the hypothesis that the row and column variables in a cross tabulation are independent. 2. The significance value 0.097 is greater than 0.05 3. So null hypothesis is not rejected and alternate hypothesis not accepted. Hence there is no association between the two variables, Income and the intention. 4. The nominal directional measures indicate both the strength and significance of the relationship between the row and column variables of the cross tabulation. 5. The value of each statistic can range from 0 to 1 and indicates the proportional reduction in error in predicting the value of one variable based on the value of other variable. 6. Also the significance value is greater than 0.05 indicating that there is no relationship between the two variables. 7. Hence the two attributes are independent.

RESULT:

There is no association between the income and the purchase intention of the individual.

You might also like