You are on page 1of 13

Research Methodology Handout

Parametric Tests
Tests based on assumptions about population distributions and parameters. The assumptions for parametric tests: Interval or ratio level data (SPSS Scale data) Normal distribution or closely so Homogeneity of variance - the variance (standard deviation squared) should be similar in each group Samples randomly drawn from the population

t -Test
T test is a parametric test used for comparing samples means to see, if there is sufficient evidence to infer that the means of the corresponding population distribution also differ. Independent-Samples t - Test The independent sample t-test compares the means of two different samples. The two samples share some variable of interest in common but there is no overlap between the memberships of the two groups. Example the difference between males and females on an exam score. Activity To test whether there is a difference between males and females on total points earned (Data File used grades.sav). Paired Samples t Test The paired samples t test is usually based on groups of individuals, who experience both conditions of the variables of interest. Example students score on the first quiz v/s the same students score on the second quiz. Activity To compare the distribution of scores on quiz1 with scores on quiz2 (Data File used grades.sav). One Sample t-test A one sample t-test allows us to test whether a sample mean (of a normally distributed interval variable) significantly differs from a hypothesized (preset) value. Example Does a course offered to college seniors result in a GRE score greater than or equal to 1200.

Activity To determine if the percent values for the entire class differed significantly from 85 (Data File used grades.sav). Non-Parametric Tests Tests that make no assumptions about population parameters or distributions. MannWhitney U test (also called the MannWhitneyWilcoxon (MWW) or Wilcoxon ranksum test) It is a non-parametric statistical hypothesis test for assessing whether two independent samples of observations have equally large values. It is one of the most well-known non-parametric significance tests. MannWhitney U test accomplishes essentially what a t-test does when the distributions of the two samples deviate significantly from normal. If the distribution does not differ significantly from normal then, t-test should be used because it has greater power. Assumptions: 1 2 All the observations from both groups are independent of each other, The responses are ordinal or continuous measurements

Activity To determine whether women score higher than men on final exam (Data File used grades.sav). The Sign Test It utilizes pair wise comparisons of two different distributions to identify which is larger than which, and then from this information it determines if the two distributions differ significantly from each other. The sign test can be used to test the hypothesis that there is "no difference in medians" between the continuous distributions of two random variables X and Y, in the situation when we can draw paired samples from X and Y. Assumptions: Let Zi = Yi Xi for i = 1, ... , n. 1 2 3 The differences Zi are assumed to be independent. Each Zi comes from the same continuous population. The values of Xi and Yi represent are ordered (at least the ordinal scale), so the

comparisons "greater than", "less than", and "equal to" are meaningful.

Activity To determine whether the scores on quiz1 were significantly higher than the scores on quiz2 (Data File used grades.sav). Wilcoxon Matched-Pairs Signed-Ranks Test Wilcoxon test is a nonparametric test that compares two paired groups. It calculates the difference between each set of pairs, and analyzes that list of differences. The difficulty with the sign test is that a difference between paired quizzes of 10 (10/1, 0 on the other) and the difference of 1 (e.g. 6/1, 5 on the other) will be coded identically (as the magnitude of the differences as a negative). Wilcoxon matched-pairs signed-ranks test incorporates information about the magnitude of the differences between paired values. Activity To compare scores on on quiz1 with scores on quiz2 (Data File used grades.sav). The Runs Test The runs test is used to see if the elements of a particular data set are randomly distributed. If the sequence HHTHTTHTTHTHTTTTHTH resulted from flipping a coin, does this sequence differ significantly from randomness? In other words are we flipping a biased coin unfortunately this procure works only with dichotomous data, it is not possible to test from instance if we are rolling a loaded die. The runs test is a non-parametric statistical test that checks a randomness hypothesis for a two-valued data sequence. More precisely, it can be used to test the hypothesis that the elements of the sequence are mutually independent. Runs tests can be used to test: The randomness of a distribution, by taking the data in the given order and marking with + the data greater than the median, and with the data less than the median; (Numbers equaling the median are omitted.) Whether a function fits well to a data set, by marking the data exceeding the function value with + and the other data with . For this use, the runs test, which takes into account the signs but not the distances, is complementary to the chi square test, which takes into account the distances but not the signs. Activity To see if males and females are distributed randomly (Data File used grades.sav). Kolmogorov-Smirlov One Sample Test

Is a nonparametric test for the equality of continuous, one-dimensional probability distributions that can be used to compare a sample with a reference probability distribution (one-sample KS test), or to compare two samples (two-sample KS test), this is designates to measure whether a particular distribution differs significantly from a normal distribution (skewness and kurtosis is equals to zero), a uniform distribution (values are distributed evely, such as the numbers 1-100 consecutively), a poisson distribution (the value equals the mean and the variance of the distribution; as becomes large, the distribution approximates normality), or an exponential distribution. Activity To test whether the final variable deviates significantly from normal (Data File used grades.sav).

ANOVA
Analysis of variance is a procedure used for comparing sample means to see if there is sufficient evidence to infer that the means of the corresponding population distributions also differ. Whereas, ttests compare only two distributions, analysis of variance is able to compare many. For example: If we want to see among five ethnic groups, any groups score differed significantly form each other on the same quiz, it would require one- way analysis of variance to accomplish this. One Way ANOVA

Using the One Way ANOVA command, we may have exactly one dependent variable (always continuous) and exactly one independent variable (always categorical). Activity To conduct a one-way analysis of variance to see if any of four ethnic groups differ on their quiz4 scores (Data File used grades.sav).

Multivariate Analysis
Multiple Regressions Multiple Regression a technique for estimating the value of the criterion variable (Y) from values on two or more other predictor variables (Xs). It employs the same rationale as simple regression and the formula is a logical extension of that for linear regression: Y = b0 + b1X1 + b2 X2 + b3 X3 +.... etc Types of Multiple Regressions 1 Standard Multiple regression

2 3

Hierarchical Multiple Regression Stepwise Multiple Regression

Decisions about the order of entry for predictors are made solely on statistical decision in the stepwise regression SPSS programme. Forward entry involves the entry of the IVs one at a time as selected by SPSS. Backward selection is the reverse of the forward entry commencing with the insertion of all IVs with SPSS deleting successively those that fail to meet certain critical significance values. SPSS allows you to choose either forward or backward entry by clicking on the Method box Key Terminology Predictor variable: A variable (IV) from which a value is used to estimate a value on another variable (DV) Criterion variable: A variable (DV) a value of which is estimated from a value of the predictor variable (IV) Coefficient of Determination (R Square): This represents the proportion of variation in the criterion variable (Y) which is explained (or accounted for) by variation in the predictor variable (X). Adjusted R Square: It is the modified version of R square which takes into account the number of independent variables in the equation and the sample size. Multicollinearity: Create a correlation matrix and inspect for high correlations of 0.90 and above as this implies the two variables are measuring the same variance and will over-inflate R. Therefore only one of the two variables is needed. Activity To run the regression procedure with a dependent variable of zhelp and independent variables of sympathy, severity, empatend and anger, using the Stepwise method (Data File used helping1.sav). Factor Analysis

Factor analysis originated in psychometrics, and is used in behavioral sciences, social sciences, marketing, product management, operations research, and other applied sciences that deal with large quantities of data. The primary objective of factor analysis is data reduction ibn order to simplify and reduce noise by identifying basic underlying latent factors or components that explain a large portion of the variation in the data set parsimoniously. Types of factor analysis Explanatory Factor Analysis (EFA) aims to reduce large number of variables into a smaller number of factors and thereby identify the factor structure or model. This type is exploratory in

nature. Confirmatory Factor Analysis (CFA) aims to confirm theoretical predictions whether a specified set of constructs are influencing responses in a predicted way. It provides a way of confirming that the factor structure or model obtained through EFA study is robust. Types of factoring Principal component analysis (PCA): This is the most common form of factor analysis, it gives a linear combination of variables in a manner that the maximum variance is extracted from the variables. And then removes this variance and seeks a second linear combination which explains the maximum proportion of the remaining variance, and so on. Canonical factor analysis: It is also known Rao's canonical factoring, and is a different method of computing the same model as PCA. It seeks factors which have the highest canonical correlation with the observed variables and it is unaffected by arbitrary rescaling of the data. Common factor analysis: It seeks the least number of factors which can account for the common variance (correlation) of a set of variables. It is also called principal factor analysis (PFA) or principal axis factoring (PAF). Image factoring: It is based on the correlation matrix of predicted variables rather than actual variables, where each variable is predicted from the others using multiple regression. Alpha factoring: It is based on maximizing the reliability of factors, assuming variables are randomly sampled from a universe of variables. All other methods assume cases to be sampled and variables fixed. Key Terminology Factor loadings: The factor loadings are also known as component loadings and are the correlation coefficients between the individual variables and their respective factors. Communality: The communality measures the percent of variance in a given variable explained by all the factors jointly and may be interpreted as the reliability of the indicator. Eigenvalues/Characteristic roots: The eigenvalue for a given factor measures the variance in all the variables which is accounted for by that factor. Eigenvalues measure the amount of variation in the total sample accounted for by each factor. Factor scores: Also called component scores in PCA, factor scores are the scores of each case on each factor. It is a composite measure for each observation on each factor extracted in factor analysis. Factor scores may be used as variables in subsequent modeling. Kaiser criterion: The Kaiser rule is to drop all components with eigenvalues under 1.0. Scree plot: The scree test says to drop all further components after the one starting the elbow.

Varimax rotation: It is an orthogonal rotation of the factor axes to maximize the variance of the squared loadings of a factor on all the variables in a factor matrix. Each factor will tend to have either large or small loadings of any particular variable. A varimax solution yields results which make it as easy as possible to identify each variable with a single factor. This is the most common rotation option. Quartimax rotation: It is an orthogonal alternative which minimizes the number of factors needed to explain each variable. This type of rotation often generates a general factor on which most variables are loaded to a high or medium degree. Such a factor structure is usually not helpful to the research purpose. Equimax rotation: It is a compromise between Varimax and Quartimax criteria. Direct oblimin rotation: Itis the standard method when one wishes a non-orthogonal (oblique) solution that is, one in which the factors are allowed to be correlated. Activity To conduct a factor analysis with the fifteen efficacy items (effic1 to effic15) with all the default options (plus the varimax rotation). (Data File used helping2.sav).

Chi Square
Chi Square is the most common and simple non-parametric test of significance investigating associations between categories of nominal variables where observations can be classified into discrete categories and treated as frequencies. For Example: - Is there a significant preference for one of three brands of toothpaste among a sample of children; - Is there a significant association between membership or not of a trade union among full-time and part-time employees; - Are there gender preferences for various types of investment category? USE OF CHI SQUARE Chi Square tests hypotheses about the independence (or association) of frequency counts in various categories. The hypotheses are: H0 where the variables are statistically independent or no statistical association, and H1 where the variables are statistically dependent or associated. For example H0 would state that there is no significant association between your gender and which toothpaste you prefer; or that union membership is independent of (not associated with) type of employment, i.e. that the cross-categories from each variable are independent of each

other. TWO FORMS OF CHI SQUARE There are two forms 1. Goodness-of-Fit Chi Square 2. Cross-tabulations (contingency tables) But to whichever of these uses chi square is put, the general principle remains the same. We compare the observed proportions in a sample with the expected proportions and apply the chi-square test to determine whether the difference between observed and expected proportions is likely to be a function of sampling error (non-significant - retaining the null hypothesis H0 ) or unlikely to be a function of sampling error (significant association - reject the null hypothesis and support alternate hypothesis - H1 ).

GOODNESS OF FIT A goodness-of-fit test - how well does an observed distribution fit a hypothesized or theoretical distribution are some brands of frozen peas chosen by consumers more than others?; is absence through sickness regularly distributed through the working week or is sick leave more frequent on some days than other days?; are choices on a survey item with a three- point response scale of yes, no opinion, no, equally divided or is there a significant preference for one choice to the item? The formula for chi square is the summation for each cell: Chi2 = (O - E) E Where: O = observed frequency - the data observed in our research/survey E = expected frequency, and = the summation over all the cells in the table CROSS-TABULATION This is a two-dimensional table showing frequencies in each combination of categories for two nominal variables - each of which can be divided into two or more sub-categories,e.g. preference for type of music (classical, jazz, country and western, rock) against age

group (below 21; 21 - 45; above 45) length of service in year groupings against job position level

CONTINGENCY AND CROSS-TABULATION TABLES 1. The 2 x 2 contingency table has two variables each divided into two categories only organized by rows and columns, i.e. 4 cells. 2. Cross-tabulation tables have more than two rows and two columns, e.g. are investment types associated with age groups. But with increasing rows and columns, interpretation of results becomes more complex and sample sizes must be larger so that sufficient observed counts occur in each cell. RESTRICTIONS IN THE USE OF THE CHI SQUARE chi square is only appropriate for data that are classified as frequency of occurrence (counts) within categories (nominal data) it must only be used on frequencies, never on percentages categories must be mutually exclusive - each response can be classified into only one cell larger samples are needed when there are many categories within each variable. A rule-of-thumb is that the expected frequency in all cells should at least equal or be greater than 5. Fusing of categories is not really desirable, since it involves a reduction in the amount of information available.

Activity (a) To test the hypothesis that the distribution of choices for soft drink is random, that is, there is no significant preference for any specific drink.(Data File used chi square). (b) To test the hypothesis that there is no significant relationship between gender and whether the person smokes or not.(Data File used chi square).

Data Files

Grades.sav The data file is the raw data for calculating the grades in a particular class. Variable ID: Last Name: First Name: Gender: Ethnic: Year: Lowup : Section: GPA: EXTCR: Review: Quiz1 to Quiz 5: Final: Description six digit student id number the last name of the student the first name of the student the gender of the student: 1=F, 2=M the ethnicity of the student: 1=native, 2=Asian, 3= Balck, 4=white, 5= Hispanic year in school: 1=1st year, 2= 2nd year, 3=3rd year, 4=4th year lower or upper division student: 1=lower, 2= upper section of the class (1-3) cumulative GPA at the beginning of the course whether or not the student did the extra credit project: 1= no, 2=yes whether or not the student attended the review sessions : 1= no, 2=yes scores out of 10 points on 5 quizzes through out the term final exam worth 75 points.

Helping1.sav This data file is related to a study of helping behavior; it is real data derived from a sample of 81 subjects. The following variables will be used in the description; all variables except z help (z score between -3 to +3) are measured on a little (1) to much (7) scale. Variable Z help: Sympathy: Anger: Efficacy: Severity: Empatend: Description The dependent variable (a measure of total amount of time spent helping a friend with a problem) The feelings of sympathy aroused in the helper by the friends need. The feelings of anger or irritation aroused in the helper by the friends need. Self efficacy of the helper in relation to the friends need. Helpers rating of how severe the friends problem was. Empathic tendency of the helper as measured by a personality test.

Helping2.sav In the helping2 file, self- efficacy (belief that one has the ability to help effectively) , was measured by 15 questions, each paired with an amount of help question that measured a particular type of helping. An example of one of the paired questions follows: a. Time spent expressing sympathy, empathy or understanding None 0- 15 minutes 15- 30 minutes 30- 60 minutes 1- 2 hours 2- 5 hours ---hours

b. Did you belie you were capable of expressing sympathy, empathy or understanding to your friend? 1 Not at all 2 3 4 some 5 6 7 very much so

There were three categories of help represented in the 15 questions: 6 were intended to measure empathic types of helping, 4 questions were intended to measure informational type of helping, 4 questions were intended to measure instrumental type of helping, and the 15th question was open ended to allow any additional type of help given to be inserted. Factor analysis was conducted on the 15 self efficacy questions to see if the results would yield three categories of self efficacy that were originally intended.

Variable effic1 effic2 effic3 effic4 effic5 effic6 effic7 effic8 effic9 effic10 effic11 effic12 effic13 effic14 effic15

Description efficacy for encourage reassure efficacy for tasks or services efficacy for appraise/clarify efficacy for validate affirm efficacy for loaning mateials efficacy for information advice efficacy for express willingness to help efficacy for participate in activities efficacy for find someone to help efficacy for express sympathy empathy concern efficacy for reduce tension tell jokes efficacy for teach to do better efficacy for empathic listening efficacy for relieve of self blame efficacy for open-ended question

Graduate.sav A wealth of information is collected about each applicant prior to acceptance, and department records indicate whether that student was successful in completing the course. Our example uses the information collected prior to acceptance to predict successful completion of a graduate program. The file is called graduate file and consists of 50 students admitted into the program between 7 and 11 years ago. The dependant variable is category (1=finished the PhD, 2=did not finish), and 27 predictor variables are utilized to predict category membership in one of these two groups: Variable Gender: Age: Marital: GPA: Area GPA: Grearea: Grequent: Greverbal: Letter1: Letter2: Letter3: Motive: Description 1= female, 2=male age in years at the time of application 1=M, 2=S overall undergraduate GPA GPA in the area of specialty score on the major area section of the GRE score on the quantitative section of the GRE score on the verbal section of the GRE first of the three recommendation letters (rated1=weak through 9= strong) second of the three recommendation letters (rated1=weak through 9= strong) third of the three recommendation letters applicants level of motivation (1=low to 9 =high)

Stable: Resource: Interact: Hostile: Impress:

applicants emotional stability (same scale for this and all the follow) financial resources and support system in place applicants ability to interact comfortably with peers and superiors applicants level of inner hostility impression of selectors who conducted an interview

You might also like