Statistical Package for the Social Sciences (SPSS; Armonk, NY, IBM Corp.) is a statistical software application that allows for researchers to enter and manipulate data and conduct various statistical analyses. Step by step methods for conducting and interpreting over 60 statistical tests are available in Research Engineer. Videos will be coming soon. Click on a link below to gain access to the methods for conducting and interpreting the statistical analysis in SPSS.
Spearman's rho vs. Pearson's r
Bivariate associations between variables
Surveys and the outcomes they generate are oftentimes not able to meet the assumption of normality, as per skewness and kurtosis statistics. Also, some types of variables are just naturally skewed (i.e. income, length of stay at a hospital), and thus require the use of non-parametric statistics.
Spearman's rho correlation is considered non-parametric because it is the correlational test used when finding the association between two variables measured at an ordinal level. Ordinal level measurement does not possess a "true zero" and therefore cannot possess the precision and accuracy of continuous variables.
Pearson's r is used when correlating two continuous variables. However, one MUST check for the assumption of normality and identify and make decisions about any outliers (observations more than 3.29 standard deviations away from the mean). This is of PARAMOUNT IMPORTANCE because correlations are highly influenced by outlying observations. Just ONE outlier can artifically skew a correlation positively or negatively, and in a statistically significant fashion!
Going back to the introduction, remember to use Spearman's rho on interval and ordinal variables as well as with variables that are naturally skewed. Statistics, in and of itself as a science, is very flawed. Not everything you come across in existence will fit the normal curve. Luckily, we have non-parametric statistics that are robust to these common violations of inferential statistical tests.
Correlations and regression are used to establish this kind of evidence
Predictive validity evidence means that a survey instrument has the ability to predict some sort of occurrence in the future. The most common application of predictive validity occurs in tests like the ACT, SAT, GRE, MCAT, LSAT, and GMAT. These tests are given before entering various phases of higher education to assess an individual's potential to graduate from either undergraduate or graduate school. Interestingly enough, the correlation between these prevalent (and expensive) tests and graduation is only 0.3! This means that 91% of what accounts for graduation is NOT associated with test scores on these instruments. And we are talking a multi-BILLION dollar business...but, I digress.
Predictive validity is calculated using simple correlation coefficients. A correlation of 0.1 is considered weak evidence, a correlation of 0.3 denotes moderate evidence, and a correlation of 0.5 would make most social scientists jump for joy. Remember, in order to understand the amount of shared variance between two constructs, you simply "square" the correlation coefficient to yield the coefficient of determination. Even with the highest level of predictive evidence with a predictive validity coefficient of 0.5, you are only accounting for 25% of the association between the two constructs!
Within medicine, I believe that predictive validity plays an important role in imaging and early diagnosis. One of the benefits of working in medicine is that the measures are more objective, concrete, observable, validated, and measurable versus the social sciences. Correlations of 0.9 are common between various etiological, prognostic, confounding, clinical, and demographic phenomena within medicine. If an imaging or diagnostic method can detect the earlier stages of a progressing disease state, then future outcomes can be mitigated with earlier and preventative treatment.
Correlations are used to generate validity evidence
Concurrent, predictive, convergent, and divergent validity
Correlations play a central role in applied psychometrics.
The inter-correlations among survey instrument items play a role in calculating internal consistency reliability coefficients (Cronbach's alpha, split-half, KR-20), test-retest reliability (Spearman-Brown formula), and inter-rater reliability (Kappa, ICC). Correlation matrices also play a significant role in principal components analysis (eigenvalues, factor loadings).
Correlations are used to generate convergent, predictive, and concurrent validity evidence. Significant correlations with theoretically or conceptually similar constructs/survey instruments denotes evidence of validity. In social sciences, a validity coefficient (or correlation coefficient) of .3 is considered evidence of validity.
Pearson's r and Spearman's rho are the most prevalent correlation tests used to generate validity evidence. These correlations are used with survey instruments that generate ordinal or continuous outcomes.
Eric Heidel, Ph.D. is Owner and Operator of Scalë, LLC.