Transformations for ANOVA
Account for non-normal distributions when comparing three independent groups
The statistical assumption of normality is one of the foundational tenets of statistics. Parametric statistics require a normal distribution to be correctly interpreted. If a variable's distribution is non-normal, there are several options that one can choose to still answer a research question.
1. Conduct a logarithmic transformation for the variable's distribution which will "normalize" the distribution. The interpretability of the means and standard deviations of the analysis is lost, but the p-value and the effect size is still interpretable.
2. Identify any outliers (values that are more than 3.29 standard deviations away from the mean) and check for data entry errors. If any observations meet this criterion and the proportion of outliers do not make up more than 10% of all observations, then researchers can delete the outliers in a "listwise" fashion. This means that they completely delete the observations.
3. Run a non-parametric Kruskal-Wallis test. Non-parametric tests are robust enough to handle violations of normality.
1. Conduct a logarithmic transformation for the variable's distribution which will "normalize" the distribution. The interpretability of the means and standard deviations of the analysis is lost, but the p-value and the effect size is still interpretable.
2. Identify any outliers (values that are more than 3.29 standard deviations away from the mean) and check for data entry errors. If any observations meet this criterion and the proportion of outliers do not make up more than 10% of all observations, then researchers can delete the outliers in a "listwise" fashion. This means that they completely delete the observations.
3. Run a non-parametric Kruskal-Wallis test. Non-parametric tests are robust enough to handle violations of normality.
The steps for conducting logarithmic transformations for ANOVA in SPSS
1. Click Transform.
2. Click Compute Variable.
3. In the Target Variable: box, give the outcome a new name that reflects it has been transformed.
4. Click on the continuous outcome variable to highlight it.
5. Click on the arrow to move the variable into Numeric Expression: box.
6. Type "ln" and put parentheses around the variable. Example: ln(outcome)
7. Click OK.
8. In the Data View tab of SPSS, there is a transformed outcome variable.
2. Click Compute Variable.
3. In the Target Variable: box, give the outcome a new name that reflects it has been transformed.
4. Click on the continuous outcome variable to highlight it.
5. Click on the arrow to move the variable into Numeric Expression: box.
6. Type "ln" and put parentheses around the variable. Example: ln(outcome)
7. Click OK.
8. In the Data View tab of SPSS, there is a transformed outcome variable.
The steps for interpreting the transformed variable for ANOVA
1. When researchers clicked on the Save standardized values as variables box when checking for the assumption of normality, a new variable was created with a "Z" at the front and the name of the outcome after it. Example: Zoutcome
2. Click Data.
3. Click Sort Cases.
4. Click on the outcome variable that has a "Z" in front of it.
5. Click on the arrow the "Z" outcome into the Sort by: box.
6. Click OK.
7. In the Data View, look at the "Z" outcome variable and identify any observations that are above an absolute value of 3.29.
8. Look at the original outcome variable and identify the observations that match the "Z" outcome observations above an absolute value of 3.29.
9. Make a decision on whether to delete the observation, transform the outcome variable using the steps above, or run a non-parametric Kruskal-Wallis test.
2. Click Data.
3. Click Sort Cases.
4. Click on the outcome variable that has a "Z" in front of it.
5. Click on the arrow the "Z" outcome into the Sort by: box.
6. Click OK.
7. In the Data View, look at the "Z" outcome variable and identify any observations that are above an absolute value of 3.29.
8. Look at the original outcome variable and identify the observations that match the "Z" outcome observations above an absolute value of 3.29.
9. Make a decision on whether to delete the observation, transform the outcome variable using the steps above, or run a non-parametric Kruskal-Wallis test.
The steps for conducting a Kruskal-Wallis test when violating assumptions of ANOVA in SPSS
1. The data is entered in a between-subjects fashion.
2. Click Analyze.
3. Drag the cursor over the Nonparametric Tests drop-down menu.
4. Drag the cursor over the Legacy Dialogs drop-down menu.
5. Click K Independent Samples.
6. Click on the continuous outcome variable to highlight it.
7. Click on the arrow button to move the outcome variable into the Test Variable List: box.
8. Click on the "grouping" variable to highlight it and then click on the arrow to move the "grouping" variable into the Grouping Variable: box.
9. Click on the Define Range button.
10. Enter the categorical value for the independent group that has the smallest value into the Minimum: box. Example: "0"
11. Enter the categorical value for the independent group that has the largest value into the Maximum: box. Example: "2"
12. Click Continue.
13. Click OK.
2. Click Analyze.
3. Drag the cursor over the Nonparametric Tests drop-down menu.
4. Drag the cursor over the Legacy Dialogs drop-down menu.
5. Click K Independent Samples.
6. Click on the continuous outcome variable to highlight it.
7. Click on the arrow button to move the outcome variable into the Test Variable List: box.
8. Click on the "grouping" variable to highlight it and then click on the arrow to move the "grouping" variable into the Grouping Variable: box.
9. Click on the Define Range button.
10. Enter the categorical value for the independent group that has the smallest value into the Minimum: box. Example: "0"
11. Enter the categorical value for the independent group that has the largest value into the Maximum: box. Example: "2"
12. Click Continue.
13. Click OK.
The steps for interpreting the SPSS output for a Kruskal-Wallis test
1. In the Test Statistics table, look at the p-value associated with Asymp. Sig. row. This is the p-value that is interpreted.
If it is LESS THAN .05, then researchers have evidence of a statistically significant difference in the continuous outcome variable between the two independent groups.
If the p-value is MORE THAN .05, then researchers have evidence that there is not a statistically significant difference in the continuous outcome variable between the two independent groups.
2. Medians and interquartile ranges are reported for each independent group when using the Kruskal-Wallis test.
If it is LESS THAN .05, then researchers have evidence of a statistically significant difference in the continuous outcome variable between the two independent groups.
If the p-value is MORE THAN .05, then researchers have evidence that there is not a statistically significant difference in the continuous outcome variable between the two independent groups.
2. Medians and interquartile ranges are reported for each independent group when using the Kruskal-Wallis test.
Click on the Download Database and Download Data Dictionary buttons for a configured database and data dictionary for transformations for ANOVA.
Statistician For Hire
DO YOU NEED TO HIRE A STATISTICIAN?
Eric Heidel, Ph.D. will provide statistical consulting for your research study at $100/hour. Secure checkout is available with PayPal, Stripe, Venmo, and Zelle.
- Statistical Analysis
- Sample Size Calculations
- Diagnostic Testing and Epidemiological Calculations
- Psychometrics
Contact Dr. Eric Heidel
|
Copyright © 2024 Scalë. All Rights Reserved. Patent Pending. |