# Multiple regression

## Test multivariate associations when predicting for a continuous outcome

Multiple regression is used to predictor for continuous outcomes. In multiple regression, it is hypothesized that a series of predictor, demographic, clinical, and confounding variables have some sort of association with the outcome. The continuous outcome in multiple regression needs to be normally distributed. The predictor, demographic, clinical, and confounding variables can be entered into a

__(all together at the same time), in a__**simultaneous model**__(where an algorithm picks the best group of variables that account for the most variance in the outcome), or in a__**stepwise model**__(a theoretical framework is used to choose the order or entry into a model). Multiple regression yields an algorithm that can predict for a continuous outcome.__**hierarchical model**The figure below depicts the use of multiple regression (simultaneous model). Predictor, clinical, confounding, and demographic variables are being used to predict for a continuous outcome that is normally distributed. Multiple regression is a multivariate test that yields beta weights, standard errors, and a measure of observed variance.

### The steps for conducting multiple regression in SPSS

1. The data is entered in a multivariate fashion.

2. Click

3. Drag the cursor over the

4. Click

5. Click on the continuous outcome variable to highlight it.

6. Click on the

7. Click on the first predictor variable to highlight it.

8. Click on the

9. Repeat Steps 7 and 8 until all of the predictor variables are in the

10. Click on the

11. Click on the

12. Click on the

13. Click on the DEPENDNT variable to highlight it.

14. Click on the

15. Click on the *ZRESID variable to highlight it.

16. Click on the

17. In the

18. Click

19. Click

2. Click

**.**__A__nalyze3. Drag the cursor over the

**drop-down menu.**__R__egression4. Click

**.**__L__inear5. Click on the continuous outcome variable to highlight it.

6. Click on the

**arrow**to move the variable into the**box.**__D__ependent:7. Click on the first predictor variable to highlight it.

8. Click on the

**arrow**to move the variable into the**box.**__I__ndependent(s):9. Repeat Steps 7 and 8 until all of the predictor variables are in the

**box.**__I__ndependent(s):10. Click on the

**button.**__S__tatistics11. Click on the

**R**,__s__quared change**Co**,__l__linearity diagnostics**D**, and__u__rbin-Watson**boxes to select them.**__C__asewise diagnostics12. Click on the

**Plo**button.__t__s13. Click on the DEPENDNT variable to highlight it.

14. Click on the

**arrow**to move the variable into the**box.**__X__:15. Click on the *ZRESID variable to highlight it.

16. Click on the

**arrow**to move the variable into the**box.**__Y__:17. In the

**Standardized Residual Plots**table, click on the**and**__H__istogram**No**boxes to select them.__r__mal probability plot18. Click

**Continue**.19. Click

**OK**.### The steps for interpreting the SPSS output for multiple regression

1. Look in the

The

If the

If the

2. Look in the

The

The

The

The

If a

If a

The

If any of the Tolerance values are

**Model Summary**table, under the**R Square**and the**Sig. F Change**columns. These are the values that are interpreted.The

**R Square**value is the amount of variance in the outcome that is accounted for by the predictor variables you have used.If the

*p*-value is**LESS THAN .05**, the model has accounted for a statistically significant amount of variance in the outcome.If the

*p*-value is**MORE THAN .05**, the model has not accounted for a significant amount of the outcome.2. Look in the

**Coefficients**table, under the**B**,**Std. Error**,**Beta**,**Sig.**, and**Tolerance**columns.The

**B**column contains the unstandardized beta coefficients that depict the magnitude and direction of the effect on the outcome variable.The

**Std. Error**contains the error values associated with the unstandardized beta coefficients.The

**Beta**column presents unstandardized beta coefficients for each predictor variable.The

**Sig.**column shows the*p*-value associated with each predictor variable.If a

*p*-value is**LESS THAN .05**, then that variable has a significant association with the outcome variable.If a

*p*-value is**MORE THAN .05**, then that variable does not have a significant association with the outcome variable.The

**Tolerance**column presents values related to assessing multicollinearity among the predictor variables.If any of the Tolerance values are

**BELOW .75**, consider creating a new variable or deleting one of the predictor variables.### Residuals

At this point, researchers need to construct and interpret several plots of the raw and standardized residuals to fully assess the fit of your model. Residuals can be thought of as

**the error associated with predicting or estimating outcomes using predictor variables**. Residual analysis is**extremely important**for meeting the linearity, normality, and homogeneity of variance assumptions of multiple regression.Scroll down the bottom of the SPSS output to the

**Scatterplot**. If the plot is linear, then researchers can assume linearity.### Outliers

**Normality and equal variance**assumptions apply to multiple regression analyses.

Look at the

**P-P Plot of Regression Standardized Residual**graph. If there are**not significant deviations of residuals from the line**and the**line is not curved**, then normality and homogeneity of variance can be assumed.Click on the

**Download Database**and**Download Data Dictionary**buttons for a configured database and data dictionary for multiple regression.**Click on the****Validation of Statistical Findings**button to learn more about bootstrap, split-group, and jack-knife validation methods.## Hire A Statistician

**DO YOU NEED TO HIRE A STATISTICIAN?**

Eric Heidel, Ph.D., PStat** **will provide the following statistical consulting services for undergraduate and graduate students at $100/hour. Secure checkout is available with Stripe, Venmo, Zelle, or PayPal.

- Statistical Analysis
- Research Design
- Sample Size Calculations
- Diagnostic Testing and Epidemiological Calculations
- Survey Design and Psychometrics