Propensity score matching
Match cases to more comparable controls using propensity score matching
Propensity score matching is a statistical technique used in observational research designs to calculate the chance of individual participants being in either the treatment group or the control group based on pertinent demographic, confounding, and predictor variables. Propensity score matching is primarily employed to individually match treatment or case participants to potential control participants from the population. Since group membership is measured at a categorical level, logistic regression (two groups) or multinomial logistic regression (three or more groups) are the statistical tests used for propensity score matching.
At first, the statistical reasoning associated with propensity score matching may seem backwards. This is because we have always used group membership (treatment or control) to predict for some sort of outcome (yes or no). The ONLY difference between propensity score matching and regular logistic or multinomial logistic regression is that instead of predicting for an outcome/dependent variable, you are predicting for a predictor/independent variable (treatment or control).
Also, just like with regular regression, the independent, predictor, demographic, prognostic, clinical, and confounding variables that are entered into a model must have some sort of theoretical, conceptual, or physiological association or relevance when predicting for group membership. The validity of group membership yielded from propensity score matching is only as accurate as the predictor variables that are used in the regression model. This is the primary weakness of propensity score matching, the predicted group membership is adjusted ONLY for the variables used in the model. Results can easily become biased without important variables being accounted for in propensity score matching.
Also, just like with regular regression, the independent, predictor, demographic, prognostic, clinical, and confounding variables that are entered into a model must have some sort of theoretical, conceptual, or physiological association or relevance when predicting for group membership. The validity of group membership yielded from propensity score matching is only as accurate as the predictor variables that are used in the regression model. This is the primary weakness of propensity score matching, the predicted group membership is adjusted ONLY for the variables used in the model. Results can easily become biased without important variables being accounted for in propensity score matching.
The four steps for selecting controls using propensity score matching
1. Define inclusion and exclusion criteria for case and control group membership.
2. Identify the variables that affect group membership (independent, predictor, demographic, prognostic, clinical, and confounding variables).
3. Enter the variables into either a logistic regression (two groups) or multinomial logistic regression (three or more groups) model predicting for group membership.
4. Match cases to controls with similar propensity scores (estimated likelihood of group membership).
2. Identify the variables that affect group membership (independent, predictor, demographic, prognostic, clinical, and confounding variables).
3. Enter the variables into either a logistic regression (two groups) or multinomial logistic regression (three or more groups) model predicting for group membership.
4. Match cases to controls with similar propensity scores (estimated likelihood of group membership).
There are several other uses for propensity scores. One way to use them is as an independent predictor variable in a regular logistic or multinomial logistic regression. It is often a BETTER representation of several variables by itself in a multivariate model. Essentially, rather than using four variables to predict for an outcome, conduct propensity score matching with the four variables and use the yielded propensity scores as just ONE variable. This variable is already adjusted for the shared variance among the groups in terms of the predictor variables.
Another use of the propensity score is to statistically adjust for baseline differences in non-probability samples. The propensity score is entered into the regression model or is used as a covariate in a factorial analysis. Non-probability samples cannot assume that differences at baseline are due to chance as with probability samples. Potential selection biases must be accounted for to truly understand treatment effects yielded in observational studies. Propensity scores can serve as this adjustment in the statistical analysis.
Another use of the propensity score is to statistically adjust for baseline differences in non-probability samples. The propensity score is entered into the regression model or is used as a covariate in a factorial analysis. Non-probability samples cannot assume that differences at baseline are due to chance as with probability samples. Potential selection biases must be accounted for to truly understand treatment effects yielded in observational studies. Propensity scores can serve as this adjustment in the statistical analysis.
Click on a button below to continue.
Hire A Statistician
DO YOU NEED TO HIRE A STATISTICIAN?
Eric Heidel, Ph.D., PStat will provide you with statistical consultation services for your research project at $100/hour. Secure checkout is available with Stripe, Venmo, Zelle, or PayPal.
- Statistical Analysis on any kind of project
- Dissertation and Thesis Projects
- DNP Capstone Projects
- Clinical Trials
- Analysis of Survey Data