Positive Predictive Value - Eric Heidel, PhD PStat

Tags

Published on

April 1, 2015

Categorical measurement caveats

95% Confidence Interval Categorical Diagnostic Testing Inter-rater Reliability Intraclass Correlation Coefficient Kappa Statistic Multivariate Statistics Negative Predictive Value Odds Ratio With 95% CI Positive Predictive Value Relative Risk Sample Size Sensitivity Specificity Statistical-power-test

Effects of categorical measurement

Decrease statistical power and increase sample size

Categorical variables are very prevalent in medicine. Measures like presence of comorbidities, mortality, and test results are categorical in nature. Here are some general caveats associated with categorical measurement and sample size:

1. Categorical outcomes will always DECREASE statistical power and INCREASE the needed sample size. This is due to the lack of precision and accuracy in categorical measurement.

2. The underlying algebra associated with calculating 95% confidence intervals of odds ratios and relative risk is 100% dependent upon the sample size. With smaller sample sizes, by default, wider and less precise 95% confidence intervals will be found. If one of the cells of a cross-tabulation table has fewer observations that the other cells, then the 95% confidence interval will be wider and potentially not truly interpretable. A 95% confidence interval will become narrower or more precise only with larger sample sizes.

3. When using categorical variables for diagnostic testing purposes, larger samples sizes will be needed to calculate precise measures of sensitivity, specificity, positive predictive value (PPV), negative predictive value (NPV). With smaller sample sizes in diagnostic studies, a change in one or two observations can have drastic effects on the diagnostic values.

This is especially true when there is a subjective rating used for purposes of diagnosing someone as "positive" or "negative" for a given disease state (radiologist reading an X-ray). Inter-rater reliability coefficients such as Kappa or ICC should be employed to ensure consistency and reliability among subsequent ratings and raters. Sensitivity, specificity, and PPV will be affected by inter-rater reliability. Receiver Operator Characteristic (ROC) curves can be used to find a given value where sensitivity and specificity of a test is maximized. ROC curves can also be used to compare the area under the curve (AUC) between several diagnostic tests at the same time so that the best can be chosen.

4. For each predictor categorical parameter (or variable) that you want to include in a multivariate model, you have to increase your sample size by at least 20-40 observations of the outcome. This due to the limited precision, accuracy, and statistical power associated with categorical measurement. Researchers HAVE to collect more observations in order to detect any potential significant multivariate associations.

In the case that a polychotomous variable is to be used in a model, create (a-1), where a is the number of categories, dichotomous variables with "0" as not being that category and "1" as being that category. For each level, 20-40 more observations of the outcome will be needed to have enough statistical power to detect differences amongst the multiple groups.
Published on

September 10, 2014

Positive Predictive Value and Prevalence

Diagnostic Testing Positive Predictive Value PPV Prevalence

Positive Predictive Value (PPV) and Prevalence

Increased prevalence of an outcome will lead to more cases being "picked up"

Positive predictive value (PPV) is the likelihood that a person with a "+" on a diagnostic test actually has the disease state as it is detected using a "gold standard." Another way of defining PPV is how believable a "+" test result is in a given population.

As the prevalence of a disease state in a given population increases, the positive predictive value of a test will increase. This is simply due to the fact that there are more cases or disease states that can be detected.

If you are working with a rare outcome in a given population, be aware that less prevalent outcomes increase the number of false positives detected by a diagnostic test. By definition, lower prevalence dictates that there are not many true positives or "sick" people in a given population. With so few actual cases and more people being tested, the inherent measurement error associated with diagnostic testing will yield more and more false positives.

So, in conclusion, it is very important to know the baseline prevalence of your outcome or disease state in your population of interest when assessing diagnostic tests. Higher prevalence leads to increased PPV and lower prevalence leads to increased false positives.

Tags

Categorical measurement caveats

Effects of categorical measurement

Decrease statistical power and increase sample size

Positive Predictive Value and Prevalence

Positive Predictive Value (PPV) and Prevalence

Increased prevalence of an outcome will lead to more cases being "picked up"

Contact Dr. Eric Heidel
consultation@scalelive.com
(865) 742-7731

Copyright © 2026 Scalë. All Rights Reserved. Patent Pending.

Tags

Categorical measurement caveats

Effects of categorical measurement

Decrease statistical power and increase sample size

Positive Predictive Value and Prevalence

Positive Predictive Value (PPV) and Prevalence

Increased prevalence of an outcome will lead to more cases being "picked up"

Contact Dr. Eric Heidelconsultation@scalelive.com(865) 742-7731

Copyright © 2026 Scalë. All Rights Reserved. Patent Pending.

Contact Dr. Eric Heidel
consultation@scalelive.com
(865) 742-7731