- Published on
-
-
- Published on
Sensitivity and specificity are important diagnostic measures that provide evidence of a diagnostic test's ability to either "detect disease" or "identify the healthy," depending upon the clinical context.
In order to conduct diagnostic testing, the results of the diagnostic test of interest have to be compared to the results of existing "gold standard" method of diagnosis in a defined population. The results of both the diagnostic test and "gold standard" have to be quantified as a dichotomous categorical variable (positive or negative, "+" or "-").
Sensitivity is the ability of a diagnostic test to detect disease. It is the percentage of people that tested positive or "+" with both the diagnostic test and the "gold standard." A diagnostic test with high sensitivity is good at picking up cases of a given disease state. It is also able to "rule out" disease states.
Specificity is the ability of a diagnostic test to identify the healthy. It is the percentage of people that tested negative or "-" with both the diagnostic test and the "gold standard." A diagnostic test with high specificity is good at detecting cases that do not require more intensive treatment. It is also able to "rule in" disease states.
It is optimal to have a diagnostic test that can both detect disease (sensitivity) AND identify the healthy (specificity). There is an absolute inverse relationship between sensitivity and specificity. As sensitivity goes up, specificity will go down. Higher specificity will lead to lower sensitivity. A well-accepted criterion for a balanced diagnostic test is 80% for both sensitivity and specificity. However, given the clinical context, a certain type of diagnostic test with either higher sensitivity or specificity may be warranted.
If the diagnostic test results are measured along a numerical continuum, then receiver operator characteristic (ROC) curves can be plotted to detect what value maximizes both sensitivity and specificity. ROC curves can also be used to compare the diagnostic efficacy of several tests concurrently and comparing area under the curve (AUC).
-
- Published on
Categorical variables are very prevalent in medicine. Measures like presence of comorbidities, mortality, and test results are categorical in nature. Here are some general caveats associated with categorical measurement and sample size:
1. Categorical outcomes will always DECREASE statistical power and INCREASE the needed sample size. This is due to the lack of precision and accuracy in categorical measurement.
2. The underlying algebra associated with calculating 95% confidence intervals of odds ratios and relative risk is 100% dependent upon the sample size. With smaller sample sizes, by default, wider and less precise 95% confidence intervals will be found. If one of the cells of a cross-tabulation table has fewer observations that the other cells, then the 95% confidence interval will be wider and potentially not truly interpretable. A 95% confidence interval will become narrower or more precise only with larger sample sizes.
3. When using categorical variables for diagnostic testing purposes, larger samples sizes will be needed to calculate precise measures of sensitivity, specificity, positive predictive value (PPV), negative predictive value (NPV). With smaller sample sizes in diagnostic studies, a change in one or two observations can have drastic effects on the diagnostic values.
This is especially true when there is a subjective rating used for purposes of diagnosing someone as "positive" or "negative" for a given disease state (radiologist reading an X-ray). Inter-rater reliability coefficients such as Kappa or ICC should be employed to ensure consistency and reliability among subsequent ratings and raters. Sensitivity, specificity, and PPV will be affected by inter-rater reliability. Receiver Operator Characteristic (ROC) curves can be used to find a given value where sensitivity and specificity of a test is maximized. ROC curves can also be used to compare the area under the curve (AUC) between several diagnostic tests at the same time so that the best can be chosen.
4. For each predictor categorical parameter (or variable) that you want to include in a multivariate model, you have to increase your sample size by at least 20-40 observations of the outcome. This due to the limited precision, accuracy, and statistical power associated with categorical measurement. Researchers HAVE to collect more observations in order to detect any potential significant multivariate associations.
In the case that a polychotomous variable is to be used in a model, create (a-1), where a is the number of categories, dichotomous variables with "0" as not being that category and "1" as being that category. For each level, 20-40 more observations of the outcome will be needed to have enough statistical power to detect differences amongst the multiple groups. -
- Published on
The methodology or research design used in a study is employed to answer the research question. Without a research question, there is no reason to have a methodological approach. Observational research designs like cases series, case-control, cross-sectional, retrospective cohort, and prospective cohort are research questions related to associations between variables. Experimental research designs are used to answer research questions related to causal effects.
When choosing a research methodology, one word should always come to mind, feasible. The feasibility of what you can and cannot do given time, money, resources, and collaborators must be taken into consideration before conducting a study. Researchers that have limited amounts of the aforementioned may be better served by retrospective observational designs where data on predictors and outcomes already exists. Prospective and experimental designs require much more time and effort to conduct. A significantly larger amount of empirical complexity and experience is needed to conduct these types of designs. There must also have to be sufficient time to follow-up on the outcomes of interest.
The PICO (population, intervention, comparator, and outcome) mnemonic is an excellent tool for defining important parts of a research methodology. The population should be defined in regards to inclusion and exclusion criteria. In order for studies and experiments to be replicated, the intervention or treatment must be explicitly described. If the goal of a research study is to show evidence of a treatment effect, then a comparison, control, or comparator group is necessary. Comparator participants should possess similar demographic and clinical characteristics to treatment participants to truly understand any associations or effects. Finally, the primary outcome should be measured at the current "gold standard" level to increase the precision and accuracy of research findings. The "gold standard" outcome is also more generalizable and understood by clinicians because it is part of their lexicon and cognitive schema. -
- Published on
-
- Published on