Categorical measurement caveats

4/1/2015

Effects of categorical measurement

Decrease statistical power and increase sample size

Categorical variables are very prevalent in medicine. Measures like presence of comorbidities, mortality, and test results are categorical in nature. Here are some general caveats associated with categorical measurement and sample size:

1. Categorical outcomes will always DECREASE statistical power and INCREASE the needed sample size. This is due to the lack of precision and accuracy in categorical measurement.

2. The underlying algebra associated with calculating 95% confidence intervals of odds ratios and relative risk is 100% dependent upon the sample size. With smaller sample sizes, by default, wider and less precise 95% confidence intervals will be found. If one of the cells of a cross-tabulation table has fewer observations that the other cells, then the 95% confidence interval will be wider and potentially not truly interpretable. A 95% confidence interval will become narrower or more precise only with larger sample sizes.

3. When using categorical variables for diagnostic testing purposes, larger samples sizes will be needed to calculate precise measures of sensitivity, specificity, positive predictive value (PPV), negative predictive value (NPV). With smaller sample sizes in diagnostic studies, a change in one or two observations can have drastic effects on the diagnostic values.

This is especially true when there is a subjective rating used for purposes of diagnosing someone as "positive" or "negative" for a given disease state (radiologist reading an X-ray). Inter-rater reliability coefficients such as Kappa or ICC should be employed to ensure consistency and reliability among subsequent ratings and raters. Sensitivity, specificity, and PPV will be affected by inter-rater reliability. Receiver Operator Characteristic (ROC) curves can be used to find a given value where sensitivity and specificity of a test is maximized. ROC curves can also be used to compare the area under the curve (AUC) between several diagnostic tests at the same time so that the best can be chosen.

4. For each predictor categorical parameter (or variable) that you want to include in a multivariate model, you have to increase your sample size by at least 20-40 observations of the outcome. This due to the limited precision, accuracy, and statistical power associated with categorical measurement. Researchers HAVE to collect more observations in order to detect any potential significant multivariate associations.

In the case that a polychotomous variable is to be used in a model, create (a-1), where a is the number of categories, dichotomous variables with "0" as not being that category and "1" as being that category. For each level, 20-40 more observations of the outcome will be needed to have enough statistical power to detect differences amongst the multiple groups.

0 Comments

New propensity score matching, calculators, reliability, and regression diagnostics pages in Research Engineer

1/12/2015

0 Comments

New pages for Research Engineer

Increased content validity for the website

Propensity score matching is a statistical methodology that is used in observation research designs. It also is very useful for controlling for confounding variables in multivariate models. A new page has been added describing its use in research.

I published all of the research calculators available in Research Engineer to one page for easier access. New pages are also available for internal consistency reliability and inter-rater reliability.

If you use Research Engineer for purposes of regression analysis, then you have seen the methods analyses associated with residual analysis and meeting certain statistical assumptions like linearity, normality, and equal variances. New pages are available to give deeper insights into these important statistical assumptions.

Click on a button below to get started! Thank you!

Propensity Score Matching

Calculators

Cronbach's Alpha

Split-Half Reliability

Kudar-Richardson 20 (KR-20)

Kappa

Intraclass Correlation Coefficient

Regression

Residual Analysis

Normal Probability Plot

Linearity

0 Comments

The Kappa statistic

12/6/2014

0 Comments

Kappa is a measure of inter-rater reliability

Rating performance or constructs a dichotomous categorical level

The Kappa statistic is a measure of inter-rater reliability when the construct or behavior is being rated using a dichotomous categorical outcome. When a sequential series of steps must be completed to yield an end product, such as with performance assessment, then a "checklist" or series of "yes/no" responses are scored by independent raters. The Kappa statistic can be used to assess the level of agreement/consistency/reliability between raters on subsequent dichotomous responses.

It is important that raters have an operational definition of what constitutes a "yes" or "no" in regards to performance. The construct or behavior of interest must be standardized between raters so that unsystematic bias can be reduced. A lack of operationalization and standardization in performance assessment significantly DECREASES the chances of obtaining evidence of inter-rater reliability when using the Kappa statistic.

Kappa is not a "powerful" statistic because of the dichotomous categorical variables used in the analysis. Larger sample sizes are needed to achieve adequate statistical power when categorical outcomes are utilized. So, many observations of the performance of simulation may be needed to adequately assess BOTH inter-rater reliability and outcomes of interest. The chances of having adequate inter-rater reliability decreases with fewer observations of performance or simulation.

0 Comments

The role of correlations in psychometrics

11/29/2014

1 Comment

Correlations are used to generate validity evidence

Concurrent, predictive, convergent, and divergent validity

Correlations play a central role in applied psychometrics.

The inter-correlations among survey instrument items play a role in calculating internal consistency reliability coefficients (Cronbach's alpha, split-half, KR-20), test-retest reliability (Spearman-Brown formula), and inter-rater reliability (Kappa, ICC). Correlation matrices also play a significant role in principal components analysis (eigenvalues, factor loadings).

Correlations are used to generate convergent , predictive, and concurrent validity evidence. Significant correlations with theoretically or conceptually similar constructs/survey instruments denotes evidence of validity. In social sciences, a validity coefficient (or correlation coefficient) of .3 is considered evidence of validity.

Pearson's r and Spearman's rho are the most prevalent correlation tests used to generate validity evidence. These correlations are used with survey instruments that generate ordinal or continuous outcomes.

1 Comment

Categorical measurement caveats

Effects of categorical measurement

Decrease statistical power and increase sample size

New propensity score matching, calculators, reliability, and regression diagnostics pages in Research Engineer

New pages for Research Engineer

Increased content validity for the website

The Kappa statistic

Kappa is a measure of inter-rater reliability

Rating performance or constructs a dichotomous categorical level

The role of correlations in psychometrics

Correlations are used to generate validity evidence

Concurrent, predictive, convergent, and divergent validity

Archives

Author

Categories

Contact Dr. Eric Heidel
[email protected]
(865) 742-7731

Copyright © 2024 Scalë. All Rights Reserved. Patent Pending.

Categorical measurement caveats

Effects of categorical measurement

Decrease statistical power and increase sample size

New propensity score matching, calculators, reliability, and regression diagnostics pages in Research Engineer

New pages for Research Engineer

Increased content validity for the website

The Kappa statistic

Kappa is a measure of inter-rater reliability

Rating performance or constructs a dichotomous categorical level

The role of correlations in psychometrics

Correlations are used to generate validity evidence

Concurrent, predictive, convergent, and divergent validity

Archives

Author

Categories

Contact Dr. Eric Heidel[email protected](865) 742-7731

Copyright © 2024 Scalë. All Rights Reserved. Patent Pending.

Contact Dr. Eric Heidel
[email protected]
(865) 742-7731