Statistical Consultation Line: (865) 742-7731
Accredited Professional Statistician For Hire
  • Contact Form

Values needed for sample size calculations

10/8/2014

0 Comments

 

Evidence-based measures of effect

Use the empirical literature to your advantage

One of the most important things you can do when designing your study is to conduct an a priori power analysis. Doing so will tell you how many people that you will need in your sample size to detect the effect size or treatment effect in your study.

Without an a priori calculation, you could frivolously waste months or years of your life conducting a study only to find out that you only needed 100 in each group to achieve significance. Or, with the inverse, you conduct a study with only 50 patients and find out in a post hoc fashion that you would have needed 10,000 to prove your effect!  

If you are using Research Engineer and G*Power to run your analyses, here are the things you will need:

1. An evidence-based measure of effect from the literature is the first thing you should seek out. Find a study that is theoretically, conceptually, or clinically similar to your own. Try to find a study that uses the same outcome you plan to use in your study.  

2. Use the means, standard deviations, and proportions from these published studies as evidence-based measures of effect size to calculate how large of a sample size you will need. These values will be reported in body of the results section or in tables within the manuscript. It shows more empirical rigor on your part if you conduct an a priori power analysis based on a well-known study in the field.

3. Plug these values into G*Power using the steps published on the sample size page to find out how many people you will need to collect for your study.

Scale, LLC
0 Comments

Measurement at continuous levels

10/7/2014

0 Comments

 

Measure variables at the highest level possible

Don't discount your continuous variables!

There is a tendency for researchers to take continuous variables and recode them into ordinal or categorical variables. For example, researchers may ask participants to answer if they are 20-30 years old, 31-40 years old, 41-50 years old, 51-60 years old, or 60+ years old. Or, they may set an arbitrary "cut-off" of values above or below a certain value (People who are 55 years and older versus everyone younger than 55 years).

Researchers lose valuable precision and accuracy in measurement when continuous variables are demoted to ordinal or categorical levels. It is ALWAYS better to take an actual numerical value with a "true zero" and analyze it using parametric statistics. If there is a theoretical, conceptual, or empirical basis for pairing down continuous measures into lower levels of measurement, then and only then should it be done. If you were a researcher and wanted to know the most precise and accurate measure possible of my age, which of the following is the best way to ask?

1. How many years old are you?  (continuous)

2. How old are you? (circle one)  20-30    31-40    41-50    51-60   60+  (ordinal)

3. Are you above or below the age of 55?  (categorical)

The continuous method will give you a stronger measure of age, which can then be broken down into separate ordinal or categorical levels, AT YOUR DISCRETION. So, always measure at the continuous level if at all possible.

With this being said, PLEASE realize that while we can go from continuous to ordinal and continuous levels of measurement, it is IMPOSSIBLE to change categorical and ordinal variable into a continuous level of measurement.

Let's use a basic example:

Gender - 0 = male and 1 = female

Is there any way to convert this into a continuous variable? No.

Here is another example:

How old are you? (circle one)  20-30    31-40    41-50    51-60   60+

Can you convert this into a continuous variable? No, again.

In conclusion, ALWAYS try to measure your variables at a continuous level, if at all possible or feasible. They can be broken down into ordinal and categorical variables as needed. Also, REALIZE that once you have decided to measure something at a categorical or ordinal level, it cannot be converted to continuous.

Scale, LLC
0 Comments

Publication bias

10/6/2014

0 Comments

 

Publishing only significant research findings

The collective unconscious and statistics

Admit it, it just feels better when a statistically significant difference or treatment effect is found! Promotion, tenure, benefits, and perks...all relying on the one p-value being below .05! Statistics has done something for you when have found statistical significance!

Truth be told, this line of thinking has lead to a gross overestimation OR underestimation of important treatment effects in the clinical literature. Publication bias is a rampant, unconscious, and deleterious phenomenon within science. But, so long as human beings with presuppositions, biases, and knowledge gaps conduct research and statistics AND so long as human beings are responsible for peer-review and gate-keeping of the literature, publication bias will continue to exist. This means that important and potentially life-saving or cost-saving treatments will not be represented in the clinical literature, SIMPLY BECAUSE STATISTICAL SIGNIFICANCE WAS NOT ACHIEVED.

What can be done? I proffer the following:

1.  I think that math, science, and statistics need a complete "makeover" in the collective unconscious. These things are very cool and it is completely awesome to be a nerd and work hard towards mastering content areas within these fields. College degrees in these fields lead to job security and better pay.

In my experience, few people recall their experiences with statistics with much zeal. People have an automatic recoil towards statistics. This MUST change. If statistics and hypothesis testing are going to be the methods by which we conduct, communicate, and interpret research findings, then a drastic change in the collective orientation towards statistics must occur.

2. In tune with #1, statistical scientists and educators must do a better job of teaching the lexicon or language of our mathematical science to the general population. Skewness, kurtosis, effect size, sample size, statistical power, confidence interval, probability, reliability, validity, precision, accuracy, sampling error, normality, homogeneity of variance, sphericity, standard deviation, variance, covariance, confounding, hypothesis testing, reject, do not reject, Type I error, Type II error...WHAT DO ALL OF THESE WORDS MEAN???

Most people do not use these words everyday. People experience cognitive dissonance when knowledge and sensory gaps occur. Educators need to take a deductive approach towards imparting the language and "meaning" within content areas such as 1) hypothesis testing, 2) measurement, 3) statistical power, and 4) statistics. Without a basic working knowledge and understanding of these critical terms in applied statistics, the collective unconscious will continue to recoil at the very sight, sound, or presence of statistics, and even some statisticians.

3. Masters and doctoral level researchers and clinicians need to possess the knowledge and experience to conduct applied statistics in the correct fashion. I could really get on my "soapbox" here but I've almost taken too much of your time. Simply put, seek out assistance before publishing data that does not meet statistical assumptions.

In conclusion, I hope you can appreciate my candor in this regard. Publication bias is a REAL phenomenon and it has DRASTIC implications on clinical treatment. You want your clinician to be informed by unbiased clinical evidence. However, that is probably not the case. Let's work towards changing this scary truth!   

Scale, LLC
0 Comments

95% confidence intervals

10/5/2014

1 Comment

 

Precision and consistency of treatment effects

95% confidence intervals are dependent upon sample size

If there is ANY statistical calculation that holds true value for researchers and clinicians on a day-to-day basis, it is the 95% confidence interval wrapped around the findings of inferential analyses. Statistics is not an exact mathematical science as far as other exact mathematical sciences go, measurement error is inherent when attempting to measure for anything related to human beings, and FEW tried and true causal effects have been proven scientifically. Statistics' strength as a mathematical science is in its ability to build confidence intervals around findings to put them into a relative context.  

Also, 95% confidence intervals act as the primary inference associated with unadjusted odds ratios, relative risk, hazard ratios, and adjusted odds ratios. If the confidence interval crosses over 1.0, there is a non-significant effect. Wide 95% confidence intervals are indicative of small sample sizes and lead to decreased precision of the effect. Constricted or narrow 95% confidence intervals reflect increased precision and consistency of a treatment effect.

In essence, p-values should not be what people get excited about when it comes to statistical analyses. The interpretation of your findings within the context of the subsequent population means, odds, risk, hazard, and 95% confidence intervals IS the real "meat" of applied statistics.

Scale, LLC
1 Comment

Non-parametric statistics and small sample sizes

10/5/2014

0 Comments

 

Non-parametric statistics are robust to small sample sizes

The right way to conduct statistics

Mark Twain said it best, "There are lies, damn lies, and statistics." Statistics can be misleading from both the standpoint of the person conducting the statistics and the person that is interpreting the analyses. Many between-subjects studies have small sample sizes (n < 20) and statistical assumptions for parametric statistics cannot be met.

For basic researchers that operate day in and day out with small sample sizes, the answer is to use non-parametric statistics. Non-parametric statistical tests such as the Mann-Whitney U, Kruskal-Wallis, Wilcoxon, and Friedman's ANOVA are robust to violations of statistical assumptions and skewed distributions. These tests can yield interpretable medians, interquartile ranges, and p-values.

Non-parametric statistics are also useful in the social sciences due to the inherent measurement error associated with assessing human behaviors, thoughts, feelings, intelligence, and emotional states. The underlying algebra associated with psychometrics relies on intercorrelations amongst constructs or items.  Correlations can easily be skewed by outlying observations and measurement error.  Therefore, in concordance with mathematical and empirical reasoning, non-parametric statistics should be used often for between-subjects comparisons of surveys, instruments, and psychological measures.

Scale, LLC
0 Comments

Chi-square p-values are not enough

10/3/2014

1 Comment

 

Chi-square p-value

Odds ratio with 95% confidence interval should be reported and interpreted

Most people that need statistics are focused only on the almighty p-value of less than .05. When running Chi-square analyses between a dichotomous categorical predictor and a dichotomous categorical outcome, p-values are not the primary inference that should be interpreted for practical purposes. The lack of precision and accuracy in categorical measures coupled with sampling error makes the p-values yielded from Chi-square analyses virtually worthless in the applied sense.

The correct statistic to run is an unadjusted odds ratio with 95% confidence interval. This is the best measure for interpreting the magnitude of the association between two dichotomous categorical variables collected in a retrospective fashion. Relative risk can be calculated when the association is assessed in a prospective fashion.

The width of the 95% confidence interval and it crossing over 1.0 dictate the significance and precision of the association between the variables.  With smaller sample sizes, the 95% confidence interval will be wider and less precise. Larger sample sizes will yield more precise effects.

Scale, LLC
1 Comment

Ordinal measures becoming continuous with normality

10/2/2014

1 Comment

 

Ordinal measures and normality

Ordinal level measurement can become interval level with assumed normality

Here is an interesting trick I picked up along the way when it comes to ordinal outcomes and some unvalidated measures. If you run skewness and kurtosis statistics on the ordinal variable and its distribution meets the assumption of normality (skewness and kurtosis statistics are less than an absolute value of 2.0), then you can "upgrade" the variable to a continuous level of measurement and analyze it using more powerful parametric statistics.  

This type of thinking is the reason that the SAT, ACT, GRE, MCAT, LSAT, and validated psychological instruments are perceived at a continuous level. The scores yielded from these instruments, by definition, are not continuous because a "true zero" does not exist. Scores from these tests are often norm- or criterion-referenced to the population so that they can be interpreted in the correct context. Therefore, with the subjectivity and measurement error associated with classical test theory and item response theory, the scores are actually ordinal.

With that being said, if the survey instrument or ordinal outcome is used in the empirical literature often and it meets the assumption of normality as per skewness and kurtosis statistics, treat the ordinal variable as a continuous variable and run analyses using parametric statistics (t-tests, ANOVA, regression) versus non-parametric statistics (Chi-square, Mann-Whitney U, Kruskal-Wallis, McNemar's, Wicoxon, Friedman's ANOVA, logistic regression). 

Scale, LLC
1 Comment

Statistical Designs

10/1/2014

0 Comments

 

Research questions lead to choice of statistical design

Differences between-subjects and within-subjects designs

There are terms in statistics that many people do not understand from a practical standpoint. I'm a biostatistical scientist and it took me YEARS to wrap my head around some fundamental aspects of statistical reasoning, much less the lexicon. I would hypothesize that 90% of the statistics reported in the empirical literature as a whole fall between two different categories of statistics, between-subjects and within-subjects. Here is a basic breakdown of the differences in these types of statistical tests:

1. Between-subjects - When you are comparing independent groups on a categorical, ordinal, or continuous outcome variable, you are conducting between-subjects analyses. The "between-" denotes the differences between mutually exclusive groups or levels of a categorical predictor variable. Chi-square, Mann-Whitney U, independent-samples t-tests, odds ratio, Kruskal-Wallis, and one-way ANOVA are all considered between-subjects analyses because of the comparison of independent groups.  

2. Within-subjects - When you are comparing THE SAME GROUP on a categorical, ordinal, or continuous outcome ACROSS TIME OR WITHIN THE SAME OBJECT OF MEASUREMENT MULTIPLE TIMES, then you are conducting within-subjects analyses. The "within-" relates to the differences within the same object of measurement across multiple observations, time, or literally, "within-subjects." Chi-square Goodness-of-fit, Wilcoxon, repeated-measures t-tests, relative risk, Friedman's ANOVA, and repeated-measures ANOVA are within-subjects analyses because the same group or cohort of individuals is measured at several different time-points or observations.

Scale, LLC
0 Comments
Forward>>

    Archives

    March 2016
    January 2016
    November 2015
    October 2015
    September 2015
    August 2015
    July 2015
    May 2015
    April 2015
    March 2015
    February 2015
    January 2015
    December 2014
    November 2014
    October 2014
    September 2014

    Author

    Eric Heidel, Ph.D. is Owner and Operator of Scalë, LLC.

    Categories

    All
    95% Confidence Interval
    Absolute Risk Reduction
    Accuracy
    Acquiring Clinical Evidence
    Adjusted Odds Ratio
    Affordable Care Act
    Alpha Value
    ANCOVA Test
    ANOVA Test
    Applying Clinical Evidence
    Appraisal Of The Literature
    Appraising Clinical Evidence
    A Priori
    Area Under The Curve
    Asking Clinical Questions
    Assessing Clinical Practice
    AUC
    Basic Science
    Beta Value
    Between-subjects
    Biserial
    Blinding
    Bloom's Taxonomy
    Bonferroni
    Boolean Operators
    Calculator
    Case-control Design
    Case Series
    Categorical
    Causal Effects
    Chi-square
    Chi-square Assumption
    Chi-square Goodness-of-fit
    Classical Test Theory
    Clinical Pathways
    Clustered Random Sampling
    Cochran-Mantel-Haenszel
    Cochran's Q Test
    Coefficient Of Determination
    Cognitive Dissonance
    Cohort
    Comparative Effectiveness Research
    Comparator
    Concurrent Validity
    Confidence Interval
    Confirmatory Factor Analysis
    Construct Specification
    Construct Validity
    Continuous
    Control Event Rate
    Convenience Sampling Method
    Convergent Validity
    Copyright
    Correlations
    Count Variables
    Cox Regression
    Cronbach's Alpha
    Cross-sectional
    Curriculum Vitae
    Database Management
    Diagnostic Testing
    EBM
    Education
    Effect Size
    Empirical Literature
    Epidemiology
    Equivalency Trial
    Eric Heidel
    Evidence-based Medicine
    Exclusion Criteria
    Experimental Designs
    Experimental Event Rate
    Facebook
    Factorial ANOVA
    Feasible Research Questions
    FINER
    Fisher's Exact Tests
    Friedman's ANOVA
    Generalized Estimating Equations (GEE)
    "gold Standard" Outcome
    G*Power
    Guidelines For Authors
    Hazard Ratio
    Hierarchical Regression
    Homogeneity Of Variance
    Hypothesis Testing
    ICC
    Incidence
    Inclusion Criteria
    Independence Of Observations Assumption
    Independent Samples T-test
    Intention-to-treat
    Internal Consistency Reliability
    Interquartile Range
    Inter-rater Reliability
    Interval Variables
    Intervention
    Intraclass Correlation Coefficient
    Isomorphism
    Item Response Theory
    Kaplan-Meier Curve
    Kappa Statistic
    KR-20
    Kruskal-Wallis
    Kurtosis
    Levene's Test
    Likert Scales
    Linearity
    Listwise Deletion
    Logarithmic Transformations
    Logistic Regression
    Log-Rank Test
    Longitudinal Data
    MANCOVA
    Mann-Whitney U
    MANOVA
    Mass Emails In Survey Research
    Math
    Mauchly's Test
    McNemar's Test
    Mean
    Measurement
    Median
    Medicine
    Merging Databases
    Missing Data
    Mode
    Multinomial Logistic Regression
    Multiple Regression
    Multivariate Statistics
    Negative Binomial Regression
    Negative Predictive Value
    Nominal Variables
    Nonequivalent Control Group Design
    Non-inferiority
    Non-inferiority Trial
    Non-parametric Statistics
    Non-probability Sampling
    Normality
    Normality Of Difference Scores
    Normal Probability Plot
    Novel Research Question
    Number Needed To Treat
    Observational Research
    Odds Ratio With 95% CI
    One-sample Median Tests
    One-sample T-test
    One-sided Hypothesis
    One-Way Random
    Operationalization
    Ordinal
    Outcome
    Outliers
    Parametric Statistics
    Pearson's R
    Ph.D.
    Phi Coefficient
    PICO
    Pilot Study
    Point Biserial
    Poisson Regression
    Population
    Positive Predictive Value
    Post Hoc
    Post-positivism
    PPACA
    PPV
    Precision
    Predictive Validity
    Prevalence
    Principal Components Analysis
    Probability Sampling
    Propensity Score Matching
    Proportion
    Proportional Odds Regression
    Prospective Cohort
    Psychometrics
    Psychometric Tests
    Publication
    Publication Bias
    Purposive Sampling
    P-value
    Random Assignment
    Randomized Controlled Trial
    Random Selection
    Rank Biserial
    Ratio Variables
    Receiver Operator Characteristic
    Regression
    Regression Analysis
    Relative Risk
    Relevant Research Question
    Reliability
    Repeated-measures ANOVA
    Repeated-measures T-test
    Research
    Research Design
    Research Engineer
    Research Journal
    Research Question
    Residual Analysis
    Retrospective Cohort
    ROC Curve
    Sample Size
    Sampling
    Sampling Error
    Sampling Method
    Scales Of Measurement
    Science
    Search Engine
    Search Query
    Sensitivity
    Simple Random Sampling
    Sitemap
    Skewness
    Social Science
    Spearman-Brown
    Spearman's Rho
    Specificity
    Specificity In Literature Searching
    Sphericity Assumption
    Split-half Reliability
    SPSS
    Standard Deviation
    Standards Of Care
    Statistical Analysis
    Statistical Assumptions
    Statistical Consultation
    Statistical Power
    Statistical Power Analysis
    Statistical-power-test
    Statistician
    Statistics
    Stratified Random Sampling
    Survey
    Survey Construct Specification
    Survey Methods
    Systematic Review
    Test-Retest Reliability
    Twitter
    Two-sided Hypothesis
    Two-Way Mixed
    Two-Way Random
    Type I Error
    Type II Error
    Unadjusted Odds Ratio
    Validity
    Variables
    Variance
    Wilcoxon
    Within-subjects
    YouTube


    Contact Form

Contact Dr. Eric Heidel
[email protected]
(865) 742-7731

Copyright © 2024 Scalë. All Rights Reserved. Patent Pending.