Archive - Eric Heidel, PhD PStat - Statistician For Hire

Tags

Published on

October 17, 2014

Multivariate statistical designs

ANCOVA Test Cochran-Mantel-Haenszel Cox Regression Factorial ANOVA Kaplan-Meier Curve Logistic Regression MANCOVA MANOVA Multinomial Logistic Regression Multiple Regression Multivariate Statistics Negative Binomial Regression Poisson Regression Proportional Odds Regression

Multivariate statistical tests show evidence of association between predictor variables and an outcome, when controlling for demographic, confounding, and other patient data.

Multivariate statistics are more reflective of real-world medicine

We covered between-subjects and within-subjects analyses in the first Statistical Designs post. Multivariate statistics will be the focus in Statistical Designs 2.

While 90% of statistics reported in the literature fall under the guise of between-subjects and within-subjects analyses, they do not properly account for all of the variance and confounding effects that exist in reality. Multivariate statistics play an important role in empirical reasoning because they allow us to control for various demographic, confounding, clinical, or prognostic variables that mitigate, mediate, and affect the association between a predictor and outcome variable. They are also much more representative of reality and true effects that exist within human populations.

Very few if any relationships or treatment effects in physiology, psychology, education, or life in general are bivariate in nature. Relationships and treatment effects in reality ARE multivariate, diverse, and confounded by any number of characteristics. Therefore, it makes sense that researchers should be conducting multivariate statistics to truly understand human phenomena.

With this being said, it is important to use multivariate statistics ONLY when you are asking a multivariate research question. Throwing a bunch of variables into a model without some sort of theoretical or conceptual reason for including them can yield false treatment effects and increase Type I errors. Also, these spurious variables can create "statistical noise" which detracts from a model's capability for detecting significant associations.

Choosing the correct multivariate statistic to answer your question is simple. You choose the multivariate analysis based on the outcome.

1. Categorical outcomes - Logistic regression (dichotomous), multinomial logistic regression (polychotomous), Kaplan-Meier, Cochran-Mantel-Haenszel, Cox regression (dichotomous/survival/time-to-event)

2. Ordinal outcomes - Proportional odds regression

3. Continuous outcomes - Factorial ANOVA with fixed effects, factorial ANOVA with random effects, factorial ANOVA with mixed effects, ANCOVA, multiple regression, MANOVA, MANCOVA

4. Count outcomes - Negative binomial regression (variance larger than mean) and Poisson regression (mean larger than variance)
Published on

October 12, 2014

Publication of Research Findings

Guidelines For Authors Publication Research Journal

Publish the results of your study in a top-tier journal and become famous!

Be Meticulous, Scrupulous, Obsessive, and Objective

I'm trying to get my applied and methodological research published just like everyone else out there in academia. The one thing I have learned is that all journals have different methods of submission, all with different expectations of writing styles, and varying methods for structuring the paper and citing prior research.

Submitting a research manuscript is a tedious and anxiety-ridden process. It is by no means easy, user-friendly, or logical. I have to make a judicious effort to remember the email address, ID, password, and contact information for each submission at any time. You MUST do this, because editors and reviewers will NOT do it for you.

Upon rejection (and believe me, it is coming), you then have to completely reformat your manuscript with font and boldface changes and writing styles to meet the requisite needs of the new journal. You do this numerous times at the expense of the Impact Factor just to get published...pretty soon, what you ended with is nothing like what you started with, some months or years ago. It's no longer an original thought, but a mish-mash of editorial comments. What is one to do?!

1. First and foremost, follow EXACTLY what the "Guideline for Authors" section tells you to do. Because top-tier journals receive many manuscripts, the author guidelines are an easy way to "weed out" manuscripts. ONE simple mistake in a citation or subheading, and they will reject it or send it back for revisions. You may have to COMPLETELY revamp the structure and writing style of the paper, but at least you have the body of the manuscript put together!

2. Also, the editors may have good ideas on how to make your paper better. If you get a rejection, integrate any pertinent changes into your manuscript. However, if you can tell that the reviewer barely read the manuscript (if at all) and gave you superfluous remarks, then you do not want to submit to that publication anyways.

3. DO NOT GIVE UP! Keep submitting your work. Do not every give up on your manuscript. If it is rejected ten times, make changes and revisions, and then send it in an eleventh time. Check the "Information for Authors" section of each publication and make sure that the journal is focused on the correct audience.

4. Feel free to give the editors of the publication of interest an email or call. Ask if they would be interested in the study you are going to submit and if not, do they have any ideas for other potential publications? And go ahead and feel free to email them when you do not hear back from them, it is their job to get back to you.
Published on

October 8, 2014

Values needed for sample size calculations

A Priori Effect Size G*Power Mean Proportion Research Engineer Sample Size Standard Deviation Statistical Power Analysis

Evidence-based measures of effect

Use the empirical literature to your advantage

One of the most important things you can do when designing your study is to conduct an a priori power analysis. Doing so will tell you how many people that you will need in your sample size to detect the effect size or treatment effect in your study.

Without an a priori calculation, you could frivolously waste months or years of your life conducting a study only to find out that you only needed 100 in each group to achieve significance. Or, with the inverse, you conduct a study with only 50 patients and find out in a post hoc fashion that you would have needed 10,000 to prove your effect!

If you are using Research Engineer and G*Power to run your analyses, here are the things you will need:

1. An evidence-based measure of effect from the literature is the first thing you should seek out. Find a study that is theoretically, conceptually, or clinically similar to your own. Try to find a study that uses the same outcome you plan to use in your study.

2. Use the means, standard deviations, and proportions from these published studies as evidence-based measures of effect size to calculate how large of a sample size you will need. These values will be reported in body of the results section or in tables within the manuscript. It shows more empirical rigor on your part if you conduct an a priori power analysis based on a well-known study in the field.

3. Plug these values into G*Power using the steps published on the sample size page to find out how many people you will need to collect for your study.
Published on

October 7, 2014

Measurement at continuous levels

Categorical Continuous Ordinal Parametric Statistics Variables

Measure variables at the highest level possible

Don't discount your continuous variables!

There is a tendency for researchers to take continuous variables and recode them into ordinal or categorical variables. For example, researchers may ask participants to answer if they are 20-30 years old, 31-40 years old, 41-50 years old, 51-60 years old, or 60+ years old. Or, they may set an arbitrary "cut-off" of values above or below a certain value (People who are 55 years and older versus everyone younger than 55 years).

Researchers lose valuable precision and accuracy in measurement when continuous variables are demoted to ordinal or categorical levels. It is ALWAYS better to take an actual numerical value with a "true zero" and analyze it using parametric statistics. If there is a theoretical, conceptual, or empirical basis for pairing down continuous measures into lower levels of measurement, then and only then should it be done. If you were a researcher and wanted to know the most precise and accurate measure possible of my age, which of the following is the best way to ask?

1. How many years old are you? (continuous)

2. How old are you? (circle one) 20-30 31-40 41-50 51-60 60+ (ordinal)

3. Are you above or below the age of 55? (categorical)

The continuous method will give you a stronger measure of age, which can then be broken down into separate ordinal or categorical levels, AT YOUR DISCRETION. So, always measure at the continuous level if at all possible.

With this being said, PLEASE realize that while we can go from continuous to ordinal and continuous levels of measurement, it is IMPOSSIBLE to change categorical and ordinal variable into a continuous level of measurement.

Let's use a basic example:

Gender - 0 = male and 1 = female

Is there any way to convert this into a continuous variable? No.

Here is another example:

How old are you? (circle one) 20-30 31-40 41-50 51-60 60+

Can you convert this into a continuous variable? No, again.

In conclusion, ALWAYS try to measure your variables at a continuous level, if at all possible or feasible. They can be broken down into ordinal and categorical variables as needed. Also, REALIZE that once you have decided to measure something at a categorical or ordinal level, it cannot be converted to continuous.
Published on

October 6, 2014

Publication bias

Publication Bias

Publishing only significant research findings

The collective unconscious and statistics

Admit it, it just feels better when a statistically significant difference or treatment effect is found! Promotion, tenure, benefits, and perks...all relying on the one p-value being below .05! Statistics has done something for you when have found statistical significance!

Truth be told, this line of thinking has lead to a gross overestimation OR underestimation of important treatment effects in the clinical literature. Publication bias is a rampant, unconscious, and deleterious phenomenon within science. But, so long as human beings with presuppositions, biases, and knowledge gaps conduct research and statistics AND so long as human beings are responsible for peer-review and gate-keeping of the literature, publication bias will continue to exist. This means that important and potentially life-saving or cost-saving treatments will not be represented in the clinical literature, SIMPLY BECAUSE STATISTICAL SIGNIFICANCE WAS NOT ACHIEVED.

What can be done? I proffer the following:

1. I think that math, science, and statistics need a complete "makeover" in the collective unconscious. These things are very cool and it is completely awesome to be a nerd and work hard towards mastering content areas within these fields. College degrees in these fields lead to job security and better pay.

In my experience, few people recall their experiences with statistics with much zeal. People have an automatic recoil towards statistics. This MUST change. If statistics and hypothesis testing are going to be the methods by which we conduct, communicate, and interpret research findings, then a drastic change in the collective orientation towards statistics must occur.

2. In tune with #1, statistical scientists and educators must do a better job of teaching the lexicon or language of our mathematical science to the general population. Skewness, kurtosis, effect size, sample size, statistical power, confidence interval, probability, reliability, validity, precision, accuracy, sampling error, normality, homogeneity of variance, sphericity, standard deviation, variance, covariance, confounding, hypothesis testing, reject, do not reject, Type I error, Type II error...WHAT DO ALL OF THESE WORDS MEAN???

Most people do not use these words everyday. People experience cognitive dissonance when knowledge and sensory gaps occur. Educators need to take a deductive approach towards imparting the language and "meaning" within content areas such as 1) hypothesis testing, 2) measurement, 3) statistical power, and 4) statistics. Without a basic working knowledge and understanding of these critical terms in applied statistics, the collective unconscious will continue to recoil at the very sight, sound, or presence of statistics, and even some statisticians.

3. Masters and doctoral level researchers and clinicians need to possess the knowledge and experience to conduct applied statistics in the correct fashion. I could really get on my "soapbox" here but I've almost taken too much of your time. Simply put, seek out assistance before publishing data that does not meet statistical assumptions.

In conclusion, I hope you can appreciate my candor in this regard. Publication bias is a REAL phenomenon and it has DRASTIC implications on clinical treatment. You want your clinician to be informed by unbiased clinical evidence. However, that is probably not the case. Let's work towards changing this scary truth!
Published on

October 5, 2014

95% confidence intervals

95% Confidence Interval Adjusted Odds Ratio Confidence Interval Hazard Ratio Measurement Odds Ratio With 95% CI P-value Relative Risk Statistics

Precision and consistency of treatment effects

95% confidence intervals are dependent upon sample size

If there is ANY statistical calculation that holds true value for researchers and clinicians on a day-to-day basis, it is the 95% confidence interval wrapped around the findings of inferential analyses. Statistics is not an exact mathematical science as far as other exact mathematical sciences go, measurement error is inherent when attempting to measure for anything related to human beings, and FEW tried and true causal effects have been proven scientifically. Statistics' strength as a mathematical science is in its ability to build confidence intervals around findings to put them into a relative context.

Also, 95% confidence intervals act as the primary inference associated with unadjusted odds ratios, relative risk, hazard ratios, and adjusted odds ratios. If the confidence interval crosses over 1.0, there is a non-significant effect. Wide 95% confidence intervals are indicative of small sample sizes and lead to decreased precision of the effect. Constricted or narrow 95% confidence intervals reflect increased precision and consistency of a treatment effect.

In essence, p-values should not be what people get excited about when it comes to statistical analyses. The interpretation of your findings within the context of the subsequent population means, odds, risk, hazard, and 95% confidence intervals IS the real "meat" of applied statistics.

Tags

Multivariate statistical designs

Multivariate statistical tests show evidence of association between predictor variables and an outcome, when controlling for demographic, confounding, and other patient data.

Multivariate statistics are more reflective of real-world medicine

Publication of Research Findings

Publish the results of your study in a top-tier journal and become famous!

Be Meticulous, Scrupulous, Obsessive, and Objective

Values needed for sample size calculations

Evidence-based measures of effect

Use the empirical literature to your advantage

Measurement at continuous levels

Measure variables at the highest level possible

Don't discount your continuous variables!

Publication bias

Publishing only significant research findings

The collective unconscious and statistics

95% confidence intervals

Precision and consistency of treatment effects

95% confidence intervals are dependent upon sample size

Contact Dr. Eric Heidel
consultation@scalelive.com
(865) 742-7731

Copyright © 2026 Scalë. All Rights Reserved. Patent Pending.

Tags

Multivariate statistical designs

Multivariate statistical tests show evidence of association between predictor variables and an outcome, when controlling for demographic, confounding, and other patient data.

Multivariate statistics are more reflective of real-world medicine

Publication of Research Findings

Publish the results of your study in a top-tier journal and become famous!

Be Meticulous, Scrupulous, Obsessive, and Objective

Values needed for sample size calculations

Evidence-based measures of effect

Use the empirical literature to your advantage

Measurement at continuous levels

Measure variables at the highest level possible

Don't discount your continuous variables!

Publication bias

Publishing only significant research findings

The collective unconscious and statistics

95% confidence intervals

Precision and consistency of treatment effects

95% confidence intervals are dependent upon sample size

Contact Dr. Eric Heidelconsultation@scalelive.com(865) 742-7731

Copyright © 2026 Scalë. All Rights Reserved. Patent Pending.

Contact Dr. Eric Heidel
consultation@scalelive.com
(865) 742-7731