Here are some personal thoughs about how to report the statistical analysis in an applied paper (e.g. medical study).

Structure of the ‘statistical analysis’ paragraph

In the method section, the statistical paragraph should relate research questions, generally phrased in plain english at the end of the introduction, e.g.:

investigate the effect of psilocibin intake on the change in depressive syndroms.

to a statistical procedure producing, in fine, numerical values reported in the result section. A careful description should explicit three concepts:

  • the parameter of interest, e.g. mean difference in 1 month change in depression score between the psilocibin and placebo group.
  • the statistical model used to estimate the parameter of interest, e.g. a linear regression was used to model the mean difference as a function of psilocibin intake, age, sex, …
  • the testing procedure used to quantify the level of evidence against a null hypothesis, e.g. a Wald test was used to test whether the difference in change was 0.

The definition of the parameter of interest is the most important aspect, followed by the modeling procedure since it is where assumptions are being made. In most studies, the testing procedure is more a technicality that does not matter too much for interpreting the results. The precise definition of the exposure (e.g. dosage in mg) or of the outcome (e.g. clinical scale used) should be defined in the method section before the statistical analysis paragraph.

Unfortunately, statistical paragraphs are often more a list of statistical tools, e.g.:

a logistic model was used to compare the groups in term of HAMD score

making it hard for the reader to decifer what the authors are doing. The parameter of interest is not explicitely defined and readers not versed in statistics will find it hard to follow. Statisticians will also be in doubt about what the authors are actually doing, since a statistical model can be used to target different parameters (e.g. odds ratio or marginal difference in outcome probability between treated and untreated).

Citing software

A common mistake is to confuse R Studio and the R software. R studio is a possible user interface to the R software whereas the R software is where the actual calculations happen. I usually do not find it relevant to cite R studio in a scientific article.

A precise reference for the R software can be found using the following command in R:

citation()

For reproducibility purpose, mentionning the version of R is useless. Instead one should provide the code used as well the output of:

sessionInfo()

which should contain the version of R and each of the packages used in the analysis.

Backtransformation

Consider a study where two groups are compared (say active and placebo) and the log-transformed outcome is analyzed using a linear regression. On the log-scale, we obtain an intercept of 0.5 (placebo group) and a treatment effect of 0.25. So a expected value in the active group of 0.75.

Estimate and confidence intervals (but not p-values!) should typically be back-transformed to ease interpretation. An estimate of 0.5 would lead to back-transformed value of exp(0.5), approximatively 1.65. The other group would get a value of 2.12. The backtransformed treatment effect would be approximatively 1.28.

Assumming homoschedastic residuals between the two groups (on the log scale), 1.28 is the ratio between the expected outcomes in the two groups. Said otherwise, the mean outcome in the active group is 28% higher compared to the placebo group. WARNING: 1.54 and 2.12 are NOT the expected mean outcome value in each group (one would need to multiply by the exponential of half the residuals variance). Assuming a log-normal distribution, 1.54 and 2.12 can be interpreted as the median value in each group and 1.28 as the ratio between the median values.

Partial residuals

Providing a graphical representation associated with the results of a statistical analysis is a common request. Displaying the raw data with the results of a statistical test (on the same graph or in the title or caption) can be misleading when adjusting for covariates. Indeed the raw data would correspond to an unadjusted analysis which may greatly differ from the adjusted analysis. The possible mismatch between the graphical representation and the estimated summary statistic(s) may confuse the reader.

When working with a continuous outcome, one can instead display partial residuals. Consider the following two-arm trial:

> data(ckdW, package = "LMMstar")
> ggplot(ckdW, aes(x = allocation, y = pwv0)) + geom_boxplot()



where the following statistical model was used to test the group difference (allocation variable):

> e.lmm <- lmm(pwv0 ~ allocation + sex + age, data = ckdW)
> model.tables(e.lmm)
              estimate         se      df      lower    upper      p.value
(Intercept) -0.6701892 1.98683016 47.0094 -4.6671548 3.326776 7.373804e-01
allocationB  0.3269616 0.82004599 47.0094 -1.3227494 1.976673 6.919113e-01
sexmale      0.8474481 0.95876696 47.0094 -1.0813321 2.776228 3.812519e-01
age          0.1682865 0.03246632 47.0094  0.1029731 0.233600 4.510612e-06

We obtain a mean group difference of 0.327 [-1.323;1.977] instead of 0.266 [-1.806;2.339] had we not have adjusted for sex and age. Partial residuals can be extracted with the residuals method:

> ckdW$pres <- residuals(e.lmm, type = "partial", var = "allocation")
> head(ckdW$pres)
[1] -2.7098727 -2.3524574 -1.6281556 -0.1672843 -0.5647286 -2.4024574

One counter-intuitive feature of partial residuals is that there may not follow the original scale (they are typically centered around 0 wherease the outcome value was strictly positive). But they do exactly reflect the estimated difference:

> tapply(ckdW$pres,ckdW$allocation,mean)
           A            B 
1.064773e-16 3.269616e-01

So the graphical display based on the partial residuals will match the result of the statistical test:

> plot(e.lmm, type = "partial", var = "allocation")



Assuming that the model holds, the display is to be understood as the outcome we would have observed had all individuals have had the same sex (here female) and same age (here age 0). Another counter-intuitive feature is that continuous variables have by default reference level 0 which in the case of age is un-realistic. The first issue can be fixed by keeping the intercept (argument var) and the second by specifying the reference levels:

> mytype <- "partial"
> attr(mytype, "reference") <- data.frame(sex = factor("female", levels(ckdW$sex)),
                                          age = 58)
> plot(e.lmm, type = mytype, var = c("(Intercept)","allocation"))



Greek letters and notations

There is no intrinsic meanning to $\beta$, $\rho$, … One should define what each refers to in the statistical analysis paragraph. Avoid to use the same letter to refer to two different things. It can be a good idea to use \(p\) to refer to p-values relative to a single test and \(p_{adj}\) to FWER adjusted p-values. In the latter case, it should be made clear over which/how many tests the FWER is evaluated.

Test \(E=mc^2\)

\(a^2 + b^2 = c^2\) –> note that all equations between these tags will not need escaping!