STATISTICS
Distributions, outliers and confidence intervals in laboratory data
With the first in a new series of articles, Stephen MacDonald returns to the pages of Pathology in Practice. Here, he begins with an introduction to statistical interpretation focusing on improving the quality of the first statistical encounter with a dataset.
Laboratory professionals spend much of their working lives generating, reviewing, reporting and discussing numerical data. Yet many of the statistical mistakes made in routine laboratory work occur before any formal analysis has begun. A dataset is collected, a spreadsheet is opened, and familiar numbers are produced: a mean, a standard deviation, perhaps a P value later on. The problem is that these outputs are often generated before anyone has stopped to ask a more basic question: what sort of data are these, and what is the most honest way to describe them? That question maters because
laboratory data do not always behave like textbook examples. Some datasets are approximately symmetric and well- behaved, but many are not. They may be right-skewed, constrained by analytical limits, distorted by a small number of extreme observations, or made up of more
Data patern Useful summaries Approximately symmetric Mean, SD Right-skewed Mixed or bimodal data Small dataset with uncertain shape Median, IQR; percentiles
One or two extreme values Median, IQR; mean if impact of extremes is relevant
Separate subgroup summaries where justified
Median, IQR; confidence intervals around the main estimate
than one underlying population. Even where a dataset looks fairly simple at first glance, the reason for its shape may not be simple at all. Biological heterogeneity, pre-analytical variation, analytical effects, and errors in data capture can all leave visible marks on the distribution. The reference interval literature has wrestled with this problem for decades, which is why it provides such a useful foundation for the present topic: it recognises that quantitative laboratory data often require careful descriptive and statistical treatment rather than default reliance on Gaussian assumptions. This article therefore focuses on three
connected themes: how laboratory data are distributed, how unusual values should be interpreted, and how confidence intervals help express uncertainty around the quantities we report. These are not advanced topics in the sense of requiring
Main reason
Mean and SD describe centre and spread reasonably well
Less influenced by a long upper tail
One overall summary may hide distinct populations
Small samples are more sensitive to individual values and imprecision
complex mathematics, but they are fundamental. If a dataset is described badly at the outset, everything that follows becomes harder to trust. If the centre and spread are summarised inappropriately, the conclusions may be distorted. If an extreme value is removed too quickly, an important clue may disappear. If a point estimate is reported with no indication of uncertainty, readers may place more confidence in it than it deserves. This first article in the series is
designed to establish that foundation. It is not intended to be a general statistics primer, nor a detailed guide to formal hypothesis testing, nor a full treatment of reference interval methodology. Those topics have their own place. The aim here is narrower and more practical: to help laboratory professionals recognise how data behave in practice, choose summaries that fit the data rather than
Common mistake
Assuming all laboratory data behave this way
Reporting mean and SD alone
Robust summaries reduce distortion Removing extreme values without investigation from isolated values
Pooling different groups into one summary
Over-interpreting shape or reporting point estimates without intervals
Table 1. Summary statistics should be chosen according to the shape of the data and the purpose of the analysis. SD: standard deviation; IQR: interquartile range. 22
WWW.PATHOLOGYINPRACTICE.COM May 2026
Page 1 |
Page 2 |
Page 3 |
Page 4 |
Page 5 |
Page 6 |
Page 7 |
Page 8 |
Page 9 |
Page 10 |
Page 11 |
Page 12 |
Page 13 |
Page 14 |
Page 15 |
Page 16 |
Page 17 |
Page 18 |
Page 19 |
Page 20 |
Page 21 |
Page 22 |
Page 23 |
Page 24 |
Page 25 |
Page 26 |
Page 27 |
Page 28 |
Page 29 |
Page 30 |
Page 31 |
Page 32 |
Page 33 |
Page 34 |
Page 35 |
Page 36 |
Page 37 |
Page 38 |
Page 39 |
Page 40 |
Page 41 |
Page 42 |
Page 43 |
Page 44 |
Page 45 |
Page 46 |
Page 47 |
Page 48 |
Page 49 |
Page 50 |
Page 51 |
Page 52