Pathology In Practice May 2026

orderForm.title

orderForm.productCode

orderForm.description

orderForm.quantity

orderForm.itemPrice

orderForm.price

orderForm.totalPrice

orderForm.deliveryDetails.name

orderForm.deliveryDetails.accountNumber

orderForm.deliveryDetails.phone

orderForm.deliveryDetails.poNumber

orderForm.deliveryDetails.email

orderForm.deliveryDetails.companyName

orderForm.deliveryDetails.billingAddress

orderForm.deliveryDetails.deliveryAddress

orderForm.deliveryDetails.deliveryDetailsDeliveryAddressSameAsBillingAddress

orderForm.deliveryDetails.address1

orderForm.deliveryDetails.address2

orderForm.deliveryDetails.city

orderForm.deliveryDetails.state

orderForm.deliveryDetails.postCode

orderForm.deliveryDetails.country

orderForm.deliveryDetails.additionalInformation

orderForm.noItems

STATISTICS

Distributions, outliers and confidence intervals in laboratory data

With the first in a new series of articles, Stephen MacDonald returns to the pages of Pathology in Practice. Here, he begins with an introduction to statistical interpretation focusing on improving the quality of the first statistical encounter with a dataset.

Laboratory professionals spend much of their working lives generating, reviewing, reporting and discussing numerical data. Yet many of the statistical mistakes made in routine laboratory work occur before any formal analysis has begun. A dataset is collected, a spreadsheet is opened, and familiar numbers are produced: a mean, a standard deviation, perhaps a P value later on. The problem is that these outputs are often generated before anyone has stopped to ask a more basic question: what sort of data are these, and what is the most honest way to describe them? That question maters because

laboratory data do not always behave like textbook examples. Some datasets are approximately symmetric and well- behaved, but many are not. They may be right-skewed, constrained by analytical limits, distorted by a small number of extreme observations, or made up of more

Data patern Useful summaries Approximately symmetric Mean, SD Right-skewed Mixed or bimodal data Small dataset with uncertain shape Median, IQR; percentiles

One or two extreme values Median, IQR; mean if impact of extremes is relevant

Separate subgroup summaries where justified

Median, IQR; confidence intervals around the main estimate

than one underlying population. Even where a dataset looks fairly simple at first glance, the reason for its shape may not be simple at all. Biological heterogeneity, pre-analytical variation, analytical effects, and errors in data capture can all leave visible marks on the distribution. The reference interval literature has wrestled with this problem for decades, which is why it provides such a useful foundation for the present topic: it recognises that quantitative laboratory data often require careful descriptive and statistical treatment rather than default reliance on Gaussian assumptions. This article therefore focuses on three

connected themes: how laboratory data are distributed, how unusual values should be interpreted, and how confidence intervals help express uncertainty around the quantities we report. These are not advanced topics in the sense of requiring

Main reason

Mean and SD describe centre and spread reasonably well

Less influenced by a long upper tail

One overall summary may hide distinct populations

Small samples are more sensitive to individual values and imprecision

complex mathematics, but they are fundamental. If a dataset is described badly at the outset, everything that follows becomes harder to trust. If the centre and spread are summarised inappropriately, the conclusions may be distorted. If an extreme value is removed too quickly, an important clue may disappear. If a point estimate is reported with no indication of uncertainty, readers may place more confidence in it than it deserves. This first article in the series is

designed to establish that foundation. It is not intended to be a general statistics primer, nor a detailed guide to formal hypothesis testing, nor a full treatment of reference interval methodology. Those topics have their own place. The aim here is narrower and more practical: to help laboratory professionals recognise how data behave in practice, choose summaries that fit the data rather than

Common mistake

Assuming all laboratory data behave this way

Reporting mean and SD alone

Robust summaries reduce distortion Removing extreme values without investigation from isolated values

Pooling different groups into one summary

Over-interpreting shape or reporting point estimates without intervals

Table 1. Summary statistics should be chosen according to the shape of the data and the purpose of the analysis. SD: standard deviation; IQR: interquartile range. 22 WWW.PATHOLOGYINPRACTICE.COM May 2026

Page 1 | Page 2 | Page 3 | Page 4 | Page 5 | Page 6 | Page 7 | Page 8 | Page 9 | Page 10 | Page 11 | Page 12 | Page 13 | Page 14 | Page 15 | Page 16 | Page 17 | Page 18 | Page 19 | Page 20 | Page 21 | Page 22 | Page 23 | Page 24 | Page 25 | Page 26 | Page 27 | Page 28 | Page 29 | Page 30 | Page 31 | Page 32 | Page 33 | Page 34 | Page 35 | Page 36 | Page 37 | Page 38 | Page 39 | Page 40 | Page 41 | Page 42 | Page 43 | Page 44 | Page 45 | Page 46 | Page 47 | Page 48 | Page 49 | Page 50 | Page 51 | Page 52