Oncology available at the time. IHC is a semi-quantitative, subjective
technique that introduces two major sources of variability: human interpretation and technical inconsistency. There is no globally standardised protocol for HER2 IHC, and results can vary depending on pre-analytical variables such as staining procedure, fixation time, antibody clone, laboratory conditions, detection system and scoring thresholds. These issues are most pronounced in the low HER2 expression range, where distinguishing IHC 0 from 1+ or 2+ is particularly challenging.
Table 1: HER2 classification categories with associated IHC/ISH results and treatment options.
The lack of concordance in low range IHC classifications A study by Zaakouk et al.9
in the UK and Republic
of Ireland examined classifications by 16 expert pathologists from the UK National Coordinating Committee for Breast Pathology. Each pathologist independently evaluated 50 digitally scanned HER2 IHC slides. Statistical measures, including Fleiss’ kappa and Cohen’s kappa, were employed to determine agreement levels. Slides with low concordance were re-evaluated after a washout period to assess consistency over time. Results showed absolute agreement in just 6 per cent of cases, all of which were IHC 3+. Poor agreement was observed in 10 per cent of cases, primarily due to heterogeneous HER2 expression, cytoplasmic staining or low expression levels near the 10 per cent positivity threshold. When scores for HER2- low were grouped as IHC 0 versus others, concordance improved to 86 per cent, and combining scores of IHC 1+ and 2+ also enhanced agreement levels. For HER2-low cases, inter-observer agreement was only fair to moderate, highlighting the difficulties that even experienced pathologists face in making consistent low range classifications. This issue is not unique to the UK and Ireland. A multi-centre European study by Baez- Navarro et al.12
aimed to assess interobserver
agreement in HER2 IHC scoring for breast cancer cases classified as HER2-negative, and to determine if adjusted scoring criteria or the addition of FISH improved consistency. The study involved a two-round evaluation of 105 HER2-
negative non-amplified breast cancer cases. In the first round, 16 pathologists scored digital slides using the 2018 ASCO/CAP guidelines. Following a consensus meeting, the same pathologists re-evaluated the slides using modified criteria based on the 2007 ASCO/CAP guidelines, introducing an additional ultralow category. Pathologists achieved complete agreement in only 4.7 per cent of cases and, even when using clustered scoring – for example, combining IHC 1+ and 2+ – concordance improved only modestly, with the most frequent agreement pattern being ≥12 observers agreeing on a score in only 56.2 per cent of cases. Fleiss’ kappa remained fair to moderate at 0.32 and, for binary HER2-negative vs HER2-low, it was still only fair at 0.39. US-based data also support these findings. A study by Robbins et al.13
involving 18 specialist et al.,14
pathologists from 15 institutions showed substantial discordance in HER2 IHC scoring within the intermediate categories; <1 per cent agreement for IHC 1+ and 3.6 per cent agreement for 2+. The discordance within the IHC 0 cases was also substantial, with an overall agreement of only 25 per cent and poor inter-rater reliability metrics. Since the emergence of lower levels of HER2 as targets for therapy, determining when a case is 0 vs not- 0 is an important clinical decision threshold for prescription of HER2-low therapies. Similar findings were reported by Fernandez a study that analysed datasets from
both the College of American Pathologists (CAP) and Yale University to evaluate the
accuracy and consistency of standard ERBB2 IHC assays in distinguishing between low ERBB2 expression levels and negative expression. During the CAP survey, data from over 1,400 laboratories participating in proficiency testing was analysed. Each lab assessed 80 tissue cores over the course of two years. 65 per cent achieved a concordance rate of 90 per cent or higher among participating laboratories and high agreement was primarily observed in cases scored as IHC 0 or 3+. The lowest agreement occurred between
IHC scores of 0 and 1+, with 25 per cent of negative cores showing less than 70 per cent concordance. In the Yale University cohort study, a set of 170 breast cancer biopsy cases was evaluated by 18 pathologists from 15 institutions. Each pathologist independently scored ERBB2 expression using standard IHC assays, simulating real-world diagnostic conditions. Among 92 cases read as IHC 0 by at least one pathologist, only 26 per cent achieved 90 per cent or greater agreement. In contrast, among 45 cases read as IHC 3+ by at least one pathologist, 58 per cent reached the same level of agreement. The overall kappa score was 0.39 for all HER2 categories, reflecting fair to moderate agreement. When collapsed into binary categories, for IHC 0 vs non-0, kappa was 0.26, indicating poor agreement in identifying true negatives versus low expressors. For IHC 3+ vs non-3+, kappa was 0.61, suggesting substantial agreement when identifying strong positive cases. Essentially, the Fernandez et al. study found poor agreement between pathologists in both datasets, especially in IHC 0 and 1+ cases.
Table 2: Summary of concordance from studies looking at IHC inter-observer agreement. 46
www.clinicalservicesjournal.com I November 2025
Insufficiencies of ISH techniques ISH methods – such as FISH and CISH – can offer more objective assessment than IHC by detecting HER2 gene amplification, but their use is also limited in several ways. Most notably, ISH methods are only routinely used as confirmatory tests for IHC 2+ cases, not to clarify HER2 status
Page 1 |
Page 2 |
Page 3 |
Page 4 |
Page 5 |
Page 6 |
Page 7 |
Page 8 |
Page 9 |
Page 10 |
Page 11 |
Page 12 |
Page 13 |
Page 14 |
Page 15 |
Page 16 |
Page 17 |
Page 18 |
Page 19 |
Page 20 |
Page 21 |
Page 22 |
Page 23 |
Page 24 |
Page 25 |
Page 26 |
Page 27 |
Page 28 |
Page 29 |
Page 30 |
Page 31 |
Page 32 |
Page 33 |
Page 34 |
Page 35 |
Page 36 |
Page 37 |
Page 38 |
Page 39 |
Page 40 |
Page 41 |
Page 42 |
Page 43 |
Page 44 |
Page 45 |
Page 46 |
Page 47 |
Page 48 |
Page 49 |
Page 50 |
Page 51 |
Page 52 |
Page 53 |
Page 54 |
Page 55 |
Page 56 |
Page 57 |
Page 58 |
Page 59 |
Page 60 |
Page 61 |
Page 62 |
Page 63 |
Page 64