Posted on Sep 27, 2023

Recent studies have yet again highlighted problems with the reproducibility of immunohistochemistry (IHC) for diagnosing breast cancer. MammaTyper® may be a reliable and easy-to-use alternative, but how does its reliability compare? We revisit a 2017 study which provides us with an answer.

New developments in HER2-directed therapies have meant that people with low levels of HER2 expression may benefit from treatments such as trastuzumab deruxtecan (Enhertu). Previously this group would have been deemed ‘HER2-negative’, but now, the distinction of the HER2-low group is clinically important.

In light of this, two recent studies have looked at the concordance between different laboratories on HER2-low scoring using immunohistochemistry (IHC). A study of 16 expert pathologists from the UK National Coordinating Committee for Breast Pathology found overall agreement was low. A separate European study, led by researchers at Erasmus University in the Netherlands, came to a similar conclusion.

These two studies add to a long line of evidence that we believe demonstrates IHC is no longer good enough to help pathologists do their job. RT-qPCR-based assays, like MammaTyper®, may be a better option. But how does MammaTyper® compare in terms of reproducibility between different laboratories?

Testing the reproducibility of MammaTyper® To help answer this question, we can look at a study carried out in 2017, led by Prof Zsuzsanna Varga from University Hospital Zurich in Switzerland. 10 international pathology institutions participated, all with expert-level background in the field of breast cancer diagnostics. They were asked to use MammaTyper® to score HER2, ER, PR and Ki67 mRNA levels of 24 samples in total.

The study team assessed inter-lab reproducibility of the quantitative results of the qPCR assay, the categorical results of the scoring (ie. positive/negative for the markers), and agreement on subtypes based upon St. Gallen categorisation. The authors conclude that in their study, MammaTyper® “showed reliable reproducibility in the quantitative assessment of the single markers ERBB2, ESR1, PGR, and MKI67, as well as in the subtype determination, and thereby overcomes the variability known on the basis of diagnostic experience with IHC.”

IHC vs MammaTyper® – which is more reliable? So how does this reproducibility of MammaTyper® compare with that of IHC? We can attempt a side-by-side comparison of the Varga 2017 paper with the recent studies of HER2 IHC scoring concordance from 2023. These studies measured agreement between pathologists using statistics such as Fleiss’ kappa or Krippendorf’s alpha, which give values in a range from 0 to 1, with 1 being perfect agreement.

Firstly, looking solely at HER2, in the Varga study of MammaTyper®, agreement between pathologists on HER2-positive vs. HER2-negative gave a Fleiss’ kappa value of 1.00 – the highest value possible, representing perfect agreement.

In the UK National Coordinating Committee for Breast Pathology (UKNCCBP) study, the overall agreement of HER2 scoring using IHC (when broken into 0, 1+, 2+, and 3+) resulted in a Fleiss’ kappa value of 0.548, representing only ‘moderate’ agreement.

In the Erasmus study, the Krippendorf’s alpha value – a similar statistic to Fleiss’ kappa – for HER2 IHC scoring was 0.63, an interobserver reliability deemed ‘low’.

Looking at Ki67 tells a similar story. In the Varga study of MammaTyper®, agreement on Ki67-positive vs. Ki67-negative gave a Fleiss’ kappa value of 0.94, ‘almost perfect’ agreement between laboratories.

But a separate study by Varga and colleagues in 2015 showed that with IHC, the best Fleiss’ kappa score they could achieve for Ki67 scoring was 0.58, indicating only ‘moderate’ agreement.

Fleiss’ kappa scores for ER and PgR were also excellent in Varga’s study of MammaTyper® - 0.91 and 0.94 respectively, representing ‘almost perfect’ agreement. What’s more, the agreement on St Gallen classification gave another ‘almost perfect’ kappa score of 0.90.

Well-established issues with IHC The issues with the reproducibility of IHC were already well-known in 2017 when the Varga study of MammaTyper® was published. The authors note that “the intra- and inter-observer variability of IHC is a concern”, with discrepancies of up to 20% for HER2, ER, and PgR, and “prominent and challenging” inconsistency for Ki67.

Based upon the two IHC concordance studies, it doesn’t appear that much has improved since 2017. It’s worth pointing out that the participants in the recent UKNCCBP and Erasmus studies were all expert pathologists with huge amounts of experience, using the latest ‘standardised’ techniques for HER2 IHC staining. If this is the best possible concordance that IHC can produce, we can expect ‘real world’ experiences in local hospital pathology departments will give poorer results.

A real impact for patients The disappointing agreement on scoring HER2 has become all the more relevant now that ‘HER2-low’ patients could benefit from HER2-directed therapies. Ki67 scoring with IHC has always had problems, and since Ki67 helps distinguish between St. Gallen Luminal A or B categories, this affects decisions for selecting chemotherapy treatment for patients.

In conclusion, the reliability of IHC (or lack thereof) is not an academic question, but has real impact for patients, who could run the risk of being over- or under-treated, depending on where they are diagnosed. Patients deserve a diagnosis from pathologists that is based upon a more reliable tool than IHC. We believe the evidence shows that MammaTyper® is that tool.

Want to test MammaTyper® out for yourself? Get in touch: