Quantifying the effects of data augmentation and stain color normalization in convolutional neural networks for computational pathology

Tellez, David; Litjens, Geert; Bandi, Peter; Bulten, Wouter; Bokhorst, John-Melle; Ciompi, Francesco; van der Laak, Jeroen

doi:10.1016/j.media.2019.101544

Computer Science > Computer Vision and Pattern Recognition

arXiv:1902.06543 (cs)

[Submitted on 18 Feb 2019 (v1), last revised 15 Apr 2020 (this version, v2)]

Title:Quantifying the effects of data augmentation and stain color normalization in convolutional neural networks for computational pathology

Authors:David Tellez, Geert Litjens, Peter Bandi, Wouter Bulten, John-Melle Bokhorst, Francesco Ciompi, Jeroen van der Laak

View PDF

Abstract:Stain variation is a phenomenon observed when distinct pathology laboratories stain tissue slides that exhibit similar but not identical color appearance. Due to this color shift between laboratories, convolutional neural networks (CNNs) trained with images from one lab often underperform on unseen images from the other lab. Several techniques have been proposed to reduce the generalization error, mainly grouped into two categories: stain color augmentation and stain color normalization. The former simulates a wide variety of realistic stain variations during training, producing stain-invariant CNNs. The latter aims to match training and test color distributions in order to reduce stain variation. For the first time, we compared some of these techniques and quantified their effect on CNN classification performance using a heterogeneous dataset of hematoxylin and eosin histopathology images from 4 organs and 9 pathology laboratories. Additionally, we propose a novel unsupervised method to perform stain color normalization using a neural network. Based on our experimental results, we provide practical guidelines on how to use stain color augmentation and stain color normalization in future computational pathology applications.

Comments:	Accepted in the Medical Image Analysis journal
Subjects:	Computer Vision and Pattern Recognition (cs.CV)
Cite as:	arXiv:1902.06543 [cs.CV]
	(or arXiv:1902.06543v2 [cs.CV] for this version)
	https://doi.org/10.48550/arXiv.1902.06543
Related DOI:	https://doi.org/10.1016/j.media.2019.101544

Submission history

From: David Tellez [view email]
[v1] Mon, 18 Feb 2019 12:36:58 UTC (7,586 KB)
[v2] Wed, 15 Apr 2020 14:03:56 UTC (9,274 KB)

Computer Science > Computer Vision and Pattern Recognition

Title:Quantifying the effects of data augmentation and stain color normalization in convolutional neural networks for computational pathology

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computer Vision and Pattern Recognition

Title:Quantifying the effects of data augmentation and stain color normalization in convolutional neural networks for computational pathology

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators