Investigating the power of deep learning for predicting breast cancer from whole slide images

A third year project in partial requirement for a Computer Science BSc at the University of Warwick, supervised by Professor Fayyaz Minhas.

Abstract

Both in the UK and globally, breast cancer is the most commonly diagnosed cancer in females. A key component in the diagnosis and management of breast cancer is the analysis of the tumour samples from the diagnostic biopsy.
The computational pathology literature shows high average results for predicting tumour properties from the microscope slides using deep learning models, giving hope that the need for specific laboratory tests could be eliminated. However, recent literature shows that these models make excessive use of correlated properties and so perform poorly in subgroups of patients who subvert these correlations.
In this project we gathered data on the relationship between the complexity of models and their overall performance. We also investigated the feasibility of making models ignore these undesired correlations.
We find that the gap between simple models and complex models is small, suggesting that the popular status of ER/PR prediction from H&E as a benchmark is not deserved. We also find that various cutting-edge techniques for learning robust models are each incapable of improving performance in this task over the difficult subset.
We conclude that H&E stains do not contain data from which ER/PR can be independently predicted, and so IHC testing cannot be directly disrupted by computational pathology.

Copyright

Code structure

datasets/

Contains code used to process the datasets. The datasets themselves must be obtained separately:

The TCGA graphs are available on request from Dr Mark Eastwood. The TCGA-BRCA clinical supplements used to find the ER/PR labels are publicly available for download from NIH NCI GDC
The ABCTB graphs (which already contain ER/PR labels) are also available on request from Dr Mark Eastwood. Note however that the ABCTB dataset must only be used for research purposes

experiments/

The bulk of the code for the project:

Each model architecture defines a subclass of torch.nn.Module
5-fold cross-validation is carried out by creating an instance of GNNModelTrainer and calling the train_and_validate method. The model architecture class is passed to train_and_validate and a new instance of it is constructed in each fold
GNNModelTrainer trains the model using PyTorch Lightning's Trainer object with ModelCheckpoint and EarlyStopping callbacks
GNNModelTrainer must create a LightningModule from the torch Module that defined the architecture. This is achieved using the wrapper class LightningModel; this defines the loss function and optimizer. Experiments that correspond to changes to LightningModel are implemented as features switches in GNNModelTrainer which are passed through to LightningModel
ModelEvaluationUtils contains the definitions of our metrics
GNNModelTrainer creates an instance of ModelEvaluator. The evalaute_fold method is called at the end of each fold and the close method is called once there are no folds left to evaluate. This produces a .metrics file, our own format designed to be both human-readable and machine-readable

Name		Name	Last commit message	Last commit date
Latest commit History 159 Commits
datasets		datasets
experiments		experiments
README.md		README.md
code_checks.bash		code_checks.bash
project.pdf		project.pdf
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Investigating the power of deep learning for predicting breast cancer from whole slide images

Abstract

Copyright

Code structure

datasets/

experiments/

About

Uh oh!

Uh oh!

Languages

qwertpi/cs310

Folders and files

Latest commit

History

Repository files navigation

Investigating the power of deep learning for predicting breast cancer from whole slide images

Abstract

Copyright

Code structure

datasets/

experiments/

About

Resources

Uh oh!

Stars

Watchers

Forks

Uh oh!

Languages