A Comparison of Self-Supervised Pretraining Approaches for Predicting Disease Risk from Chest Radiograph Images (2306.08955v1)
Abstract: Deep learning is the state-of-the-art for medical imaging tasks, but requires large, labeled datasets. For risk prediction, large datasets are rare since they require both imaging and follow-up (e.g., diagnosis codes). However, the release of publicly available imaging data with diagnostic labels presents an opportunity for self and semi-supervised approaches to improve label efficiency for risk prediction. Though several studies have compared self-supervised approaches in natural image classification, object detection, and medical image interpretation, there is limited data on which approaches learn robust representations for risk prediction. We present a comparison of semi- and self-supervised learning to predict mortality risk using chest x-ray images. We find that a semi-supervised autoencoder outperforms contrastive and transfer learning in internal and external validation.
- Reduced lung-cancer mortality with low-dose computed tomographic screening. New England Journal of Medicine, 365(5):395–409, August 2011. 10.1056/nejmoa1102873. URL \urlhttps://doi.org/10.1056/nejmoa1102873.
- Big self-supervised models advance medical image classification, 2021. URL \urlhttps://arxiv.org/abs/2101.05224.
- Autoencoders, 2020. URL \urlhttps://arxiv.org/abs/2003.05991.
- PadChest: A large chest x-ray image dataset with multi-label annotated reports. Medical Image Analysis, 66:101797, December 2020. ISSN 1361-8415. 10.1016/j.media.2020.101797.
- Deep learning for chest X-ray analysis: A survey. Medical Image Analysis, 72:102125, August 2021. ISSN 1361-8415. 10.1016/j.media.2021.102125.
- Emerging Properties in Self-Supervised Vision Transformers. In 2021 IEEE/CVF International Conference on Computer Vision (ICCV), pages 9630–9640, Montreal, QC, Canada, October 2021a. IEEE. ISBN 978-1-66542-812-5. 10.1109/ICCV48922.2021.00951.
- Emerging properties in self-supervised vision transformers, 2021b. URL \urlhttps://arxiv.org/abs/2104.14294.
- A simple framework for contrastive learning of visual representations, 2020a. URL \urlhttps://arxiv.org/abs/2002.05709.
- A simple framework for contrastive learning of visual representations. CoRR, abs/2002.05709, 2020b. URL \urlhttps://arxiv.org/abs/2002.05709.
- When does contrastive visual representation learning work?, 2021. URL \urlhttps://arxiv.org/abs/2105.05837.
- ImageNet: A large-scale hierarchical image database. In 2009 IEEE Conference on Computer Vision and Pattern Recognition, pages 248–255, June 2009. 10.1109/CVPR.2009.5206848.
- Patient contrastive learning: A performant, expressive, and practical approach to electrocardiogram modeling. PLOS Computational Biology, 18(2):e1009862, 2022. 10.1371/journal.pcbi.1009862.
- Bootstrap Your Own Latent - A New Approach to Self-Supervised Learning. In Advances in Neural Information Processing Systems, volume 33, pages 21271–21284. Curran Associates, Inc., 2020.
- Semi-supervised Learning by Disentangling and Self-ensembling over Stochastic Latent Space. In Dinggang Shen, Tianming Liu, Terry M. Peters, Lawrence H. Staib, Caroline Essert, Sean Zhou, Pew-Thian Yap, and Ali Khan, editors, Medical Image Computing and Computer Assisted Intervention – MICCAI 2019, Lecture Notes in Computer Science, pages 766–774, Cham, 2019. Springer International Publishing. ISBN 978-3-030-32226-7. 10.1007/978-3-030-32226-7_85.
- Momentum contrast for unsupervised visual representation learning, 2019. URL \urlhttps://arxiv.org/abs/1911.05722.
- Learning deep representations by mutual information estimation and maximization, 2018. URL \urlhttps://arxiv.org/abs/1808.06670.
- CheXpert: A Large Chest Radiograph Dataset with Uncertainty Labels and Expert Comparison. Proceedings of the AAAI Conference on Artificial Intelligence, 33(01):590–597, July 2019. ISSN 2374-3468. 10.1609/aaai.v33i01.3301590.
- nnU-Net: A self-configuring method for deep learning-based biomedical image segmentation. Nature Methods, 18(2):203–211, February 2021. ISSN 1548-7105. 10.1038/s41592-020-01008-z.
- MIMIC-CXR, a de-identified publicly available database of chest radiographs with free-text reports. Scientific Data, 6(1):317, December 2019. ISSN 2052-4463. 10.1038/s41597-019-0322-0.
- Adam: A method for stochastic optimization, 2014. URL \urlhttps://arxiv.org/abs/1412.6980.
- An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale. 2021.
- Analysing complex sensory data by non-linear artificial neural networks. In Tormod Naes and Einar Risvik, editors, Multivariate analysis of data in sensory science, volume 16 of Data Handling in Science and Technology, pages 103–133. Elsevier, 1996. https://doi.org/10.1016/S0922-3487(96)80028-1. URL \urlhttps://www.sciencedirect.com/science/article/pii/S0922348796800281.
- Temporal Ensembling for Semi-Supervised Learning, March 2017.
- A survey on deep learning in medical image analysis. Medical Image Analysis, 42:60–88, 2017. ISSN 1361-8415. 10.1016/j.media.2017.07.005.
- Deep Learning to Assess Long-term Mortality From Chest Radiographs. JAMA Network Open, 2(7):e197416, July 2019. ISSN 2574-3805. 10.1001/jamanetworkopen.2019.7416.
- Deep Learning Using Chest Radiographs to Identify High-Risk Smokers for Lung Cancer Screening Computed Tomography: Development and Validation of a Prediction Model. Annals of Internal Medicine, 173(9):704–713, November 2020. ISSN 1539-3704. 10.7326/M20-1868.
- Screening by chest radiograph and lung cancer mortality. JAMA, 306(17):1865, November 2011. 10.1001/jama.2011.1591. URL \urlhttps://doi.org/10.1001/jama.2011.1591.
- Prediction of cardiovascular risk factors from retinal fundus photographs via deep learning. Nature Biomedical Engineering, 2(3):158–164, March 2018. ISSN 2157-846X. 10.1038/s41551-018-0195-0.
- Design of the prostate, lung, colorectal and ovarian (PLCO) cancer screening trial. Controlled Clinical Trials, 21(6):273S–309S, December 2000. 10.1016/s0197-2456(00)00098-2. URL \urlhttps://doi.org/10.1016/s0197-2456(00)00098-2.
- Transfusion: Understanding transfer learning for medical imaging, 2019. URL \urlhttps://arxiv.org/abs/1902.07208.
- Deep learning to estimate biological age from chest radiographs. JACC: Cardiovascular Imaging, 14(11):2226–2236, November 2021. 10.1016/j.jcmg.2021.01.008. URL \urlhttps://doi.org/10.1016/j.jcmg.2021.01.008.
- Elaine Ron. CANCER RISKS FROM MEDICAL RADIATION. Health Physics, 85(1):47, July 2003. ISSN 0017-9078.
- An artificial intelligence system for predicting the deterioration of COVID-19 patients in the emergency department. npj Digital Medicine, 4(1):1–11, May . ISSN 2398-6352. 10.1038/s41746-021-00453-0.
- Self-supervised learning methods and applications in medical imaging analysis: A survey. PeerJ Computer Science, 8:e1045, July 2022. ISSN 2376-5992. 10.7717/peerj-cs.1045.
- Pre-training autoencoder for lung nodule malignancy assessment using CT images. Applied Sciences, 10(21):7837, November 2020. 10.3390/app10217837. URL \urlhttps://doi.org/10.3390/app10217837.
- Leslie N. Smith. A disciplined approach to neural network hyper-parameters: Part 1 - learning rate, batch size, momentum, and weight decay. CoRR, abs/1803.09820, 2018. URL \urlhttp://arxiv.org/abs/1803.09820.
- Moco-cxr: Moco pretraining improves representation and transferability of chest x-ray models, 2020. URL \urlhttps://arxiv.org/abs/2010.05352.
- A survey on deep transfer learning. CoRR, abs/1808.01974, 2018. URL \urlhttp://arxiv.org/abs/1808.01974.
- How transferable are self-supervised features in medical image classification tasks?, 2021. URL \urlhttps://arxiv.org/abs/2108.10048.
- ChestX-ray8: Hospital-Scale Chest X-Ray Database and Benchmarks on Weakly-Supervised Classification and Localization of Common Thorax Diseases. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pages 2097–2106, 2017.
- Radiologists can visually predict mortality risk based on the gestalt of chest radiographs comparable to a deep learning network. Scientific Reports, 11(1):19586, October 2021. ISSN 2045-2322. 10.1038/s41598-021-99107-0.
- Unsupervised Feature Learning via Non-Parametric Instance Discrimination. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pages 3733–3742, 2018.
- Self pre-training with masked autoencoders for medical image analysis, 2022. URL \urlhttps://arxiv.org/abs/2203.05573.
- Yanru Chen (12 papers)
- Michael T Lu (2 papers)
- Vineet K Raghu (2 papers)