Dive into Self-Supervised Learning for Medical Image Analysis: Data, Models and Tasks

Published 25 Sep 2022 in cs.CV | (2209.12157v2)

Abstract: Self-supervised learning (SSL) has achieved remarkable performance in various medical imaging tasks by dint of priors from massive unlabelled data. However, regarding a specific downstream task, there is still a lack of an instruction book on how to select suitable pretext tasks and implementation details throughout the standard ``pretrain-then-finetune'' workflow. In this work, we focus on exploiting the capacity of SSL in terms of four realistic and significant issues: (1) the impact of SSL on imbalanced datasets, (2) the network architecture, (3) the applicability of upstream tasks to downstream tasks and (4) the stacking effect of SSL and common policies for deep learning. We provide a large-scale, in-depth and fine-grained study through extensive experiments on predictive, contrastive, generative and multi-SSL algorithms. Based on the results, we have uncovered several insights. Positively, SSL advances class-imbalanced learning mainly by boosting the performance of the rare class, which is of interest to clinical diagnosis. Unfortunately, SSL offers marginal or even negative returns in some cases, including severely imbalanced and relatively balanced data regimes, as well as combinations with common training policies. Our intriguing findings provide practical guidelines for the usage of SSL in the medical context and highlight the need for developing universal pretext tasks to accommodate diverse application scenarios.