- The paper presents a novel unsupervised domain adaptation framework for cross-modality medical image segmentation using a symmetric architecture.
- DSAN employs bidirectional feature alignment and adversarial losses to extract domain-invariant features and enhance semantic mining.
- Experimental results on Cardiac and BraTS datasets show improvements in Dice scores and reductions in segmentation errors.
Deep Symmetric Adaptation Network for Cross-modality Medical Image Segmentation
This paper introduces a novel approach to unsupervised domain adaptation for medical image segmentation, focusing on cross-modality tasks such as MRI to CT segmentation. The proposed method, termed Deep Symmetric Adaptation Network (DSAN), leverages symmetric architecture to perform effective feature alignment and semantic mining. The key innovation lies in its bidirectional alignment of features between source and target domains and segmentation network training using multiple image styles generated by adversarial networks.
Method Overview
The DSAN framework is characterized by a completely symmetric architecture incorporating shared and domain-specific components. The network is composed of a common encoder shared across domains, two domain-specific private decoders, and a pixel-wise classifier. The shared encoder and private decoders form translation sub-networks for reconstructing images and generating cross-domain images. A pixel-wise classifier and the encoder form the segmentation sub-network aimed at leveraging semantic information from stylized images derived from adversarial training.
Figure 1: An overview of the proposed method, highlighting symmetric architecture for cross-domain adaptation in medical image segmentation.
Translation Sub-networks
The translation sub-networks implement adversarial losses to mitigate domain shifts. They employ a bidirectional approach: translating from source to target domain and vice versa, thereby aligning features across domains. The private decoders specialize in domain-specific tasks, ensuring the encoder focuses solely on domain-invariant features. Adversarial loss encourages the generated cross-domain images to be indistinguishable from real images in the target domain.
Segmentation Sub-network
The segmentation sub-network is trained using images generated from both source and target domains, exploiting semantic information across different styles. It employs deep supervision through additional classifiers for lower feature maps. The semantic mining is enhanced by adversarial losses on prediction maps, aligning segmentation outputs from different domains.
Experimental Results
The efficacy of the DSAN method is demonstrated through experiments on two medical datasets: Cardiac dataset and BraTS dataset. The results show significant improvements over state-of-the-art methods in these tasks, with notable enhancements in Dice scores and reduction in ASD and Hausdorff distances. The use of bidirectional feature alignment and comprehensive semantic mining proves advantageous compared to methods utilizing either image translation or feature alignment independently.
Figure 2: Cardiac segmentation results comparing different domain adaptation methods on MRI to CT task.
Figure 3: Brain tumor segmentation showcasing results of various methods in the unsupervised domain adaptation task.
Ablation Study
The ablation studies further highlight the importance of each component within the DSAN. Specific experiments verify the impact of bidirectional feature alignment, semantic mining using adversarial loss, and the architecture choices regarding shared versus private network components. The findings indicate that leveraging all styled images in training further enhances segmentation performance.
Discussion
The DSAN framework’s design choices, such as sharing the encoder across segmentation and translation tasks, demonstrate that effective domain-invariant feature extraction can significantly address domain shifts. Variations in network components, like private decoders and shared discriminators, were discussed, emphasizing their contribution to improved performance.

Figure 4: Training progress indicating effect of different components and settings on segmentation performance.
Conclusion
The DSAN model represents a robust method for unsupervised domain adaptation in medical image segmentation. Its symmetric architecture effectively aligns cross-modality features and harnesses diverse semantic information, achieving superior segmentation results. Future directions may focus on enhancing the method by integrating self-training or pseudo-labeling strategies to further utilize unlabeled target domain data.
Figure 5: Comparison of initialization strategies indicating training performance benefits of pre-trained initialization.