A Deep Learning Framework for Unsupervised Affine and Deformable Image Registration (1809.06130v2)

Published 17 Sep 2018 in cs.CV

Abstract: Image registration, the process of aligning two or more images, is the core technique of many (semi-)automatic medical image analysis tasks. Recent studies have shown that deep learning methods, notably convolutional neural networks (ConvNets), can be used for image registration. Thus far training of ConvNets for registration was supervised using predefined example registrations. However, obtaining example registrations is not trivial. To circumvent the need for predefined examples, and thereby to increase convenience of training ConvNets for image registration, we propose the Deep Learning Image Registration (DLIR) framework for \textit{unsupervised} affine and deformable image registration. In the DLIR framework ConvNets are trained for image registration by exploiting image similarity analogous to conventional intensity-based image registration. After a ConvNet has been trained with the DLIR framework, it can be used to register pairs of unseen images in one shot. We propose flexible ConvNets designs for affine image registration and for deformable image registration. By stacking multiple of these ConvNets into a larger architecture, we are able to perform coarse-to-fine image registration. We show for registration of cardiac cine MRI and registration of chest CT that performance of the DLIR framework is comparable to conventional image registration while being several orders of magnitude faster.

Authors (6)

Bob D. de Vos (18 papers)
Floris F. Berendsen (2 papers)
Max A. Viergever (32 papers)
Hessam Sokooti (9 papers)
Marius Staring (31 papers)
Ivana Isgum (64 papers)

Citations (638)

View on Semantic Scholar

Summary

Overview of "A Deep Learning Framework for Unsupervised Affine and Deformable Image Registration"

The paper presents an innovative deep learning-based framework for performing unsupervised image registration, specifically targeting the challenges in medical imaging. This approach circumvents the need for predefined example registrations by utilizing convolutional neural networks (ConvNets) within the Deep Learning Image Registration (DLIR) framework to perform both affine and deformable image registration. The authors demonstrate the framework's capability to execute image registration tasks with accuracy comparable to conventional methods but with significantly reduced computational time.

Methodology

The DLIR framework incorporates ConvNets designed to predict transformation parameters for image registration by leveraging image similarity between fixed and moving image pairs. The ConvNets are trained without supervision, meaning no example registrations are required during training. This unsupervised approach addresses the challenge of acquiring manually labeled training data, which is particularly arduous in medical imaging contexts.

Affine and Deformable Registration

The paper details distinct ConvNet architectures tailored for affine and deformable registration:

Affine Image Registration: ConvNets for affine registration analyze pairs of images to predict global transformations, capable of handling different image sizes.
Deformable Image Registration: ConvNets for deformable registration employ B-spline transformations, leveraging densely connected neural networks to capture local deformations efficiently.

The framework also supports multi-stage ConvNets that perform sequential affine and deformable registration, effectively refining registration through coarse-to-fine resolutions.

Experimental Evaluation

The authors conduct extensive experiments to validate the DLIR framework using intra-patient cardiac cine MRIs, inter-patient low-dose chest CTs, and publicly available 4D chest CT data. They demonstrate that:

Intra-patient Cardiac MRI Registration: The DLIR framework performs comparably to conventional iterative methods while significantly reducing the risk of image folding.
Inter-patient Chest CT Registration: Although slightly outperformed by conventional methods in later stages, the DLIR framework maintains competitiveness with fewer outliers and executes on average within 0.43 seconds on a GPU.
DIR-Lab Data Evaluation: On the publicly available dataset, the DLIR framework achieves reasonable accuracy, indicating its robustness even with limited training data.

Technical and Theoretical Implications

The DLIR framework’s unsupervised approach has significant implications for medical image analysis:

Resource Efficiency: Eliminating the need for curated labels or example registrations decreases the dependence on expert annotations, reducing time and cost.
Speed: The GPU-optimized ConvNets achieve registration in milliseconds, facilitating real-time applications in clinical settings.
Scalability: The framework is adaptable to various transformation models and image modalities, suggesting broad applicability in other domains requiring complex image registration.

Future Directions

Continued advancements in the DLIR framework could incorporate additional deep learning strategies to enhance robustness and accuracy further. Potential areas of exploration include:

Investigating more complex ConvNet architectures to improve registration precision without escalating memory consumption.
Extending the framework to include multi-modality registration using different similarity metrics.
Enhancing regularization techniques to enforce diffeomorphism and reduce folding.

In conclusion, the DLIR framework represents a significant step toward efficient and accurate unsupervised image registration. Its ability to generalize across distinct medical imaging tasks with minimal computation highlights its potential utility in developing more sophisticated AI tools for clinical use.

PDF Markdown