Emergent Mind

Restore-RWKV: Efficient and Effective Medical Image Restoration with RWKV

(2407.11087)
Published Jul 14, 2024 in eess.IV and cs.CV

Abstract

Transformers have revolutionized medical image restoration, but the quadratic complexity still poses limitations for their application to high-resolution medical images. The recent advent of RWKV in the NLP field has attracted much attention as it can process long sequences efficiently. To leverage its advanced design, we propose Restore-RWKV, the first RWKV-based model for medical image restoration. Since the original RWKV model is designed for 1D sequences, we make two necessary modifications for modeling spatial relations in 2D images. First, we present a recurrent WKV (Re-WKV) attention mechanism that captures global dependencies with linear computational complexity. Re-WKV incorporates bidirectional attention as basic for a global receptive field and recurrent attention to effectively model 2D dependencies from various scan directions. Second, we develop an omnidirectional token shift (Omni-Shift) layer that enhances local dependencies by shifting tokens from all directions and across a wide context range. These adaptations make the proposed Restore-RWKV an efficient and effective model for medical image restoration. Extensive experiments demonstrate that Restore-RWKV achieves superior performance across various medical image restoration tasks, including MRI image super-resolution, CT image denoising, PET image synthesis, and all-in-one medical image restoration. Code is available at: \href{https://github.com/Yaziwel/Restore-RWKV.git}{https://github.com/Yaziwel/Restore-RWKV}.

Restore-RWKV architecture overview and detailed R-RWKV block with Re-WKV attention and Omni-Shift layer.

Overview

  • Restore-RWKV presents an innovative adaptation of the Receptance Weighted Key Value (RWKV) model from natural language processing to medical image restoration, addressing limitations like high computational complexity and limited receptive fields found in CNNs and Transformers.

  • Extensive experiments demonstrate that Restore-RWKV outperforms state-of-the-art methods across various medical imaging tasks, including MRI image super-resolution, CT image denoising, and PET image synthesis.

  • Theoretical and practical implications of the model include demonstrating competitive performance with lower computational complexity and potential real-world applications to improve diagnostic accuracy and efficiency in medical imaging systems.

Restore-RWKV: Efficient and Effective Medical Image Restoration with RWKV

"Restore-RWKV: Efficient and Effective Medical Image Restoration with RWKV" is a substantial contribution to the field of medical image restoration (MedIR). It addresses some of the core limitations associated with existing approaches, including convolutional neural networks (CNNs), Transformers, and Mamba-based models. This paper presents an innovative adaptation of the Receptance Weighted Key Value (RWKV) model, originally designed for 1D sequences in NLP, to the domain of 2D medical image restoration.

Key Contributions

  1. Restore-RWKV Model Design: The paper introduces Restore-RWKV, the first RWKV-based model applied to MedIR. This model effectively addresses the high computational complexity and limited receptive fields of previous approaches by leveraging two novel components:
  • Recurrent WKV (Re-WKV) Attention: The authors propose a Re-WKV attention mechanism that captures global dependencies with linear computational complexity. Unlike the unidirectional WKV attention in the original RWKV, Re-WKV utilizes a bidirectional attention mechanism to achieve a global receptive field and models 2D dependencies by recurrently processing various scan directions.
  • Omnidirectional Token Shift (Omni-Shift): This layer enhances local dependencies by shifting tokens from all directions across a wide context range. It employs a structural re-parameterization strategy for improved efficiency during both training and testing.
  1. Extensive Experimental Validation: The efficacy of Restore-RWKV is demonstrated through extensive experiments across multiple MedIR tasks, such as MRI image super-resolution, CT image denoising, PET image synthesis, and a composite all-in-one medical image restoration task. The results show that Restore-RWKV consistently outperforms state-of-the-art methods in these domains.

Numerical Results and Implications

The paper provides comprehensive numerical results evidencing the superiority of Restore-RWKV:

  • MRI Image Super-Resolution: Outperforming the second-best method Restormer by over 0.20 dB in PSNR, Restore-RWKV achieves a PSNR of 32.0913, SSIM of 0.9408, and RMSE of 28.9713.
  • CT Image Denoising: Restore-RWKV achieves a PSNR of 33.7988, SSIM of 0.9198, and RMSE of 8.3600, surpassing other models including MambaIR.
  • PET Image Synthesis: The model attains a PSNR of 37.3314, SSIM of 0.9474, and RMSE of 0.0852, showing its robustness across different medical imaging modalities.
  • All-in-One MedIR: While Restore-RWKV does not include specialized modules for handling task disparities, it achieves second-best results on average, indicating strong capacity and generalizability across multiple tasks.

Theoretical and Practical Implications

Theoretical Implications: Restore-RWKV advances theoretical understanding by showing that the RWKV architecture can be adapted from NLP to vision tasks with specific modifications. The introduction of Re-WKV attention and Omni-Shift mechanisms provides insights into effective ways to model both global and local dependencies in high-dimensional data. This demonstrates that RWKV-based models can achieve competitive performance with lower computational complexity compared to traditional Transformers.

Practical Implications: Practically, Restore-RWKV can be implemented in real-world medical imaging systems to improve diagnostic accuracy and efficiency. Its ability to handle various MedIR tasks makes it a versatile tool that can be integrated into systems where resources are constrained, yet high resolution and accuracy are required.

Future Directions

The promising results of Restore-RWKV suggest several directions for future research:

  • Generalization to Other Imaging Tasks: Extending the application of Restore-RWKV to other types of medical imaging, such as ultrasound or X-ray, would further validate its versatility.
  • Real-Time Processing: Optimize Restore-RWKV for real-time applications, especially in clinical settings, where fast and accurate image restoration is crucial.
  • Hybrid Models: Investigate hybrid architectures that combine the strengths of Restore-RWKV with other advanced models to push the boundaries of current MedIR capabilities.
  • Large-Scale Clinical Trials: Conduct large-scale, real-world clinical trials to assess the practical benefits and limitations of Restore-RWKV in diverse healthcare environments.

In conclusion, Restore-RWKV marks a significant advancement in medical image restoration, offering efficient and effective solutions through novel adaptations of RWKV mechanisms. Its superior performance, both in specific tasks and as a generalist model, highlights its potential to become a new standard in MedIR applications.

Create an account to read this summary for free:

Newsletter

Get summaries of trending comp sci papers delivered straight to your inbox:

Unsubscribe anytime.