Restore-RWKV: Efficient and Effective Medical Image Restoration with RWKV (2407.11087v3)

Published 14 Jul 2024 in eess.IV and cs.CV

Abstract: Transformers have revolutionized medical image restoration, but the quadratic complexity still poses limitations for their application to high-resolution medical images. The recent advent of the Receptance Weighted Key Value (RWKV) model in the natural language processing field has attracted much attention due to its ability to process long sequences efficiently. To leverage its advanced design, we propose Restore-RWKV, the first RWKV-based model for medical image restoration. Since the original RWKV model is designed for 1D sequences, we make two necessary modifications for modeling spatial relations in 2D medical images. First, we present a recurrent WKV (Re-WKV) attention mechanism that captures global dependencies with linear computational complexity. Re-WKV incorporates bidirectional attention as basic for a global receptive field and recurrent attention to effectively model 2D dependencies from various scan directions. Second, we develop an omnidirectional token shift (Omni-Shift) layer that enhances local dependencies by shifting tokens from all directions and across a wide context range. These adaptations make the proposed Restore-RWKV an efficient and effective model for medical image restoration. Even a lightweight variant of Restore-RWKV, with only 1.16 million parameters, achieves comparable or even superior results compared to existing state-of-the-art (SOTA) methods. Extensive experiments demonstrate that the resulting Restore-RWKV achieves SOTA performance across a range of medical image restoration tasks, including PET image synthesis, CT image denoising, MRI image super-resolution, and all-in-one medical image restoration. Code is available at: https://github.com/Yaziwel/Restore-RWKV.

Citations (3)

View on Semantic Scholar

Summary

The paper presents the first RWKV-based model for MedIR, featuring Re-WKV attention and Omni-Shift for enhanced global and local image dependencies.
It achieves superior results on MRI, CT, and PET tasks with notable PSNR improvements, surpassing state-of-the-art methods.
The model offers efficient real-world applications by reducing computational complexity while maintaining high diagnostic accuracy.

Restore-RWKV: Efficient and Effective Medical Image Restoration with RWKV

"Restore-RWKV: Efficient and Effective Medical Image Restoration with RWKV" is a substantial contribution to the field of medical image restoration (MedIR). It addresses some of the core limitations associated with existing approaches, including convolutional neural networks (CNNs), Transformers, and Mamba-based models. This paper presents an innovative adaptation of the Receptance Weighted Key Value (RWKV) model, originally designed for 1D sequences in NLP, to the domain of 2D medical image restoration.

Key Contributions

Restore-RWKV Model Design: The paper introduces Restore-RWKV, the first RWKV-based model applied to MedIR. This model effectively addresses the high computational complexity and limited receptive fields of previous approaches by leveraging two novel components:

Recurrent WKV (Re-WKV) Attention: The authors propose a Re-WKV attention mechanism that captures global dependencies with linear computational complexity. Unlike the unidirectional WKV attention in the original RWKV, Re-WKV utilizes a bidirectional attention mechanism to achieve a global receptive field and models 2D dependencies by recurrently processing various scan directions.
Omnidirectional Token Shift (Omni-Shift): This layer enhances local dependencies by shifting tokens from all directions across a wide context range. It employs a structural re-parameterization strategy for improved efficiency during both training and testing.

Extensive Experimental Validation: The efficacy of Restore-RWKV is demonstrated through extensive experiments across multiple MedIR tasks, such as MRI image super-resolution, CT image denoising, PET image synthesis, and a composite all-in-one medical image restoration task. The results show that Restore-RWKV consistently outperforms state-of-the-art methods in these domains.

Numerical Results and Implications

The paper provides comprehensive numerical results evidencing the superiority of Restore-RWKV:

MRI Image Super-Resolution: Outperforming the second-best method Restormer by over 0.20 dB in PSNR, Restore-RWKV achieves a PSNR of 32.0913, SSIM of 0.9408, and RMSE of 28.9713.
CT Image Denoising: Restore-RWKV achieves a PSNR of 33.7988, SSIM of 0.9198, and RMSE of 8.3600, surpassing other models including MambaIR.
PET Image Synthesis: The model attains a PSNR of 37.3314, SSIM of 0.9474, and RMSE of 0.0852, showing its robustness across different medical imaging modalities.
All-in-One MedIR: While Restore-RWKV does not include specialized modules for handling task disparities, it achieves second-best results on average, indicating strong capacity and generalizability across multiple tasks.

Theoretical and Practical Implications

Theoretical Implications:

Restore-RWKV advances theoretical understanding by showing that the RWKV architecture can be adapted from NLP to vision tasks with specific modifications. The introduction of Re-WKV attention and Omni-Shift mechanisms provides insights into effective ways to model both global and local dependencies in high-dimensional data. This demonstrates that RWKV-based models can achieve competitive performance with lower computational complexity compared to traditional Transformers.

Practical Implications:

Practically, Restore-RWKV can be implemented in real-world medical imaging systems to improve diagnostic accuracy and efficiency. Its ability to handle various MedIR tasks makes it a versatile tool that can be integrated into systems where resources are constrained, yet high resolution and accuracy are required.

Future Directions

The promising results of Restore-RWKV suggest several directions for future research:

Generalization to Other Imaging Tasks: Extending the application of Restore-RWKV to other types of medical imaging, such as ultrasound or X-ray, would further validate its versatility.
Real-Time Processing: Optimize Restore-RWKV for real-time applications, especially in clinical settings, where fast and accurate image restoration is crucial.
Hybrid Models: Investigate hybrid architectures that combine the strengths of Restore-RWKV with other advanced models to push the boundaries of current MedIR capabilities.
Large-Scale Clinical Trials: Conduct large-scale, real-world clinical trials to assess the practical benefits and limitations of Restore-RWKV in diverse healthcare environments.

In conclusion, Restore-RWKV marks a significant advancement in medical image restoration, offering efficient and effective solutions through novel adaptations of RWKV mechanisms. Its superior performance, both in specific tasks and as a generalist model, highlights its potential to become a new standard in MedIR applications.

Related Papers

GitHub

GitHub - Yaziwel/Restore-RWKV: Restore-RWKV: Efficient and Effective Medical Image Restoration with RWKV (60 stars)

Tweets

https://twitter.com/BlinkDL_AI/status/1813571367473709133