Papers
Topics
Authors
Recent
Detailed Answer
Quick Answer
Concise responses based on abstracts only
Detailed Answer
Well-researched responses based on abstracts and relevant paper content.
Custom Instructions Pro
Preferences or requirements that you'd like Emergent Mind to consider when generating responses
Gemini 2.5 Flash
Gemini 2.5 Flash 62 tok/s
Gemini 2.5 Pro 48 tok/s Pro
GPT-5 Medium 17 tok/s Pro
GPT-5 High 13 tok/s Pro
GPT-4o 101 tok/s Pro
Kimi K2 217 tok/s Pro
GPT OSS 120B 474 tok/s Pro
Claude Sonnet 4 36 tok/s Pro
2000 character limit reached

Dependability Analysis of Data Storage Systems in Presence of Soft Errors (2112.12520v1)

Published 23 Dec 2021 in cs.PF, cs.AR, and cs.IR

Abstract: In recent years, high availability and reliability of Data Storage Systems (DSS) have been significantly threatened by soft errors occurring in storage controllers. Due to their specific functionality and hardware-software stack, error propagation and manifestation in DSS is quite different from general-purpose computing architectures. To our knowledge, no previous study has examined the system-level effects of soft errors on the availability and reliability of data storage systems. In this paper, we first analyze the effects of soft errors occurring in the server processors of storage controllers on the entire storage system dependability. To this end, we implemented the major functions of a typical data storage system controller, running on a full stack of storage system operating system, and developed a framework to perform fault injection experiments using a full system simulator. We then propose a new metric, Storage System Vulnerability Factor (SSVF), to accurately capture the impact of soft errors in storage systems. By conducting extensive experiments, it is revealed that depending on the controller configuration, up to 40% of cache memory contains end-user data where any unrecoverable soft errors in this part will result in Data Loss (DL) in an irreversible manner. However, soft errors in the rest of cache memory filled by Operating System (OS) and storage applications will result in Data Unavailability (DU) at the storage system level. Our analysis also shows that Detectable Unrecoverable Errors (DUEs) on the cache data field are the major cause of DU in storage systems, while Silent Data Corruptions (SDCs) in the cache tag and data field are mainly the cause of DL in storage systems.

Citations (15)

Summary

We haven't generated a summary for this paper yet.

List To Do Tasks Checklist Streamline Icon: https://streamlinehq.com

Collections

Sign up for free to add this paper to one or more collections.

Lightbulb On Streamline Icon: https://streamlinehq.com

Continue Learning

We haven't generated follow-up questions for this paper yet.

Don't miss out on important new AI/ML research

See which papers are being discussed right now on X, Reddit, and more:

“Emergent Mind helps me see which AI papers have caught fire online.”

Philip

Philip

Creator, AI Explained on YouTube