Relax: An Architectural Framework for Software Recovery of Hardware Faults
| Sorted by Date | Classified by Publication Type | Classified by Project |
Marc de Kruijf, Shuou Nomura, and Karthikeyan Sankaralingam. Relax: An Architectural Framework for Software Recovery of Hardware Faults. In Proceedings of the 37th International Symposium on Computer Architecture (ISCA), 2010.
Download
Abstract
As technology scales ever further, device unreliability is creatingexcessive complexity for hardware to maintain the illusion of perfectoperation. In this paper, we consider whether exposing hardware faultinformation to software and allowing software to control faultrecovery simplifies hardware design and helps technology scaling.The combination of emerging applications and emerging many-corearchitectures makes software recovery a viable alternative tohardware-based fault recovery. Emerging applications tend to havefew I/O and memory side-effects, which limits the amount ofinformation that needs checkpointing, and they allow discardingindividual sub-computations with small qualitative impact. Softwarerecovery can harness these properties in ways that hardware recoverycannot.We describe Relax, an architectural framework for software recovery ofhardware faults. Relax includes three core components:(1) an ISA extension that allows software to mark regions of code for softwarerecovery,(2) a hardware organization that simplifies reliability considerations and provdes) an ISA extension that allows software to mark regions of code for softwareenergy efficiency with hardware recovery support removed, and(3) software support for compilers and programmers to utilize the Relax ISA.Applying Relax to counter the effects of process variation, our results showa 20% energy efficiency improvement for PARSEC applications with only minimal source code changesand simpler hardware.
BibTeX
@inproceedings{isca10:relax, author={Marc de Kruijf and Shuou Nomura and Karthikeyan Sankaralingam}, title={Relax: An Architectural Framework for Software Recovery of Hardware Faults}, booktitle="{Proceedings of the 37th International Symposium on Computer Architecture (ISCA)}", year={2010}, abstract = { As technology scales ever further, device unreliability is creating excessive complexity for hardware to maintain the illusion of perfect operation. In this paper, we consider whether exposing hardware fault information to software and allowing software to control fault recovery simplifies hardware design and helps technology scaling. The combination of emerging applications and emerging many-core architectures makes software recovery a viable alternative to hardware-based fault recovery. Emerging applications tend to have few I/O and memory side-effects, which limits the amount of information that needs checkpointing, and they allow discarding individual sub-computations with small qualitative impact. Software recovery can harness these properties in ways that hardware recovery cannot. We describe Relax, an architectural framework for software recovery of hardware faults. Relax includes three core components: (1) an ISA extension that allows software to mark regions of code for software recovery, (2) a hardware organization that simplifies reliability considerations and provdes) an ISA extension that allows software to mark regions of code for software energy efficiency with hardware recovery support removed, and (3) software support for compilers and programmers to utilize the Relax ISA. Applying Relax to counter the effects of process variation, our results show a 20\% energy efficiency improvement for PARSEC applications with only minimal source code changes and simpler hardware. }, bib_dl_pdf = {http://www.cs.wisc.edu/vertical/papers/2010/isca10-relax.pdf}, bib_dl_ppt = {http://www.cs.wisc.edu/vertical/talks/2010/isca10-relax.pptx}, bib_pubtype = {Refereed Conference}, bib_rescat = {proj-relax} }
Generated by bib.pl (written by Patrick Riley ) on Sun Sep 26, 2021 16:14:28 time=1207019082