Reducing the performance overhead of resilient CMPs with substitutable resources

Malek, A.; Tzilis, S.; Khan, D.A.; Sourdis, Ioannis; Smaragdos, Georgios; Strydis, Christos

doi:10.1109/DFT.2015.7315161

A. Malek, S. Tzilis, D.A. Khan, I. Sourdis (Ioannis), G. Smaragdos (Georgios) and C. Strydis (Christos)

2015-11-02

Reducing the performance overhead of resilient CMPs with substitutable resources

Presented at the 28th IEEE International Symposium on Defect and Fault Tolerance in VLSI and Nanotechnology Systems, DFTS 2015 (October 2015), Amherst

Permanent faults on a chip are often tolerated using spare resources. In the past, sparing has been applied to Chip Multiprocessors (CMPs) at various granularities of substitutable units (SUs). Entire processors, pipeline stages or even individual functional units are isolated when faulty and replaced by spare ones using flexible, reconfigurable interconnects. Although spare resources increase systems fault tolerance, the extra delay imposed by the reconfigurable interconnects limits performance. In this paper, we study two options for dealing with this delay: (i) pipelining the reconfigurable interconnects and (ii) scaling down operating frequency. The former keeps a frequency close to the one of the baseline processor, but increases the number of cycles required for executing a program. The latter maintains the number of execution cycles constant, but requires a slower clock. We investigate the above performance tradeoff using an adaptive 4-core CMP design with substitutable pipeline stages. We retrieve post place and route results of different designs running two sets of benchmarks and evaluate their performance. Our experiments indicate that adding reconfigurable interconnects for wiring the SUs of a 4-core CMP pose significant delay increasing the critical path of the design almost by 3.5 times. On the other hand, pipelining the reconfigurable interconnects increases cycle time by 41% and - depending on the processor configuration - reduces performance overhead to 1.4-2.9× the execution time of the baseline.

Additional Metadata
Persistent URL	doi.org/10.1109/DFT.2015.7315161, hdl.handle.net/1765/83907
Conference	28th IEEE International Symposium on Defect and Fault Tolerance in VLSI and Nanotechnology Systems, DFTS 2015
Organisation	Department of Neuroscience
Citation APA Style AAA Style APA Style Cell Style Chicago Style Harvard Style IEEE Style MLA Style Nature Style Vancouver Style American-Institute-of-Physics Style Council-of-Science-Editors Style BibTex Format Endnote Format RIS Format CSL Format DOIs only Format	Malek, A., Tzilis, S., Khan, D. A., Sourdis, I., Smaragdos, G., & Strydis, C. (2015). Reducing the performance overhead of resilient CMPs with substitutable resources. Presented at the 28th IEEE International Symposium on Defect and Fault Tolerance in VLSI and Nanotechnology Systems, DFTS 2015. doi:10.1109/DFT.2015.7315161

Reducing the performance overhead of resilient CMPs with substitutable resources

Publication

Publication

About

Reducing the performance overhead of resilient CMPs with substitutable resources

Publication

Publication

Workflow

Workflow

Add Content