QSLC: Quantization-Based, Low-Error Selective Approximation for GPUs

Lal, SohanSohanLalLucas, JanJanLucasJuurlink, Ben H. H.Ben H. H.Juurlink2022-04-072022-04-072021-02Design, Automation and Test in Europe Conference and Exhibition (DATE 2021)http://hdl.handle.net/11420/12245GPUs use a large memory access granularity (MAG) that often results in a low effective compression ratio for memory compression techniques. The low effective compression ratio is caused by a significant fraction of compressed blocks that have a few bytes above a multiple of MAG. While MAG-aware selective approximation, based on a tree structure, has been used to increase the effective compression ratio and the performance gain, approximation results in a high error that is reduced by using complex optimizations. We propose a simple quantization-based approximation technique (QSLC) that can also selectively approximate a few bytes above MAG. While the quantization-based approximation technique has a similar performance to the state-of-the-art tree-based selective approximation, the average error for the quantization-based technique is 5× lower. We further trade-off the two techniques and show that the area and power overhead of the quantization-based technique is 12.1× and 7.6× lower than the state-of-the-art, respectively. Our sensitivity analysis to different block sizes further shows the opportunities and the significance of MAG-aware selective approximation.enQSLC: Quantization-Based, Low-Error Selective Approximation for GPUsConference Paper10.23919/DATE51398.2021.9474124Conference Paper