Beyond compression ratio : a throughput analysis of memory compression techniques for GPUs

Renz, ManuelManuelRenzLal, SohanSohanLal2023-11-222023-11-22202341st IEEE International Conference on Computer Design (ICCD 2023)https://hdl.handle.net/11420/44170Memory compression is increasingly used as a technique to synthetically increase the off-chip memory bandwidth of GPUs by transferring data in a compressed format between on-chip and off-chip memory. The increased memory bandwidth results in a speedup for the bandwidth-limited applications. State-of-the-art memory compression techniques often target a high compression ratio, however, a high compression ratio alone is not sufficient for full integration into throughput-oriented GPUs. To deploy a memory compression technique, the throughput of the compression technique has to keep up with the bandwidth of the off-chip memory of GPUs. Unfortunately, the throughput of the state-of-the-art memory compression techniques is often not discussed in detail and mostly the emphasis is only placed on the achieved compression ratio. In this work, we present a throughput analysis of several state-of-the-art memory compression techniques and study their capability to match modern GPUs memory bandwidth. We implement several memory compression techniques in hardware and synthesize designs using Synopsys Design Compiler with 14 nm ASIC libraries to analyze the throughput, area, and power consumption. Our analysis shows that simple compression techniques that have moderate compression ratios but higher throughput are more suitable for practical implementation in GPUs considering the area (up to 11.8× lower), power consumption (up to 7× lower), and the complexity of implementation.enhttp://rightsstatements.org/vocab/InC/1.0/GPUsmemory compressionthroughputbandwidthoff-chip memoryComputer SciencesBeyond compression ratio : a throughput analysis of memory compression techniques for GPUsConference Paper10.15480/882.882410.15480/882.8824Conference Paper