Lal, SohanSohanLalLucas, JanJanLucasAndersch, MichaelMichaelAnderschAlvarez-Mesa, MauricioMauricioAlvarez-MesaElhossini, AhmedAhmedElhossiniJuurlink, Ben H. H.Ben H. H.Juurlink2022-04-202022-04-202014-07International Conference on Embedded Computer Systems: Architectures, Modeling and Simulation (SAMOS 2014)http://hdl.handle.net/11420/12321GPUs are much more power-efficient devices compared to CPUs, but due to several performance bottlenecks, the performance per watt of GPUs is often much lower than what could be achieved theoretically. To sustain and continue high performance computing growth, new architectural and application techniques are required to create power-efficient computing systems. To find such techniques, however, it is necessary to study the power consumption at a detailed level and understand the bottlenecks which cause low performance. Therefore, in this paper, we study GPU power consumption at component level and investigate the bottlenecks that cause low performance and low energy efficiency. We divide the low performance kernels into low occupancy and full occupancy categories. For the low occupancy category, we study if increasing the occupancy helps in increasing performance and energy efficiency. For the full occupancy category, we investigate if these kernels are limited by memory bandwidth, coalescing efficiency, or SIMD utilization.GPGPU workload characteristics and performance analysisConference Paper10.1109/SAMOS.2014.6893202Other