Options
The neuro vector engine: Flexibility to improve convolutional net efficiency for wearable vision
Publikationstyp
Conference Paper
Publikationsdatum
2016-03
Sprache
English
Start Page
1604
End Page
1609
Article Number
7459569
Citation
Design, Automation and Test in Europe Conference and Exhibition (DATE 2016)
Contribution to Conference
Publisher DOI
Scopus ID
Deep Convolutional Networks (ConvNets) are currently superior in benchmark performance, but the associated demands on computation and data transfer prohibit straightforward mapping on energy constrained wearable platforms. The computational burden can be overcome by dedicated hardware accelerators, but it is the sheer amount of data transfer, and level of utilization that determines the energy-efficiency of these implementations. This paper presents the Neuro Vector Engine (NVE) a SIMD accelerator for ConvNets for visual object classification, targeting portable and wearable devices. Our accelerator is very flexible due to the usage of VLIW ISA, at the cost of instruction fetch overhead. We show that this overhead is insignificant when the extra flexibility enables advanced data locality optimizations, and improves HW utilization over ConvNet vision applications. By co-optimizing accelerator architecture and algorithm loop structure, 30 Gops is achieved with a power envelope of 54mW and only 0.26mm2 silicon footprint at TSMC 40nm technology, enabling high-end visual object recognition by portable and even wearable devices.