Low-latency real-time inference for multilayer perceptrons on FPGAs

Al-Zoubi, Ahmad; Fey, Görschwin

Low-latency real-time inference for multilayer perceptrons on FPGAs

Publikationstyp

Conference Paper

Date Issued

2023

Sprache

English

Author(s)

Al-Zoubi, Ahmad

Eingebettete Systeme E-13

Fey, Görschwin

Eingebettete Systeme E-13

TORE-URI

https://hdl.handle.net/11420/44028

Start Page

123

End Page

133

Citation

International Workshop on Boolean Problems (2023)

Contribution to Conference

15th International Workshop on Boolean Problems, IWSBP 2022

Publisher DOI

10.1007/978-3-031-28916-3_9

Scopus ID

2-s2.0-85173403239

Publisher

Springer International Publishing

ISBN

978-3-031-28916-3

978-3-031-28915-6

978-3-031-28917-0

978-3-031-28918-7

Application domains such as process control, particle accelerator control systems, autonomous driving, and monitoring of critical infrastructures are considered latency critical. However, most studies and commercial processors focus on the throughput aspect of the machine learning algorithms implemented in these domains. Given the wide use of multilayer perceptron neural network, specifically, in fast inference tasks and their competitive accuracy, we propose an efficient, latency-optimized architecture for multilayer perceptron neural networks, implemented on field-programmable gate arrays (FPGAs). The proposed architecture takes advantage of the inherited parallel computation model of the multilayer perceptron and our proposed implementation of the segmented activation functions. We analyze the latency, accuracy, and power consumption of the proposed architecture in comparison with the state of the implementations. Experimental results show that the proposed architecture for a topology of 7-9-9-9-5 of the neural network achieves a latency of 86.58 ns and power consumption of 3.731 W with an accuracy of 98.21%. When compared to the same topology of the state of the art, ours outperforms the pre-existing implementation in latency by a factor of 2.1 x for the customized implementation, and 332.81 x for a commercial IP.

Subjects

MLE@TUHH

DDC Class

004: Computer Sciences

510: Mathematics

Funding(s)

Center for Data and Computing in Natural Sciences

Options

Low-latency real-time inference for multilayer perceptrons on FPGAs