TUHH Open Research
Help
  • Log In
    New user? Click here to register.Have you forgotten your password?
  • English
  • Deutsch
  • Communities & Collections
  • Publications
  • Research Data
  • People
  • Institutions
  • Projects
  • Statistics
  1. Home
  2. TUHH
  3. Publication References
  4. Latency-optimized hardware acceleration of multilayer perceptron inference
 
Options

Latency-optimized hardware acceleration of multilayer perceptron inference

Publikationstyp
Conference Paper
Date Issued
2023
Sprache
English
Author(s)
Zoubi, Ahmad al-  
Eingebettete Systeme E-13  
Schaible, Benedikt  
Martino, Gianluca  orcid-logo
Eingebettete Systeme E-13  
Fey, Görschwin  orcid-logo
Eingebettete Systeme E-13  
TORE-URI
https://hdl.handle.net/11420/47298
Start Page
235
End Page
241
Citation
26th Euromicro Conference on Digital System Design: 235-241 (2023)
Contribution to Conference
26th Euromicro Conference on Digital System Design, DSD 2023  
Publisher DOI
10.1109/DSD60849.2023.00042
Scopus ID
2-s2.0-85189170213
Publisher
IEEE
ISBN
979-835034419-6
Decreasing the inference latency of neural networks is crucial in situations where real-time responses are necessary. We propose a new neuron architecture for parallel computations, targeting the MLP implementation on an FPGA. The parallelism in the proposed architecture is exposed through the segmentation of non-linear activation functions into a set of linear segments, delivering highly accurate estimations of the original function. The implementation combines various other optimization techniques, such as fixed-point arithmetics, pipelining, array partitioning, and loop unrolling. For the validation of the proposed architecture using the Xilinx Vitis HLS toolchain, four MLPs with a mix of non-linear activation functions have been implemented and evaluated in comparison to accelerated models produced by the open-source tool hls4ml, a Python package for latency-optimized machine learning inference in FPGAs. Experimental results clearly show that our proposed architecture outperformed the corresponding hls4ml model with up to three times speedups.
Subjects
FPGA
MLP
Non-linear activation function
Parallel
Segmentation
MLE@TUHH
DDC Class
004: Computer Sciences
620: Engineering
TUHH
Weiterführende Links
  • Contact
  • Send Feedback
  • Cookie settings
  • Privacy policy
  • Impress
DSpace Software

Built with DSpace-CRIS software - Extension maintained and optimized by 4Science
Design by effective webwork GmbH

  • Deutsche NationalbibliothekDeutsche Nationalbibliothek
  • ORCiD Member OrganizationORCiD Member Organization
  • DataCiteDataCite
  • Re3DataRe3Data
  • OpenDOAROpenDOAR
  • OpenAireOpenAire
  • BASE Bielefeld Academic Search EngineBASE Bielefeld Academic Search Engine
Feedback