Peng, Bo-YuanBo-YuanPengMarotzke, AdrianAdrianMarotzkeTsai, Ming-HanMing-HanTsaiYang, Bo-YinBo-YinYangChen, Ho-LinHo-LinChen2022-12-122022-12-122023-06Journal of Cryptographic Engineering 13 (2): 167-186 (2023-06)http://hdl.handle.net/11420/14343We present a novel full hardware implementation of Streamlined NTRU Prime, with two variants: a high-speed, high-area implementation and a slower, low-area implementation. We introduce several new techniques that improve performance, including a batch inversion for key generation, a high-speed schoolbook polynomial multiplier, an NTT polynomial multiplier combined with a CRT map, a new DSP-free modular reduction method, a high-speed radix sorting module, and new encoders and decoders. With the high-speed design, we achieve the to-date fastest speeds for Streamlined NTRU Prime, with speeds of 5007, 10,989, and 64,026 cycles for encapsulation, decapsulation, and key generation, respectively, while running at 285 MHz on a Xilinx Zynq Ultrascale+. The entire design uses 40,060 LUT, 26,384 flip-flops, 36.5 Bram, and 31 DSP.en2190-8516Journal of cryptographic engineering20232167186Springerhttps://creativecommons.org/licenses/by/4.0/FPGAHardware ImplementationLattice CryptographyNTRU PrimePost-Quantum CryptographyTechnikStreamlined NTRU Prime on FPGAJournal Article10.15480/882.486210.1007/s13389-022-00303-z10.15480/882.4862Journal Article