FPGA Acceleration With Hessian-Based Comprehensive Intra-Layer Mixed-Precision Quantization for Transformer Models

Recent advancements in using FPGAs as co-processors for language model acceleration, particularly for energy efficiency and flexibility, face challenges due to limited memory capacity. This limitation hinders the deployment of transformer-based language models. To address this challenge, we propose...

Full description

Saved in:

Bibliographic Details
Main Authors:	Woohong Byun, Jongseok Woo, Saibal Mukhopadhyay
Format:	Article
Language:	English
Published:	IEEE 2025-01-01
Series:	IEEE Access
Subjects:	Accelerator activation BERT FPGA hardware acceleration Hessian
Online Access:	https://ieeexplore.ieee.org/document/10973048/
Tags:	Add Tag No Tags, Be the first to tag this record!

Internet

https://ieeexplore.ieee.org/document/10973048/

FPGA Acceleration With Hessian-Based Comprehensive Intra-Layer Mixed-Precision Quantization for Transformer Models

Internet

Similar Items