Enhancing DNN Computational Efficiency via Decomposition and Approximation

The increasing computational demands of emerging deep neural networks (DNNs) are fueled by their extensive computation intensity across various tasks, placing a significant strain on resources. This paper introduces DART, an adaptive microarchitecture that enhances area, power, and energy efficiency...

Full description

Saved in:

Bibliographic Details
Main Authors:	Ori Schweitzer, Uri Weiser, Freddy Gabbay
Format:	Article
Language:	English
Published:	IEEE 2024-01-01
Series:	IEEE Access
Subjects:	Approximate computing computer architectures deep neural networks machine learning accelerators
Online Access:	https://ieeexplore.ieee.org/document/10813351/
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1846099961757827072
author	Ori Schweitzer Uri Weiser Freddy Gabbay
author_facet	Ori Schweitzer Uri Weiser Freddy Gabbay
author_sort	Ori Schweitzer
collection	DOAJ
description	The increasing computational demands of emerging deep neural networks (DNNs) are fueled by their extensive computation intensity across various tasks, placing a significant strain on resources. This paper introduces DART, an adaptive microarchitecture that enhances area, power, and energy efficiency of DNN accelerators through approximated computations and decomposition, while preserving accuracy. DART improves DNN efficiency by leveraging adaptive resource allocation and simultaneous multi-threading (SMT). It exploits two prominent attributes of DNNs: resiliency and sparsity, of both magnitude and bit-level. Our microarchitecture decomposes the Multiply-and-Accumulate (MAC) into fine-grained elementary computational resources. Additionally, DART employs an approximate representation that leverages dynamic and flexible allocation of decomposed computational resources through SMT (Simultaneous Multi-Threading), thereby enhancing resource utilization and optimizing power consumption. We further improve efficiency by introducing a new Temporal SMT (tSMT) technique, which suggests processing computations from temporally adjacent threads by expanding the computational time window for resource allocation. Our simulation analysis, using a systolic array accelerator as a case study, indicates that DART can achieve more than 30% reduction in area and power, with an accuracy degradation of less than 1% in state-of-the-art DNNs in vision and natural language processing (NLP) tasks, compared to conventional processing elements (PEs) using 8-bit integer MAC units.
format	Article
id	doaj-art-84aeace6586a4ea48df8e35a62d6bb1c
institution	Kabale University
issn	2169-3536
language	English
publishDate	2024-01-01
publisher	IEEE
record_format	Article
series	IEEE Access
spelling	doaj-art-84aeace6586a4ea48df8e35a62d6bb1c2024-12-31T00:00:31ZengIEEEIEEE Access2169-35362024-01-011219734719736210.1109/ACCESS.2024.352198010813351Enhancing DNN Computational Efficiency via Decomposition and ApproximationOri Schweitzer0https://orcid.org/0009-0007-8481-7710Uri Weiser1https://orcid.org/0009-0005-3800-8272Freddy Gabbay2https://orcid.org/0000-0002-6549-7957Faculty of Electrical and Computer Engineering, Technion—Israel Institute of Technology, Haifa, IsraelFaculty of Electrical and Computer Engineering, Technion—Israel Institute of Technology, Haifa, IsraelFaculty of Sciences, Institute of Applied Physics, The Hebrew University of Jerusalem, Jerusalem, IsraelThe increasing computational demands of emerging deep neural networks (DNNs) are fueled by their extensive computation intensity across various tasks, placing a significant strain on resources. This paper introduces DART, an adaptive microarchitecture that enhances area, power, and energy efficiency of DNN accelerators through approximated computations and decomposition, while preserving accuracy. DART improves DNN efficiency by leveraging adaptive resource allocation and simultaneous multi-threading (SMT). It exploits two prominent attributes of DNNs: resiliency and sparsity, of both magnitude and bit-level. Our microarchitecture decomposes the Multiply-and-Accumulate (MAC) into fine-grained elementary computational resources. Additionally, DART employs an approximate representation that leverages dynamic and flexible allocation of decomposed computational resources through SMT (Simultaneous Multi-Threading), thereby enhancing resource utilization and optimizing power consumption. We further improve efficiency by introducing a new Temporal SMT (tSMT) technique, which suggests processing computations from temporally adjacent threads by expanding the computational time window for resource allocation. Our simulation analysis, using a systolic array accelerator as a case study, indicates that DART can achieve more than 30% reduction in area and power, with an accuracy degradation of less than 1% in state-of-the-art DNNs in vision and natural language processing (NLP) tasks, compared to conventional processing elements (PEs) using 8-bit integer MAC units.https://ieeexplore.ieee.org/document/10813351/Approximate computingcomputer architecturesdeep neural networksmachine learning accelerators
spellingShingle	Ori Schweitzer Uri Weiser Freddy Gabbay Enhancing DNN Computational Efficiency via Decomposition and Approximation IEEE Access Approximate computing computer architectures deep neural networks machine learning accelerators
title	Enhancing DNN Computational Efficiency via Decomposition and Approximation
title_full	Enhancing DNN Computational Efficiency via Decomposition and Approximation
title_fullStr	Enhancing DNN Computational Efficiency via Decomposition and Approximation
title_full_unstemmed	Enhancing DNN Computational Efficiency via Decomposition and Approximation
title_short	Enhancing DNN Computational Efficiency via Decomposition and Approximation
title_sort	enhancing dnn computational efficiency via decomposition and approximation
topic	Approximate computing computer architectures deep neural networks machine learning accelerators
url	https://ieeexplore.ieee.org/document/10813351/
work_keys_str_mv	AT orischweitzer enhancingdnncomputationalefficiencyviadecompositionandapproximation AT uriweiser enhancingdnncomputationalefficiencyviadecompositionandapproximation AT freddygabbay enhancingdnncomputationalefficiencyviadecompositionandapproximation

Enhancing DNN Computational Efficiency via Decomposition and Approximation

Similar Items