Enhancing DNN Computational Efficiency via Decomposition and Approximation

The increasing computational demands of emerging deep neural networks (DNNs) are fueled by their extensive computation intensity across various tasks, placing a significant strain on resources. This paper introduces DART, an adaptive microarchitecture that enhances area, power, and energy efficiency...

Full description

Saved in:
Bibliographic Details
Main Authors: Ori Schweitzer, Uri Weiser, Freddy Gabbay
Format: Article
Language:English
Published: IEEE 2024-01-01
Series:IEEE Access
Subjects:
Online Access:https://ieeexplore.ieee.org/document/10813351/
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1846099961757827072
author Ori Schweitzer
Uri Weiser
Freddy Gabbay
author_facet Ori Schweitzer
Uri Weiser
Freddy Gabbay
author_sort Ori Schweitzer
collection DOAJ
description The increasing computational demands of emerging deep neural networks (DNNs) are fueled by their extensive computation intensity across various tasks, placing a significant strain on resources. This paper introduces DART, an adaptive microarchitecture that enhances area, power, and energy efficiency of DNN accelerators through approximated computations and decomposition, while preserving accuracy. DART improves DNN efficiency by leveraging adaptive resource allocation and simultaneous multi-threading (SMT). It exploits two prominent attributes of DNNs: resiliency and sparsity, of both magnitude and bit-level. Our microarchitecture decomposes the Multiply-and-Accumulate (MAC) into fine-grained elementary computational resources. Additionally, DART employs an approximate representation that leverages dynamic and flexible allocation of decomposed computational resources through SMT (Simultaneous Multi-Threading), thereby enhancing resource utilization and optimizing power consumption. We further improve efficiency by introducing a new Temporal SMT (tSMT) technique, which suggests processing computations from temporally adjacent threads by expanding the computational time window for resource allocation. Our simulation analysis, using a systolic array accelerator as a case study, indicates that DART can achieve more than 30% reduction in area and power, with an accuracy degradation of less than 1% in state-of-the-art DNNs in vision and natural language processing (NLP) tasks, compared to conventional processing elements (PEs) using 8-bit integer MAC units.
format Article
id doaj-art-84aeace6586a4ea48df8e35a62d6bb1c
institution Kabale University
issn 2169-3536
language English
publishDate 2024-01-01
publisher IEEE
record_format Article
series IEEE Access
spelling doaj-art-84aeace6586a4ea48df8e35a62d6bb1c2024-12-31T00:00:31ZengIEEEIEEE Access2169-35362024-01-011219734719736210.1109/ACCESS.2024.352198010813351Enhancing DNN Computational Efficiency via Decomposition and ApproximationOri Schweitzer0https://orcid.org/0009-0007-8481-7710Uri Weiser1https://orcid.org/0009-0005-3800-8272Freddy Gabbay2https://orcid.org/0000-0002-6549-7957Faculty of Electrical and Computer Engineering, Technion—Israel Institute of Technology, Haifa, IsraelFaculty of Electrical and Computer Engineering, Technion—Israel Institute of Technology, Haifa, IsraelFaculty of Sciences, Institute of Applied Physics, The Hebrew University of Jerusalem, Jerusalem, IsraelThe increasing computational demands of emerging deep neural networks (DNNs) are fueled by their extensive computation intensity across various tasks, placing a significant strain on resources. This paper introduces DART, an adaptive microarchitecture that enhances area, power, and energy efficiency of DNN accelerators through approximated computations and decomposition, while preserving accuracy. DART improves DNN efficiency by leveraging adaptive resource allocation and simultaneous multi-threading (SMT). It exploits two prominent attributes of DNNs: resiliency and sparsity, of both magnitude and bit-level. Our microarchitecture decomposes the Multiply-and-Accumulate (MAC) into fine-grained elementary computational resources. Additionally, DART employs an approximate representation that leverages dynamic and flexible allocation of decomposed computational resources through SMT (Simultaneous Multi-Threading), thereby enhancing resource utilization and optimizing power consumption. We further improve efficiency by introducing a new Temporal SMT (tSMT) technique, which suggests processing computations from temporally adjacent threads by expanding the computational time window for resource allocation. Our simulation analysis, using a systolic array accelerator as a case study, indicates that DART can achieve more than 30% reduction in area and power, with an accuracy degradation of less than 1% in state-of-the-art DNNs in vision and natural language processing (NLP) tasks, compared to conventional processing elements (PEs) using 8-bit integer MAC units.https://ieeexplore.ieee.org/document/10813351/Approximate computingcomputer architecturesdeep neural networksmachine learning accelerators
spellingShingle Ori Schweitzer
Uri Weiser
Freddy Gabbay
Enhancing DNN Computational Efficiency via Decomposition and Approximation
IEEE Access
Approximate computing
computer architectures
deep neural networks
machine learning accelerators
title Enhancing DNN Computational Efficiency via Decomposition and Approximation
title_full Enhancing DNN Computational Efficiency via Decomposition and Approximation
title_fullStr Enhancing DNN Computational Efficiency via Decomposition and Approximation
title_full_unstemmed Enhancing DNN Computational Efficiency via Decomposition and Approximation
title_short Enhancing DNN Computational Efficiency via Decomposition and Approximation
title_sort enhancing dnn computational efficiency via decomposition and approximation
topic Approximate computing
computer architectures
deep neural networks
machine learning accelerators
url https://ieeexplore.ieee.org/document/10813351/
work_keys_str_mv AT orischweitzer enhancingdnncomputationalefficiencyviadecompositionandapproximation
AT uriweiser enhancingdnncomputationalefficiencyviadecompositionandapproximation
AT freddygabbay enhancingdnncomputationalefficiencyviadecompositionandapproximation