Enhancing DNN Computational Efficiency via Decomposition and Approximation
The increasing computational demands of emerging deep neural networks (DNNs) are fueled by their extensive computation intensity across various tasks, placing a significant strain on resources. This paper introduces DART, an adaptive microarchitecture that enhances area, power, and energy efficiency...
Saved in:
| Main Authors: | , , |
|---|---|
| Format: | Article |
| Language: | English |
| Published: |
IEEE
2024-01-01
|
| Series: | IEEE Access |
| Subjects: | |
| Online Access: | https://ieeexplore.ieee.org/document/10813351/ |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| _version_ | 1846099961757827072 |
|---|---|
| author | Ori Schweitzer Uri Weiser Freddy Gabbay |
| author_facet | Ori Schweitzer Uri Weiser Freddy Gabbay |
| author_sort | Ori Schweitzer |
| collection | DOAJ |
| description | The increasing computational demands of emerging deep neural networks (DNNs) are fueled by their extensive computation intensity across various tasks, placing a significant strain on resources. This paper introduces DART, an adaptive microarchitecture that enhances area, power, and energy efficiency of DNN accelerators through approximated computations and decomposition, while preserving accuracy. DART improves DNN efficiency by leveraging adaptive resource allocation and simultaneous multi-threading (SMT). It exploits two prominent attributes of DNNs: resiliency and sparsity, of both magnitude and bit-level. Our microarchitecture decomposes the Multiply-and-Accumulate (MAC) into fine-grained elementary computational resources. Additionally, DART employs an approximate representation that leverages dynamic and flexible allocation of decomposed computational resources through SMT (Simultaneous Multi-Threading), thereby enhancing resource utilization and optimizing power consumption. We further improve efficiency by introducing a new Temporal SMT (tSMT) technique, which suggests processing computations from temporally adjacent threads by expanding the computational time window for resource allocation. Our simulation analysis, using a systolic array accelerator as a case study, indicates that DART can achieve more than 30% reduction in area and power, with an accuracy degradation of less than 1% in state-of-the-art DNNs in vision and natural language processing (NLP) tasks, compared to conventional processing elements (PEs) using 8-bit integer MAC units. |
| format | Article |
| id | doaj-art-84aeace6586a4ea48df8e35a62d6bb1c |
| institution | Kabale University |
| issn | 2169-3536 |
| language | English |
| publishDate | 2024-01-01 |
| publisher | IEEE |
| record_format | Article |
| series | IEEE Access |
| spelling | doaj-art-84aeace6586a4ea48df8e35a62d6bb1c2024-12-31T00:00:31ZengIEEEIEEE Access2169-35362024-01-011219734719736210.1109/ACCESS.2024.352198010813351Enhancing DNN Computational Efficiency via Decomposition and ApproximationOri Schweitzer0https://orcid.org/0009-0007-8481-7710Uri Weiser1https://orcid.org/0009-0005-3800-8272Freddy Gabbay2https://orcid.org/0000-0002-6549-7957Faculty of Electrical and Computer Engineering, Technion—Israel Institute of Technology, Haifa, IsraelFaculty of Electrical and Computer Engineering, Technion—Israel Institute of Technology, Haifa, IsraelFaculty of Sciences, Institute of Applied Physics, The Hebrew University of Jerusalem, Jerusalem, IsraelThe increasing computational demands of emerging deep neural networks (DNNs) are fueled by their extensive computation intensity across various tasks, placing a significant strain on resources. This paper introduces DART, an adaptive microarchitecture that enhances area, power, and energy efficiency of DNN accelerators through approximated computations and decomposition, while preserving accuracy. DART improves DNN efficiency by leveraging adaptive resource allocation and simultaneous multi-threading (SMT). It exploits two prominent attributes of DNNs: resiliency and sparsity, of both magnitude and bit-level. Our microarchitecture decomposes the Multiply-and-Accumulate (MAC) into fine-grained elementary computational resources. Additionally, DART employs an approximate representation that leverages dynamic and flexible allocation of decomposed computational resources through SMT (Simultaneous Multi-Threading), thereby enhancing resource utilization and optimizing power consumption. We further improve efficiency by introducing a new Temporal SMT (tSMT) technique, which suggests processing computations from temporally adjacent threads by expanding the computational time window for resource allocation. Our simulation analysis, using a systolic array accelerator as a case study, indicates that DART can achieve more than 30% reduction in area and power, with an accuracy degradation of less than 1% in state-of-the-art DNNs in vision and natural language processing (NLP) tasks, compared to conventional processing elements (PEs) using 8-bit integer MAC units.https://ieeexplore.ieee.org/document/10813351/Approximate computingcomputer architecturesdeep neural networksmachine learning accelerators |
| spellingShingle | Ori Schweitzer Uri Weiser Freddy Gabbay Enhancing DNN Computational Efficiency via Decomposition and Approximation IEEE Access Approximate computing computer architectures deep neural networks machine learning accelerators |
| title | Enhancing DNN Computational Efficiency via Decomposition and Approximation |
| title_full | Enhancing DNN Computational Efficiency via Decomposition and Approximation |
| title_fullStr | Enhancing DNN Computational Efficiency via Decomposition and Approximation |
| title_full_unstemmed | Enhancing DNN Computational Efficiency via Decomposition and Approximation |
| title_short | Enhancing DNN Computational Efficiency via Decomposition and Approximation |
| title_sort | enhancing dnn computational efficiency via decomposition and approximation |
| topic | Approximate computing computer architectures deep neural networks machine learning accelerators |
| url | https://ieeexplore.ieee.org/document/10813351/ |
| work_keys_str_mv | AT orischweitzer enhancingdnncomputationalefficiencyviadecompositionandapproximation AT uriweiser enhancingdnncomputationalefficiencyviadecompositionandapproximation AT freddygabbay enhancingdnncomputationalefficiencyviadecompositionandapproximation |