ELO-Mask: Effective and Layerwise Optimization of Mask for Sparse LLMs

To address the issue of the substantial computational resource consumption during the inference phase of large language models due to their vast number of parameters, model sparsification is an effective solution. However, current sparsification methods for large models are costly. We propose a comp...

Full description

Saved in:
Bibliographic Details
Main Authors: Bingjie Xiang, Jiarui Wu, Xiaoying Han, Qian Gu, Fei Chao, Xiao Yang, Fan Wu, Xin Fu
Format: Article
Language:English
Published: IEEE 2024-01-01
Series:IEEE Access
Subjects:
Online Access:https://ieeexplore.ieee.org/document/10753603/
Tags: Add Tag
No Tags, Be the first to tag this record!