Aligning to the teacher: multilevel feature-aligned knowledge distillation

Knowledge distillation is a technique for transferring knowledge from a teacher’s (large) model to a student’s (small) model. Usually, the features of the teacher model contain richer information, while the features of the student model carry less information. This leads to a poor distillation effec...

Full description

Saved in:
Bibliographic Details
Main Authors: Yang Zhang, Pan He, Chuanyun Xu, Jingyan Pang, Xiao Wang, Xinghai Yuan, Pengfei Lv, Gang Li
Format: Article
Language:English
Published: PeerJ Inc. 2025-08-01
Series:PeerJ Computer Science
Subjects:
Online Access:https://peerj.com/articles/cs-3075.pdf
Tags: Add Tag
No Tags, Be the first to tag this record!