JEP-KD: Joint-Embedding Predictive Architecture Based Knowledge Distillation for Visual Speech Recognition

Visual Speech Recognition (VSR) tasks are generally recognized to have a lower theoretical performance ceiling than Automatic Speech Recognition (ASR), owing to the inherent limitations of conveying semantic information visually. To mitigate this challenge, this paper introduces an advanced knowledg...

Full description

Saved in:
Bibliographic Details
Main Authors: Chang Sun, Bo Qin, Hong Yang
Format: Article
Language:English
Published: IEEE 2024-01-01
Series:IEEE Open Journal of Signal Processing
Subjects:
Online Access:https://ieeexplore.ieee.org/document/10750407/
Tags: Add Tag
No Tags, Be the first to tag this record!