Text this: Semantically-Enhanced Feature Extraction with CLIP and Transformer Networks for Driver Fatigue Detection