Text this: Nonlinear multi-head cross-attention network and programmable gradient information for gaze estimation