Text this: A Spatio-Temporal Deep Learning Approach for Efficient Deepfake Video Detection