Text this: Convolutional Swin Encoder