Text this: End-to-end scene text detection and recognition algorithm based on Transformer decoders