Text this: An end-to-end homography estimation method for large baseline scenes with an attention mechanism