Text this: Adjacent-Scale Multimodal Fusion Networks for Semantic Segmentation of Remote Sensing Data