Text this: Enhanced Cross-stage-attention U-Net for esophageal target volume segmentation