Text this: Deep Fusion of Skeleton Spatial–Temporal and Dynamic Information for Action Recognition