Text this: Spatial-temporal initialization dilemma: towards realistic visual tracking