Text this: Learning‐based tracking control of AUV: Mixed policy improvement and game‐based disturbance rejection