Text this: Gradient descent Sarsa(?)algorithm based on the adaptive potential function shaping reward mechanism