Text this: Bayesian Q learning method with Dyna architecture and prioritized sweeping