DequeReplayRollerEpsGreedysamples transitions from the replay buffer one by one, uniformly
sample_batch_uniformfrom the buffer "backend"
sampleyou can see that it samples full trajectories
env_rollerwould be able to work with both the
PrioritizedReplayBackend. I was thinking of how the Rainbow DQN could be added as well.