Loading paper
Randomized Advantage Transformation (RAT): Computing Natural Policy Gradients via Direct Backpropagation | Tomesphere