Loading paper
Minimax Optimal Variance-Aware Regret Bounds for Multinomial Logistic MDPs | Tomesphere