Loading paper
Provably Efficient Exploration in Reward Machines with Low Regret | Tomesphere