Loading paper
PingPong: A Benchmark for Role-Playing Language Models with User Emulation and Multi-Model Evaluation | Tomesphere