Loading paper
TracrBench: Generating Interpretability Testbeds with Large Language Models | Tomesphere