Loading paper
Scratchpad Patching: Decoupling Compute from Patch Size in Byte-Level Language Models | Tomesphere