Loading paper
$\texttt{YC-Bench}$: Benchmarking AI Agents for Long-Term Planning and Consistent Execution | Tomesphere