Loading paper
CSKV: Training-Efficient Channel Shrinking for KV Cache in Long-Context Scenarios | Tomesphere