Loading paper
IAM: Efficient Inference through Attention Mapping between Different-scale LLMs | Tomesphere