RoPE-LIME: RoPE-Space Locality + Sparse-K Sampling for Efficient LLM Attribution
Isaac Picov, Ritesh Goru

TL;DR
RoPE-LIME is a novel explanation method for closed-source LLMs that uses a local kernel in RoPE space and sparse sampling to generate more accurate token attributions efficiently.
Contribution
It introduces RoPE-LIME, combining RoPE-based similarity and sparse sampling to improve LLM attribution efficiency and accuracy without requiring model gradients.
Findings
Outperforms leave-one-out sampling in attribution quality
Reduces API calls compared to gSMILE
Effective on HotpotQA and MMLU datasets
Abstract
Explaining closed-source Large Language Model (LLM) outputs is challenging because API access prevents gradient-based attribution, while perturbation methods are costly and noisy when they depend on regenerated text. We introduce \textbf{Rotary Positional Embedding Linear Local Interpretable Model-agnostic Explanations (RoPE-LIME)}, an open-source extension of gSMILE that decouples reasoning from explanation: given a fixed output from a closed model, a smaller open-source surrogate computes token-level attributions from probability-based objectives (negative log-likelihood and divergence targets) under input perturbations. RoPE-LIME incorporates (i) a locality kernel based on Relaxed Word Mover's Distance computed in \textbf{RoPE embedding space} for stable similarity under masking, and (ii) \textbf{Sparse-} sampling, an efficient perturbation strategy that improves interaction…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Natural Language Processing Techniques · Explainable Artificial Intelligence (XAI)
