An Evaluation of Context Length Extrapolation in Long Code via Positional Embeddings and Efficient Attention
Madhusudan Ghosh, Rishabh Gupta

TL;DR
This paper investigates methods to improve large language models' ability to handle longer code sequences by enhancing position encodings and attention mechanisms, addressing fixed context length limitations in code tasks.
Contribution
It provides a thorough analysis of zero-shot, inference-only techniques for context length extrapolation in long code completion using positional embeddings and efficient attention.
Findings
Enhanced position encoding methods improve long code handling.
Efficient attention mechanisms extend effective context lengths.
Analysis identifies strengths and limitations of current approaches.
Abstract
The rapid advancement of large language models (LLMs) has led to a significant increase in automated tools in the software engineering, capable of performing various code-related tasks such as code generation, completion, and translation. Despite these advancements, its effectiveness is constrained by fixed context lengths, limiting its ability to generalize across long, domain-specific code sequences. To address this challenge, we investigate zero-shot, inference-only methods aimed at improving position encodings and optimizing attention mechanisms. Our goal is to provide a thorough analysis of current approaches that facilitate context length extrapolation in code, particularly in the context of long code completion tasks.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSoftware Engineering Research · Topic Modeling · Natural Language Processing Techniques
