RIRO: Reshaping Inputs, Refining Outputs Unlocking the Potential of Large Language Models in Data-Scarce Contexts
Ali Hamdi, Hozaifa Kassab, Mohamed Bahaa, and Marwa Mohamed

TL;DR
RIRO is a two-layer architecture that enhances large language models' performance in data-scarce settings by reformulating inputs and refining outputs, demonstrated through fine-tuning and new benchmarking metrics.
Contribution
The paper introduces RIRO, a novel approach combining prompt engineering and output refinement to improve LLMs in low-data environments, with extensive evaluation and benchmarking.
Findings
Phi-2 outperforms Falcon models after fine-tuning.
Benchmarking with cosine similarity, BLEU, and ROUGE metrics shows performance gains.
Challenges like computational costs and overfitting remain significant.
Abstract
Large language models (LLMs) have significantly advanced natural language processing, excelling in areas like text generation, summarization, and question-answering. Despite their capabilities, these models face challenges when fine-tuned on small, domain-specific datasets, often struggling to generalize and deliver accurate results with unfamiliar inputs. To tackle this issue, we introduce RIRO, a novel two-layer architecture designed to improve performance in data-scarce environments. The first layer leverages advanced prompt engineering to reformulate inputs, ensuring better alignment with training data, while the second layer focuses on refining outputs to minimize inconsistencies. Through fine-tuning models like Phi-2, Falcon 7B, and Falcon 1B, with Phi-2 outperforming the others. Additionally, we introduce a benchmark using evaluation metrics such as cosine similarity, Levenshtein…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNatural Language Processing Techniques · Topic Modeling
