Flacuna: Unleashing the Problem Solving Power of Vicuna using FLAN Fine-Tuning
Deepanway Ghosal, Yew Ken Chia, Navonil Majumder, Soujanya Poria

TL;DR
This paper demonstrates that fine-tuning VICUNA with a specialized instruction dataset called FLANMINI significantly enhances its problem-solving capabilities, outperforming other large language models on various benchmarks.
Contribution
The study introduces FLACUNA, a fine-tuned VICUNA model using FLANMINI, showing improved problem-solving performance over existing models.
Findings
FLACUNA outperforms baseline models on multiple benchmarks.
Fine-tuning with FLANMINI enhances problem-solving skills.
VICUNA's performance is significantly improved through targeted instruction tuning.
Abstract
Recently, the release of INSTRUCTEVAL has provided valuable insights into the performance of large language models (LLMs) that utilize encoder-decoder or decoder-only architecture. Interestingly, despite being introduced four years ago, T5-based LLMs, such as FLAN-T5, continue to outperform the latest decoder-based LLMs, such as LLAMA and VICUNA, on tasks that require general problem-solving skills. This performance discrepancy can be attributed to three key factors: (1) Pre-training data, (2) Backbone architecture, and (3) Instruction dataset. In this technical report, our main focus is on investigating the impact of the third factor by leveraging VICUNA, a large language model based on LLAMA, which has undergone fine-tuning on ChatGPT conversations. To achieve this objective, we fine-tuned VICUNA using a customized instruction dataset collection called FLANMINI. This collection…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMachine Learning and Data Classification · Topic Modeling · Advanced Neural Network Applications
MethodsFocus
