Safurai 001: New Qualitative Approach for Code LLM Evaluation
Davide Cifarelli, Leonardo Boiardi, Alessandro Puppo

TL;DR
Safurai-001 is a new conversational coding LLM that matches recent models in performance and introduces GPT4-based MultiParameters for comprehensive evaluation, showing improved metrics over GPT-3.5 and WizardCoder.
Contribution
The paper introduces Safurai-001, a novel coding LLM with conversational capabilities, and proposes GPT4-based MultiParameters as a new evaluation benchmark for coding models.
Findings
Safurai-001 outperforms GPT-3.5 by 1.58% in code readability.
Safurai-001 outperforms WizardCoder by 18.78% in code readability.
The GPT4-based MultiParameters benchmark provides comprehensive insights into model performance.
Abstract
This paper presents Safurai-001, a new Large Language Model (LLM) with significant potential in the domain of coding assistance. Driven by recent advancements in coding LLMs, Safurai-001 competes in performance with the latest models like WizardCoder [Xu et al., 2023], PanguCoder [Shen et al., 2023] and Phi-1 [Gunasekar et al., 2023] but aims to deliver a more conversational interaction. By capitalizing on the progress in data engineering (including latest techniques of data transformation and prompt engineering) and instruction tuning, this new model promises to stand toe-to-toe with recent closed and open source developments. Recognizing the need for an efficacious evaluation metric for coding LLMs, this paper also introduces GPT4-based MultiParameters, an evaluation benchmark that harnesses varied parameters to present a comprehensive insight into the models functioning and…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNatural Language Processing Techniques · Topic Modeling · Text Readability and Simplification
MethodsRefunds@Expedia|||How do I get a full refund from Expedia? · {Dispute@FaQ-s}How to file a dispute with Expedia? · Multi-Head Attention · 15 Ways to Contact How can i speak to someone at Delta Airlines · Attention Is All You Need · Attention Dropout · Residual Connection · Adam · Linear Layer · Weight Decay
