"Which LLM should I use?": Evaluating LLMs for tasks performed by   Undergraduate Computer Science Students

Vibhor Agarwal; Madhav Krishan Garg; Sahiti Dharmavaram; Dhruv Kumar

arXiv:2402.01687·cs.CY·April 4, 2024·2 cites

"Which LLM should I use?": Evaluating LLMs for tasks performed by Undergraduate Computer Science Students

Vibhor Agarwal, Madhav Krishan Garg, Sahiti Dharmavaram, Dhruv Kumar

PDF

Open Access

TL;DR

This paper systematically evaluates multiple large language models to determine their effectiveness for tasks commonly performed by undergraduate computer science students, providing guidance on model selection.

Contribution

It offers a comprehensive comparison of popular LLMs across diverse student tasks, filling a gap in computing education research.

Findings

01

Google Bard and ChatGPT excel in code explanation and documentation

02

GitHub Copilot Chat performs best in coding tasks

03

Models show varied strengths in learning and communication tasks

Abstract

This study evaluates the effectiveness of various large language models (LLMs) in performing tasks common among undergraduate computer science students. Although a number of research studies in the computing education community have explored the possibility of using LLMs for a variety of tasks, there is a lack of comprehensive research comparing different LLMs and evaluating which LLMs are most effective for different tasks. Our research systematically assesses some of the publicly available LLMs such as Google Bard, ChatGPT(3.5), GitHub Copilot Chat, and Microsoft Copilot across diverse tasks commonly encountered by undergraduate computer science students in India. These tasks include code explanation and documentation, solving class assignments, technical interview preparation, learning new concepts and frameworks, and email writing. Evaluation for these tasks was carried out by…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsArtificial Intelligence in Law