TL;DR
This paper introduces LLMChain, a blockchain-based reputation system that combines automatic and human evaluations to assess and improve trustworthiness of large language models, addressing reliability and ethical concerns.
Contribution
It presents the first blockchain-based decentralized framework for sharing and evaluating LLMs, integrating automatic and human feedback for reputation scoring.
Findings
Effective assessment of seven LLMs demonstrated
Scalable evaluation across two benchmark datasets
Enhanced transparency in LLM trustworthiness
Abstract
Large Language Models (LLMs) have witnessed rapid growth in emerging challenges and capabilities of language understanding, generation, and reasoning. Despite their remarkable performance in natural language processing-based applications, LLMs are susceptible to undesirable and erratic behaviors, including hallucinations, unreliable reasoning, and the generation of harmful content. These flawed behaviors undermine trust in LLMs and pose significant hurdles to their adoption in real-world applications, such as legal assistance and medical diagnosis, where precision, reliability, and ethical considerations are paramount. These could also lead to user dissatisfaction, which is currently inadequately assessed and captured. Therefore, to effectively and transparently assess users' satisfaction and trust in their interactions with LLMs, we design and develop LLMChain, a decentralized…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
