Measuring source code conciseness across programming languages using compression
Lodewijk Bergmans, Xander Schrijen, Edwin Ouwehand, Magiel Bruntink

TL;DR
This paper introduces an information-theoretic model to objectively measure and compare the conciseness of different programming languages, validated through large-scale analysis and surveys.
Contribution
It presents a novel, evidence-based approach for quantifying programming language conciseness using compression, applicable to diverse software applications.
Findings
Strong correlation with alternative analytical methods
High correlation with developer survey results
Useful for improving multi-language software metrics
Abstract
It is well-known, and often a topic of heated debates, that programs in some programming languages are more concise than in others. This is a relevant factor when comparing or aggregating volume-impacted metrics on source code written in a combination of programming languages. In this paper, we present a model for measuring the conciseness of programming languages in a consistent, objective and evidence-based way. We present the approach, explain how it is founded on information theoretical principles, present detailed analysis steps and show the quantitative results of applying this model to a large benchmark of diverse commercial software applications. We demonstrate that our metric for language conciseness is strongly correlated with both an alternative analytical approach, and with a large scale developer survey, and show how its results can be applied to improve software metrics…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
