On some properties of medians, percentiles, baselines, and thresholds in empirical bibliometric analysis
Vladimir Pislyakov

TL;DR
This paper critically examines the use of medians, percentiles, baselines, and thresholds in bibliometric analysis, revealing inconsistencies and potential misapplications in empirical research.
Contribution
It provides a theoretical analysis of common bibliometric measures and highlights discrepancies between their definitions and practical usage across fields.
Findings
Quartiles often do not represent quarters in data distribution
Medians are not always equivalent to 50% points in practice
World baselines and thresholds can lead to biased evaluations
Abstract
One of the most useful and correct methodological approaches in bibliometrics is ranking. In the context of highly skewed bibliometric distributions and severe distortions caused by outliers, it is often the preferable way of analysis. Ranking methodology strictly implies that "oranges should be compared with oranges, apples with apples". We should make a "like with like" comparison. Ranks in different fields show how a unit under study is compared to others in its field. But do we always apply an "apples approach" appropriately? Is median really a 50%, quartile a 25%, 10th percentile a 10%? The paper considers theoretical definitions of such terms compared to their real sense in the course of bibliometric research. It is found that in many empirical cases quartiles are not quarters, medians are not halves, world baselines are not unity, and integer thresholds lead to inequality of…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
