Field- and time-normalization of data with many zeros: An empirical analysis using citation and Twitter data
Robin Haunschild, Lutz Bornmann

TL;DR
This paper evaluates a new field- and time-normalized indicator, MHq, for sparse data like citations and Twitter mentions, demonstrating its ability to distinguish quality levels and questioning Twitter's usefulness in research evaluation.
Contribution
The paper introduces the MHq indicator based on the Mantel-Haenszel analysis and validates its effectiveness in assessing research quality across different data sources.
Findings
MHq can distinguish between quality levels in most cases.
Other indicators in the family often fail to distinguish quality levels.
Twitter mentions show a weak correlation with scientific quality.
Abstract
Thelwall (2017a, 2017b) proposed a new family of field- and time-normalized indicators, which is intended for sparse data. These indicators are based on units of analysis (e.g., institutions) rather than on the paper level. They compare the proportion of mentioned papers (e.g., on Twitter) of a unit with the proportion of mentioned papers in the corresponding fields and publication years (the expected values). We propose a new indicator (Mantel-Haenszel quotient, MHq) for the indicator family. The MHq goes back to the MH analysis. This analysis is an established method, which can be used to pool the data from several 2x2 cross tables based on different subgroups. We investigate (using citations and assessments by peers, i.e., F1000Prime recommendations) whether the indicator family (including the MHq) can distinguish between quality levels defined by the assessments of peers. Thus, we…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
