Examining Scientific Writing Styles from the Perspective of Linguistic Complexity
Chao Lu, Yi Bu, Jie Wang, Ying Ding, Vetle Torvik, Matthew Schnaars,, Chengzhi Zhang

TL;DR
This study analyzes linguistic complexity in scientific writing by comparing native and non-native English-speaking authors using a large dataset, revealing marginal differences in syntactic and lexical features.
Contribution
It provides a large-scale empirical comparison of scientific writing styles between NESs and NNESs based on linguistic complexity metrics.
Findings
Marginal differences in syntactic complexity between groups.
Marginal differences in lexical diversity and density.
Large-scale dataset analysis of 150,000 articles.
Abstract
Publishing articles in high-impact English journals is difficult for scholars around the world, especially for non-native English-speaking scholars (NNESs), most of whom struggle with proficiency in English. In order to uncover the differences in English scientific writing between native English-speaking scholars (NESs) and NNESs, we collected a large-scale data set containing more than 150,000 full-text articles published in PLoS between 2006 and 2015. We divided these articles into three groups according to the ethnic backgrounds of the first and corresponding authors, obtained by Ethnea, and examined the scientific writing styles in English from a two-fold perspective of linguistic complexity: (1) syntactic complexity, including measurements of sentence length and sentence complexity; and (2) lexical complexity, including measurements of lexical diversity, lexical density, and…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNatural Language Processing Techniques · Topic Modeling · Text Readability and Simplification
