Measuring Corporate Digital Divide with web scraping: Evidence from Italy
Mazzoni Leonardo, Pinelli Fabio, Riccaboni Massimo

TL;DR
This paper introduces a novel web scraping method to measure the digital divide among Italian firms, revealing significant disparities across sectors and regions, and enabling real-time digital assessment.
Contribution
It develops a new digital assessment index using web scraping of firm websites, addressing data scarcity in measuring corporate digital divide.
Findings
Significant digital divide across sectors and regions
Web scraping effectively captures digital footprint characteristics
Enables near-real-time monitoring of digital disparities
Abstract
With the increasing pervasiveness of ICTs in the fabric of economic activities, the corporate digital divide has emerged as a new crucial topic to evaluate the IT competencies and the digital gap between firms and territories. Given the scarcity of available granular data to measure the phenomenon, most studies have used survey data. To bridge the empirical gap, we scrape the website homepage of 182 705 Italian firms, extracting ten features related to their digital footprint characteristics to develop a new corporate digital assessment index. Our results highlight a significant digital divide across dimensions, sectors and geographical locations of Italian firms, opening up new perspectives on monitoring and near-real-time data-driven analysis.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsICT Impact and Policies · Social Media and Politics · E-Government and Public Services
