Scanner data in inflation measurement: from raw data to price indices
Jacek Bia{\l}ek, Maciej Ber\k{e}sewicz

TL;DR
This paper discusses the processing of scanner data for inflation measurement, covering data cleaning, classification, and index calculation, and compares various price index methods using real datasets.
Contribution
It proposes detailed procedures for handling scanner data and evaluates the sensitivity of different price index methods in inflation measurement.
Findings
Scanner data enables detailed CPI calculation with complete transaction info.
Different price index methods vary in sensitivity to data filtering and aggregation.
Proper data processing is crucial for accurate inflation measurement.
Abstract
Scanner data offer new opportunities for CPI or HICP calculation. They can be obtained from a~wide variety of~retailers (supermarkets, home electronics, Internet shops, etc.) and provide information at the level of~the barcode. One of~advantages of~using scanner data is the fact that they contain complete transaction information, i.e. prices and quantities for every sold item. To use scanner data, it must be carefully processed. After clearing data and unifying product names, products should be carefully classified (e.g. into COICOP 5 or below), matched, filtered and aggregated. These procedures often require creating new IT or writing custom scripts (R, Python, Mathematica, SAS, others). One of~new challenges connected with scanner data is the appropriate choice of~the index formula. In this article we present a~proposal for the implementation of~individual stages of~handling scanner…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
