A statistical technique for cleaning option price data
Jaco Visagie

TL;DR
This paper introduces a statistical method to identify and remove arbitrage-violating option prices from datasets, enhancing data reliability without relying on specific pricing models, and provides cleaned datasets for research use.
Contribution
The paper presents a novel, model-free statistical technique for cleaning option price data, addressing common data issues and ensuring arbitrage-free datasets.
Findings
Effective removal of arbitrage-violating prices
No reliance on specific option pricing models
Provides publicly available cleaned datasets
Abstract
Recorded option pricing datasets are not always freely available. Additionally, these datasets often contain numerous prices which are either higher or lower than can reasonably be expected. Various reasons for these unexpected observations are possible, including human error in the recording of the details associated with the option in question. In order for the analyses performed on these datasets to be reliable, it is necessary to identify and remove these options from the dataset. In this paper, we list three distinct problems often found in recorded option price datasets alongside means of addressing these. The methods used are justified using sound statistical reasoning and remove option prices violating the standard assumption of no arbitrage. An attractive aspect of the proposed technique is that no option pricing model-based assumptions are used. Although the discussion is…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsStochastic processes and financial applications · Financial Risk and Volatility Modeling · Stock Market Forecasting Methods
