Size matters? Or not: A/B testing with limited sample in automotive embedded software
Yuchu Liu, David Issa Mattos, Jan Bosch, Helena Holmstr\"om Olsson,, Jonn Lantz

TL;DR
This paper introduces a novel group design method for A/B testing in automotive embedded software, enabling reliable causal inference from small sample sizes, which is crucial due to limited user participation.
Contribution
It applies and evaluates the Balance Match Weighted method in automotive software development, demonstrating its effectiveness in small-sample A/B testing scenarios.
Findings
Effective group balancing for small samples
Successful case study with automotive manufacturer
Discussion of benefits and limitations
Abstract
A/B testing is gaining attention in the automotive sector as a promising tool to measure causal effects from software changes. Different from the web-facing businesses, where A/B testing has been well-established, the automotive domain often suffers from limited eligible users to participate in online experiments. To address this shortcoming, we present a method for designing balanced control and treatment groups so that sound conclusions can be drawn from experiments with considerably small sample sizes. While the Balance Match Weighted method has been used in other domains such as medicine, this is the first paper to apply and evaluate it in the context of software development. Furthermore, we describe the Balance Match Weighted method in detail and we conduct a case study together with an automotive manufacturer to apply the group design method in a fleet of vehicles. Finally, we…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
