Negative binomial regression and inference using a pre-trained transformer

Valentine Svensson

arXiv:2508.04111·stat.ML·August 7, 2025

Negative binomial regression and inference using a pre-trained transformer

Valentine Svensson

PDF

1 Models

TL;DR

This paper explores using a pre-trained transformer to estimate negative binomial regression parameters efficiently, finding it faster and more accurate than traditional methods, with method of moments being the most practical approach.

Contribution

It introduces a transformer-based approach for negative binomial regression parameter estimation, demonstrating superior speed and accuracy over traditional optimization methods.

Findings

01

Transformer estimates outperform maximum likelihood in accuracy.

02

Method of moments estimates are as accurate as MLE but much faster.

03

Method of moments provides better-calibrated and more powerful tests.

Abstract

Negative binomial regression is essential for analyzing over-dispersed count data in in comparative studies, but parameter estimation becomes computationally challenging in large screens requiring millions of comparisons. We investigate using a pre-trained transformer to produce estimates of negative binomial regression parameters from observed count data, trained through synthetic data generation to learn to invert the process of generating counts from parameters. The transformer method achieved better parameter accuracy than maximum likelihood optimization while being 20 times faster. However, comparisons unexpectedly revealed that method of moment estimates performed as well as maximum likelihood optimization in accuracy, while being 1,000 times faster and producing better-calibrated and more powerful tests, making it the most efficient solution for this application.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Models

🤗
valsv/nb-transformer
model

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.