A two-step sequential approach for hyperparameter selection in finite context models

Jos\'e Contente; Ana Martins; Armando J. Pinho; S\'onia Gouveia

arXiv:2603.19736·stat.ML·March 23, 2026

A two-step sequential approach for hyperparameter selection in finite context models

Jos\'e Contente, Ana Martins, Armando J. Pinho, S\'onia Gouveia

PDF

Open Access

TL;DR

This paper introduces a two-step sequential method for efficiently selecting hyperparameters in finite-context models, significantly reducing computational costs while maintaining compression performance.

Contribution

It proposes a statistically grounded, two-stage approach for hyperparameter selection in FCMs, decomposing the joint optimization into independent steps based on dependence measures and maximum likelihood estimation.

Findings

01

Dependence measures are more sensitive to context length k than smoothing parameter α.

02

The method achieves comparable compression to exhaustive search with less computational effort.

03

Hyperparameter estimation accuracy improves with larger sample sizes.

Abstract

Finite-context models (FCMs) are widely used for compressing symbolic sequences such as DNA, where predictive performance depends critically on the context length k and smoothing parameter {\alpha}. In practice, these hyperparameters are typically selected through exhaustive search, which is computationally expensive and scales poorly with model complexity. This paper proposes a statistically grounded two-step sequential approach for efficient hyperparameter selection in FCMs. The key idea is to decompose the joint optimization problem into two independent stages. First, the context length k is estimated using categorical serial dependence measures, including Cram\'er's {\nu}, Cohen's \k{appa} and partial mutual information (pami). Second, the smoothing parameter {\alpha} is estimated via maximum likelihood conditional on the selected context length k. Simulation experiments were…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAlgorithms and Data Compression · Fractal and DNA sequence analysis · Speech Recognition and Synthesis