Multi-Integration of Labels across Categories for Component Identification (MILCCI)

Noga Mudrik; Yuxi Chen; Gal Mishne; Adam S. Charles

arXiv:2602.04270·cs.LG·February 5, 2026

Multi-Integration of Labels across Categories for Component Identification (MILCCI)

Noga Mudrik, Yuxi Chen, Gal Mishne, Adam S. Charles

PDF

Open Access 3 Reviews

TL;DR

MILCCI is a new data-driven method that identifies interpretable components in multi-trial temporal data, capturing cross-trial variability and integrating label information to understand category-specific representations.

Contribution

MILCCI extends sparse decomposition techniques to incorporate label similarities, enabling subtle, label-driven adjustments and disentangling category contributions in multi-trial data.

Findings

01

Successfully applied to synthetic data, demonstrating accurate component identification.

02

Effectively analyzed real-world datasets like voting patterns and neuronal recordings.

03

Enhanced interpretability of temporal components with label integration.

Abstract

Many fields collect large-scale temporal data through repeated measurements (trials), where each trial is labeled with a set of metadata variables spanning several categories. For example, a trial in a neuroscience study may be linked to a value from category (a): task difficulty, and category (b): animal choice. A critical challenge in time-series analysis is to understand how these labels are encoded within the multi-trial observations, and disentangle the distinct effect of each label entry across categories. Here, we present MILCCI, a novel data-driven method that i) identifies the interpretable components underlying the data, ii) captures cross-trial variability, and iii) integrates label information to understand each category's representation within the data. MILCCI extends a sparse per-trial decomposition that leverages label similarities within each category to enable subtle,…

Peer Reviews

Decision·Submitted to ICLR 2026

Reviewer 01Rating 2Confidence 3

Strengths

- The setting is practical although challenging because we do not have ground-truth labels to justify the performance rigorously.

Weaknesses

- The writing of the paper is not good with confusing mathematical notions and formulas, making the paper hard to read. - It is unclear how to use and interpret $A$ and $\Phi$. - The experiments have no ground-truth. Therefore, it is hard to justify the performance of the proposed approach.

Reviewer 02Rating 6Confidence 3

Strengths

- Strainghtforward model The model proposed here is straightforward, attributing components of the time series segment to labels in a multi-hot label vector. The optimization is done in a kind of E-M algorithm, an alernation between optimizing the time series "templates" that correcpond to different labels and the "projection matrices" that determine the membersips of the templates in the labels. - Comparison to baselines The work compares the proposed algorithm to multiple beaselines includi

Weaknesses

- No alternatives for the design choice While the model puts forward a quite reasonable and quite principled architecture, the alternative design choices are not explored in the text. For the E-M-like algorithm that alternates between optimizing A and Phi, is lasso for MSE the best algorithm choice or could other alternatives (e.g. MSE -> entropy) would be better? - No range description for the US election interpretability While the readouts of the model on the US electios dataset are linked

Reviewer 03Rating 6Confidence 2

Strengths

1.MILCCI breaks new ground by modeling components as category-specific tensors, allowing subtle adjustments to label changes while avoiding the rigidity of fixed-component methods. This design uniquely enables disentangling multi-category label effects, a capability lacking in SiBBlInGS and tensor factorization. 2.The method is rigorously tested across synthetic and real-world data with varying characteristics. It outperforms baselines in synthetic component recovery and produces interpretable

Weaknesses

1.The iterative training process may become computationally expensive for large datasets with many categories or trials (e.g., >10,000 trials). The paper does not report runtime comparisons with baselines or discuss optimizations (e.g., mini-batch training) for scaling. 2.While MILCCI uses sparsity (γ₂) and label consistency (γ₁) hyperparameters, there is no systematic analysis of how these parameters affect performance. For example, how does varying γ₁ impact component consistency across label

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsEEG and Brain-Computer Interfaces · Face Recognition and Perception · Functional Brain Connectivity Studies