Multi-Integration of Labels across Categories for Component Identification (MILCCI)
Noga Mudrik, Yuxi Chen, Gal Mishne, Adam S. Charles

TL;DR
MILCCI is a new data-driven method that identifies interpretable components in multi-trial temporal data, capturing cross-trial variability and integrating label information to understand category-specific representations.
Contribution
MILCCI extends sparse decomposition techniques to incorporate label similarities, enabling subtle, label-driven adjustments and disentangling category contributions in multi-trial data.
Findings
Successfully applied to synthetic data, demonstrating accurate component identification.
Effectively analyzed real-world datasets like voting patterns and neuronal recordings.
Enhanced interpretability of temporal components with label integration.
Abstract
Many fields collect large-scale temporal data through repeated measurements (trials), where each trial is labeled with a set of metadata variables spanning several categories. For example, a trial in a neuroscience study may be linked to a value from category (a): task difficulty, and category (b): animal choice. A critical challenge in time-series analysis is to understand how these labels are encoded within the multi-trial observations, and disentangle the distinct effect of each label entry across categories. Here, we present MILCCI, a novel data-driven method that i) identifies the interpretable components underlying the data, ii) captures cross-trial variability, and iii) integrates label information to understand each category's representation within the data. MILCCI extends a sparse per-trial decomposition that leverages label similarities within each category to enable subtle,…
Peer Reviews
Decision·Submitted to ICLR 2026
- The setting is practical although challenging because we do not have ground-truth labels to justify the performance rigorously.
- The writing of the paper is not good with confusing mathematical notions and formulas, making the paper hard to read. - It is unclear how to use and interpret $A$ and $\Phi$. - The experiments have no ground-truth. Therefore, it is hard to justify the performance of the proposed approach.
- Strainghtforward model The model proposed here is straightforward, attributing components of the time series segment to labels in a multi-hot label vector. The optimization is done in a kind of E-M algorithm, an alernation between optimizing the time series "templates" that correcpond to different labels and the "projection matrices" that determine the membersips of the templates in the labels. - Comparison to baselines The work compares the proposed algorithm to multiple beaselines includi
- No alternatives for the design choice While the model puts forward a quite reasonable and quite principled architecture, the alternative design choices are not explored in the text. For the E-M-like algorithm that alternates between optimizing A and Phi, is lasso for MSE the best algorithm choice or could other alternatives (e.g. MSE -> entropy) would be better? - No range description for the US election interpretability While the readouts of the model on the US electios dataset are linked
1.MILCCI breaks new ground by modeling components as category-specific tensors, allowing subtle adjustments to label changes while avoiding the rigidity of fixed-component methods. This design uniquely enables disentangling multi-category label effects, a capability lacking in SiBBlInGS and tensor factorization. 2.The method is rigorously tested across synthetic and real-world data with varying characteristics. It outperforms baselines in synthetic component recovery and produces interpretable
1.The iterative training process may become computationally expensive for large datasets with many categories or trials (e.g., >10,000 trials). The paper does not report runtime comparisons with baselines or discuss optimizations (e.g., mini-batch training) for scaling. 2.While MILCCI uses sparsity (γ₂) and label consistency (γ₁) hyperparameters, there is no systematic analysis of how these parameters affect performance. For example, how does varying γ₁ impact component consistency across label
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsEEG and Brain-Computer Interfaces · Face Recognition and Perception · Functional Brain Connectivity Studies
