Efficient Learning of Sparse Conditional Random Fields for Supervised   Sequence Labelling

Nataliya Sokolovska (LTCI); Thomas Lavergne (LIMSI); Olivier Capp\'e; (LTCI); Fran\c{c}ois Yvon (LIMSI)

arXiv:0909.1308·cs.LG·May 14, 2015

Efficient Learning of Sparse Conditional Random Fields for Supervised Sequence Labelling

Nataliya Sokolovska (LTCI), Thomas Lavergne (LIMSI), Olivier Capp\'e, (LTCI), Fran\c{c}ois Yvon (LIMSI)

PDF

TL;DR

This paper introduces a sparse learning method for Conditional Random Fields (CRFs) that uses L1 regularization and coordinate descent to improve training efficiency and scalability for sequence labeling tasks.

Contribution

The paper presents a novel coordinate descent algorithm for L1-regularized CRFs, enabling faster training by exploiting sparsity in model parameters.

Findings

01

Significantly speeds up CRF training and labeling processes.

02

Effectively handles larger models due to sparsity exploitation.

03

Empirical results show competitive performance with state-of-the-art methods.

Abstract

Conditional Random Fields (CRFs) constitute a popular and efficient approach for supervised sequence labelling. CRFs can cope with large description spaces and can integrate some form of structural dependency between labels. In this contribution, we address the issue of efficient feature selection for CRFs based on imposing sparsity through an L1 penalty. We first show how sparsity of the parameter set can be exploited to significantly speed up training and labelling. We then introduce coordinate descent parameter update schemes for CRFs with L1 regularization. We finally provide some empirical comparisons of the proposed approach with state-of-the-art CRF training strategies. In particular, it is shown that the proposed approach is able to take profit of the sparsity to speed up processing and hence potentially handle larger dimensional models.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.