Conformal prediction for text infilling and part-of-speech prediction
Neil Dey, Jing Ding, Jack Ferrell, Carolina Kapper, Maxwell Lovig,, Emiliano Planchon, and Jonathan P Williams

TL;DR
This paper introduces conformal prediction algorithms for text infilling and POS tagging, providing statistically reliable confidence sets that are valid in finite samples, demonstrated on the Brown Corpus and real audio transcriptions.
Contribution
It develops new conformal prediction methods integrated with BERT and BiLSTM models for NLP tasks, ensuring valid set predictions with practical applicability.
Findings
Algorithms produce valid confidence sets in simulations
Set sizes are small enough for real-world use
Improved machine transcription accuracy with set predictions
Abstract
Modern machine learning algorithms are capable of providing remarkably accurate point-predictions; however, questions remain about their statistical reliability. Unlike conventional machine learning methods, conformal prediction algorithms return confidence sets (i.e., set-valued predictions) that correspond to a given significance level. Moreover, these confidence sets are valid in the sense that they guarantee finite sample control over type 1 error probabilities, allowing the practitioner to choose an acceptable error rate. In our paper, we propose inductive conformal prediction (ICP) algorithms for the tasks of text infilling and part-of-speech (POS) prediction for natural language data. We construct new conformal prediction-enhanced bidirectional encoder representations from transformers (BERT) and bidirectional long short-term memory (BiLSTM) algorithms for POS tagging and a new…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Natural Language Processing Techniques · Speech Recognition and Synthesis
MethodsMulti-Head Attention · Attention Is All You Need · Linear Layer · Refunds@Expedia|||How do I get a full refund from Expedia? · WordPiece · Layer Normalization · Residual Connection · Dense Connections · Attention Dropout · Softmax
