DoubleCCA: Improving Foundation Model Group Robustness with Random Sentence Embeddings
Hong Liu, Yitong Lu

TL;DR
DoubleCCA is a new technique that enhances foundation model robustness against group biases by augmenting prompts with random sentences and using CCA to align and reconstruct embeddings, improving performance and robustness.
Contribution
We introduce DoubleCCA, a simple method combining random sentence augmentation and CCA to improve foundation model robustness to group biases.
Findings
Outperforms existing methods in robustness and performance.
Easily integrable into current foundation models.
Effective across various tasks and datasets.
Abstract
This paper presents a novel method to improve the robustness of foundation models to group-based biases. We propose a simple yet effective method, called DoubleCCA, that leverages random sentences and Canonical Correlation Analysis (CCA) to enrich the text embeddings of the foundation model. First, we generate various random sentences that augment the original prompts, which extends the original prompts with random words or character sequences. Second, we use an additional sentence embedding model to generate different text embeddings with respect to these random sentences. We then use CCA double twice to align the representations and reconstruct them back to the original representation space. We demonstrate the effectiveness of our method on a variety of tasks and datasets, showing that it outperforms existing methods in terms of both performance and robustness. Our method is simple to…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Natural Language Processing Techniques
MethodsALIGN
