Improved Text Classification via Contrastive Adversarial Training
Abstract
We propose a simple and general method to regularize the fine-tuning of Transformer-based encoders for text classification tasks. Specifically, during fine-tuning we generate adversarial examples by perturbing the word embedding matrix of the model and perform contrastive learning on clean and adversarial examples in order to teach the model to learn noiseinvariant representations. By training on both clean and adversarial examples along with the additional contrastive objective, we observe consistent improvement over standard fine-tuning on clean examples. On several GLUE benchmark tasks, our fine-tuned BERTLarge model outperforms BERTLarge baseline by 1:7% on average, and our fine-tuned RoBERTaLarge improves over RoBERTaLarge baseline by 1:3%. We additionally validate our method in different domains using three intent classification datasets, where our fine-tuned RoBERTaLarge outperforms RoBERTaLarge baseline by 1-2% on average. For the challenging low-resource scenario, we train our system using half of the training data (per intent) in each of the three intent classification datasets, and achieve similar performance compared to the baseline trained with full training data.