Fine-Tuning Transformers for NLP
In this tutorial, the process of fine-tuning two different transformer models, BERT and DistilBERT, for two NLP problems - sentiment analysis using Stanford Sentiment Treebank v2 (SST2) dataset and duplicate question detection using Quora Question Pairs (QQP) dataset is demonstrated. The HuggingFace Transformers Repo provides over 60 different network types including pre-trained models like DistilBERT and BERT, which are used in this tutorial. Training these models involves creating PyTorch datasets and dataloaders for the training and validation sets, defining a loss function, and implementing functions for model training/evaluation. The results show that fine-tuning pre-trained Transformers on downstream tasks can save significant time with performance often immediately high out of the box.
Company
AssemblyAI
Date published
June 15, 2021
Author(s)
Dillon Pulliam
Word count
2733
Language
English
Hacker News points
67