Massive Language Fashions (LLM) are all the excitement proper now. They’re used for quite a lot of duties, together with textual content classification, query answering, and textual content technology. On this tutorial, we’ll present learn how to conformalize a transformer language mannequin for textual content classification utilizing ConformalPrediction.jl
.
Specifically, we have an interest within the process of intent classification as illustrated within the sketch beneath. Firstly, we feed a buyer question into an LLM to generate embeddings. Subsequent, we prepare a classifier to match these embeddings to attainable intents. In fact, for this supervised studying drawback we want coaching knowledge consisting of inputs — queries — and outputs — labels indicating the true intent. Lastly, we apply Conformal Predition to quantify the predictive uncertainty of our classifier.
Conformal Prediction (CP) is a quickly rising methodology for Predictive Uncertainty Quantification. Should you’re unfamiliar with CP, it’s possible you’ll wish to first try my 3-part introductory sequence on the subject beginning with this post.
We are going to use the Banking77 dataset (Casanueva et al., 2020), which consists of 13,083 queries from 77 intents associated to banking. On the mannequin aspect, we’ll use the DistilRoBERTa mannequin, which is a distilled model of RoBERTa (Liu et al., 2019) fine-tuned on the Banking77 dataset.
The mannequin will be loaded from HF straight into our operating Julia session utilizing the Transformers.jl
bundle.
This bundle makes working with HF fashions remarkably straightforward in Julia. Kudos to the devs! 🙏
Beneath we load the tokenizer tkr
and the mannequin mod
. The tokenizer is used to transform the textual content right into a sequence of integers, which is then fed into the mannequin. The mannequin outputs a hidden state, which is then fed right into a classifier to get the logits for every class. Lastly, the logits are then handed by means of a softmax operate to get the corresponding predicted possibilities. Beneath we run just a few queries by means of the mannequin to see the way it performs.
# Load mannequin from HF 🤗:
tkr =…