Hugging Face Transformers Tutorial, Transferred to GPU

Deep learning and natural language processing (NLP) have recently gained significant attention in the field of artificial intelligence. Among them, Hugging Face provides user-friendly Transformer models that help researchers and developers easily perform NLP tasks. In this course, we will explain in detail how to use basic Transformer models utilizing the Hugging Face library and how to improve performance through GPU acceleration.

1. What is Hugging Face Transformers?

Hugging Face Transformers is a library that provides pre-trained models for various natural language processing tasks. These models can be utilized in various fields such as language understanding, text generation, translation, question-answering, and more. The Hugging Face library is designed to provide an easy-to-use API to facilitate the simple use of complex deep learning models.

2. Environment Setup

To use Hugging Face Transformers, you need to install Python and pip, and install the necessary libraries. Let’s install them using the command below.

pip install transformers torch

The above command installs the Transformers library and PyTorch. Next, we will run the following code to check if a GPU is available.


import torch
print("CUDA availability:", torch.cuda.is_available())
print("Current CUDA device:", torch.cuda.get_device_name(0) if torch.cuda.is_available() else "None")

By running the above code, you can check whether CUDA is available and the name of the GPU being used.

3. Loading the Model

Now, let’s learn how to load and use the model. You can load various pre-trained models through Hugging Face’s transformers library. Here, we will demonstrate using the BERT model for text classification as an example.


from transformers import BertTokenizer, BertForSequenceClassification
from torch.nn import functional as F

# Load BERT tokenizer and model
tokenizer = BertTokenizer.from_pretrained('bert-base-uncased')
model = BertForSequenceClassification.from_pretrained('bert-base-uncased')

# Send to GPU
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
model.to(device)

The above code is an example of loading the BERT model and tokenizer, and transferring the model to GPU if available.

4. Text Data Preprocessing

It is necessary to preprocess the data before inputting it into the model. Here, we show the process of tokenizing a sentence and generating input tensors.


# Input sentence
text = "Hugging Face's Transformers provide powerful natural language processing technology."
# Tokenization and conversion to indices
inputs = tokenizer(text, return_tensors="pt").to(device)

Here, return_tensors="pt" means that we will return a PyTorch tensor. Now we are ready to pass the input data to the model.

5. Model Prediction

The process of making predictions with the model is as follows. We pass the input data to the model and interpret the results using logits.


# Model prediction
with torch.no_grad():
    outputs = model(**inputs)

# Logits output
logits = outputs.logits
predicted_class = logits.argmax(dim=1).item()
print("Predicted class:", predicted_class)

Running the above code will output the predicted class of the model for the input sentence.

6. Batch Processing of Data

In real applications, it is common to process multiple sentences at once. Here is how to process multiple sentences in batches.


texts = [
    "This is the first sentence.",
    "This is the second sentence.",
    "This is the third sentence."
]

# Tokenization and conversion to indices
inputs = tokenizer(texts, padding=True, truncation=True, return_tensors="pt").to(device)

# Model prediction
with torch.no_grad():
    outputs = model(**inputs)

# Logits output
logits = outputs.logits
predicted_classes = logits.argmax(dim=1).tolist()
print("Predicted classes:", predicted_classes)

Processing multiple sentences at once as shown above allows for more efficient acquisition of the model’s prediction results.

7. Optimization and GPU Utilization

When handling large-scale data, it is important to use GPUs to speed up training. The following code shows a simple example of training the model. In this sample example, we used the Adadelta optimizer.


from torch.optim import AdamW

# Optimizer setup
optimizer = AdamW(model.parameters(), lr=5e-5)

# Dummy data and labels
train_texts = ["This is a positive sentence.", "This is a negative sentence."]
train_labels = [1, 0]

# Batch processing
train_inputs = tokenizer(train_texts, padding=True, truncation=True, return_tensors="pt").to(device)
train_labels = torch.tensor(train_labels).to(device)

# Model training
model.train()
for epoch in range(3): # Number of epochs
    optimizer.zero_grad()
    outputs = model(**train_inputs, labels=train_labels)
    loss = outputs.loss
    loss.backward()
    optimizer.step()
    print(f"Epoch {epoch + 1}, Loss: {loss.item()}")

The above code is an example of training the model using two simple sentences. It prints the loss at each epoch to monitor the training progress.

8. Saving and Loading the Model

A trained model can be saved and loaded later. The code below shows how to save and load a model.


# Save the model
model.save_pretrained("./model_directory")
tokenizer.save_pretrained("./model_directory")

# Load the model
model = BertForSequenceClassification.from_pretrained("./model_directory")
tokenizer = BertTokenizer.from_pretrained("./model_directory")
model.to(device)

You can save the model and tokenizer, and later load them when needed for use.

9. Conclusion

In this course, we explained how to perform NLP tasks using the BERT model through the Hugging Face Transformers library and how to optimize performance through GPU utilization. As deep learning becomes increasingly important, developing the ability to use various tools and libraries is essential. We hope to see further advancements in the fields of AI and NLP.