Using Hugging Face Transformers Course, Installing Mobile BERT Library and Loading Pre-trained Models

In the field of deep learning, natural language processing (NLP) plays a very important role. In particular, the BERT (Bidirectional Encoder Representations from Transformers) model is widely used in the field of NLP.
In this course, we will explain how to install the Mobile BERT model using Hugging Face’s Transformers library and how to load the pre-trained model.
Mobile BERT is a lightweight BERT model, which has the advantage of being efficiently usable on mobile devices.

1. What is the Hugging Face Transformers Library?

The Hugging Face Transformers library is a Python library that helps you easily use various state-of-the-art NLP models.
Through this library, you can load various pre-trained models such as BERT, GPT-2, T5, and apply them to various NLP tasks. It also provides APIs for user-customized training.

2. Understanding Mobile BERT

Mobile BERT is a lightweight BERT model developed by Google. Traditional BERT models are pre-trained on large-scale datasets and show strong performance, but
their large size poses constraints for use on mobile devices or embedded systems.
In contrast, Mobile BERT is designed to reduce size while maintaining as much performance as possible. Thanks to this characteristic, Mobile BERT is being utilized in various NLP tasks.

3. Environment Setup and Library Installation

To use Mobile BERT, you first need to install the Hugging Face Transformers library and other necessary libraries.
You can install the required libraries using pip with the following command:

pip install transformers torch

Once installation is completed with the command above, you will be ready to use Mobile BERT in your Python environment.

Note: If you haven’t installed PyTorch, you need to do so. If you are using a GPU that supports CUDA,
choose the appropriate version of PyTorch for CUDA from the official website and install it.

4. Loading Pre-trained Models

Now, let’s load the Mobile BERT model. The Hugging Face Transformers library provides several classes to easily use pre-trained models.

4.1 Code Example

The following code loads Mobile BERT:

from transformers import MobileBertTokenizer, MobileBertForSequenceClassification
import torch

# Load Mobile BERT model and tokenizer
model_name = "google/mobilebert-uncased"
tokenizer = MobileBertTokenizer.from_pretrained(model_name)
model = MobileBertForSequenceClassification.from_pretrained(model_name)

# Sentence to test
input_text = "The Hugging Face transformer is very useful!"

# Tokenize the input sentence and convert to tensor
inputs = tokenizer(input_text, return_tensors="pt")

# Input the data into the model and predict
with torch.no_grad():
    logits = model(**inputs).logits

predicted_class = torch.argmax(logits, dim=-1).item()
print(f"Predicted class: {predicted_class}")

4.2 Code Explanation

Examining each element of the code, we have:

  • from transformers import MobileBertTokenizer, MobileBertForSequenceClassification:
    Loads the Mobile BERT model and tokenizer.
  • model_name = "google/mobilebert-uncased": Sets the name of the pre-trained model to use.
  • tokenizer = MobileBertTokenizer.from_pretrained(model_name): Initializes the tokenizer for the model.
  • model = MobileBertForSequenceClassification.from_pretrained(model_name): Initializes the model.
    At this point, the model is suitable for sentence classification tasks.
  • inputs = tokenizer(input_text, return_tensors="pt"): Tokenizes the input sentence and converts it to a PyTorch tensor.
  • with torch.no_grad():: Configures to not track gradients of tensors for better memory efficiency.
  • logits = model(**inputs).logits: Retrieves the predictions made by the model.
  • predicted_class = torch.argmax(logits, dim=-1).item(): Selects the class with the highest probability among the predicted classes.

5. A Practical Example Using Mobile BERT

Let’s take a look at an example of performing sentence classification using the Mobile BERT model.
The approach is to classify whether the given sentence is positive or negative.

5.1 Preparing the Dataset

First, we need to prepare reliable data. For example, a movie review dataset is divided into positive and negative reviews.
This can be used to train the model. Let’s write code to load and preprocess the data.

import pandas as pd

# Load sample data (5 positive reviews, 5 negative reviews)
data = {
    "text": [
        "I really love this movie.", 
        "It's the best movie!", 
        "It was really moving.", 
        "A perfect masterpiece.", 
        "This movie touched my heart.",
        "This is a waste of time.", 
        "It's bad and boring.", 
        "I was really disappointed.", 
        "Never watch it.", 
        "This movie is the worst."
    ],
    "label": [1, 1, 1, 1, 1, 0, 0, 0, 0, 0]
}

df = pd.DataFrame(data)
print(df.head())

5.2 Training the Model

Now we move on to the process of training the model with the data. We can write a simple training loop to train the model.
However, here we will not delve into the details of the training process, but we will proceed with a simple transfer learning using the provided data.

from torch.utils.data import DataLoader, Dataset

class CustomDataset(Dataset):
    def __init__(self, texts, labels, tokenizer):
        self.texts = texts
        self.labels = labels
        self.tokenizer = tokenizer

    def __len__(self):
        return len(self.labels)

    def __getitem__(self, idx):
        text = self.texts[idx]
        label = self.labels[idx]
        encoding = self.tokenizer(text, return_tensors='pt', padding='max_length', truncation=True, max_length=128)
        return {'input_ids': encoding['input_ids'].flatten(), 'attention_mask': encoding['attention_mask'].flatten(), 'label': torch.tensor(label, dtype=torch.long)}

dataset = CustomDataset(df['text'].values, df['label'].values, tokenizer)
dataloader = DataLoader(dataset, batch_size=2, shuffle=True)

# Simple training loop
for epoch in range(3):
    for batch in dataloader:
        model.train()
        outputs = model(input_ids=batch['input_ids'], attention_mask=batch['attention_mask'], labels=batch['label'])
        loss = outputs.loss
        loss.backward()
        # Optimizer Step etc are omitted
        print(f"Epoch {epoch + 1}, Loss: {loss.item()}")

5.3 Predictions and Evaluation

After the model is trained, we can make predictions on new sentences. Let’s verify this with the following example:

test_text = "This movie is very good."
test_inputs = tokenizer(test_text, return_tensors="pt")

with torch.no_grad():
    test_logits = model(**test_inputs).logits

test_predicted_class = torch.argmax(test_logits, dim=-1).item()
print(f"Predicted class for the test sentence '{test_text}': {test_predicted_class}")

6. Conclusion

In this course, we explored how to install the Mobile BERT model using the Hugging Face Transformers library,
load a pre-trained model, and perform sentence classification tasks. Mobile BERT is a lightweight model, making it useful in mobile environments or resource-constrained settings.
We encourage further research into its applicability to various NLP tasks.

If you found this course helpful, please share it with others! If you have any additional materials or questions, feel free to leave a comment.