In the field of deep learning, natural language processing (NLP) plays a very important role. In particular, the BERT (Bidirectional Encoder Representations from Transformers) model is widely used in the field of NLP.
In this course, we will explain how to install the Mobile BERT model using Hugging Face’s Transformers library and how to load the pre-trained model.
Mobile BERT is a lightweight BERT model, which has the advantage of being efficiently usable on mobile devices.
1. What is the Hugging Face Transformers Library?
The Hugging Face Transformers library is a Python library that helps you easily use various state-of-the-art NLP models.
Through this library, you can load various pre-trained models such as BERT, GPT-2, T5, and apply them to various NLP tasks. It also provides APIs for user-customized training.
2. Understanding Mobile BERT
Mobile BERT is a lightweight BERT model developed by Google. Traditional BERT models are pre-trained on large-scale datasets and show strong performance, but
their large size poses constraints for use on mobile devices or embedded systems.
In contrast, Mobile BERT is designed to reduce size while maintaining as much performance as possible. Thanks to this characteristic, Mobile BERT is being utilized in various NLP tasks.
3. Environment Setup and Library Installation
To use Mobile BERT, you first need to install the Hugging Face Transformers library and other necessary libraries.
You can install the required libraries using pip with the following command:
pip install transformers torch
Once installation is completed with the command above, you will be ready to use Mobile BERT in your Python environment.
choose the appropriate version of PyTorch for CUDA from the official website and install it.
4. Loading Pre-trained Models
Now, let’s load the Mobile BERT model. The Hugging Face Transformers library provides several classes to easily use pre-trained models.
4.1 Code Example
The following code loads Mobile BERT:
from transformers import MobileBertTokenizer, MobileBertForSequenceClassification
import torch
# Load Mobile BERT model and tokenizer
model_name = "google/mobilebert-uncased"
tokenizer = MobileBertTokenizer.from_pretrained(model_name)
model = MobileBertForSequenceClassification.from_pretrained(model_name)
# Sentence to test
input_text = "The Hugging Face transformer is very useful!"
# Tokenize the input sentence and convert to tensor
inputs = tokenizer(input_text, return_tensors="pt")
# Input the data into the model and predict
with torch.no_grad():
logits = model(**inputs).logits
predicted_class = torch.argmax(logits, dim=-1).item()
print(f"Predicted class: {predicted_class}")
4.2 Code Explanation
Examining each element of the code, we have:
from transformers import MobileBertTokenizer, MobileBertForSequenceClassification
:
Loads the Mobile BERT model and tokenizer.model_name = "google/mobilebert-uncased"
: Sets the name of the pre-trained model to use.tokenizer = MobileBertTokenizer.from_pretrained(model_name)
: Initializes the tokenizer for the model.model = MobileBertForSequenceClassification.from_pretrained(model_name)
: Initializes the model.
At this point, the model is suitable for sentence classification tasks.inputs = tokenizer(input_text, return_tensors="pt")
: Tokenizes the input sentence and converts it to a PyTorch tensor.with torch.no_grad():
: Configures to not track gradients of tensors for better memory efficiency.logits = model(**inputs).logits
: Retrieves the predictions made by the model.predicted_class = torch.argmax(logits, dim=-1).item()
: Selects the class with the highest probability among the predicted classes.
5. A Practical Example Using Mobile BERT
Let’s take a look at an example of performing sentence classification using the Mobile BERT model.
The approach is to classify whether the given sentence is positive or negative.
5.1 Preparing the Dataset
First, we need to prepare reliable data. For example, a movie review dataset is divided into positive and negative reviews.
This can be used to train the model. Let’s write code to load and preprocess the data.
import pandas as pd
# Load sample data (5 positive reviews, 5 negative reviews)
data = {
"text": [
"I really love this movie.",
"It's the best movie!",
"It was really moving.",
"A perfect masterpiece.",
"This movie touched my heart.",
"This is a waste of time.",
"It's bad and boring.",
"I was really disappointed.",
"Never watch it.",
"This movie is the worst."
],
"label": [1, 1, 1, 1, 1, 0, 0, 0, 0, 0]
}
df = pd.DataFrame(data)
print(df.head())
5.2 Training the Model
Now we move on to the process of training the model with the data. We can write a simple training loop to train the model.
However, here we will not delve into the details of the training process, but we will proceed with a simple transfer learning using the provided data.
from torch.utils.data import DataLoader, Dataset
class CustomDataset(Dataset):
def __init__(self, texts, labels, tokenizer):
self.texts = texts
self.labels = labels
self.tokenizer = tokenizer
def __len__(self):
return len(self.labels)
def __getitem__(self, idx):
text = self.texts[idx]
label = self.labels[idx]
encoding = self.tokenizer(text, return_tensors='pt', padding='max_length', truncation=True, max_length=128)
return {'input_ids': encoding['input_ids'].flatten(), 'attention_mask': encoding['attention_mask'].flatten(), 'label': torch.tensor(label, dtype=torch.long)}
dataset = CustomDataset(df['text'].values, df['label'].values, tokenizer)
dataloader = DataLoader(dataset, batch_size=2, shuffle=True)
# Simple training loop
for epoch in range(3):
for batch in dataloader:
model.train()
outputs = model(input_ids=batch['input_ids'], attention_mask=batch['attention_mask'], labels=batch['label'])
loss = outputs.loss
loss.backward()
# Optimizer Step etc are omitted
print(f"Epoch {epoch + 1}, Loss: {loss.item()}")
5.3 Predictions and Evaluation
After the model is trained, we can make predictions on new sentences. Let’s verify this with the following example:
test_text = "This movie is very good."
test_inputs = tokenizer(test_text, return_tensors="pt")
with torch.no_grad():
test_logits = model(**test_inputs).logits
test_predicted_class = torch.argmax(test_logits, dim=-1).item()
print(f"Predicted class for the test sentence '{test_text}': {test_predicted_class}")
6. Conclusion
In this course, we explored how to install the Mobile BERT model using the Hugging Face Transformers library,
load a pre-trained model, and perform sentence classification tasks. Mobile BERT is a lightweight model, making it useful in mobile environments or resource-constrained settings.
We encourage further research into its applicability to various NLP tasks.
If you found this course helpful, please share it with others! If you have any additional materials or questions, feel free to leave a comment.