Transfomer course using Hugging Face, DistilGPT2 writing

In this course, we will practice sentence generation using the Transformers library from Hugging Face with the DistilGPT-2 model. DistilGPT-2 is a lightweight model based on OpenAI’s GPT-2 model, optimized for fast and efficient text generation. This model can be used for various natural language processing (NLP) tasks and shows excellent performance, especially in automatic writing.

1. Preparation

To use the DistilGPT-2 model, you need to install the libraries below.

pip install transformers torch
  • transformers: Hugging Face Transformers library
  • torch: PyTorch deep learning library

Use the command above to install all the necessary libraries.

2. Loading the Model

To use the model, first import the necessary libraries, then load the DistilGPT-2 model and tokenizer.

from transformers import DistilGPT2LMHeadModel, DistilGPT2Tokenizer

# Load DistilGPT-2 model and tokenizer
tokenizer = DistilGPT2Tokenizer.from_pretrained('distilgpt2')
model = DistilGPT2LMHeadModel.from_pretrained('distilgpt2')

This code loads a pre-trained model called distilgpt2. Now we can use the model to generate text.

3. Generating Text

Now, let’s generate text using the prepared model. By providing the beginning of a sentence, the model will generate the following words based on it.

# Define text generation function
def generate_text(prompt, max_length=50):
    # Tokenize input text
    inputs = tokenizer.encode(prompt, return_tensors='pt')
    
    # Generate text with the model
    outputs = model.generate(inputs, max_length=max_length, num_return_sequences=1)
    
    # Decode the generated text
    return tokenizer.decode(outputs[0], skip_special_tokens=True)

# Example usage
prompt = "Deep learning is"
generated_text = generate_text(prompt)
print(generated_text)

In the code above, the generate_text function generates text up to a maximum of max_length after the given prompt. It tokenizes the input text using tokenizer.encode, generates text using model.generate, and then decodes the generated text back to a string.

4. Adjusting Various Text Generation Parameters

By adjusting various parameters during text generation, you can create differences in the generated output. Here are the main parameters:

  • max_length: Maximum length of the generated text
  • num_return_sequences: Number of text sequences to generate
  • temperature: A lower value (e.g., 0.7) generates more predictable text, while a higher value (e.g., 1.5) increases diversity.
  • top_k: Sample from the top k possible next words
  • top_p: Sample from the possible range until the cumulative probability reaches p

Now let’s look at an example of generating text using these parameters:

# Generate text using various parameters
def generate_text_with_params(prompt, max_length=50, temperature=1.0, top_k=50, top_p=0.9):
    inputs = tokenizer.encode(prompt, return_tensors='pt')
    outputs = model.generate(
        inputs,
        max_length=max_length,
        num_return_sequences=1,
        temperature=temperature,
        top_k=top_k,
        top_p=top_p,
        do_sample=True
    )
    return tokenizer.decode(outputs[0], skip_special_tokens=True)

# Example usage
prompt = "The advancement of artificial intelligence"
generated_text = generate_text_with_params(prompt, temperature=0.7, top_k=30, top_p=0.85)
print(generated_text)

In the generate_text_with_params function above, additional parameters adjust the way text is generated. This allows for various methods of text generation.

5. Practice: Writing a Novel

This time, let’s write a short story based on the content learned above. Generate a story progression based on the prompt provided by the user.

# Example of novel writing
story_prompt = "Once upon a time, a sudden storm struck a peaceful village. The residents were all"
story = generate_text_with_params(story_prompt, max_length=150, temperature=0.9)
print(story)

This code starts from the given prompt and generates a maximum of 150 characters of text. With a higher temperature set, more creative content will be generated. Through the generated story, you can experience the text generation capability of DistilGPT-2.

6. Results and Feedback

Read the sentences generated above and provide feedback on the feelings evoked by the generated text and where the model exceeded or fell short of expectations. This feedback will help identify areas for improvement and utilization of the model.

7. Conclusion

In this course, we learned how to utilize the DistilGPT-2 model using the Hugging Face library. This model showcases remarkable potential in the field of natural language generation (NLG) and can be applied to various text generation tasks. Through this, we can further develop the creative potential of artificial intelligence.

Utilize the model to implement various natural language processing tasks. By combining this with your ideas, you can create very interesting outcomes.

8. Reference Materials