Fetching Pfizer COVID-19 Wikipedia Text
In this course, we will learn how to fetch COVID-19 related information about Pfizer from Wikipedia using the Hugging Face Transformers library. This course is aimed at those who have a basic knowledge of natural language processing (NLP) and will guide you on how to comfortably use Hugging Face’s library with Python as a friend.
1. Environment Setup
First, we need to install the necessary libraries. Enter the code below to install transformers
and wikipedia-api
.
!pip install transformers wikipedia-api
2. Importing Libraries
Let’s import the necessary libraries. transformers
helps in easily using natural language processing models. wikipedia-api
allows easy access to the Wikipedia API.
import wikipediaapi
from transformers import pipeline
3. Fetching Information from Wikipedia
Now, let’s fetch COVID-19 and Pfizer-related information from Wikipedia. We will use wikipediaapi
to get the information.
wiki_wiki = wikipediaapi.Wikipedia('en')
page = wiki_wiki.page("COVID-19_vaccine_Pfizer")
if page.exists():
print(page.text[0:1000]) # Print the first 1000 characters
else:
print("The page does not exist.")
Code Explanation
The above code retrieves the “COVID-19 Vaccine Pfizer” page from Wikipedia. If the page exists, it prints the first 1000 characters. This helps us verify the content of the information we want to fetch.
4. Summarizing the Text
Since the fetched data contains many long sentences, let’s summarize it using a natural language processing model. We will use the summarization model provided by the Hugging Face transformers
library.
summarizer = pipeline("summarization")
summary = summarizer(page.text, max_length=130, min_length=30, do_sample=False)
print("Summary:")
for s in summary:
print(s['summary_text'])
Code Explanation
This code performs text summarization through the Hugging Face “summarization” pipeline. You can adjust the length of the summary by setting max_length
and min_length
.
5. Conclusion
In this course, we learned how to fetch and summarize Pfizer’s COVID-19 information using Hugging Face Transformers and the Wikipedia API. We hope you have glimpsed the possibilities of natural language processing. These techniques can be applied in various fields and are useful tools for your projects.
6. Next Steps
Furthermore, try various natural language processing tasks such as sentiment analysis, question-answering systems, and document classification! We recommend exploring Hugging Face’s model hub to find and utilize models that suit you.
Thank you!