Research IT

Python logo

Web Scraping with Python – a guide

Looking for an easy way to get starting with online data collection that doesn’t involve copying and pasting? Research Software Engineer Chris Lam has written a simple guide which introduces web scraping with Python.


Researchers frequently initiate their data collection by exploring online resources, seeking valuable information from websites. While some data providers offer convenient download options, others provide no such functionality, leaving researchers to rely on manual copying and pasting. This approach, though functional, is time-consuming, error-prone, and inefficient.

Automating these repetitive tasks can significantly enhance productivity. The guide introduces web scraping using Python, focusing on three prominent libraries: requests, BeautifulSoup, and Selenium. Written for readers with minimal programming experience, the guide compares these tools and provides practical examples to facilitate effective web scraping. By the end, readers will be equipped to select the most suitable tool for their target website.

The guide is available on GitHub and discusses the distinct capabilities of each library as well as practical examples to illustrate their functionality including an example using BlueSky, an X (Twitter)-like website.

If you are interested in learning more about Python, why not join the University Python User Group to exchange insights with fellow Python enthusiasts?