How to install Beautifulsoup Python module in Ubuntu Linux

Learn how to install BeautifulSoup a Python library on Ubuntu 22.04, 20.04, or any other version of this Linux that is used for web scraping and parsing HTML and XML documents.

Python’s ecosystem offers various libraries to make the job of developers easy, out of such one is BeautifulSoup. Its name may not sound geeky but it is quite useful for those who are coding in Python. It provides a convenient way to extract data from web pages or search for specific elements, and retrieve their contents or attributes; making it valuable for various data analysis and web scraping tasks.

Its key features are parse HTML and XML documents; allow searching for specific elements in the parse tree;  using it you can navigate the parse tree by traversing parents, siblings, and children; can detect the encoding of a document and convert it to Unicode, and its ability Integration with Other Libraries such as pandas or CSV libraries for storing extracted data.

In this article, we find out the commands to install Beautiful Soup on Ubuntu Linux. 

Step 1: Install Python PIP, if not already

The easiest and best way to install the Python library is to use its package manager called PIP. So, if you already have that installed then you can skip this step.

To check whether it is available or not, use:

pip -V

If the output shows the version details, then PIP is there on your system.

However, if you don’t have PIP (Python package installer) already installed then here are the commands to get it.

sudo apt update -y
sudo apt install python3-pip
sudo apt install python pip

Step 2: Install BeautifulSoup on Ubuntu

Once you have confirmed the PIP is available on your Ubuntu Linux system, we can use it to install various Python libraries including BeautifulSoup. So, on your Ubuntu command terminal run:

pip install beautifulsoup4 --user

Alternatively, those who don’t want to use PIP, can go for the APT package manager, here is the command for that:

sudo apt install python3-bs4

Step 3: Check BeautifulSoup Version

To confirm, using the previous step command, whether the BeautifulSoup is actually installed on your system to use in Python Script or not, we can use the given command. It will give us the version details of a specified library and where it has been installed.

python3 -m pip show beautifulsoup4
Install BeautifulSoup on Ubuntu

Step 4: Run a simple test script

Let’s create a simple script file in which we import BeeatuifulSoup to parse a simple HTML code.

Use any text editor and create a new Python file. Here we are using nano, for example, test_beautifulsoup.py.

Add the following code to the file:

from bs4 import BeautifulSoup

html_doc = """
<html>
<head>
    <title>Example HTML Page</title>
</head>
<body>
    <h1>Welcome to Beautiful Soup</h1>
    <p>This is an example paragraph.</p>
</body>
</html>
"""

soup = BeautifulSoup(html_doc, 'html.parser')
print(soup.prettify())

Save the created file by pressing Ctrl+X, type Y and then hit the Enter key.

Run the created Python file:

python test_beautifulsoup.py

You will see the parsed and prettified HTML in the output on your Ubuntu terminal.

Test example of beautifulsoup

How to Upgrade

In the future, if there is some new update available for the Beautifulsoup library then to upgrade it use:

pip install --upgrade beautifulsoup4 --user

Uninstallation (optional)

In case after some time, you won’t need the BeatuifulSoup library of Python on your Ubuntu Linux system any more than to remove it, we can again use the PIP, here is the command:

pip uninstall beautifulsoup4

Ending note:

Beautiful Soup is a very useful Python library for web scraping and you have seen the installation process of it is quite simple for Ubuntu Linux. Even you can use the PIP command given in this article on other operating systems to install the Beautiful Soup module.

Other Articles:

Leave a Comment

This site uses Akismet to reduce spam. Learn how your comment data is processed.