Installing LXML library for Python in Ubuntu Linux

lxml is a library used in Python programming to provide an ability for parsing and manipulating XML and HTML documents. It lets developers get the benefits of C libraries libxml2 and libxslt, which makes the processing of XML and HTML faster. Hence, lxml is popular for web scraping, data extraction, and more. If you’re using Ubuntu Linux and Python and want to install the LXML library then the straightforward steps of this tutorial will help you.

Prerequisites

  • You should be on Ubuntu such as Ubuntu 24.04/22.04/20.04… however, the steps of this tutorial are not limited to a single version and apply to most versions of Ubuntu.
  • The system must have Python 3
  • User with sudo rights
  • Internet connection

Python LXML Installation Steps

1. Start with the Ubuntu Package update:

Let’s run the system update command on our Ubuntu to make sure the system packages are up to date and the latest security updates are also installed on our system. So, open your terminal and run:

sudo apt update

although it is not necessary to install LXML, if you want, can also run the system upgrade command.

sudo apt upgrade

2. Install Dependencies:

There are a few tools and libraries required by the lxml to work properly. Run the given command that will install Python3 and the development files for libxml2 and libxslt

sudo apt install libxml2-dev libxslt-dev python3-dev python3

3. Install lxml Using pip:

The recommended way to install lxml is through pip, Python’s package installer. If you don’t have PIP installed already then use – “sudo apt install python3-pip” after that, the given command.

pip3 install lxml

This command will download and install the latest version of lxml from the Python Package Index (PyPI).

4. Verify the Installation:

After completing the installation of LXML for Python, we can use a Python command to ensure LXML is correctly installed. The given command will print the version of LXML, showing we have it on our system and configured properly.

 python3 -c "import lxml; print(lxml.__version__)"

5. Example of using LXML in Python3

As we know the work of the LXML library in Python to parse XML and HTML documents, so here we show you how we can parse an XML file using LXML to extract or manipulate the file data.

Example:

Create an XML file:

nano demo.xml

Paste the following code and save the files by pressing Ctrl+X, type Y, and hit the Enter key.

<library>
    <book>
        <title>Learning Python</title>
        <author>Mark Lutz</author>
        <year>2013</year>
    </book>
    <book>
        <title>Automate the Boring Stuff with Python</title>
        <author>Al Sweigart</author>
        <year>2015</year>
    </book>
</library>

Next, we create a Python script that will add new book data in the above-created XML file.

nano book.py

Paste the code and save the file:

from lxml import etree

# Load and parse the XML file
tree = etree.parse('demo.xml')

# Get the root element
root = tree.getroot()

# Print out the title of each book
print("Book Titles in the Library:")
for book in root.findall('book'):
    title = book.find('title').text
    print(title)

# Add a new book to the library
new_book = etree.SubElement(root, "book")
title = etree.SubElement(new_book, "title")
title.text = "Python Crash Course"
author = etree.SubElement(new_book, "author")
author.text = "Eric Matthes"
year = etree.SubElement(new_book, "year")
year.text = "2016"

# Save the modified XML to a new file
tree.write('modified_library.xml', pretty_print=True, xml_declaration=True, encoding="UTF-8")

print("\nA new book has been added to the library and saved to 'modified_library.xml'.")

Run the above-created Python script:

python3 book.py

Now, if you check the newly created XML file i.e. “modified_library.xml” you will see the

LXML python usage example

For further details about LXML usage, users can refer to the official lxml documentation page.

Conclusion

lxml is a simple but useful tool for Python developers working with XML and HTML data. We already have seen the installation of LXML for Python in Ubuntu is quite simple but gives numerous possibilities for web scraping, data parsing, and other tasks.

Leave a Comment

This site uses Akismet to reduce spam. Learn how your comment data is processed.