Modify XML files with Python
Last Updated :
18 Aug, 2021
Python|Modifying/Parsing XML
Extensible Markup Language (XML) is a markup language that defines a set of rules for encoding documents in a format that is both human-readable and machine-readable.The design goals of XML focus on simplicity, generality, and usability across the Internet.It is a textual data format with strong support via Unicode for different human languages. Although the design of XML focuses on documents, the language is widely used for the representation of arbitrary data structures such as those used in web services.
XML is an inherently hierarchical data format, and the most natural way to represent it is with a tree.To perform any operations like parsing, searching, modifying an XML file we use a module xml.etree.ElementTree .It has two classes.ElementTree represents the whole XML document as a tree which helps while performing the operations. Element represents a single node in this tree.Reading and writing from the whole document are done on the ElementTree level.Interactions with a single XML element and its sub-elements are done on the Element level.
Properties of Element:
Properties | Description |
---|
Tag | String identifying what kind of data the element represents. Can be accessed using elementname.tag. |
Number of Attributes | Stored as a python dictionary. Can be accesses by elementname.attrib. |
Text string | String information regarding the element. |
Child string | Optional child elements string information. |
Child Elements | Number of child elements to a particular root. |
PARSING:
We can parse XML data from a string or an XML document.Considering xml.etree.ElementTree as ET.
1. ET.parse('Filename').getroot() -ET.parse('fname')-creates a tree and then we extract the root by .getroot().
2. ET.fromstring(stringname) -To create a root from an XML data string.
Example 1:
XML document:
XML
<?xml version="1.0"?>
<!--COUNTRIES is the root element-->
<COUNTRIES>
<country name="INDIA">
<neighbor name="Dubai" direction="W"/>
</country>
<country name="Singapore">
<neighbor name="Malaysia" direction="N"/>
</country>
</COUNTRIES>
Python Code:
Python3
# importing the module.
import xml.etree.ElementTree as ET
XMLexample_stored_in_a_string ='''<?xml version ="1.0"?>
<COUNTRIES>
<country name ="INDIA">
<neighbor name ="Dubai" direction ="W"/>
</country>
<country name ="Singapore">
<neighbor name ="Malaysia" direction ="N"/>
</country>
</COUNTRIES>
'''
# parsing directly.
tree = ET.parse('xmldocument.xml')
root = tree.getroot()
# parsing using the string.
stringroot = ET.fromstring(XMLexample_stored_in_a_string)
# printing the root.
print(root)
print(stringroot)
Output:
outputexample1
Element methods:
1)Element.iter('tag') -Iterates over all the child elements(Sub-tree elements)
2)Element.findall('tag') -Finds only elements with a tag which are direct children of the current element.
3)Element.find('tag') -Finds the first Child with the particular tag.
4)Element.get('tag') -Accesses the elements attributes.
5)Element.text -Gives the text of the element.
6)Element.attrib-returns all the attributes present.
7)Element.tag-returns the element name.
Example 2:
Python3
import xml.etree.ElementTree as ET
XMLexample_stored_in_a_string ='''<?xml version ="1.0"?>
<States>
<state name ="TELANGANA">
<rank>1</rank>
<neighbor name ="ANDHRA" language ="Telugu"/>
<neighbor name ="KARNATAKA" language ="Kannada"/>
</state>
<state name ="GUJARAT">
<rank>2</rank>
<neighbor name ="RAJASTHAN" direction ="N"/>
<neighbor name ="MADHYA PRADESH" direction ="E"/>
</state>
<state name ="KERALA">
<rank>3</rank>
<neighbor name ="TAMILNADU" direction ="S" language ="Tamil"/>
</state>
</States>
'''
# parsing from the string.
root = ET.fromstring(XMLexample_stored_in_a_string)
# printing attributes of the root tags 'neighbor'.
for neighbor in root.iter('neighbor'):
print(neighbor.attrib)
# finding the state tag and their child attributes.
for state in root.findall('state'):
rank = state.find('rank').text
name = state.get('name')
print(name, rank)
Output:
Element methods output.
MODIFYING:
Modifying the XML document can also be done through Element methods.
Methods:
1)Element.set('attrname', 'value') - Modifying element attributes.
2)Element.SubElement(parent, new_childtag) -creates a new child tag under the parent.
3)Element.write('filename.xml')-creates the tree of xml into another file.
4)Element.pop() -delete a particular attribute.
5)Element.remove() -to delete a complete tag.
Example 3:
XML Document:
xml
<?xml version="1.0"?>
<breakfast_menu>
<food>
<name itemid="11">Belgian Waffles</name>
<price>5.95</price>
<description>Two of our famous Belgian Waffles
with plenty of real maple syrup</description>
<calories>650</calories>
</food>
<food>
<name itemid="21">Strawberry Belgian Waffles</name>
<price>7.95</price>
<description>Light Belgian waffles covered
with strawberries and whipped cream</description>
<calories>900</calories>
</food>
<food>
<name itemid="31">Berry-Berry Belgian Waffles</name>
<price>8.95</price>
<description>Light Belgian waffles covered with
an assortment of fresh berries and whipped cream</description>
<calories>900</calories>
</food>
<food>
<name itemid="41">French Toast</name>
<price>4.50</price>
<description>Thick slices made from our
homemade sourdough bread</description>
<calories>600</calories>
</food>
</breakfast_menu>
Python Code:
Python3
import xml.etree.ElementTree as ET
mytree = ET.parse('xmldocument.xml.txt')
myroot = mytree.getroot()
# iterating through the price values.
for prices in myroot.iter('price'):
# updates the price value
prices.text = str(float(prices.text)+10)
# creates a new attribute
prices.set('newprices', 'yes')
# creating a new tag under the parent.
# myroot[0] here is the first food tag.
ET.SubElement(myroot[0], 'tasty')
for temp in myroot.iter('tasty'):
# giving the value as Yes.
temp.text = str('YES')
# deleting attributes in the xml.
# by using pop as attrib returns dictionary.
# removes the itemid attribute in the name tag of
# the second food tag.
myroot[1][0].attrib.pop('itemid')
# Removing the tag completely we use remove function.
# completely removes the third food tag.
myroot.remove(myroot[2])
mytree.write('output.xml')
Output:
Similar Reads
Interact with files in Python Python too supports file handling and allows users to handle files i.e., to read, write, create, delete and move files, along with many other file handling options, to operate on files. The concept of file handling has stretched over various other languages, but the implementation is either complica
6 min read
Working with zip files in Python This article explains how one can perform various operations on a zip file using a simple python program. What is a zip file? ZIP is an archive file format that supports lossless data compression. By lossless compression, we mean that the compression algorithm allows the original data to be perfectl
5 min read
How to Parse and Modify XML in Python? XML stands for Extensible Markup Language. It was designed to store and transport data. It was designed to be both human- and machine-readable. Thatâs why, the design goals of XML emphasize simplicity, generality, and usability across the Internet. Note: For more information, refer to XML | Basics H
4 min read
Writing to file in Python Writing to a file in Python means saving data generated by your program into a file on your system. This article will cover the how to write to files in Python in detail.Creating a FileCreating a file is the first step before writing data to it. In Python, we can create a file using the following th
4 min read
File Mode in Python In Python, the file mode specifies the purpose and the operations that can be performed on a file when it is opened. When you open a file using the open() function, you can specify the file mode as the second argument. Different File Mode in PythonBelow are the different types of file modes in Pytho
5 min read
Python File truncate() Method Prerequisites: File Objects in Python Reading and Writing to files in Python Truncate() method truncate the fileâs size. If the optional size argument is present, the file is truncated to (at most) that size. The size defaults to the current position. The current file position is not changed. Note t
2 min read
Python | shutil.copyfileobj() method Shutil module in Python provides many functions of high-level operations on files and collections of files. It comes under Pythonâs standard utility modules. This module helps in automating process of chowning and removal of files and directories.shutil.copyfileobj() method in Python is used to copy
2 min read
Shutil Module in Python Shutil module offers high-level operation on a file like a copy, create, and remote operation on the file. It comes under Pythonâs standard utility modules. This module helps in automating the process of copying and removal of files and directories. In this article, we will learn this module. Copyin
8 min read
How to copy file in Python3? Prerequisites: Shutil When we require a backup of data, we usually make a copy of that file. Python supports file handling and allows users to handle files i.e., to read and write files, along with many other file handling options, to operate on files. Here we will learn how to copy a file using Pyt
2 min read
How To Read .Data Files In Python? Unlocking the secrets of reading .data files in Python involves navigating through diverse structures. In this article, we will unravel the mysteries of reading .data files in Python through four distinct approaches. Understanding the structure of .data files is essential, as their format may vary w
4 min read