COURS XML_XML Schema
COURS XML_XML Schema
XSD Example
<?xml version="1.0"?>
<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema">
<xs:element name="note">
<xs:complexType>
<xs:sequence>
<xs:element name="to" type="xs:string"/>
<xs:element name="from" type="xs:string"/>
<xs:element name="heading" type="xs:string"/>
<xs:element name="body" type="xs:string"/>
</xs:sequence>
</xs:complexType>
</xs:element>
</xs:schema>
The purpose of an XML Schema is to define the legal building blocks of an XML
document:
With XML Schemas, the sender can describe the data in a way that the receiver
will understand.
<date type="date">2004-03-11</date>
ensures a mutual understanding of the content, because the XML data type
"date" requires the format "YYYY-MM-DD".
Well-Formed is Not Enough
A well-formed XML document is a document that conforms to the XML syntax
rules, like:
Even if documents are well-formed they can still contain errors, and those errors
can have serious consequences.
Think of the following situation: you order 5 gross of laser printers, instead of 5
laser printers. With XML Schemas, most of these errors can be caught by your
validating software.
<?xml version="1.0"?>
<note>
<to>Tove</to>
<from>Jani</from>
<heading>Reminder</heading>
<body>Don't forget me this weekend!</body>
</note>
An XML Schema
The following example is an XML Schema file called "note.xsd" that defines the
elements of the XML document above ("note.xml"):
<?xml version="1.0"?>
<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema"
targetNamespace="https://www.w3schools.com"
xmlns="https://www.w3schools.com"
elementFormDefault="qualified">
<xs:element name="note">
<xs:complexType>
<xs:sequence>
<xs:element name="to" type="xs:string"/>
<xs:element name="from" type="xs:string"/>
<xs:element name="heading" type="xs:string"/>
<xs:element name="body" type="xs:string"/>
</xs:sequence>
</xs:complexType>
</xs:element>
</xs:schema>
The note element is a complex type because it contains other elements. The
other elements (to, from, heading, body) are simple types because they do not
contain other elements. You will learn more about simple and complex types in
the following chapters.
<?xml version="1.0"?>
<note
xmlns="https://www.w3schools.com"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="https://www.w3schools.com/xml note.xsd">
<to>Tove</to>
<from>Jani</from>
<heading>Reminder</heading>
<body>Don't forget me this weekend!</body>
</note>
XSD - The <schema> Element
The <schema> element is the root element of every XML Schema.
<?xml version="1.0"?>
<xs:schema>
...
...
</xs:schema>
<?xml version="1.0"?>
<xs:schema
xmlns:xs ="http://www.w3.org/2001/XMLSchema"
targetNamespace ="https://www.w3schools.com"
xmlns ="https://www.w3schools.com"
elementFormDefault ="qualified">
...
...
</xs:schema>
The fragment:
xmlns:xs="http://www.w3.org/2001/XMLSchema"
indicates that the elements and data types used in the schema come from the
"http://www.w3.org/2001/XMLSchema" namespace. It also specifies that the
elements and data types that come from the
"http://www.w3.org/2001/XMLSchema" namespace should be prefixed with xs:
This fragment:
targetNamespace="https://www.w3schools.com"
indicates that the elements defined by this schema (note, to, from, heading,
body.) come from the "https://www.w3schools.com" namespace.
This fragment:
xmlns="https://www.w3schools.com"
This fragment:
elementFormDefault="qualified"
indicates that any elements used by the XML instance document which were
declared in this schema must be namespace qualified.
<note xmlns="https://www.w3schools.com"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="https://www.w3schools.com note.xsd">
<to>Tove</to>
<from>Jani</from>
<heading>Reminder</heading>
<body>Don't forget me this weekend!</body>
</note>
However, the "only text" restriction is quite misleading. The text can be of many
different types. It can be one of the types included in the XML Schema definition
(boolean, string, date, etc.), or it can be a custom type that you can define
yourself.
You can also add restrictions (facets) to a data type in order to limit its content,
or you can require the data to match a specific pattern.
Defining a Simple Element
The syntax for defining a simple element is:
where xxx is the name of the element and yyy is the data type of the element.
XML Schema has a lot of built-in data types. The most common types are:
xs:string
xs:decimal
xs:integer
xs:boolean
xs:date
xs:time
<lastname>Refsnes</lastname>
<age>36</age>
<dateborn>1970-03-27</dateborn>
Attributes
Simple elements cannot have attributes. If an element has attributes, it is
considered to be of a complex type. But the attribute itself is always declared as
a simple type.
where xxx is the name of the attribute and yyy specifies the data type of the
attribute.
XML Schema has a lot of built-in data types. The most common types are:
xs:string
xs:decimal
xs:integer
xs:boolean
xs:date
xs:time
Example
Here is an XML element with an attribute:
<lastname lang="EN">Smith</lastname>
A fixed value is also automatically assigned to the attribute, and you cannot
specify another value.
Restrictions on Content
When an XML element or attribute has a data type defined, it puts restrictions on
the element's or attribute's content.
If an XML element is of type "xs:date" and contains a string like "Hello World",
the element will not validate.
With XML Schemas, you can also add your own restrictions to your XML elements
and attributes. These restrictions are called facets. You can read more about
facets in the next chapter.
Restrictions on Values
The following example defines an element called "age" with a restriction. The
value of age cannot be lower than 0 or greater than 120:
<xs:element name="age">
<xs:simpleType>
<xs:restriction base="xs:integer">
<xs:minInclusive value="0"/>
<xs:maxInclusive value="120"/>
</xs:restriction>
</xs:simpleType>
</xs:element>––
The example below defines an element called "car" with a restriction. The only
acceptable values are: Audi, Golf, BMW:
<xs:element name="car">
<xs:simpleType>
<xs:restriction base="xs:string">
<xs:enumeration value="Audi"/>
<xs:enumeration value="Golf"/>
<xs:enumeration value="BMW"/>
</xs:restriction>
</xs:simpleType>
</xs:element>
The example above could also have been written like this:
<xs:simpleType name="carType">
<xs:restriction base="xs:string">
<xs:enumeration value="Audi"/>
<xs:enumeration value="Golf"/>
<xs:enumeration value="BMW"/>
</xs:restriction>
</xs:simpleType>
Note: In this case the type "carType" can be used by other elements because it
is not a part of the "car" element.
The example below defines an element called "letter" with a restriction. The only
acceptable value is ONE of the LOWERCASE letters from a to z:
<xs:element name="letter">
<xs:simpleType>
<xs:restriction base="xs:string">
<xs:pattern value="[a-z]"/>
</xs:restriction>
</xs:simpleType>
</xs:element>
The next example defines an element called "initials" with a restriction. The only
acceptable value is THREE of the UPPERCASE letters from a to z:
<xs:element name="initials">
<xs:simpleType>
<xs:restriction base="xs:string">
<xs:pattern value="[A-Z][A-Z][A-Z]"/>
</xs:restriction>
</xs:simpleType>
</xs:element>
The next example also defines an element called "initials" with a restriction. The
only acceptable value is THREE of the LOWERCASE OR UPPERCASE letters from a
to z:
<xs:element name="initials">
<xs:simpleType>
<xs:restriction base="xs:string">
<xs:pattern value="[a-zA-Z][a-zA-Z][a-zA-Z]"/>
</xs:restriction>
</xs:simpleType>
</xs:element>
The next example defines an element called "choice" with a restriction. The only
acceptable value is ONE of the following letters: x, y, OR z:
<xs:element name="choice">
<xs:simpleType>
<xs:restriction base="xs:string">
<xs:pattern value="[xyz]"/>
</xs:restriction>
</xs:simpleType>
</xs:element>
The next example defines an element called "prodid" with a restriction. The only
acceptable value is FIVE digits in a sequence, and each digit must be in a range
from 0 to 9:
<xs:element name="prodid">
<xs:simpleType>
<xs:restriction base="xs:integer">
<xs:pattern value="[0-9][0-9][0-9][0-9][0-9]"/>
</xs:restriction>
</xs:simpleType>
</xs:element>
<xs:element name="letter">
<xs:simpleType>
<xs:restriction base="xs:string">
<xs:pattern value="([a-z])*"/>
</xs:restriction>
</xs:simpleType>
</xs:element>
The next example also defines an element called "letter" with a restriction. The
acceptable value is one or more pairs of letters, each pair consisting of a lower
case letter followed by an upper case letter. For example, "sToP" will be validated
by this pattern, but not "Stop" or "STOP" or "stop":
<xs:element name="letter">
<xs:simpleType>
<xs:restriction base="xs:string">
<xs:pattern value="([a-z][A-Z])+"/>
</xs:restriction>
</xs:simpleType>
</xs:element>
The next example defines an element called "gender" with a restriction. The only
acceptable value is male OR female:
<xs:element name="gender">
<xs:simpleType>
<xs:restriction base="xs:string">
<xs:pattern value="male|female"/>
</xs:restriction>
</xs:simpleType>
</xs:element>
The next example defines an element called "password" with a restriction. There
must be exactly eight characters in a row and those characters must be
lowercase or uppercase letters from a to z, or a number from 0 to 9:
<xs:element name="password">
<xs:simpleType>
<xs:restriction base="xs:string">
<xs:pattern value="[a-zA-Z0-9]{8}"/>
</xs:restriction>
</xs:simpleType>
</xs:element>
<xs:element name="address">
<xs:simpleType>
<xs:restriction base="xs:string">
<xs:whiteSpace value="preserve"/>
</xs:restriction>
</xs:simpleType>
</xs:element>
This example also defines an element called "address" with a restriction. The
whiteSpace constraint is set to "replace", which means that the XML processor
WILL REPLACE all white space characters (line feeds, tabs, spaces, and carriage
returns) with spaces:
<xs:element name="address">
<xs:simpleType>
<xs:restriction base="xs:string">
<xs:whiteSpace value="replace"/>
</xs:restriction>
</xs:simpleType>
</xs:element>
This example also defines an element called "address" with a restriction. The
whiteSpace constraint is set to "collapse", which means that the XML processor
WILL REMOVE all white space characters (line feeds, tabs, spaces, carriage
returns are replaced with spaces, leading and trailing spaces are removed, and
multiple spaces are reduced to a single space):
<xs:element name="address">
<xs:simpleType>
<xs:restriction base="xs:string">
<xs:whiteSpace value="collapse"/>
</xs:restriction>
</xs:simpleType>
</xs:element>
Restrictions on Length
To limit the length of a value in an element, we would use the length,
maxLength, and minLength constraints.
This example defines an element called "password" with a restriction. The value
must be exactly eight characters:
<xs:element name="password">
<xs:simpleType>
<xs:restriction base="xs:string">
<xs:length value="8"/>
</xs:restriction>
</xs:simpleType>
</xs:element>
This example defines another element called "password" with a restriction. The
value must be minimum five characters and maximum eight characters:
<xs:element name="password">
<xs:simpleType>
<xs:restriction base="xs:string">
<xs:minLength value="5"/>
<xs:maxLength value="8"/>
</xs:restriction>
</xs:simpleType>
</xs:element>
Constraint Description
maxExclusive Specifies the upper bounds for numeric values (the value
must be less than this value)
maxInclusive Specifies the upper bounds for numeric values (the value
must be less than or equal to this value)
maxLength Specifies the maximum number of characters or list items
allowed. Must be equal to or greater than zero
minExclusive Specifies the lower bounds for numeric values (the value must
be greater than this value)
minInclusive Specifies the lower bounds for numeric values (the value must
be greater than or equal to this value)
whiteSpace Specifies how white space (line feeds, tabs, spaces, and
carriage returns) is handled
<product pid="1345"/>
<employee>
<firstname>John</firstname>
<lastname>Smith</lastname>
</employee>
A complex XML element, "description", which contains both elements and text:
<description>
It happened on <date lang="norwegian"> 03.03.99 </date> ....
</description>
<employee>
<firstname>John</firstname>
<lastname>Smith</lastname>
</employee>
<xs:element name="employee">
<xs:complexType>
<xs:sequence>
<xs:element name="firstname" type="xs:string"/>
<xs:element name="lastname" type="xs:string"/>
</xs:sequence>
</xs:complexType>
</xs:element>
If you use the method described above, only the "employee" element can use the
specified complex type. Note that the child elements, "firstname" and
"lastname", are surrounded by the <sequence> indicator. This means that the
child elements must appear in the same order as they are declared. You will
learn more about indicators in the XSD Indicators chapter.
2. The "employee" element can have a type attribute that refers to the name of
the complex type to use:
<xs:complexType name="personinfo">
<xs:sequence>
<xs:element name="firstname" type="xs:string"/>
<xs:element name="lastname" type="xs:string"/>
</xs:sequence>
</xs:complexType>
You can also base a complex type on an existing complex type and add some
elements, like this:
<xs:element name="employee" type="fullpersoninfo"/>
<xs:complexType name="personinfo">
<xs:sequence>
<xs:element name="firstname" type="xs:string"/>
<xs:element name="lastname" type="xs:string"/>
</xs:sequence>
</xs:complexType>
<xs:complexType name="fullpersoninfo">
<xs:complexContent>
<xs:extension base="personinfo">
<xs:sequence>
<xs:element name="address" type="xs:string"/>
<xs:element name="city" type="xs:string"/>
<xs:element name="country" type="xs:string"/>
</xs:sequence>
</xs:extension>
</xs:complexContent>
</xs:complexType>
The "product" element above has no content at all. To define a type with no
content, we must define a type that allows elements in its content, but we do not
actually declare any elements, like this:
<xs:element name="product">
<xs:complexType>
<xs:complexContent>
<xs:restriction base="xs:integer">
<xs:attribute name="prodid" type="xs:positiveInteger"/>
</xs:restriction>
</xs:complexContent>
</xs:complexType>
</xs:element>
In the example above, we define a complex type with a complex content. The
complexContent element signals that we intend to restrict or extend the content
model of a complex type, and the restriction of integer declares one attribute but
does not introduce any element content.
<xs:element name="product">
<xs:complexType>
<xs:attribute name="prodid" type="xs:positiveInteger"/>
</xs:complexType>
</xs:element>
Or you can give the complexType element a name, and let the "product" element
have a type attribute that refers to the name of the complexType (if you use this
method, several elements can refer to the same complex type):
<xs:element name="product" type="prodtype"/>
<xs:complexType name="prodtype">
<xs:attribute name="prodid" type="xs:positiveInteger"/>
</xs:complexType>
<person>
<firstname>John</firstname>
<lastname>Smith</lastname>
</person>
<xs:element name="person">
<xs:complexType>
<xs:sequence>
<xs:element name="firstname" type="xs:string"/>
<xs:element name="lastname" type="xs:string"/>
</xs:sequence>
</xs:complexType>
</xs:element>
Notice the <xs:sequence> tag. It means that the elements defined ("firstname"
and "lastname") must appear in that order inside a "person" element.
Or you can give the complexType element a name, and let the "person" element
have a type attribute that refers to the name of the complexType (if you use this
method, several elements can refer to the same complex type):
<xs:complexType name="persontype">
<xs:sequence>
<xs:element name="firstname" type="xs:string"/>
<xs:element name="lastname" type="xs:string"/>
</xs:sequence>
</xs:complexType>
Complex Text-Only Elements
This type contains only simple content (text and attributes), therefore we add a
simpleContent element around the content. When using simple content, you
must define an extension OR a restriction within the simpleContent element, like
this:
<xs:element name="somename">
<xs:complexType>
<xs:simpleContent>
<xs:extension base="basetype">
....
....
</xs:extension>
</xs:simpleContent>
</xs:complexType>
</xs:element>
OR
<xs:element name="somename">
<xs:complexType>
<xs:simpleContent>
<xs:restriction base="basetype">
....
....
</xs:restriction>
</xs:simpleContent>
</xs:complexType>
</xs:element>
Tip: Use the extension/restriction element to expand or to limit the base simple
type for the element.
<shoesize country="france">35</shoesize>
<xs:element name="shoesize">
<xs:complexType>
<xs:simpleContent>
<xs:extension base="xs:integer">
<xs:attribute name="country" type="xs:string" />
</xs:extension>
</xs:simpleContent>
</xs:complexType>
</xs:element>
We could also give the complexType element a name, and let the "shoesize"
element have a type attribute that refers to the name of the complexType (if you
use this method, several elements can refer to the same complex type):
<xs:complexType name="shoetype">
<xs:simpleContent>
<xs:extension base="xs:integer">
<xs:attribute name="country" type="xs:string" />
</xs:extension>
</xs:simpleContent>
</xs:complexType>
<letter>
Dear Mr. <name>John Smith</name>.
Your order <orderid>1032</orderid>
will be shipped on <shipdate>2001-07-13</shipdate>.
</letter>
<xs:element name="letter">
<xs:complexType mixed="true">
<xs:sequence>
<xs:element name="name" type="xs:string"/>
<xs:element name="orderid" type="xs:positiveInteger"/>
<xs:element name="shipdate" type="xs:date"/>
</xs:sequence>
</xs:complexType>
</xs:element>
We could also give the complexType element a name, and let the "letter"
element have a type attribute that refers to the name of the complexType (if you
use this method, several elements can refer to the same complex type):
Indicators
There are seven indicators:
Order indicators:
All
Choice
Sequence
Occurrence indicators:
maxOccurs
minOccurs
Group indicators:
Group name
attributeGroup name
Order Indicators
Order indicators are used to define the order of the elements.
All Indicator
The <all> indicator specifies that the child elements can appear in any order, and
that each child element must occur only once:
<xs:element name="person">
<xs:complexType>
<xs:all>
<xs:element name="firstname" type="xs:string"/>
<xs:element name="lastname" type="xs:string"/>
</xs:all>
</xs:complexType>
</xs:element>
Note: When using the <all> indicator you can set the <minOccurs> indicator to
0 or 1 and the <maxOccurs> indicator can only be set to 1 (the <minOccurs>
and <maxOccurs> are described later).
Choice Indicator
The <choice> indicator specifies that either one child element or another can
occur:
<xs:element name="person">
<xs:complexType>
<xs:choice>
<xs:element name="employee" type="employee"/>
<xs:element name="member" type="member"/>
</xs:choice>
</xs:complexType>
</xs:element>
Sequence Indicator
The <sequence> indicator specifies that the child elements must appear in a
specific order:
<xs:element name="person">
<xs:complexType>
<xs:sequence>
<xs:element name="firstname" type="xs:string"/>
<xs:element name="lastname" type="xs:string"/>
</xs:sequence>
</xs:complexType>
</xs:element>
Occurrence Indicators
Occurrence indicators are used to define how often an element can occur.
Note: For all "Order" and "Group" indicators (any, all, choice, sequence, group
name, and group reference) the default value for maxOccurs and minOccurs is 1.
maxOccurs Indicator
The <maxOccurs> indicator specifies the maximum number of times an element
can occur:
<xs:element name="person">
<xs:complexType>
<xs:sequence>
<xs:element name="full_name" type="xs:string"/>
<xs:element name="child_name" type="xs:string" maxOccurs="10"/>
</xs:sequence>
</xs:complexType>
</xs:element>
The example above indicates that the "child_name" element can occur a
minimum of one time (the default value for minOccurs is 1) and a maximum of
ten times in the "person" element.
minOccurs Indicator
The <minOccurs> indicator specifies the minimum number of times an element
can occur:
<xs:element name="person">
<xs:complexType>
<xs:sequence>
<xs:element name="full_name" type="xs:string"/>
<xs:element name="child_name" type="xs:string"
maxOccurs="10" minOccurs="0"/>
</xs:sequence>
</xs:complexType>
</xs:element>
The example above indicates that the "child_name" element can occur a
minimum of zero times and a maximum of ten times in the "person" element.
Tip: To allow an element to appear an unlimited number of times, use the
maxOccurs="unbounded" statement:
A working example:
<persons xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:noNamespaceSchemaLocation="family.xsd">
<person>
<full_name>Hege Refsnes</full_name>
<child_name>Cecilie</child_name>
</person>
<person>
<full_name>Tove Refsnes</full_name>
<child_name>Hege</child_name>
<child_name>Stale</child_name>
<child_name>Jim</child_name>
<child_name>Borge</child_name>
</person>
<person>
<full_name>Stale Refsnes</full_name>
</person>
</persons>
The XML file above contains a root element named "persons". Inside this root
element we have defined three "person" elements. Each "person" element must
contain a "full_name" element and it can contain up to five "child_name"
elements.
<xs:element name="persons">
<xs:complexType>
<xs:sequence>
<xs:element name="person" maxOccurs="unbounded">
<xs:complexType>
<xs:sequence>
<xs:element name="full_name" type="xs:string"/>
<xs:element name="child_name" type="xs:string"
minOccurs="0" maxOccurs="5"/>
</xs:sequence>
</xs:complexType>
</xs:element>
</xs:sequence>
</xs:complexType>
</xs:element>
</xs:schema>
Group Indicators
Group indicators are used to define related sets of elements.
Element Groups
Element groups are defined with the group declaration, like this:
<xs:group name="groupname">
...
</xs:group>
You must define an all, choice, or sequence element inside the group declaration.
The following example defines a group named "persongroup", that defines a
group of elements that must occur in an exact sequence:
<xs:group name="persongroup">
<xs:sequence>
<xs:element name="firstname" type="xs:string"/>
<xs:element name="lastname" type="xs:string"/>
<xs:element name="birthday" type="xs:date"/>
</xs:sequence>
</xs:group>
After you have defined a group, you can reference it in another definition, like
this:
<xs:group name="persongroup">
<xs:sequence>
<xs:element name="firstname" type="xs:string"/>
<xs:element name="lastname" type="xs:string"/>
<xs:element name="birthday" type="xs:date"/>
</xs:sequence>
</xs:group>
<xs:complexType name="personinfo">
<xs:sequence>
<xs:group ref="persongroup"/>
<xs:element name="country" type="xs:string"/>
</xs:sequence>
</xs:complexType>
Attribute Groups
Attribute groups are defined with the attributeGroup declaration, like this:
<xs:attributeGroup name="groupname">
...
</xs:attributeGroup>
<xs:attributeGroup name="personattrgroup">
<xs:attribute name="firstname" type="xs:string"/>
<xs:attribute name="lastname" type="xs:string"/>
<xs:attribute name="birthday" type="xs:date"/>
</xs:attributeGroup>
After you have defined an attribute group, you can reference it in another
definition, like this:
<xs:attributeGroup name="personattrgroup">
<xs:attribute name="firstname" type="xs:string"/>
<xs:attribute name="lastname" type="xs:string"/>
<xs:attribute name="birthday" type="xs:date"/>
</xs:attributeGroup>
<xs:element name="person">
<xs:complexType>
<xs:attributeGroup ref="personattrgroup"/>
</xs:complexType>
</xs:element>