Document Type Definition Dtds
Document Type Definition Dtds
DTDs
1
Document Type Definition
2
Document Type Definition
Data sent along with a DTD is known as valid XML.
In this case, an XML parser could check incoming data against
the rules defined in the DTD to make sure the data was
structured correctly.
Data sent without a DTD is Known as well-formed XML.
Here an XML-based document instance , such as
hierarchically structured weather data shown ,can be used to
implicitly describe itself..
With both valid and well-formed XML,XML encoded data
is self-describing since descriptive tags are intermixed with
the data.
DTD’s help ensure that different people and programs
can read each other’s files.
The DTD defines exactly what is and is not allowed to
appear inside a document
3
XML Parsers
• A parser is a piece of program that takes a physical
representation of some data and converts it into an
in-memory form for the program as a whole to
use.
• One of the ways to classify XML parsers is as
below :
– non-validating: the parser does not check a document
against any DTD (Document Type Definition); only checks
that the document is well-formed (that it is properly
marked up according to XML syntax rules)
– validating: in addition to checking well-formedness, the
parser verifies that the XML document conforms to a 4
XML Parsers
• An xml document is well formed if it follows
all the rules for well formedness.
• It is valid if it is well formed and follows a
DTD.
• So,
• All valid xml documents are well formed but
all well formed documents may not be valid.
5
Document Type Definition
They have a file extension .dtd
A DTD is a document which serves the following
purposes:
Specify the valid tags that can be used in a
XML document.
Specify the valid tag sequence/arrangements.
Specifies whether whitespace is significant or ignorable.
The DTD used by a XML document is declared after
the prolog.
DTD defines the structure of xml data and is similar
to class template.
6
DTD for Our Simple XML
A DTD Consists of a left square bracket character ( [ ) followed by a
series of markup declarations , followed by a right square bracket
character ( ] ).
<?xml version=“1.0”?>
<!DOCTYPE message [
<!ELEMENT message
ANY>
]
>
This example shows:
The document type is
defined with the root
element named as
“message".
The element “message" is defined to have "any" sub elements, text
content, and attributes. 7
Main features of DTD:
DTD is a simple language with only 4 types of statements:
DOCTYPE, ELEMENT, ATTLIST, and ENTITY.
One DOCTYPE statement defines one document type.
Within the DOCTYPE statement, one or more ELEMENT
statements, some ATTLIST statements and some ENTITY
statements are included to define details of the document
type.
DTD statements that define the document type can be
included inside the XML file.
DTD statements that define the document type can be
stored as a separate file and linked to the XML file.
8
DTD Declarations
DOCTYPE is a DTD statement included in an XML file to declare that this
XML document has been linked to DTD document type and to specify the
name of the root element of this XML document.
Valid syntax formats for DOCTYPE statement are:
<!DOCTYPE root_element [
internal_DTD_statements
]>
eg
<!DOCTYPE root_element
SYSTEM "DTD_location">
<!DOCTYPE root_element SYSTEM "DTD_location" [
internal_DTD_statements
]>
9
DTD Types
• Internal DTD :
– Xml and DTD are in the same file
– <! DOCTYPE root-element [element-declarations] >
• External DTD :
– A dtd that resides in a file (.dtd extension) other
than the xml file
– Advantage of an external dtd is reusability
10
Internal Dtd
• Sample xml file with an internal dtd :
<?xml version="1.0"?>
<!DOCTYPE book [
<!ELEMENT book (title,pages,price,edition)>
<!ELEMENT title (#PCDATA)>
<!ELEMENT pages (#PCDATA)>
<!ELEMENT price (#PCDATA)>
<!ELEMENT edition (#PCDATA)>
]>
<book>
<title>Xml Pocket reference</title>
<pages>150</pages>
<price>100.00</price>
<edition>second</edition>
<1/1/ 11
External dtd
• Sample xml file with external dtd
<?xml version="1.0"?>
<!DOCTYPE book SYSTEM “test.dtd”>
<book>
<title>Xml Pocket reference</title>
<pages>150</pages>
<price>100.00</price>
<edition>second</edition>
</book>
12
Contents of File test.dtd
<!ELEMENT book (title,noofpages,price,edition)>
<!ELEMENT title (#PCDATA)>
<!ELEMENT pages (#PCDATA)>
<!ELEMENT price (#PCDATA)>
<!ELEMENT edition (#PCDATA)>
13
General rules about the DOCTYPE statement:
It must appear right after the "xml" processing instruction and
before the root element.
14
Xml building blocks
• Any xml file can contain the following types of elements:
• An empty element :
<letter></letter> or <letter/>
• Element with attribute :
<letter Date=“2009/11/11”></letter>
• Element with text :
<letter>This is the letter content</letter>
• Element with child elements :
<letter>
<sender>Rahul</sender>
</letter>
• Element with child elements and Contents.
<chapter> introduction to xml
<Pages> 20 </Pages>
</chapter>
15
An element in an DTD file is represented as :
<!ELEMENT
ELEMENT_NAME
CONTENT_MODEL>
18
• If BOOK contains more than one child elements
Ex : <BOOK>
<CHAPTER>one</CHAPTER>
<PRICE>20.99</PRICE>
</BOOK>
<!ELEMENT BOOK (CHAPTER,PRICE)>
• The order of the chapter and price must be followed as
written in the DTD definition.
• Illegal Definition
<BOOK>
<PRICE>20.99</PRICE>
<CHAPTER>one</CHAPTER>
</BOOK>
19
CONTENT_MODEL : CHILD ELEMENTS Contd.
21
Content Model : PCDATA
• <!ELEMENT BOOK(CHAPTER)>
<!ELEMENT CHAPTER(#PCDATA)>
• For the above dtd declaration a valid xml would
be
<BOOK>
<CHAPTER>
Introduction to xml dtds
</CHAPTER>
</BOOK>
11/7/2015 Minal Abhyankar 22
Some dtd and xml examples
DTD XML
• <!ELEMENT BOOK • <BOOK>
(CHAPTER+,APPPENDIX?) > <CHAPTER>1</CHAPTER>
<!ELEMENT CHAPTER <APPENDIX> E</APPENDIX>
(#PCDATA) > </BOOK>
• OR
<!ELEMENT
• <BOOK>
APPPENDIX(#PCDATA) >
<CHAPTER>1</CHAPTER>
<CHAPTER>2</CHAPTER>
<APPENDIX> E</APPENDIX>
</BOOK>
• OR
• <BOOK>
<CHAPTER>1</CHAPTER>
<CHAPTER>2</CHAPTER>
</BOOK> 25
DTD XML
• <CHAPTER>
• Subsequences using parenthesis : <NAME>JIM</NAME>
<!ELEMENT CHAPTER (NAME, <PAGENO> 23</PAGENO>
(PAGENO,REFERENCE*)+) > <REFERENCE>a
</REFERENCE>
</CHAPTER>
• OR
• <CHAPTER>
<NAME>JIM</NAME>
<PAGENO> 23</PAGENO>
<REFERENCE>a
</REFERENCE>
<PAGENO> 29 </PAGENO>
<REFERENCE>C
</REFERENCE>
</CHAPTER>
• OR
• <CHAPTER>
<NAME>JIM</NAME>
<PAGENO> 23</PAGENO> 24
<PAGENO> 90</PAGENO>
DTD XML
• Choice : • <CHAPTER>
<!ELEMENT CHAPTER (NAME, <NAME>JIM</NAME>
PAGENO, (REFERENCE| <PAGENO> 23</PAGENO>
FOOTNOTE)) <REFERENCE>a </REFERENCE>
</CHAPTER>
OR
• <CHAPTER>
<NAME>JIM
</NAME>
<PAGENO>
23</PAGEN
O>
<FOOTNOTE
>FT</
FOOTNOTE> 25
CONTENT_MODEL : MIXED
• An element can contain both child elements and
text, in which case the content model would be
mixed.
• It does not allow the user to specify the order or
number of occurrences of child elements.
• <!ELEMENT chapter (#PCDATA | NoofPages)*>
• The corresponding Xml will be
<chapter> <chapter>
<NoofPages> 20 </NoofPages> or <NoofPages> 20 </NoofPages>
introduction to xml </chapter>
</chapter>
27
Exercise:
1)Create a DTD and XML Document For music collection which need to have
entries for CD collection. At least one CD collection required.
28
Answers To Exercise
1) <?xml version="1.0"?>
<!DOCTYPE cdCollection [
<!ELEMENT cdCollection (cd+)>
<!ELEMENT cd (title, artist, year)>
<!ELEMENT title (#PCDATA)>
<!ELEMENT artist (#PCDATA)>
<!ELEMENT year (#PCDATA)>
]>
<cdCollection>
<cd>
<title>Dark Side of the Moon</title>
<artist>Pink Floyd</artist>
<year>1973</year>
</cd>
</cdCollection>
29
2) <?xml version="1.0"?>
<!DOCTYPE list [
<!ELEMENT list (recipe+)>
<!ELEMENT recipe (author,recipe_name,meal,ingredients,directions)>
<!ELEMENT author (#PCDATA)>
<!ELEMENT recipe_name (#PCDATA)>
<!ELEMENT meal (#PCDATA)>
<!ELEMENT ingredients (item+)>
<!ELEMENT item (#PCDATA)>
<!ELEMENT directions (#PCDATA)>
]>
<list> <recipe> <author>Carol Schmidt</author>
<recipe_name> Chocolate Chip Bars</recipe_name>
<meal>Dinner</meal>
<ingredients>
<item>2/3 C butter</item> <item>2 C brown sugar</item>
<item>1 tsp vanilla</item> <item>1 3/4 C unsifted all-purpose flour</item>
<item>1 1/2 tsp baking powder</item>
<item>1/2 tsp salt</item> <item>3 eggs</item>
<item>1/2 C chopped nuts</item>
<item>2 cups (12-oz pkg.) semi-sweet choc. chips</item>
</ingredients>
30
<directions>
Preheat oven to 350 degrees.
Melt butter; combine with brown sugar and vanilla in large mixing bowl.
Set aside to cool. Combine flour, baking powder, and salt; set aside.
Add eggs to cooled sugar mixture; beat well.
Stir in reserved dry ingredients, nuts, and chips.
Spread in greased 13-by-9-inch pan.
Bake for 25 to 30 minutes until golden brown;
cool.
31
3) <?xml version="1.0"?>
<!DOCTYPE page[
<!ELEMENT page (title,content,comment*)>
<!ELEMENT title (#PCDATA)>
<!ELEMENT content (#PCDATA)>
<!ELEMENT comment (#PCDATA)>
]>
<page>
<title>Hello friend</title>
<content>Here is some content</content>
<!-- <comment>Written by Tester</comment> -->
</page>
32
Combination rules for elements
33
DTD Attributes
• Attributes provide supplementary information about a
particular element and appear in name/value pairs.
• <book id=“123”></book>
• In dtd, you define an attribute as below :
– <!ATTLIST ELEMENT_NAME
ATTRIBUTE_NAME TYPE DEFAULT_VALUE
ATTRIBUTE_NAM DEFAULT_VALU
>E TYPE E
• <!ATTLIST book id CDATA “123”>
• where ELEMENT_NAME and ATTRIBUTE_NAME is the
name of the element and its attributes resp.
34
Attribute Default values
• You can specify the actual default value in
DTD:
– <!ATTLIST PAGE AUTHOR CDATA
“AUTHOR_ONE”>
• The corresponding xml element would be
– <PAGE AUTHOR=“AUTHOR_ONE” />
Or
– <PAGE AUTHOR=“Dan Brown” />
35
#IMPLIED
• The default placeholder can also be substituted
by one of the following :
– #IMPLIED : indicates
• This is not a mandatory attribute of the xml element and
that you do not want to give any default value
– <!ATTLIST Page author CDATA #IMPLIED>
– Valid xml for above attribute declaration :
– <Page />
Or
– <Page
author=
“Dan 36
#REQUIRED
• #REQUIRED : indicates
– That you are not specifying the actual default value but it
is mandatory for the author of xml file to supply a value
• <!ATTLIST Page author CDATA #REQUIRED>
• Valid xml for above attribute declaration :
– <Page author=“Dan Brown” />
Or
– <Page author=“J.K.Rowling” />
• Used for mandatory attributes like author of document,
date of creation of xml etc.
37
#FIXED
• #FIXED : used when you want to fix the value
of an attribute.
• It is like creating constants.
• <!ATTLIST Page author CDATA #FIXED “Dan
Brown”>
• Valid xml for above attribute declaration :
• <Page author=“Dan Brown” />
Or
• <Page />
• Even if the attribute is not supplied in the xml, the fixed
value is passed to any application that reads this xml.
• Also, <Page author=“ABC” /> is an invalid declaration.
38