Sas User Guide For XML
Sas User Guide For XML
3 XML LIBNAME
Engine
Users Guide
Second Edition
SAS Documentation
The correct bibliographic citation for this manual is as follows: SAS Institute Inc. 2012. SAS 9.3 XML LIBNAME Engine: Users Guide, Second
Edition. Cary, NC: SAS Institute Inc.
SAS 9.3 XML LIBNAME Engine: Users Guide, Second Edition
Copyright 2012, SAS Institute Inc., Cary, NC, USA
All rights reserved. Produced in the United States of America.
For a hardcopy book: No part of this publication may be reproduced, stored in a retrieval system, or transmitted, in any form or by any means,
electronic, mechanical, photocopying, or otherwise, without the prior written permission of the publisher, SAS Institute Inc.
For a Web download or e-book:Your use of this publication shall be governed by the terms established by the vendor at the time you acquire this
publication.
The scanning, uploading, and distribution of this book via the Internet or any other means without the permission of the publisher is illegal and
punishable by law. Please purchase only authorized electronic editions and do not participate in or encourage electronic piracy of copyrighted
materials. Your support of others' rights is appreciated.
U.S. Government Restricted Rights Notice: Use, duplication, or disclosure of this software and related documentation by the U.S. government is
subject to the Agreement with SAS Institute and the restrictions set forth in FAR 52.22719 Commercial Computer Software-Restricted Rights
(June 1987).
SAS Institute Inc., SAS Campus Drive, Cary, North Carolina 27513.
1st electronic book, August 2012
SAS Publishing provides a complete selection of books and electronic products to help customers use SAS software to its fullest potential. For
more information about our e-books, e-learning products, CDs, and hard-copy books, visit the SAS Publishing Web site at
support.sas.com/publishing or call 1-800-727-3228.
SAS and all other SAS Institute Inc. product or service names are registered trademarks or trademarks of SAS Institute Inc. in the USA and other
countries. indicates USA registration.
Other brand and product names are registered trademarks or trademarks of their respective companies.
Contents
What's New in the SAS 9.3 XML LIBNAME Engine . . . . . . . . . . . . . . . . . . . . . . . . . . . . . v
Recommended Reading . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . vii
PART 1
Usage
iv Contents
Referencing a Fileref Using the URL Access Method . . . . . . . . . . . . . . . . . . . . . . . . . . 74
Specifying a Location Path on the PATH Element . . . . . . . . . . . . . . . . . . . . . . . . . . . . 74
Including Namespace Elements in an XMLMap . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 77
Importing an XML Document Using the AUTOMAP= Option to
Generate an XMLMap . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 80
Chapter 6 Understanding and Using Tagsets for the XML Engine . . . . . . . . . . . . . . . . . . . . . . 85
What Is a Tagset? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 85
Creating Customized Tagsets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 85
Exporting an XML Document Using a Customized Tagset . . . . . . . . . . . . . . . . . . . . . . 86
PART 2
93
PART 3
111
PART 4
Appendixes
139
Overview
In SAS 9.3, the engine nickname to access the enhanced XML LIBNAME engine
functionality is XMLV2. The previous nicknameXML92is supported as an alias.
In SAS 9.3, XMLV2 functionality is production, except in the z/OS environment, where
it is preproduction.
In the second maintenance release for SAS 9.3, the LIBNAME statement for XMLV2
supports automatically generating an XMLMap file.
The XMLMap syntax for version 2.1 supports XML namespaces.
For the COLUMN element, the ordinal= attribute, which determines whether the
variable is a counter variable, is no longer supported. The functionality is provided
with the class="ORDINAL" attribute. For details, see the COLUMN element on
page 124.
vii
Recommended Reading
For information about XML (Extensible Markup Language), see the Web site
www.w3.org/XML
Part 1
Usage
Chapter 1
Getting Started with the XML Engine . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
Chapter 2
Exporting XML Documents . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
Chapter 3
Importing XML Documents . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
Chapter 4
Exporting XML Documents Using an XMLMap . . . . . . . . . . . . . . . . . . . . 41
Chapter 5
Importing XML Documents Using an XMLMap . . . . . . . . . . . . . . . . . . . . 45
Chapter 6
Understanding and Using Tagsets for the XML Engine . . . . . . . . . . . . 85
Chapter 1
export (write to an output location) an XML document from a SAS data set of type
DATA by translating the SAS proprietary file format to XML markup. The output
XML document can then be:
moved to another host for the XML engine to process by translating the XML
markup back to a SAS data set.
import (read from an input location) an external XML document. The input XML
document is translated to a SAS data set.
Chapter 1
Executing the DATASETS procedure shows that SAS interprets the XML document as a
SAS data set:
proc datasets library=myxml;
Display 1.1
Chapter 1
The XML engine supports input (read) and output (create) processing. The XML
engine does not support update processing.
The XML engine is a sequential access engine in that it processes data one record
after the other. The engine starts at the beginning of the file and continues in
sequence to the end of the file. The XML engine does not provide random (direct)
access, which is required for some SAS applications and features. For example, with
the XML engine, you cannot use the SORT procedure or ORDER BY in the SQL
procedure. If you request processing that requires random access, a message in the
SAS log notifies you that the processing is not valid for sequential access. If this
message occurs, put the XML data into a temporary SAS data set before you
continue.
What Is the Difference between Using the XML Engine and the ODS
MARKUP Destination?
The XML engine creates and reads XML documents. ODS MARKUP creates, but does
not read XML documents. Typically, you use the XML engine to transport data, and you
use the ODS MARKUP destination to create XML from SAS output.
Chapter 1
Chapter 2
10
Chapter 2
The following SAS program exports an XML document from the SAS data set
MYFILES.CLASS:
libname myfiles 'SAS-library'; 1
libname trans xml 'XML-document' xmltype=oracle; 2
data trans.class; 3
set myfiles.class;
run;
The first LIBNAME statement assigns the libref MYFILES to the physical location
of the SAS library that stores the SAS data set CLASS. The V9 engine is the default.
11
The second LIBNAME statement assigns the libref TRANS to the physical location
of the file (complete pathname, filename, and file extension) that will store the
exported XML document and specifies the XML engine. The engine option
XMLTYPE=ORACLE produces tags that are equivalent to the Oracle 8i XML
implementation.
The DATA step reads the SAS data set MYFILES.CLASS and writes its content in
ORACLE XML markup to the specified XML document.
12
Chapter 2
Display 2.2
PRINT Procedure Output for WORK.TEST Containing SAS Dates, Times, and Datetimes
The following code exports an XML document for the GENERIC markup type that
includes the SAS date, time, and datetime information:
libname trans xml 'XML-document' xmltype=generic; 1
data trans.test; 2
set work.test;
run;
The LIBNAME statement assigns the libref TRANS to the physical location of the
file (complete pathname, filename, and file extension) that will store the exported
XML document and specifies the XML engine. XMLTYPE= specifies the
GENERIC markup type, which is the default.
The DATA step reads the SAS data set WORK.TEST and writes its content in XML
markup to the specified XML document.
13
The first LIBNAME statement assigns the libref FORMAT to the file that will store
the generated XML document FORMAT.XML. The default behavior for the engine
is that an assigned SAS format controls numeric values.
The second LIBNAME statement assigns the libref PREC to the file that will store
the generated XML document PRECISION.XML. The XMLDOUBLE= option
specifies INTERNAL, which causes the engine to retrieve the stored raw values.
The DATA step creates the temporary data set NPI. The data set has a numeric
variable that contains values with a high precision. The variable has an assigned
user-defined format that specifies two decimal points.
The DATA step creates the data set FORMAT.DBLTEST from WORK.NPI.
The DATA step creates the data set PREC.RAWTEST from WORK.NPI.
From the data set FORMAT.DBLTEST, the PRINT procedure generates the XML
document FORMAT.XML, which contains numeric values controlled by the SAS
format. See Output 2.3 on page 14.
14
Chapter 2
For the PRINT procedure output, a format was specified in order to show the
precision loss. In the output, the decimals after the second digit are zeros. See
Display 2.3 on page 15.
From the data set PREC.RAWTEST, the PRINT procedure generates the XML
document PRECISION.XML, which contains the stored numeric values. See Output
2.4 on page 16.
For the PRINT procedure output, a format was specified in order to show the
retained precision. See Display 2.4 on page 17.
15
16
Chapter 2
17
18
Chapter 2
The following SAS program exports an XML document from the SAS data set
INPUT.SUPPLIERS:
libname input 'c:\My Documents\myfiles'; 1
filename xsd 'c:\My Documents\XML\suppliers.xsd'; 2
libname output xml 'c:\My Documents\XML\suppliers.xml' xmltype=msaccess
xmlmeta=schemadata xmlschema=xsd; 3
data output.suppliers; 4
set input.suppliers;
run;
The first LIBNAME statement assigns the libref INPUT to the physical location of
the SAS library that stores the SAS data set SUPPLIERS.
The FILENAME statement assigns the fileref XSD to the physical location of the
separate external file that will contain the metadata-related information.
The second LIBNAME statement assigns the libref OUTPUT to the physical
location of the file (complete pathname, filename, and file extension) that will store
the exported XML document and specifies the XML engine. The engine options
The DATA step reads the SAS data set INPUT.SUPPLIERS and writes its data
content in Microsoft Access database XML markup to the XML document
Suppliers.XML, and then writes the metadata information to the separate external
file Suppliers.XSD.
Part of the resulting XML document is shown in Output 2.5 on page 19. The separate
metadata information is shown in Output 2.6 on page 20.
Output 2.5 XML Document Suppliers.XML
19
20
Chapter 2
21
22
Chapter 2
The FILENAME statement assigns the fileref OUTPUT to the physical location of
the external file (complete pathname, filename, and file extension) to which the
exported information will be written.
The LIBNAME statement uses the fileref OUTPUT as the output location and
specifies the XML engine. It includes the following engine options:
The output is the same as the XML document that is shown in Example CDISC ODM
Document on page 141.
23
Chapter 3
24
Chapter 3
The following SAS program translates the XML markup to SAS proprietary format:
libname trans xml 'XML-document'; 1
libname myfiles 'SAS-library'; 2
data myfiles.class; 3
set trans.class;
run;
The first LIBNAME statement assigns the libref TRANS to the physical location of
the XML document (complete pathname, filename, and file extension) and specifies
the XML engine. By default, the XML engine expects GENERIC markup.
The second LIBNAME statement assigns the libref MYFILES to the physical
location of the SAS library that will store the resulting SAS data set. The V9 engine
is the default.
The DATA step reads the XML document and writes its content in SAS proprietary
format.
Issuing the following PRINT procedure produces the output for the data set that was
translated from the XML document:
proc print data=myfiles.class;
run;
25
26
Chapter 3
The second SAS program imports the XML document using the XMLDOUBLE= option
in order to change the behavior, which retrieves the value from the rawdata= attribute in
the element:
libname new xml 'C:\My Documents\precision.xml' xmldouble=internal;
title 'Precision Method';
proc print data=new.rawtest;
format n_pi f14.10;
run;
27
28
Chapter 3
First, using the default XML engine behavior, which expects XML markup to conform
to W3C specifications, the following SAS program imports only the first two
observations, which contain valid XML markup, and produces errors for the last two
records, which contain non-escaped characters:
libname permit xmlv2 'c:\My Documents\XML\permit.xml';
proc print data=permit.chars;
run;
Log 3.1 SAS Log Output
ERROR: There is an
encountered
occurred at
NOTE: There were 2
29
30
Chapter 3
31
32
Chapter 3
The following SAS program interprets the XML document as a SAS data set:
libname access xml '/u/myid/XML/suppliers.xml' xmltype=msaccess 1
xmlmeta=schemadata; 1
proc print data=access.suppliers (obs=2); 2
var contactname companyname;
run;
Display 3.5
The LIBNAME statement assigns the libref ACCESS to the physical location of the
XML document (complete pathname, filename, and file extension) and specifies the
XML engine. By default, the XML engine expects GENERIC markup, so you must
include the XMLTYPE= option in order to read the XML document in MSACCESS
markup and to obtain a variable's attributes from the embedded schema. The option
XMLMETA=SCHEMADATA specifies to import both data and metadata-related
information from the input XML document.
The PRINT procedure produces the output. The procedure uses the OBS= data set
option to print only the first two observations, and the VAR statement to print only
specific variables (columns).
33
Using the CONTENTS procedure, the output displays the file's attributes, as well as the
attributes of each interpreted column (variable), such as the variable's type and length,
which are obtained from the embedded XML schema. Without the embedded XML
schema, the results for the attributes would be default values.
proc contents data=access.suppliers;
run;
34
Chapter 3
Display 3.6
35
36
Chapter 3
First, using the default XML engine behavior, which does not support concatenated
XML documents (XMLCONCATENATE=NO), the following SAS program imports the
first XML document, which consists of three observations, and produces an error for the
second XML document:
libname concat xml '/u/My Documents/XML/ConcatStudents.xml';
proc datasets library=concat;
Log 3.2 SAS Log Output
CONCAT
XML
/u/My Documents/XML/ConcatStudents.xml
GENERIC
NO XMLMAP IN EFFECT
Name
Member
Type
STUDENTS
DATA
37
The FILENAME statement assigns the fileref ODM to the physical location of the
XML document (complete pathname, filename, and file extension).
The LIBNAME statement uses the fileref ODM to reference the XML document and
specifies the XML engine. If the fileref matches the libref, you do not need to specify
the physical location of the XML document in the LIBNAME statement. By default,
the XML engine expects GENERIC markup, so you must include the XMLTYPE=
option in order to read the XML document in CDISCODM markup.
38
Chapter 3
The output from the CONTENTS procedure displays the file's attributes as well as
the attributes of each interpreted column (variable), such as the variable's type and
length. The attributes are obtained from the embedded ODM metadata content. The
VARNUM option causes the variables to be printed first in alphabetical order and
then in the order of their creation.
39
40
Chapter 3
41
Chapter 4
42
Chapter 4
Display 4.1
If the data were exported without an XMLMap, the structure of the resulting XML
document would be rectangular and consist of a TEAMS element for each observation in
the SAS data set. For example:
<?xml version="1.0" encoding="windows-1252" ?>
<TABLE>
<TEAMS>
<NAME>Thrashers</NAME>
<ABBREV>ATL</ABBREV>
<CONFERENCE>Eastern</CONFERENCE>
<DIVISION>Southeast</DIVISION>
</TEAMS>
<TEAMS>
<NAME>Hurricanes</NAME>
<ABBREV>CAR</ABBREV>
<CONFERENCE>Eastern</CONFERENCE>
<DIVISION>Southeast</DIVISION>
</TEAMS>
.
.
.
</TABLE>
To export the SAS data set as an XML document that structures data hierarchically by
division within each conference, an XMLMap is required. The only change to the
existing XMLMap is to include the OUTPUT element. Notations in the XMLMap
syntax are explained.
43
To use an XMLMap to export the SAS data set as an XML document, you must
specify 1.9 or 2.1 as the XMLMap version number.
To use an XMLMap to export the SAS data set as an XML document, you must
include the OUTPUT element in the XMLMap. The OUTPUT element contains one
or more HEADING elements and one TABLEREF element.
The TABLEREF element, which references the name of the table to be exported,
specifies the table TEAMS.
44
Chapter 4
The following SAS statements export the SAS data set named NHL.TEAMS to an XML
document named NHLOUT.XML, using an XMLMap named NHLEXPORT.MAP:
libname nhl 'C:\My Documents\myfiles';
filename out 'C:\My Documents\XML\NHLOUT.xml';
libname out xmlv2 xmltype=xmlmap
xmlmap='C:\My Documents\XML\NHLexport.map';
data out.TEAMS;
set nhl.teams;
run;
45
Chapter 5
46
Chapter 5
how to interpret the XML markup into a SAS data set or data sets, variables (columns),
and observations (rows).
As an alternative to you creating an XMLMap by coding XMLMap syntax, the SAS
XML Mapper can generate XMLMap syntax. SAS XML Mapper removes the tedium of
creating and modifying an XMLMap by providing a GUI that generates the appropriate
XML elements for you. SAS XML Mapper analyzes the structure of an XML document
or an XML schema, and generates basic syntax for the XMLMap. See Using SAS XML
Mapper to Generate and Update an XMLMap on page 135.
After the XMLMap is created, use the XMLMAP= option in the LIBNAME statement to
specify the file.
The nested elements (repeating element instances) that occur within the container
begin with the second-level instance tag.
Here is an example of an XML document that illustrates the physical structure that is
required:
<?xml version="1.0" encoding="windows-1252" ?>
<LIBRARY> 1
<STUDENTS> 2
<ID> 0755 </ID>
<NAME> Brad Martin </NAME>
<ADDRESS> 1611 Glengreen </ADDRESS>
<CITY> Huntsville </CITY>
<STATE> Texas </STATE>
</STUDENTS>
<STUDENTS> 3
<ID> 1522 </ID>
<NAME> Zac Harvell </NAME>
<ADDRESS> 11900 Glenda </ADDRESS>
<CITY> Houston </CITY>
<STATE> Texas </STATE>
</STUDENTS>
.
.
The engine goes to the second-level instance tag, which is <STUDENTS>, translates
it as the data set name, and begins scanning the elements that are nested (contained)
between the <STUDENTS> start tag and the </STUDENTS> end tag, looking for
variables.
Because the instance tags <ID>, <NAME>, <ADDRESS>, <CITY>, and <STATE>
are contained within the <STUDENTS> start tag and </STUDENTS> end tag, the
XML engine interprets them as variables. The individual instance tag names become
the data set variable names. The repeating element instances are translated into a
collection of rows with a constant set of columns.
48
Chapter 5
.
. more instances of <LOWTEMP>
.
</CLIMATE>
The XML engine recognizes the first instance tag <CLIMATE> as the rootenclosing element, which is the container for the document.
Starting with the second-level instance tag, which is <HIGHTEMP>, the XML
engine uses the repeating element instances as a collection of rows with a constant
set of columns.
When the second-level instance tag changes, the XML engine interprets that change
as a different SAS data set.
The result is two SAS data sets: HIGHTEMP and LOWTEMP. Both happen to have the
same variables but different data.
To ensure that an import result is what you expect, use the DATASETS procedure. For
example, these SAS statements result in the following:
libname climate xml 'C:\My Documents\climate.xml';
proc datasets library=climate;
quit;
49
50
Chapter 5
abbrev="WSH" />
abbrev="DAL" />
abbrev="LA" />
abbrev="ANA" />
abbrev="PHX" />
abbrev="SJ" />
51
<TYPE>character</TYPE>
<DATATYPE>STRING</DATATYPE>
<LENGTH>10</LENGTH>
</COLUMN>
</TABLE>
</SXLEMAP>
The previous XMLMap syntax defines how to translate the XML markup as explained
below using the following data investigation steps:
1
Identify the SAS data set observation boundary, which translates into a collection of
rows with a constant set of columns.
In the XML document, information about individual teams occurs in a <TEAM> tag
located with <CONFERENCE> and <DIVISION> enclosures. You want a new
observation generated each time a TEAM element is read.
52
Chapter 5
The first FILENAME statement assigns the file reference NHL to the physical
location (complete pathname, filename, and file extension) of the XML document
named NHL.XML.
The second FILENAME statement assigns the file reference MAP to the physical
location of the XMLMap named NHL.MAP.
The LIBNAME statement uses the file reference NHL to reference the XML
document. It specifies the XMLV2 engine and uses the file reference MAP to
reference the XMLMap.
PROC PRINT produces output, verifying that the import was successful.
Display 5.3
53
54
Chapter 5
The XML document can be successfully imported by creating an XMLMap that defines
how to map the XML markup. The following is the XMLMap named RSS.MAP, which
contains the syntax that is needed to successfully import RSS.XML. The syntax tells the
XML engine how to interpret the XML markup as explained in the subsequent
descriptions. The contents of RSS.XML results in two SAS data sets: CHANNEL to
contain content information and ITEMS to contain the individual news stories.
<?xml version="1.0" encoding="UTF-8"?>
<SXLEMAP name="SXLEMap" version="2.1"> 1
<TABLE name="CHANNEL"> 2
<TABLE-PATH syntax="XPath">/rss/channel</TABLE-PATH> 3
<TABLE-END-PATH beginend="BEGIN" syntax="XPath">
/rss/channel/item</TABLE-END-PATH> 4
<COLUMN name="title"> 5
<PATH syntax="XPath">/rss/channel/title</PATH>
<TYPE>character</TYPE>
<DATATYPE>string</DATATYPE>
<LENGTH>200</LENGTH>
</COLUMN>
<COLUMN name="link"> 6
<PATH syntax="XPath">/rss/channel/link</PATH>
<DESCRIPTION>Story link</DESCRIPTION>
<TYPE>character</TYPE>
<DATATYPE>string</DATATYPE>
<LENGTH>200</LENGTH>
</COLUMN>
<COLUMN name="description">
<PATH syntax="XPath">/rss/channel/description</PATH>
<TYPE>character</TYPE>
<DATATYPE>string</DATATYPE>
<LENGTH>1024</LENGTH>
</COLUMN>
<COLUMN name="language">
<PATH syntax="XPath">/rss/channel/language</PATH>
<TYPE>character</TYPE>
<DATATYPE>string</DATATYPE>
<LENGTH>8</LENGTH>
</COLUMN>
55
<COLUMN name="version"> 7
<PATH syntax="XPath">/rss@version</PATH>
<TYPE>character</TYPE>
<DATATYPE>string</DATATYPE>
<LENGTH>8</LENGTH>
</COLUMN>
</TABLE>
<TABLE description="Individual news stories" name="ITEMS"> 8
<TABLE-PATH syntax="XPath">/rss/channel/item</TABLE-PATH>
<COLUMN name="title"> 9
<PATH syntax="XPath">/rss/channel/item/title</PATH>
<TYPE>character</TYPE>
<DATATYPE>string</DATATYPE>
<LENGTH>200</LENGTH>
</COLUMN>
<COLUMN name="URL"> 10
<PATH syntax="XPath">/rss/channel/item/link</PATH>
<DESCRIPTION>Story link</DESCRIPTION>
<TYPE>character</TYPE>
<DATATYPE>string</DATATYPE>
<LENGTH>200</LENGTH>
</COLUMN>
<COLUMN name="description"> 10
<PATH syntax="XPath">/rss/channel/item/description</PATH>
<TYPE>character</TYPE>
<DATATYPE>string</DATATYPE>
<LENGTH>1024</LENGTH>
</COLUMN>
</TABLE>
</SXLEMAP>
The previous XMLMap defines how to translate the XML markup as explained below:
1
Element specifying the location path that defines where in the XML document to
collect variables for the CHANNEL data set.
Element specifying the location path that specifies when to stop processing data for
the CHANNEL data set.
Element containing the attributes for the TITLE variable in the CHANNEL data set.
The XPath construction specifies where to find the current tag and to access data
from the named element.
56
Chapter 5
Element containing the attributes for the last variable in the CHANNEL data set,
which is VERSION. This XPath construction specifies where to find the current tag
and uses the attribute form to access data from the named attribute.
Element containing the attributes for the TITLE variable in the ITEMS data set.
10 Subsequent COLUMN elements define other variables for the ITEMS data set,
DATASETS Procedure Output for RSS Library Showing Two Data Sets
57
many relationships. Top items have one or more items below it (for example, customer
to orders).
This example explains how to define an XMLMap in order to import an XML document
as two data sets that have related information.
Here is the XML document Pharmacy.XML. The file contains hierarchical data with
related entities in the form of individual customers and their prescriptions. Each
customer can have one or multiple prescriptions. Notice that PRESCRIPTION elements
are nested within each <PERSON> start tag and </PERSON> end tag:
<?xml version="1.0" ?>
<PHARMACY>
<PERSON>
<NAME>Brad Martin</NAME>
<STREET>11900 Glenda Court</STREET>
<CITY>Austin</CITY>
<PRESCRIPTION>
<NUMBER>1234</NUMBER>
<DRUG>Tetracycline</DRUG>
</PRESCRIPTION>
<PRESCRIPTION>
<NUMBER>1245</NUMBER>
<DRUG>Lomotil</DRUG>
</PRESCRIPTION>
</PERSON>
<PERSON>
<NAME>Jim Spano</NAME>
<STREET>1611 Glengreen</STREET>
<CITY>Austin</CITY>
<PRESCRIPTION>
<NUMBER>1268</NUMBER>
<DRUG>Nexium</DRUG>
</PRESCRIPTION>
</PERSON>
</PHARMACY>
To import separate data sets, one describing the customers and the other containing
prescription information, a relation between each customer and associated prescriptions
must be designated in order to know which prescriptions belong to each customer.
An XMLMap defines how to translate the XML markup into two SAS data sets. The
Person data set imports the name and address of each customer, and the Prescription data
set imports the customer's name, prescription number, and drug. Notations in the
XMLMap syntax are explained below.
Note: The XMLMap was generated by using SAS XML Mapper.
<?xml version="1.0" encoding="UTF-8"?>
<!-- ############################################################
<!-- 2011-01-10T14:39:38 -->
<!-- SAS XML Libname Engine Map -->
<!-- Generated by XML Mapper, 903000.1.0.20101208190000_v930 -->
<!-- ############################################################
<!-- ### Validation report
###
<!-- ############################################################
<!-- XMLMap validation completed successfully. -->
<!-- ############################################################
-->
-->
-->
-->
-->
58
Chapter 5
SXLEMAP is the root-enclosing element for the two SAS data set definitions.
59
COLUMN elements contain the attributes for the Name, Street, and City variables in
the Person data set.
COLUMN element contains the attributes for the Name variable in the Prescription
data set. Specifying the retain="YES" attribute causes the name to be held for
each observation until it is replaced by a different value. (The retain= attribute is like
the SAS DATA step RETAIN statement, which causes a variable to retain its value
from one iteration of the DATA step to the next.)
COLUMN elements contain the attributes for the Number and Drug variables in the
Prescription data set.
The following SAS statements import the XML document and specify the XMLMap:
filename pharm 'c:\My Documents\Pharmacy.xml';
filename map 'c:\My Documents\Pharmacy.map';
libname pharm xmlv2 xmlmap=map;
quit;
The DATASETS procedure verifies that SAS interprets the XML document
Pharmacy.XML as two SAS data sets: PHARM.PERSON and
PHARM.PRESCRIPTION.
proc datasets library=pharm;
quit;
Display 5.5
Here is the PRINT procedure output for both of the imported SAS data sets.
60
Chapter 5
Display 5.7
61
will not match. Each table must have the same generated key for like-named data
elements.
The following XMLMap imports Pharmacy.XML document as two SAS data sets that
have related information and also creates a key field that holds generated numeric key
values:
<?xml version="1.0" encoding="UTF-8"?>
<!-- ############################################################
<!-- 2011-01-10T14:39:38 -->
<!-- SAS XML Libname Engine Map -->
<!-- Generated by XML Mapper, 903000.1.0.20101208190000_v930 -->
<!-- ############################################################
<!-- ### Validation report
###
<!-- ############################################################
<!-- XMLMap validation completed successfully. -->
<!-- ############################################################
<SXLEMAP name="AUTO_GEN" version="2.1">
-->
-->
-->
-->
-->
<NAMESPACES count="0"/>
<!-- ############################################################ -->
<TABLE description="PERSON" name="PERSON">
<TABLE-PATH syntax="XPath">/PHARMACY/PERSON</TABLE-PATH> 1
<COLUMN name="KEY" retain="YES" class="ORDINAL"> 2
<INCREMENT-PATH
syntax="XPath">/PHARMACY/PERSON</INCREMENT-PATH>
<TYPE>numeric</TYPE>
<DATATYPE>integer</DATATYPE>
<FORMAT width="3">Z</FORMAT>
</COLUMN>
<COLUMN name="NAME">
<PATH syntax="XPath">/PHARMACY/PERSON/NAME</PATH>
<TYPE>character</TYPE>
<DATATYPE>string</DATATYPE>
<LENGTH>11</LENGTH>
</COLUMN>
<COLUMN name="STREET">
<PATH syntax="XPath">/PHARMACY/PERSON/STREET</PATH>
<TYPE>character</TYPE>
<DATATYPE>string</DATATYPE>
<LENGTH>18</LENGTH>
</COLUMN>
<COLUMN name="CITY">
<PATH syntax="XPath">/PHARMACY/PERSON/CITY</PATH>
<TYPE>character</TYPE>
<DATATYPE>string</DATATYPE>
<LENGTH>6</LENGTH>
</COLUMN>
</TABLE>
62
Chapter 5
The following explains the XMLMap syntax that generates the key fields:
1
In the TABLE element that defines the Person data set, the TABLE-PATH element
identifies the observation boundary for the data set. The location path generates a
new observation each time a PERSON element is read.
For the Person data set, the COLUMN element for the Key variable contains the
class="ORDINAL" attribute as well as the INCREMENT-PATH element. The XML
engine follows this process to generate the key field values for the Person data set:
1. When the XML engine encounters the <PERSON> start tag, it reads the value
into the input buffer, and then increments the value for the Key variable by 1.
2. The XML engine continues reading values into the input buffer until it
encounters the </PERSON> end tag, at which time it writes the completed input
buffer to the SAS data set as one observation.
3. The process is repeated for each <PERSON> start tag (from INCREMENTPATH) and </PERSON> end tag (from TABLE-PATH) sequence.
4. The result is four variables and two observations.
In the TABLE element that defines the Prescription data set, the TABLE-PATH
element identifies the observation boundary for the data set. The location path
generates a new observation each time a PRESCRIPTION element is read.
For the Prescription data set, the COLUMN element for the Key variable contains
the class="ORDINAL" attribute as well as the INCREMENT-PATH element.
The XML engine follows this process to generate the key field values for the
Prescription data set:
63
1. When the XML engine encounters the <PERSON> start tag, it reads the value
into the input buffer, and then increments the value for the Key variable by 1.
2. The XML engine continues reading values into the input buffer until it
encounters the </PRESCRIPTION> end tag, at which time it writes the
completed input buffer to the SAS data set as one observation. Because the
location paths for the counter variables must be the same for both TABLE
elements, the behavior of the XML engine for the Prescription data set Key
variable is the same as the Person data set Key variable. Although the XML
engine tracks the occurrence of a PERSON tag as a key for both counter
variables, the observations are derived from different TABLE-PATH locations.
3. The process is repeated for each <PERSON> start tag (from INCREMENTPATH) and </PRESCRIPTION> end tag (from TABLE-PATH) sequence.
4. The result is three variables and three observations.
The following SAS statements import the XML document:
filename pharm 'c:\My Documents\XML\Pharmacy.xml';
filename map 'c:\My Documents\XML\PharmacyOrdinal.map';
libname pharm xmlv2 xmlmap=map;
Here is the PRINT procedure output for both of the imported SAS data sets with a
numeric key:
Display 5.8
Display 5.9
64
Chapter 5
Looking at the above XML document, there are three sequences of element start tags and
end tags: VEHICLES, FORD, and ROW. If you specify the following table location path
and column locations paths, the XML engine processes the XML document as follows:
<TABLE-PATH syntax="XPath"> /VEHICLES/FORD </TABLE-PATH>
<PATH syntax="XPath"> /VEHICLES/FORD/ROW/Model </PATH>
<PATH syntax="XPath"> /VEHICLES/FORD/ROW/Year </PATH>
1. The XML engine reads the XML markup until it encounters the <FORD> start tag,
because FORD is the last element specified in the table location path.
65
2. The XML engine clears the input buffer and scans subsequent elements for variables
based on the column location paths. As a value for each variable is encountered, it is
read into the input buffer. For example, after reading the first ROW element, the
input buffer contains the values Mustang and 1965.
3. The XML engine continues reading values into the input buffer until it encounters
the </FORD> end tag, at which time it writes the completed input buffer to the SAS
data set as an observation.
4. The end result is one observation, which is not what you want.
Here is the PRINT procedure listing output showing the concatenated observation. (The
data in the observation is truncated due to the LENGTH element.)
Output 5.1 PRINT Procedure Output Showing Unacceptable FORD Data Set
Model
Year
1965
To get separate observations, you must change the table location path so that the XML
engine writes separate observations to the SAS data set. Here are the correct location
paths and the process that the engine would follow:
<TABLE-PATH syntax="XPath"> /VEHICLES/FORD/ROW </TABLE-PATH>
<PATH syntax="XPath"> /VEHICLES/FORD/ROW/Model </PATH>
<PATH syntax="XPath"> /VEHICLES/FORD/ROW/Year </PATH>
1. The XML engine reads the XML markup until it encounters the <ROW> start tag,
because ROW is the last element specified in the table location path.
2. The XML engine clears the input buffer and scans subsequent elements for variables
based on the column location paths. As a value for each variable is encountered, it is
read into the input buffer.
3. The XML engine continues reading values into the input buffer until it encounters
the </ROW> end tag, at which time it writes the completed input buffer to the SAS
data set as an observation. That is, one observation is written to the SAS data set that
contains the values Mustang and 1965.
4. The process is repeated for each <ROW> start-tag and </ROW> end-tag sequence.
5. The result is four observations.
Here is the complete XMLMap syntax:
<?xml version="1.0" ?>
<SXLEMAP version="2.1" name="path" description="XMLMap for path">
<TABLE name="FORD">
<TABLE-PATH syntax="XPath"> /VEHICLES/FORD/ROW </TABLE-PATH>
<COLUMN name="Model">
<DATATYPE> string </DATATYPE>
<LENGTH> 20 </LENGTH>
<TYPE> character </TYPE>
<PATH syntax="XPath"> /VEHICLES/FORD/ROW/Model </PATH>
</COLUMN>
<COLUMN name="Year">
66
Chapter 5
The following SAS statements import the XML document and specify the XMLMap.
The PRINT procedure verifies the results.
filename PATH 'c:\My Documents\XML\path.xml';
filename MAP 'c:\My Documents\XML\path.map';
libname PATH xmlv2 xmlmap=MAP;
proc print data=PATH.FORD noobs;
run;
Display 5.10
Here is the XMLMap syntax to use in order to import the previous XML document:
<?xml version="1.0" ?>
<SXLEMAP version="1.2">
<TABLE name="Publication">
<TABLE-PATH syntax="XPath">
/Library/Publication/Topic 1
</TABLE-PATH>
<COLUMN name="Title" retain="YES">
<PATH>
/Library/Publication/Title
</PATH>
<TYPE>character</TYPE>
<DATATYPE>STRING</DATATYPE>
<LENGTH>19</LENGTH>
</COLUMN>
<COLUMN name="Acquired" retain="YES">
<PATH>
/Library/Publication/Acquired
</PATH>
<TYPE>numeric</TYPE>
<DATATYPE>FLOAT</DATATYPE>
<LENGTH>10</LENGTH>
<FORMAT width="10" >mmddyy</FORMAT> 2
<INFORMAT width="10" >mmddyy</INFORMAT>
</COLUMN>
<COLUMN name="Topic">
<PATH>
/Library/Publication/Topic</PATH>
<TYPE>character</TYPE>
<DATATYPE>STRING</DATATYPE>
<LENGTH>9</LENGTH>
67
68
Chapter 5
The previous XMLMap tells the XML engine how to interpret the XML markup as
explained below:
1
The TOPIC element determines the location path that defines where in the XML
document to collect variables for the SAS data set. An observation is written each
time a </TOPIC> end tag is encountered in the XML document.
For the ACQUIRED column, the date is constructed using the XMLMap syntax
FORMAT element. Elements like FORMAT and INFORMAT are useful for
situations where data must be converted for use by SAS. The XML engine also
supports user-written formats and informats, which can be used independently of
each other.
Enumerations are also supported by XMLMap syntax. The ENUM element specifies
that the value for the column MAJOR must be either Y or N. Incoming values not
contained within the ENUM list are set to MISSING.
The following SAS statements import the XML document and specify the XMLMap.
The PRINT procedure verifies the results.
filename REP 'C:\My Documents\XML\Rep.xml';
filename MAP 'C:\My Documents\XML\Rep.map';
libname REP xml xmlmap=MAP;
proc print data=REP.Publication noobs;
run;
69
The following XMLMap imports the XML document using the SAS informats and
formats to read and write the date values:
<?xml version="1.0" encoding="UTF-8"?>
<!-- ############################################################
<!-- 2011-01-11T13:20:17 -->
<!-- SAS XML Libname Engine Map -->
<!-- Generated by XML Mapper, 903000.1.0.20101208190000_v930 -->
<!-- ############################################################
<!-- ### Validation report
###
<!-- ############################################################
<!-- XMLMap validation completed successfully. -->
<!-- ############################################################
-->
-->
-->
-->
-->
70
Chapter 5
The following explains the XMLMap syntax that imports the date values:
1
For the Basic variable, the FORMAT element specifies the E8601DA SAS format,
which writes data values in the extended format yyyy-mm-dd.
For the Basic variable, the INFORMAT element specifies the B8601DA SAS
informat, which reads date values into a variable in the basic format yyyymmdd.
Note: As recommended, when you read values into a variable with a basic format
SAS informat, this example writes the values with the corresponding extended
format SAS format.
For the Extended variable, the FORMAT element specifies the E8601DA SAS
format, which writes data values in the extended format yyyy-mm-dd.
For the Extended variable, the INFORMAT element specifies the E8601DA SAS
informat, which reads date values into a variable in the basic format yyyy-mm-dd.
The following SAS statements import the XML document and display PRINT procedure
output:
filename dates 'c:\My Documents\XML\ISOdate.xml';
filename map 'c:\My Documents\XML\ISOdate.map';
libname dates xmlv2 xmlmap=map;
proc print data=dates.isodate;
run;
Using ISO 8601 SAS Informats and Formats to Import Time Values with a Time Zone
71
Display 5.12
The following XMLMap imports the XML document using the SAS informats and
formats to read and write the time values:
<?xml version="1.0" encoding="UTF-8"?>
<!-- ############################################################
<!-- 2011-01-11T13:31:41 -->
<!-- SAS XML Libname Engine Map -->
<!-- Generated by XML Mapper, 903000.1.0.20101208190000_v930 -->
<!-- ############################################################
<!-- ### Validation report
###
<!-- ############################################################
<!-- XMLMap validation completed successfully. -->
-->
-->
-->
-->
72
Chapter 5
The following explains the XMLMap syntax that imports the time values:
1
For the Local variable, the INFORMAT and FORMAT elements specify the
E8601TM SAS informat and format, which reads and writes time values in the
extended format hh:mm:ss.ffffff. Because there is no time zone indicator, the context
of the value is local time.
For the Localzone variable, which reads the same value as the Local variable, the
INFORMAT element specifies the E8601TM SAS informat, which reads time values
in the extended format hh:mm:ss.ffffff. Because there is no time zone indicator, the
context of the value is local time.
Using ISO 8601 SAS Informats and Formats to Import Time Values with a Time Zone
73
The FORMAT element, however, specifies the E8601LZ SAS format, which writes
time values in the extended format hh:mm:ss+|-hh:mm. The E8601LZ format
appends the UTC offset to the value as determined by the local, current SAS session.
Using the E8601LZ format enables you to provide a time notation in order to
eliminate the ambiguity of local time.
Note: Even with the time notation, it is recommended that you do not mix timebased values.
3
For the UTC variable, the INFORMAT and FORMAT elements specify the
E8601TZ SAS informat and format, which reads and writes time values in the
extended format hh:mm:ss+|-hh:mm. Because there is a time zone indicator, the
value is assumed to be expressed in UTC. No adjustment or conversion is made to
the value.
For the Offset variable, the INFORMAT and FORMAT elements specify the
E8601TZ SAS informat and format, which reads and writes time values in the
extended format hh:mm:ss+|-hh:mm. Because there is a time zone offset present,
when the time value is read into the variable using the time zone-sensitive SAS
informat, the value is adjusted to UTC as requested via the time zone indicator, but
the time zone context is not stored with the value. When the time value is written
using the time zonesensitive SAS format, the value is expressed as UTC with a
zero offset value and is not adjusted to or from local time.
The following SAS statements import the XML document and display the PRINT
procedure output:
filename timzn 'c:\My Documents\XML\Time.xml';
filename map 'c:\My Documents\XML\Time.map';
libname timzn xmlv2 xmlmap=map;
proc print data=timzn.time;
run;
Display 5.13
74
Chapter 5
The first FILENAME statement assigns the fileref NHL to the XML document by
using the URL access method.
The second FILENAME statement assigns the fileref MAP to the physical location
of the XMLMap NHL.map.
The LIBNAME statement uses the fileref NHL to reference the XML document,
specifies the XML engine, and uses the fileref MAP to reference the XMLMap.
PROC COPY procedure reads the XML document, and writes its content as a
temporary SAS data set. When using the URL access method, you should include the
step to create the SAS data set with either a COPY procedure or a DATA step.
75
Here is the XMLMap used to import the XML document, with notations for each XPath
form on the PATH element:
<?xml version="1.0" ?>
<SXLEMAP version="1.2">
<TABLE name="TEAMS">
<TABLE-PATH syntax="XPath">
/NHL/CONFERENCE/DIVISION/TEAM
</TABLE-PATH>
<COLUMN name="ABBREV">
<PATH syntax="XPath">
/NHL/CONFERENCE/DIVISION/TEAM/@abbrev 1
</PATH>
<TYPE>character</TYPE>
<DATATYPE>STRING</DATATYPE>
<LENGTH>3</LENGTH>
</COLUMN>
<COLUMN name="FOUNDED">
<PATH syntax="XPath">
/NHL/CONFERENCE/DIVISION/TEAM/@founded[@abbrev="ATL"] 2
</PATH>
<TYPE>character</TYPE>
<DATATYPE>STRING</DATATYPE>
<LENGTH>10</LENGTH>
</COLUMN>
<COLUMN name="CONFERENCE" retain="YES">
<PATH syntax="XPath">
/NHL/CONFERENCE 3
</PATH>
<TYPE>character</TYPE>
<DATATYPE>STRING</DATATYPE>
<LENGTH>10</LENGTH>
</COLUMN>
<COLUMN name="TEAM">
<PATH syntax="XPath">
/NHL/CONFERENCE/DIVISION/TEAM[@founded="1993"] 4
</PATH>
<TYPE>character</TYPE>
<DATATYPE>STRING</DATATYPE>
<LENGTH>10</LENGTH>
</COLUMN>
<COLUMN name="TEAM5">
<PATH syntax="XPath">
/NHL/CONFERENCE/DIVISION/TEAM[position()=5] 5
</PATH>
76
Chapter 5
The Abbrev variable uses the attribute form that selects values from a specific
attribute. The engine scans the XML markup until it finds the TEAM element. The
engine retrieves the value from the abbrev= attribute, which results in each team
abbreviation.
The Founded variable uses the attribute form that conditionally selects from a
specific attribute based on the value of another attribute. The engine scans the XML
markup until it finds the TEAM element. The engine retrieves the value from the
founded= attribute where the value of the abbrev= attribute is ATL, which results in
the value 1999. The two attributes must be for the same element.
The Conference variable uses the element form that selects PCDATA from a named
element. The engine scans the XML markup until it finds the CONFERENCE
element. The engine retrieves the value between the <CONFERENCE> start tag and
the </CONFERENCE> end tag, which results in the value Eastern.
The Team variable uses the element form that conditionally selects PCDATA from a
named element. The engine scans the XML markup until it finds the TEAM element
where the value of the founded= attribute is 1993. The engine retrieves the value
between the <TEAM> start tag and the </TEAM> end tag, which results in the value
Panthers.
The Team5 variable uses the element form that conditionally selects PCDATA from
a named element based on a specific occurrence of the element. The position
function tells the engine to scan the XML markup until it finds the fifth occurrence
of the TEAM element. The engine retrieves the value between the <TEAM> start tag
and the </TEAM> end tag, which results in the value Capitals.
The following SAS statements import the XML document NHLShort.XML and specify
the XMLMap named NHL1.MAP. The PRINT procedure shows the resulting variables
with selected values:
filename NHL 'C:\My Documents\XML\NHLShort.xml';
filename MAP 'C:\My Documents\XML\NHL1.map';
libname NHL xml xmlmap=MAP;
proc print data=NHL.TEAMS noobs;
run;
77
78
Chapter 5
Here is the XMLMap that was used to import the XML document. Notations describe
the namespace elements.
<SXLEMAP name="Namespace" version="2.1">
<NAMESPACES count="3"> 1
<NS id="1" prefix="HOME">http://sample.url.org/home</NS> 2
<NS id="2" prefix="IP">http://sample.url.org/ip</NS>
<NS id="3" prefix="WORK">http://sample.url.org/work</NS>
</NAMESPACES>
<TABLE description="PERSON" name="PERSON"> 3
<TABLE-PATH syntax="XPath">/PEOPLE/PERSON</TABLE-PATH>
<COLUMN name="NAME"> 4
<PATH syntax="XPath">/PEOPLE/PERSON/NAME</PATH>
<TYPE>character</TYPE>
<DATATYPE>string</DATATYPE>
<LENGTH>13</LENGTH>
</COLUMN>
<COLUMN name="ADDRESS"> 4
<PATH syntax="XPathENR">/PEOPLE/PERSON/{1}ADDRESS</PATH> 5
<TYPE>character</TYPE>
<DATATYPE>string</DATATYPE>
<LENGTH>16</LENGTH>
</COLUMN>
<COLUMN name="PHONE"> 4
<PATH syntax="XPathENR">/PEOPLE/PERSON/{1}PHONE</PATH>
<TYPE>character</TYPE>
<DATATYPE>string</DATATYPE>
<LENGTH>12</LENGTH>
</COLUMN>
<COLUMN name="ADDRESS1"> 4
<PATH syntax="XPathENR">/PEOPLE/PERSON/{3}ADDRESS</PATH>
<TYPE>character</TYPE>
<DATATYPE>string</DATATYPE>
<LENGTH>26</LENGTH>
</COLUMN>
<COLUMN name="PHONE1"> 4
<PATH syntax="XPathENR">/PEOPLE/PERSON/{3}PHONE</PATH>
<TYPE>character</TYPE>
<DATATYPE>string</DATATYPE>
<LENGTH>12</LENGTH>
79
</COLUMN>
<COLUMN name="ADDRESS2"> 4
<PATH syntax="XPathENR">/PEOPLE/PERSON/{2}ADDRESS</PATH>
<TYPE>character</TYPE>
<DATATYPE>string</DATATYPE>
<LENGTH>11</LENGTH>
</COLUMN>
</TABLE>
</SXLEMAP>
80
Chapter 5
Display 5.15
abbrev="DAL" />
abbrev="LA" />
abbrev="ANA" />
abbrev="PHX" />
81
<TEAM name="Sharks"
</DIVISION>
</CONFERENCE>
</NHL>
abbrev="SJ" />
1. The first FILENAME statement assigns the file reference NHL to the physical
location (complete pathname, filename, and file extension) of the XML document
named NHL.XML to be imported.
2. The second FILENAME statement assigns the file reference MAP to the physical
location of the XMLMap named NHLGenerate.MAP to be generated.
3. The LIBNAME statement includes the following arguments:
The LIBNAME statement assigns the library reference NHL, which matches the
file reference that is assigned in the first FILENAME statement. Because the
library reference and file reference match, the physical location of the XML
document to be imported does not have to be specified in the LIBNAME
statement.
The XMLMAP= option specifies the file reference MAP, which matches the file
reference that is assigned in the second FILENAME statement. The file reference
is associated with the physical location of the XMLMap to be generated.
4. PROC PRINT produces output, verifying that the import was successful.
Here is the generated NHLGenerate.MAP XMLMap:
82
Chapter 5
83
<!-- ############################################################ -->
<TABLE description="DIVISION" name="DIVISION">
<TABLE-PATH syntax="XPath">/NHL/CONFERENCE/DIVISION</TABLE-PATH>
<COLUMN class="ORDINAL" name="CONFERENCE_ORDINAL">
<INCREMENT-PATH beginend="BEGIN" syntax="XPath">/NHL/CONFERENCE</INCREMENT-PATH>
<TYPE>numeric</TYPE>
<DATATYPE>integer</DATATYPE>
</COLUMN>
<COLUMN class="ORDINAL" name="DIVISION_ORDINAL">
<INCREMENT-PATH beginend="BEGIN" syntax="XPath">/NHL/CONFERENCE/DIVISION</INCREMENT-PATH>
<TYPE>numeric</TYPE>
<DATATYPE>integer</DATATYPE>
</COLUMN>
<COLUMN name="DIVISION">
<PATH syntax="XPath">/NHL/CONFERENCE/DIVISION</PATH>
<TYPE>character</TYPE>
<DATATYPE>string</DATATYPE>
<LENGTH>9</LENGTH>
</COLUMN>
</TABLE>
<!-- ############################################################ -->
<TABLE description="TEAM" name="TEAM">
<TABLE-PATH syntax="XPath">/NHL/CONFERENCE/DIVISION/TEAM</TABLE-PATH>
<COLUMN class="ORDINAL" name="DIVISION_ORDINAL">
<INCREMENT-PATH beginend="BEGIN" syntax="XPath">/NHL/CONFERENCE/DIVISION</INCREMENT-PATH>
<TYPE>numeric</TYPE>
<DATATYPE>integer</DATATYPE>
</COLUMN>
<COLUMN class="ORDINAL" name="TEAM_ORDINAL">
<INCREMENT-PATH beginend="BEGIN" syntax="XPath">/NHL/CONFERENCE/DIVISION/TEAM</INCREMENTPATH>
<TYPE>numeric</TYPE>
<DATATYPE>integer</DATATYPE>
</COLUMN>
<COLUMN name="name">
<PATH syntax="XPath">/NHL/CONFERENCE/DIVISION/TEAM/@name</PATH>
<TYPE>character</TYPE>
<DATATYPE>string</DATATYPE>
<LENGTH>10</LENGTH>
</COLUMN>
<COLUMN name="abbrev">
<PATH syntax="XPath">/NHL/CONFERENCE/DIVISION/TEAM/@abbrev</PATH>
<TYPE>character</TYPE>
<DATATYPE>string</DATATYPE>
<LENGTH>3</LENGTH>
</COLUMN>
</TABLE>
</SXLEMAP>
84
Chapter 5
85
Chapter 6
What Is a Tagset? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 85
Creating Customized Tagsets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 85
Exporting an XML Document Using a Customized Tagset . . . . . . . . . . . . . . . . . . . 86
Example Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 86
Define Customized Tagset Using TEMPLATE Procedure . . . . . . . . . . . . . . . . . . . 86
Export XML Document Using Customized Tagset . . . . . . . . . . . . . . . . . . . . . . . . . 91
What Is a Tagset?
A tagset specifies instructions for generating a markup language from your SAS data set.
The resulting output contains embedded instructions defining layout and some content.
SAS provides tagsets for a variety of markup languages, including the XML markup
language.
not specify different tagsets. If you alter the tagset when exporting an XML
document, and then attempt to import the XML document generated by that altered
tagset, the XML engine might not be able to translate the XML markup back to SAS
proprietary format.
86
Chapter 6
/* +------------------------------------------------+
|
|
+------------------------------------------------+ */
define event XMLversion;
put '<?xml version="1.0"';
putq ' encoding=' ENCODING;
put ' ?>' CR;
break;
end;
$LIBRARYNAME
$TABLENAME
$COLTAG
$META
'LIBRARY' ;
'DATASET' ;
'column' ;
'FULL' ;
eval
eval
eval
end;
$is_engine
$is_procprint
$is_OUTBOARD
1;
0;
1;
87
/* +------------------------------------------------+
|
|
+------------------------------------------------+ */
define event doc;
start:
trigger initialize;
trigger XMLversion;
break;
finish:
break;
end;
88
Chapter 6
/* NOT ENGINE
*/
/* TABLE VIEWER */
/* +------------------------------------------------+
|
|
+------------------------------------------------+ */
define event colspec_entry;
start:
break / if ^$is_engine and $index eq 1 and cmp(name, "Obs");
eval $index_max $index_max+1;
set $col_names[] name;
set $col_types[] type;
set $col_width[] width;
break;
finish:
break;
end;
89
90
Chapter 6
/* +------------------------------------------------+
|
|
| at this point, we just take over XML output.
|
| EmitRow() is triggered each time the data is
|
|
loaded into the $col_values array.
|
|
|
| we can output anything we desire from here... |
|
|
+------------------------------------------------+ */
define
start:
put
put
put
event EmitMeta; 1
'<' $LIBRARYNAME '>' CR ;
'
<!-- ' CR ;
'
List of available columns' CR
eval $index 1;
iterate $col_names ;
do /while _value_;
put '
' $index ' ' _value_ CR
91
next $col_names;
eval $index $index+1;
done;
put '
-->' CR ;
break;
finish:
put '</' $LIBRARYNAME '>' ;
break;
end;
"Name";
trigger EmitCol ;
"Height"; trigger EmitCol ;
"Weight"; trigger EmitCol ;
xdent;
put "</STUDENT>" CR ;
xdent;
break;
end;
The EmitMeta event generates an XML comment that contains a list of the variables
from the SAS data set. The event contains an example of iteration for a list variable,
which processes all of the variables in the SAS data set. For more information about
iteration, see the ITERATE statement in the TEMPLATE procedure DEFINE
EVENT statement in SAS Output Delivery System: User's Guide.
The EmitRow event creates XML output from the three SAS data set observations.
The EmitRow event names specific variables to process, which are Name, Height,
and Weight.
The EmitCol event creates generic-looking XML for each processed variable.
92
Chapter 6
data XMLout.class; 4
set work.class;
run;
The DATA step creates a data set named WORK.CLASS that consists of only three
observations.
The FILENAME statement assigns the fileref XMLOUT to the physical location of
the file that will store the exported XML document (complete pathname, filename,
and file extension).
The LIBNAME statement uses the fileref to reference the XML document and
specifies the XML engine. The TAGSET= option specifies the customized tagset
named Tagsets.Custom.
The DATA step reads the data set WORK.CLASS and writes its content to the
specified XML document in the format that is defined by the customized tagset.
93
Part 2
94
95
Chapter 7
By specifying the engine nickname XML, you access the SAS 9.1.3 XML engine
functionality.
By specifying the engine nickname XMLV2, you access XML engine functionality
with enhancements and changes after SAS 9.1.3. For example, the XMLV2 version
provides enhanced LIBNAME statement functionality, new XMLMap functionality,
and diagnostics of obsolete syntax.
96
Chapter 7
XMLMap functionality for XMLV2 includes the ability to use an XMLMap for
exporting and support for XML namespaces.
XML documents that are imported with the XML version might not pass the more
strict parsing rules in the XMLV2 version. For example, like XML markup, the
XMLV2 version is case sensitive. Opening and closing tags must be written in the
same case, such as <BODY> ...</BODY> and <Message>...</Message>. For
the XMLV2 version, the tag <Letter> is different from the tag <letter>.
Attribute names are also case sensitive, and the attribute value must be enclosed in
quotation marks, such as <Note date="09/24/1975">.
XMLMap files that are accepted by the XML version might not work with the
XMLV2 version. The XMLV2 version requires that XMLMap files be XML
compliant, which means that the markup is case sensitive. In addition, the XMLMap
markup must follow the specific XMLMap rules. Tag names must be uppercase.
Element attributes must be lowercase. An example is <SXLEMAP
version="2.1">. In addition, the supported XPath syntax is case sensitive.
XMLMap Files
The XML version supports all XMLMap files starting with XMLMap version 1.0. The
XMLV2 version supports XMLMap files starting with XMLMap version 1.2. The
documented XMLMap syntax version is 2.1. See XMLMap Syntax: Overview on page
113.
The ability to assign a libref to a SAS library, rather than assigning the libref to a
specific XML document.
Additional options. For a list of the LIBNAME statement options that are available
for the XML and XMLV2 nicknames, see LIBNAME Statement Options on page
97.
Using the XMLV2 nickname and the GENERIC markup type, you can export an
XML document from multiple SAS data sets. For example, if you have two SAS
data sets named Grades.Fred and Grades.Wilma, the following code exports an XML
document named Grades.xml that includes the grades from both SAS data sets:
libname stones xmlv2 'c:\Grades.xml';
data stones.fred;
set grades.fred;
run;
data stones.wilma;
Task
Option
Automatically generate an
XMLMap file to import an XML
document
AUTOMAP= on page
100
FORMATACTIVE= on
page 101
XML
XMLV2
When importing or exporting a CDISC ODM XML document with the CDISCODM markup type
Determine
whether
SAS
formats
are used
FORMATACTIVE= on
page 101
Specify
the libref
to create a
format
catalog
FORMATLIBRARY=
on page 102
Replace
existing
format
entries in
the format
catalog
FORMATNOREPLACE
= on page 102
ODSCHARSET= on
page 102
ODSRECSEP= on page
103
98
Chapter 7
Task
Option
XML
XMLV2
ODSTRANTAB= on
page 103
XMLCONCATENATE=
on page 104
XMLDATAFORM= on
page 104
XMLDOUBLE= on
page 104
XMLENCODING= on
page 105
XMLFILEREF= on page
106
Specify an XMLMap
XMLMAP= on page
106
XMLMETA= on page
107
XMLPROCESS= on
page 107
XMLSCHEMA= on
page 108
XMLTYPE= on page
108
99
Chapter 8
Dictionary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 99
LIBNAME Statement Syntax . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 99
Dictionary
LIBNAME Statement Syntax
Processes an XML document.
Valid in:
Category:
Anywhere
Data Access
Syntax
LIBNAME libref engine <'SAS-library | XML-document-path'> <options>;
Required Arguments
libref
is a valid SAS name that serves as a shortcut name to associate with the physical
location of the XML document. The name must conform to the rules for SAS names.
A libref cannot exceed eight characters.
engine
is the engine nickname for the SAS XML LIBNAME engine that imports and
exports an XML document.
XML
specifies the XML engine nickname that accesses the SAS 9.1.3 XML engine
functionality. The syntax for functionality that is available only for the XML
engine nickname is labeled with XML Only.
XMLV2
specifies the XML engine nickname that accesses the SAS 9.2 and 9.3 XML
engine functionality. The syntax for functionality that is available only for the
XMLV2 engine nickname is labeled with XMLV2 Only.
Alias: XML92
100
Chapter 8
To specify a fileref for the XML document that does not match the libref, you
can use the XMLFILEREF= option on page 106. For example, the following
code writes to the XML document Wilma.XML:
filename cartoon 'C:\XMLdata\wilma.xml';
libname bedrock xml xmlfileref=cartoon;
proc print data=bedrock.wilma;
run;
Optional Arguments
AUTOMAP=REPLACE | REUSE XMLV2 Only
specifies to automatically generate an XMLMap file to import an XML document.
The XMLMap file contains specific syntax that describes how to interpret the XML
markup into a SAS data set or data sets, variables (columns), and observations
(rows). XMLMap syntax is generated by analyzing the structure of the specified
XML document. To automatically generate the XMLMap file, you must specify an
existing XML document and the physical location for the output XMLMap file.
101
REPLACE
overwrites an existing XMLMap file. If an XMLMap file exists at the specified
physical location, the generated XMLMap file overwrites the existing one. If an
XMLMap file does not exist at the specified physical location, the generated
XMLMap file is written to the specified pathname and filename.
REUSE
does not overwrite an existing XMLMap file. If an XMLMap file exists at the
specified physical location, the existing XMLMap file is used. If an XMLMap
file does not exist at the specified physical location, the generated XMLMap file
is written to the specified pathname and filename.
Restriction: Use this option when importing only.
Requirements:
You must specify the physical location of an existing XML document with either
the complete pathname, filename, and file extension, or with a file reference that
is associated with the physical location for the DISK or TEMP device type only.
The XML document must exist on disk. The AUTOMAP= option does not
support accessing an XML document by using access methods such as FTP,
SFTP, URL, or WebDAV.
You must include the XMLMAP= option on page 106 to specify the physical
location of the generated XMLMap file with either the complete pathname,
filename, and file extension, or with a file reference that is associated with the
physical location for the DISK or TEMP device type only. The AUTOMAP=
option does not support accessing an XMLMap file by using access methods
such as FTP, SFTP, URL, or WebDAV.
Tips:
XML Only
By default, the format catalog is created in the Work library. If you want to store
the catalog in a permanent library, use the FORMATLIBRARY= option on page
102.
102
Chapter 8
When the format catalog is updated, the default behavior is that any new SAS
formats that are created by converting CDISC ODM CodeList elements will
overwrite any existing SAS formats that have the same name. To prevent existing
SAS formats from being overwritten, specify FORMATNOREPLACE=YES.
Example: Exporting an XML Document in CDISC ODM Markup on page 22
For the GENERIC markup type, specifies whether output values are affected by SAS
formats.
NO
writes the actual data value to the XML markup.
YES
causes the XML markup to contain the formatted data value.
Restriction: For the GENERIC markup type, if you export a SAS data set with
formatted data values, and then you try to import the XML document back
into the existing SAS data set, the import might fail. Exporting a SAS data set
with formatted data values can result in different variables or different
variable attributes.
Default: NO
Restriction: Use this option for the CDISCODM and GENERIC markup types only.
103
Requirement: Use this option with caution. If you are unfamiliar with character
sets, encoding methods, or translation tables, do not use this option without
proper technical advice.
Tip: The combination of the character set and translation table (encoding method)
results in the file's encoding.
See: ODSCHARSET= Option in SAS National Language Support (NLS): Reference
Guide.
ODSRECSEP= DEFAULT | NONE | YES XML Only
controls the generation of a record separator that marks the end of a line in the output
XML document.
DEFAULT
enables the XML engine to determine whether to generate a record separator
based on the operating environment where you run the SAS job.
The use of a record separator varies by operating environment.
Tip: If you do not transfer XML documents across environments, use the default
behavior.
NONE
specifies to not generate a record separator.
The XML engine uses the logical record length of the file that you are writing to
and writes one line of XML markup at a time to the output file.
Requirement: The logical record length of the file that you are writing to must
be at least as long as the longest line that is produced. If the logical record
length of the file is not long enough, then the markup might wrap to another
line at an inappropriate place.
Interaction: Transferring an XML document that does not contain a record
separator can be a problem. For example, FTP needs a record separator to
transfer data properly in ASCII (text) mode.
YES
specifies to generate a record separator.
Default: The XML engine determines whether to generate a record separator
based on the operating environment where you run the SAS job.
Restriction: Use this option when exporting an XML document only.
Interaction: Most transfer utilities interpret the record separator as a carriage
return sequence. For example, using FTP in ASCII (text) mode to transfer an
XML document that contains a record separator results in properly
constructed line breaks for the target environment.
ODSTRANTAB=table-name
specifies the translation table to use for the output file. The translation table
(encoding method) is a set of rules that are used to map characters in a character set
to numeric values. An example of a translation table is one that converts characters
from EBCDIC to ASCII-ISO. The table-name can be any translation table that SAS
provides or any user-defined translation table. The value must be the name of a SAS
catalog entry in either the SASUSER.PROFILE catalog or the SASHELP.HOST
catalog.
Restriction: Use this option when exporting an XML document only.
Requirement: Use this option with caution. If you are unfamiliar with character
sets, encoding methods, or translation tables, do not use this option without
proper technical advice.
Tip: The combination of the character set and translation table results in the file's
encoding.
104
Chapter 8
Reference Guide.
TAGSET=tagset-name
specifies the name of a tagset to override the default tagset that is used by the
markup type that is specified with XMLTYPE=.
To change the tags that are produced, you can create a customized tagset and specify
it with the TAGSET= option. For information about creating customized tagsets, see
the TEMPLATE procedure in the SAS Output Delivery System: User's Guide.
Restriction: Use this option when exporting an XML document only.
Requirement: Use this option with caution. If you are unfamiliar with XML
markup, do not use this option.
See: Understanding and Using Tagsets for the XML Engine on page 85
Example: Exporting an XML Document Using a Customized Tagset on page 86
CAUTION: If you alter the tagset when exporting an XML document and then
attempt to import the XML document generated by that altered tagset, the
XML engine might not be able to translate the XML markup back to a SAS
proprietary format.
XMLCONCATENATE=NO | YES
specifies whether the file to be imported contains multiple, concatenated XML
documents. Importing multiple, concatenated XML documents can be useful (for
example, if an application is producing a complete document per query or response
as in a Web form).
Alias: XMLCONCAT=
Default: NO
Restriction: Use this option when importing an XML document only.
Requirement: Use XMLCONCATENATE=YES cautiously. If an XML document
consists of concatenated XML documents, the content is not standard XML
construction. The option is provided for convenience, not to encourage invalid
XML markup.
Example: Importing Concatenated XML Documents on page 35
XMLDATAFORM=ELEMENT | ATTRIBUTE
specifies whether the tag for the element to contain SAS variable information (name
and data) is in open element or enclosed attribute format. For example, if the variable
name is PRICE and the value of one observation is 1.98, the generated output for
ELEMENT is <PRICE> 1.98 </PRICE> and for ATTRIBUTE is <COLUMN
name="PRICE"value="1.98" />.
Default: ELEMENT
Restrictions:
105
The XML engine nickname uses an assigned format. The maximum value is
16 digits. For example, if a numeric variable has an assigned format width
that is 20 digits, such as BEST20., the engine truncates the exported value. If
there is not an assigned format, the engine displays the value using BEST10.
The XMLV2 engine nickname ignores any assigned format and displays the
value using BEST16.
When importing, the SAS XML LIBNAME engine retrieves PCDATA (parsed
character data) from the named element in the XML document and converts the
data into numeric variable content.
Alias: FORMAT
INTERNAL
when exporting, the SAS XML LIBNAME engine retrieves the stored value for
the numeric variable and writes the raw value to a generated attribute value pair
(of the form rawvalue="value"). SAS uses the base64 encoding of a portable
machine representation. (The base64 encoding method converts binary data into
ASCII text and vice versa and is similar to the MIME format.)
When importing, the SAS XML LIBNAME engine retrieves the stored value
from the rawvalue= attribute from the named element in the XML document. It
converts that value into numeric variable content. The PCDATA content of the
element is ignored. When importing, XMLDOUBLE=INTERNAL is not
supported for the XMLV2 engine nickname.
Alias: PRECISION
Tip: Typically, you use XMLDOUBLE=INTERNAL to import or export an
XML document when content is more important than readability.
Default: DISPLAY
Restriction: You can specify the XMLDOUBLE= option for the GENERIC markup
type only.
Examples:
106
Chapter 8
XMLFILEREF=fileref
is the SAS name that is associated with the physical location of the XML document
to be exported or imported. To assign the fileref, use the FILENAME statement. The
XML engine can access any data referenced by a fileref. For example, the following
code writes to the XML document Wilma.XML:
filename cartoon 'C:\XMLdata\wilma.xml';
libname bedrock xml xmlfileref=cartoon;
proc print data=bedrock.wilma;
run;
Tip: When using the URL access method to reference a fileref that is assigned to an
Restrictions:
The XMLV2 engine nickname supports XMLMap syntax versions 1.2, 1.9, and
2.1. The XMLV2 engine nickname does not support XMLMap versions 1.0 or
1.1.
The XML engine nickname supports XMLMap syntax versions 1.0, 1.1, and 1.2.
The XML engine nickname does not support XMLMap syntax versions 1.9 or
2.1.
Requirement: If you specify an XMLMap, specify XMLTYPE=XMLMAP or do
not specify a markup type. If you explicitly specify a markup type other than
XMLMAP (such as XMLTYPE=GENERIC), an error occurs.
See: XMLMap Syntax: Overview on page 113
Example: Importing XML Documents Using an XMLMap on page 45
107
the data is written to the physical location of the XML document specified in the
LIBNAME statement. Separate metadata-related information is written to the
physical location specified with XMLSCHEMA=. If XMLSCHEMA= is not
specified, the metadata-related information is embedded with the data content in
the XML document.
Tip: Prior to SAS 9, the functionality for the XMLMETA= option used the keyword
XMLSCHEMA=. SAS 9 changed the option keyword XMLSCHEMA= to
XMLMETA=. SAS 9.1 added new functionality using the XMLSCHEMA=
option.
Examples:
108
Chapter 8
page 27
XMLSCHEMA=fileref | 'external-file'
specifies an external file to contain metadata-related information.
fileref
is the SAS name that is associated with the physical location of the output file.
To assign a fileref, use the FILENAME statement.
'external-file'
is the physical location of the file to contain the metadata-related information.
Include the complete pathname and the filename. Enclose the physical name in
single or double quotation marks.
Restrictions:
109
exchange, and archiving of clinical trials data and metadata for medical and
biopharmaceutical product development.
Tip: Use the FORMATACTIVE=, FORMATNOREPLACE=, and
FORMATLIBRARY= options to specify how display data are read and
stored in the target environment.
Examples:
110
Chapter 8
111
Part 3
112
113
Chapter 9
The first element in the XMLMap is the SXLEMAP element, which is the primary
(root) enclosing element that contains the definition for the generated output file. See
SXLEMAP Element on page 117.
The namespace elements define XML namespaces, which distinguish element and
attribute names by qualifying them with Uniform Resource Identifier (URIs). See
Elements for Namespaces on page 118.
If you use an XMLMap for exporting, you must include the exporting elements. See
Elements for Exporting on page 119.
The table elements define the SAS data set. See Elements for Tables on page 120.
The column elements define the variables for the SAS data set. See Elements for
Columns on page 124.
CAUTION:
The XMLMap markup, as XML itself, is case sensitive. The tag names must be
uppercase, and the element attributes must be lowercase. For example, <SXLEMAP
version="2.1">. In addition, the supported XPath syntax is case sensitive as
well.
114
Chapter 9
Table 9.1
XMLMap Syntax
Syntax
Description
Import
Export
XML
XMLV2
SXLEMAP on
page 117
NAMESPACES
on page 118
NS on page 119
OUTPUT on
page 120
HEADING on
page 120
ATTRIBUTE on
page 120
TABLEREF on
page 120
TABLE on page
121
TABLE-PATH
on page 121
TABLE-ENDPATH on page
122
TABLEDESCRIPTION
on page 124
COLUMN
name= on page
124
COLUMN
retain= on page
125
COLUMN class=
on page 125
TYPE on page
126
DATATYPE on
page 126
115
Syntax
Description
Import
Export
XML
XMLV2
DEFAULT on
page 127
ENUM on page
127
FORMAT on
page 127
INFORMAT on
page 128
DESCRIPTION
on page 128
LENGTH on
page 128
PATH on page
129
INCREMENTPATH on page
130
RESET-PATH on
page 131
DECREMENTPATH on page
132
116
Chapter 9
117
Chapter 10
Dictionary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 117
SXLEMAP Element . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 117
Elements for Namespaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 118
Elements for Exporting . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 119
Elements for Tables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 120
Elements for Columns . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 124
Dictionary
SXLEMAP Element
Is the primary (root) enclosing element that contains the definition for the generated output file. The
element provides the XML well-formed constraint for the definition.
Restriction:
Requirement:
When importing an XML document, the definition can define more than one output
SAS data set. When exporting an XML document from a SAS data set, the definition
can define only one output XML document.
The SXLEMAP element is required.
Syntax
SXLEMAP version="number" name="XMLMap" description="description"
Attributes
version="number"
specifies the version of the XMLMap syntax. The documented XMLMap syntax
version is 2.1 and must be specified to obtain full functionality.
Default: The default version is the first version of XMLMap syntax. It is retained
for compatibility with prior releases of the XMLMap syntax. It is recommended
that you update existing XMLMaps to version 2.1.
Restrictions:
The XMLV2 engine nickname supports XMLMap syntax versions 1.2, 1.9, and
2.1. The XMLV2 engine nickname does not support XMLMap versions 1.0 or
1.1.
118
Chapter 10
The XML engine nickname supports XMLMap syntax versions 1.0, 1.1, and 1.2.
The XML engine nickname does not support XMLMap syntax versions 1.9 or
2.1.
Tip: To update an XMLMap to version 2.1, load the existing XMLMap into SAS
9.3 XML Mapper, and then save the XMLMap. For information about SAS XML
Mapper, see Using SAS XML Mapper to Generate and Update an XMLMap
on page 135.
name="XMLMap"
is an optional attribute that specifies the filename of the XMLMap.
description="description"
is an optional attribute that specifies a description of the XMLMap.
Details
In the example below, the SXLEMAP element specifies all three attributes and contains
two TABLE elements.
<?xml version="1.0" ?>
<SXLEMAP version="2.1" name="Myxmlmap" description="sample XMLMap">
<TABLE name="test1">
.
.
.
</TABLE>
<TABLE name="test2">
.
.
.
</TABLE>
</SXLEMAP>
Syntax
NAMESPACES count="number"
NS id="number" <prefix="name">
Elements
NAMESPACES count="number" XMLV2 Only
is an optional element that contains one or more NS elements for defining XML
namespaces. For example, <NAMESPACES count="2">.
XMLMap namespace elements enable you to import an XML document with likenamed elements that are qualified with XML namespaces. In addition, XMLMap
namespace elements maintain XML namespaces from the imported XML document
to export an XML document from the SAS data set.
119
The engine supports exporting from one SAS data set only.
Syntax
OUTPUT
120
Chapter 10
HEADING
ATTRIBUTE name="name" value="value"
TABLEREF name="name"
Elements
OUTPUT XMLV2 Only
is an optional element that contains one or more HEADING elements and one
TABLEREF element for exporting a SAS data set as an XML document.
Requirement: If you specify version 1.9 or 2.1 in an XMLMap to export a SAS data
set as an XML document, you must include the OUTPUT element in the
XMLMap.
Example: Using an XMLMap to Export an XML Document with a Hierarchical
Structure on page 41
HEADING XMLV2 Only
is an optional element that contains one or more ATTRIBUTE elements.
ATTRIBUTE name="name" value="value" XMLV2 Only
is an optional element that contains additional file attribute information for the
exported XML document, such as a schema reference or other general attributes. The
specified name-value pairs are added as attributes to the first generated element in
the exported XML document, such as, <NHL description="Teams of the
National Hockey League">.
name="name"
specifies a name for a file attribute, such as name="description".
value="value"
specifies a value for the attribute, such as value="Teams of the
National Hockey League".
TABLEREF name="name" XMLV2 Only
is an optional element that specifies the name of the table in the XMLMap to be
exported.
name="name"
specifies the name of the table in the XMLMap to be exported. The name must
be unique in the XMLMap definition, and the name must be a valid SAS name,
which can be up to 32 characters.
Restriction: You can specify one TABLEREF element only.
Requirement: The specified name must match a TABLE element name= attribute.
Syntax
TABLE name="data-set-name"
TABLE-PATH syntax="type"
TABLE-END-PATH syntax="type" beginend="BEGIN | END"
TABLE-DESCRIPTION
121
Elements
TABLE name="data-set-name"
is an element that contains a data set definition. For example, <TABLE
name="channel">.
name="data-set-name"
specifies the name for the SAS data set. The name must be unique in the
XMLMap, and the name must be a valid SAS name, which can be up to 32
characters.
Requirement: The name= attribute is required.
Requirement: The TABLE element is required.
Interaction: The TABLE element can contain one or more of the following
1. The XML engine reads the XML markup until it encounters the <ITEM> start
tag.
2. The XML engine clears the input buffer, sets the contents to MISSING (by
default), and scans elements for variable names based on the COLUMN element
definitions. As values are encountered, they are read into the input buffer. (Note
that whether the XML engine resets to MISSING is determined by the
DEFAULT element as well as the COLUMN element retain= attribute.)
3. When the </ITEM> end tag is encountered, the XML engine writes the
completed input buffer to the SAS data set as a SAS observation.
4. The process is repeated for each <ITEM> start-tag and </ITEM> end-tag
sequence until the end-of-file is encountered in the input stream or until the
TABLE-END-PATH (if specified) is achieved, which results in six observations.
syntax="type"
is an optional attribute that specifies the type of syntax in the location path. The
syntax is valid XPath construction in compliance with the W3C specifications.
For example, syntax="XPath".
Default: XPath
Requirements:
122
Chapter 10
determines which end tag causes the XML engine to write the completed input
buffer to the SAS data set. If you do not identify the appropriate end tag, the
result could be concatenated data instead of separate observations, or an
unexpected set of columns. For examples, see Determining the Observation
Boundary to Avoid Concatenated Data on page 64 and Determining the
Observation Boundary to Select the Best Columns on page 66.
Requirements:
encountered.
Therefore, with the two location path specifications, the XML engine processes only
the highlighted data in the RSS.XML document for the CHANNEL data set, rather
than the entire XML document:
<?xml version="1.0" encoding="ISO-8859-1" ?>
<rss version="0.91">
<channel>
<title>WriteTheWeb</title>
<link>http://writetheweb.com</link>
<description>News for web users that write back
</description>
<language>en-us</language>
<copyright>Copyright 2000, WriteTheWeb team.
123
</copyright>
<managingEditor>[email protected]
</managingEditor>
<webMaster>[email protected]</webMaster>
<image>
<title>WriteTheWeb</title>
<url>http://writetheweb.com/images/mynetscape88.gif
</url>
<link>http://writetheweb.com</link>
<width>88</width>
<height>31</height>
<description>News for web users that write back
</description>
</image>
<item>
<title>Giving the world a pluggable Gnutella</title>
<link>http://writetheweb.com/read.php?item=24</link>
<description>WorldOS is a framework on which to build programs
that work like Freenet or Gnutella-allowing distributed
applications using peer-to-peer routing.</description>
</item>
<item>
.
.
.
</channel>
</rss>
syntax="type"
is an optional attribute that specifies the type of syntax in the location path. The
syntax is valid XPath construction in compliance with the W3C specifications.
The XPath form supported by the XML engine allows elements and attributes to
be individually selected for exclusion in the generated SAS data set. For
example, syntax="XPath".
Default: XPath
Requirements:
124
Chapter 10
Syntax
COLUMN name="name" retain="NO | YES" class="ORDINAL | FILENAME | FILEPATH"
TYPE
DATATYPE
DEFAULT
ENUM
FORMAT width="w" ndec="d"
INFORMAT width="w" ndec="d"
DESCRIPTION
LENGTH
PATH syntax="type"
INCREMENT-PATH syntax="type" beginend="BEGIN | END"
RESET-PATH syntax="type" beginend="BEGIN | END"
DECREMENT-PATH syntax="type" beginend="BEGIN | END"
Elements
COLUMN name="name" retain="NO | YES" class="ORDINAL | FILENAME |
FILEPATH"
is an element that contains a variable definition. For example, <COLUMN
name="Title">.
name="name"
specifies the name for the variable. The name must be a valid SAS name, which
can be up to 32 characters.
Requirement: The name= attribute is required.
125
retain="NO | YES"
is an optional attribute that determines the contents of the input buffer at the
beginning of each observation.
NO
sets the value for the beginning of each observation either to MISSING or to
the value of the DEFAULT element if specified.
YES
keeps the current value until it is replaced by a new, nonmissing value.
Specifying YES is much like the RETAIN statement in DATA step
processing. It forces the retention of processed values after an observation is
written to the output SAS data set.
Default: NO
Example: Importing Hierarchical Data as Related Data Sets on page 56
You must use the INCREMENT-PATH element or the DECREMENTPATH element. The PATH element is not allowed.
The TYPE element must specify the SAS data type as numeric, and the
DATATYPE element must specify the type of data as integer.
Example: Including a Key Field with Generated Numeric Keys on page
60
FILENAME
generates a character variable that contains the filename and extension of the
input document. This functionality can be useful when you assign a libref for
the XML engine that is associated with a physical location of a SAS library
to determine which file contains a particular value.
Requirement: The TYPE element must specify the SAS data type as
character, and the DATATYPE element must specify the type of data as
string.
FILEPATH
generates a character variable that contains the pathname, filename, and
extension of the input document. This functionality can be useful when you
assign a libref for the XML engine that is associated with a physical location
of a SAS library to determine which file contains a particular observation.
Requirement: The TYPE element must specify the SAS data type as
character, and the DATATYPE element must specify the type of data as
string.
Requirement: At least one COLUMN element is required.
126
Chapter 10
.
To apply output formatting in SAS, use the FORMAT element.
To control data type conversion in input, use the INFORMAT element. For
example, <INFORMAT> datatime </INFORMAT>.
DATATYPE
specifies the type of data being read from the XML document for the variable. For
example, <DATATYPE> string </DATATYPE> specifies that the data contains
alphanumeric characters.
The type of data specification can be
string
specifies that the data contains alphanumeric characters and does not contain
numbers used for calculations.
integer
specifies that the data contains whole numbers used for calculations.
double
specifies that the data contains floating-point numbers.
datetime
specifies that the input represents a valid datetime value, which is either
in the form of the XML specification ISO 8601 format. The default form is:
yyyy-mm-ddThh:mm:ss.ffffff.
date
specifies that the input represents a valid date value, which is either
in the form of the XML specification ISO 8601 format. The default form is:
yyyy-mm-dd.
time
specifies that the input represents a valid time value, which is either
in the form of the XML specification ISO 8601 format. The default form is:
hh:mm:ss.ffffff.
127
Restriction: The values for previous versions of XMLMap syntax are not accepted
DEFAULT
is an optional element that specifies a default value for a missing value for the
variable. Use the DEFAULT element to assign a nonmissing value to missing data.
For example, <DEFAULT> single </DEFAULT> assigns the value single when
a missing value occurs.
Default: By default, the XML engine sets a missing value to MISSING.
Example: Determining the Observation Boundary to Select the Best Columns on
page 66
ENUM
is an optional element that contains a list of valid values for the variable. The ENUM
element can contain one or more VALUE elements to list the values. By using
ENUM, values in the XML document are verified against the list of values. If a value
is not valid, it is either set to MISSING (by default) or set to the value specified by
the DEFAULT element. Note that a value specified for DEFAULT must be one of
the ENUM values in order to be valid.
<COLUMN name="filing_status">
.
.
.
<DEFAULT> single </DEFAULT>
.
.
.
<ENUM>
<VALUE> single </VALUE>
<VALUE> married filing joint return </VALUE>
<VALUE> married filing separate return </VALUE>
<VALUE> head of household </VALUE>
<VALUE> qualifying widow(er) </VALUE>
</ENUM>
</COLUMN>
page 66
FORMAT width="w" ndec="d"
is an optional element that specifies a SAS format for the variable. A format name
can be up to 31 characters for a character format and 32 characters for a numeric
format. A SAS format is an instruction that SAS uses to write values. You use
formats to control the written appearance of values. Do not include a period (.) as
part of the format name. Specify a width and length as attributes, not as part of the
format name.
For a list of the SAS formats, including the ISO 8601 SAS formats, see SAS Formats
and Informats: Reference.
128
Chapter 10
width="w"
is an optional attribute that specifies a format width, which for most formats is
the number of columns in the output data.
ndec="d"
is an optional attribute that specifies a decimal scaling factor for numeric
formats.
Here is an example:
<FORMAT> E8601DA </FORMAT>
<FORMAT width="8"> best </FORMAT>
<FORMAT width="8" ndec="2"> dollar </FORMAT>
page 66
INFORMAT width="w" ndec="d"
is an optional element that specifies a SAS informat for the variable. An informat
name can be up to 30 characters for a character informat and 31 characters for a
numeric informat. A SAS informat is an instruction that SAS uses to read values into
a variable (that is, to store the values). Do not include a period (.) as part of the
informat name. Specify a width and length as attributes, not as part of the informat
name.
For a list of the SAS informats, including the ISO 8601 SAS informats, see SAS
Formats and Informats: Reference.
Here is an example:
<INFORMAT> E8601DA </INFORMAT>
<INFORMAT width="8"> best </INFORMAT>
<INFORMAT width="8" ndec="2"> dollar </INFORMAT>
width="w"
is an optional attribute that specifies an informat width, which for most informats
is the number of columns in the input data.
ndec="d"
is an optional attribute that specifies a decimal scaling factor for numeric
informats. SAS divides the input data by 10 to the power of this value.
Example: Determining the Observation Boundary to Select the Best Columns on
page 66
DESCRIPTION
is an optional element that specifies a description for the variable, which can be up to
256 characters. The following example shows that the description is assigned as the
variable label.
<DESCRIPTION> Story link </DESCRIPTION>
LENGTH
is the maximum field storage length from the XML data for a character variable. The
value refers to the number of bytes used to store each of the variable's values in the
SAS data set. The value can be 1 to 32,767. During the input process, a maximum
length of characters is read from the XML document and transferred to the
observation buffer. For example, <LENGTH> 200 </LENGTH>.
Restriction: LENGTH is not valid for numeric data.
129
Requirement: For data that is defined as a STRING data type, the LENGTH
element is required.
Tip: You can use LENGTH to truncate a long field.
PATH syntax="type"
specifies a location path that tells the XML engine where in the XML document to
locate and access a specific tag for the current variable. In addition, the location path
tells the XML engine to perform a function, which is determined by the location path
form, to retrieve the value for the variable. The XPath forms that are supported allow
elements and attributes to be individually included in the generated SAS data set.
syntax="type"
is an attribute that specifies the type of syntax used in the location path. The
syntax is valid XPath construction in compliance with the W3C specifications.
The XPath form supported by the XML engine allows elements and attributes to
be individually included in the generated SAS data set.
Default: XPath
Requirements:
use any other valid W3C form, the results will be unpredictable.
element-form
selects PCDATA (parsed character data) from a named element. The following
element forms enable you to select from a named element, conditionally select
from a named element based on a specific attribute value, or conditionally select
from a named element based on a specific occurrence of the element using the
position function:
<PATH> /LEVEL/ITEM </PATH>
<PATH> /LEVEL/ITEM[@attr="value"] </PATH>
<PATH> /LEVEL/ITEM[position()=n]|[n] </PATH>
The following examples illustrate the element forms. For more information about
the examples, see Specifying a Location Path on the PATH Element on page
74.
The following location path tells the XML engine to scan the XML markup
until it finds the CONFERENCE element. The XML engine retrieves the
value between the <CONFERENCE> start tag and the </CONFERENCE>
end tag.
<PATH> /NHL/CONFERENCE </PATH>
The following location path tells the XML engine to scan the XML markup
until it finds the TEAM element where the value of the founded= attribute is
1993. The XML engine retrieves the value between the <TEAM> start tag
and the </TEAM> end tag.
<PATH> /NHL/CONFERENCE/DIVISION/TEAM[@founded="1993"] </PATH>
130
Chapter 10
The following location path uses the position function to tell the XML engine
to scan the XML markup until it finds the fifth occurrence of the TEAM
element. The XML engine retrieves the value between the <TEAM> start tag
and the </TEAM> end tag.
<PATH> /NHL/CONFERENCE/DIVISION/TEAM[position()=5] </PATH>
You can use the following shorter version for the position function:
<PATH> /NHL/CONFERENCE/DIVISION/TEAM[5] </PATH>
attribute-form
selects values from an attribute. The following attribute forms enable you to
select from a specific attribute or conditionally select from a specific attribute
based on the value of another attribute:
<PATH> /LEVEL/ITEM/@attr </PATH>
<PATH> /LEVEL/ITEM/@attr[attr2="value"] </PATH
The following examples illustrate the attribute forms. For more information
about the examples, see Specifying a Location Path on the PATH Element on
page 74.
The following location path tells the XML engine to scan the XML markup
until it finds the TEAM element. The XML engine retrieves the value from
the abbrev= attribute.
<PATH syntax="XPath"> /NHL/CONFERENCE/DIVISION/TEAM/@abbrev </PATH>
The following location path tells the XML engine to scan the XML markup
until it finds the TEAM element. The XML engine retrieves the value from
the founded= attribute where the value of the abbrev= attribute is ATL. The
two attributes must be for the same element.
<PATH> /NHL/CONFERENCE/DIVISION/TEAM/@founded[@abbrev="ATL"] </PATH>
Requirements:
131
engine where in the input data to increment the accumulated value for the counter
variable by 1.
syntax="type"
is an optional attribute that specifies the type of syntax in the location path. The
syntax is valid XPath construction in compliance with the W3C specifications.
The XPath form supported by the XML engine allows elements and attributes to
be individually included in the generated SAS data set. For example,
syntax="XPath".
Default: XPath
Requirements:
132
Chapter 10
133
134
Chapter 10
135
Chapter 11
136
Chapter 11
Display 11.1
137
The latest version of SAS XML Mapper, which is SAS 9.3, can be downloaded and used
with SAS 9.3 or with versions of SAS prior to SAS 9.3. There are some features that can
be used only with SAS 9.3 XML Mapper, such as the 2.1 XMLMap version.
SAS XML Mapper has online Help attached, which includes usage examples. From the
menu bar, select Help, and then Help Topics.
For a quick tutorial of SAS XML Mapper, see the video How to Automatically Generate
XMLMap Files (video) on the Base SAS XML LIBNAME Engine Focus Area page at
http://support.sas.com/rnd/base/xmlengine. Look for the heading XML
Mapper and click on the link to the video.
138
Chapter 11
139
Part 4
Appendixes
Appendix 1
Example CDISC ODM Document . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 141
140
141
Appendix 1
Here is an example of an XML document that is in the CDISC ODM format. This
document is used in Importing a CDISC ODM Document on page 37 and in
Exporting an XML Document in CDISC ODM Markup on page 22.
142
Appendix 1
- <!--
143
-->
- <ItemDef OID="ID.TAREA" SASFieldName="TAREA" Name="Therapeutic Area" DataType="text" Length="4">
<CodeListRef CodeListOID="CL.$TAREAF" />
</ItemDef>
<ItemDef OID="ID.PNO" SASFieldName="PNO" Name="Protocol Number" DataType="text" Length="15" />
- <ItemDef OID="ID.SCTRY" SASFieldName="SCTRY" Name="Country" DataType="text" Length="4">
<CodeListRef CodeListOID="CL.$SCTRYF" />
</ItemDef>
- <ItemDef OID="ID.F_STATUS" SASFieldName="F_STATUS" Name="Record status, 5 levels, internal use"
DataType="text" Length="1">
<CodeListRef CodeListOID="CL.$F_STATU" />
</ItemDef>
<ItemDef OID="ID.LINE_NO" SASFieldName="LINE_NO" Name="Line Number" DataType="integer" Length="2" />
<ItemDef OID="ID.AETERM" SASFieldName="AETERM" Name="Conmed Indication" DataType="text" Length="100" /
>
<ItemDef OID="ID.AESTMON" SASFieldName="AESTMON" Name="Start Month - Enter Two Digits 01-12"
DataType="integer" Length="2" />
<ItemDef OID="ID.AESTDAY" SASFieldName="AESTDAY" Name="Start Day - Enter Two Digits 01-31"
DataType="integer" Length="2" />
<ItemDef OID="ID.AESTYR" SASFieldName="AESTYR" Name="Start Year - Enter Four Digit Year"
DataType="integer" Length="4" />
<ItemDef OID="ID.AESTDT" SASFieldName="AESTDT" Name="Derived Start Date" DataType="date" />
<ItemDef OID="ID.AEENMON" SASFieldName="AEENMON" Name="Stop Month - Enter Two Digits 01-12"
DataType="integer" Length="2" />
<ItemDef OID="ID.AEENDAY" SASFieldName="AEENDAY" Name="Stop Day - Enter Two Digits 01-31"
DataType="integer" Length="2" />
<ItemDef OID="ID.AEENYR" SASFieldName="AEENYR" Name="Stop Year - Enter Four Digit Year"
DataType="integer" Length="4" />
<ItemDef OID="ID.AEENDT" SASFieldName="AEENDT" Name="Derived Stop Date" DataType="date" />
- <ItemDef OID="ID.AESEV" SASFieldName="AESEV" Name="Severity" DataType="text" Length="1">
<CodeListRef CodeListOID="CL.$AESEV" />
</ItemDef>
- <ItemDef OID="ID.AEREL" SASFieldName="AEREL" Name="Relationship to study drug" DataType="text"
Length="1">
<CodeListRef CodeListOID="CL.$AEREL" />
</ItemDef>
- <ItemDef OID="ID.AEOUT" SASFieldName="AEOUT" Name="Outcome" DataType="text" Length="1">
<CodeListRef CodeListOID="CL.$AEOUT" />
</ItemDef>
- <ItemDef OID="ID.AEACTTRT" SASFieldName="AEACTTRT" Name="Actions taken re study drug" DataType="text"
Length="1">
<CodeListRef CodeListOID="CL.$AEACTTR" />
</ItemDef>
- <ItemDef OID="ID.AECONTRT" SASFieldName="AECONTRT" Name="Actions taken, other" DataType="text"
Length="1">
<CodeListRef CodeListOID="CL.$AECONTR" />
</ItemDef>
- <!-Translation to ODM markup for any PROC FORMAT style
user defined or SAS internal formatting specifications
applied to columns in the table
-->
- <CodeList OID="CL.$TAREAF" SASFormatName="$TAREAF" Name="$TAREAF" DataType="text">
- <CodeListItem CodedValue="ONC">
144
Appendix 1
- <Decode>
<TranslatedText xml:lang="en">Oncology</TranslatedText>
</Decode>
</CodeListItem>
</CodeList>
- <CodeList OID="CL.$SCTRYF" SASFormatName="$SCTRYF" Name="$SCTRYF" DataType="text">
- <CodeListItem CodedValue="USA">
- <Decode>
<TranslatedText xml:lang="en">United States</TranslatedText>
</Decode>
</CodeListItem>
</CodeList>
- <CodeList OID="CL.$F_STATU" SASFormatName="$F_STATU" Name="$F_STATU" DataType="text">
- <CodeListItem CodedValue="S">
- <Decode>
<TranslatedText xml:lang="en">Source verified, not queried</TranslatedText>
</Decode>
</CodeListItem>
- <CodeListItem CodedValue="V">
- <Decode>
<TranslatedText xml:lang="en">Source verified, queried</TranslatedText>
</Decode>
</CodeListItem>
</CodeList>
- <CodeList OID="CL.$AESEV" SASFormatName="$AESEV" Name="$AESEV" DataType="text">
- <CodeListItem CodedValue="1">
- <Decode>
<TranslatedText xml:lang="en">Mild</TranslatedText>
</Decode>
</CodeListItem>
- <CodeListItem CodedValue="2">
- <Decode>
<TranslatedText xml:lang="en">Moderate</TranslatedText>
</Decode>
</CodeListItem>
- <CodeListItem CodedValue="3">
- <Decode>
<TranslatedText xml:lang="en">Severe</TranslatedText>
</Decode>
</CodeListItem>
- <CodeListItem CodedValue="4">
- <Decode>
<TranslatedText xml:lang="en">Life Threatening</TranslatedText>
</Decode>
</CodeListItem>
</CodeList>
- <CodeList OID="CL.$AEREL" SASFormatName="$AEREL" Name="$AEREL" DataType="text">
- <CodeListItem CodedValue="0">
- <Decode>
<TranslatedText xml:lang="en">None</TranslatedText>
</Decode>
</CodeListItem>
- <CodeListItem CodedValue="1">
- <Decode>
<TranslatedText xml:lang="en">Unlikely</TranslatedText>
</Decode>
</CodeListItem>
- <CodeListItem CodedValue="2">
- <Decode>
<TranslatedText xml:lang="en">Possible</TranslatedText>
</Decode>
</CodeListItem>
- <CodeListItem CodedValue="3">
- <Decode>
<TranslatedText xml:lang="en">Probable</TranslatedText>
</Decode>
</CodeListItem>
</CodeList>
145
146
- <!--
Appendix 1
Administrative metadata
-->
<AdminData />
- <!-Clinical Data
: AE
Adverse Events
Some adverse events from this trial
-->
<ClinicalData StudyOID="STUDY.StudyOID" MetaDataVersionOID="v1.1.0">
<SubjectData SubjectKey="001">
<StudyEventData StudyEventOID="SE.VISIT1" StudyEventRepeatKey="1">
<FormData FormOID="FORM.AE" FormRepeatKey="1">
<ItemGroupData ItemGroupOID="IG.AE" ItemGroupRepeatKey="1">
<ItemData ItemOID="ID.TAREA" Value="ONC" />
<ItemData ItemOID="ID.PNO" Value="143-02" />
<ItemData ItemOID="ID.SCTRY" Value="USA" />
<ItemData ItemOID="ID.F_STATUS" Value="V" />
<ItemData ItemOID="ID.LINE_NO" Value="1" />
<ItemData ItemOID="ID.AETERM" Value="HEADACHE" />
<ItemData ItemOID="ID.AESTMON" Value="06" />
<ItemData ItemOID="ID.AESTDAY" Value="10" />
<ItemData ItemOID="ID.AESTYR" Value="1999" />
<ItemData ItemOID="ID.AESTDT" Value="1999-06-10" />
<ItemData ItemOID="ID.AEENMON" Value="06" />
<ItemData ItemOID="ID.AEENDAY" Value="14" />
<ItemData ItemOID="ID.AEENYR" Value="1999" />
<ItemData ItemOID="ID.AEENDT" Value="1999-06-14" />
<ItemData ItemOID="ID.AESEV" Value="1" />
<ItemData ItemOID="ID.AEREL" Value="0" />
<ItemData ItemOID="ID.AEOUT" Value="1" />
<ItemData ItemOID="ID.AEACTTRT" Value="0" />
<ItemData ItemOID="ID.AECONTRT" Value="1" />
</ItemGroupData>
- <ItemGroupData ItemGroupOID="IG.AE" ItemGroupRepeatKey="2">
<ItemData ItemOID="ID.TAREA" Value="ONC" />
<ItemData ItemOID="ID.PNO" Value="143-02" />
<ItemData ItemOID="ID.SCTRY" Value="USA" />
<ItemData ItemOID="ID.F_STATUS" Value="V" />
<ItemData ItemOID="ID.LINE_NO" Value="2" />
<ItemData ItemOID="ID.AETERM" Value="CONGESTION" />
<ItemData ItemOID="ID.AESTMON" Value="06" />
<ItemData ItemOID="ID.AESTDAY" Value="11" />
<ItemData ItemOID="ID.AESTYR" Value="1999" />
<ItemData ItemOID="ID.AESTDT" Value="1999-06-11" />
<ItemData ItemOID="ID.AEENMON" Value="" />
<ItemData ItemOID="ID.AEENDAY" Value="" />
<ItemData ItemOID="ID.AEENYR" Value="" />
<ItemData ItemOID="ID.AEENDT" Value="" />
<ItemData ItemOID="ID.AESEV" Value="1" />
<ItemData ItemOID="ID.AEREL" Value="0" />
<ItemData ItemOID="ID.AEOUT" Value="2" />
<ItemData ItemOID="ID.AEACTTRT" Value="0" />
<ItemData ItemOID="ID.AECONTRT" Value="1" />
</ItemGroupData>
</FormData>
</StudyEventData>
</SubjectData>
</ClinicalData>
</ODM>
-
147
Glossary
DTD
Document Type Definition. A file that specifies how the markup tags in a group of
SGML or XML documents should be interpreted by an application that displays,
prints, or otherwise processes the documents.
encoding
the result of mapping a coded character set to code values.
Extensible Markup Language
See XML.
file reference
See fileref.
File Transfer Protocol
a telecommunications protocol that is used for transferring files from one computer
to another over a network. Short form: FTP.
fileref
a name that is temporarily assigned to an external file or to an aggregate storage
location such as a directory or a folder. The fileref identifies the file or the storage
location to SAS.
format
See SAS format.
FTP
See File Transfer Protocol.
informat
See SAS informat.
key field
See sequence field.
library reference
See libref.
148
Glossary
libref
a SAS name that is associated with the location of a SAS library. For example, in the
name MYLIB.MYFILE, MYLIB is the libref, and MYFILE is a file in the SAS
library.
markup language
a set of codes that are embedded in text in order to define layout and certain content.
metadata
descriptive data about data that is stored and managed in a database, in order to
facilitate access to captured and archived data for further use.
observation
a row in a SAS data set. All of the data values in an observation are associated with a
single entity such as a customer or a state. Each observation contains either one data
value or a missing-value indicator for each variable.
ODS template
a description of how output should appear when it is formatted. ODS templates are
stored as compiled entries in a template store (item store). Common template types
include STATGRAPH, STYLE, CROSSTABS, TAGSET, and TABLE.
SAS data file
a type of SAS data set that contains data values as well as descriptor information that
is associated with the data. The descriptor information includes information such as
the data types and lengths of the variables, as well as the name of the engine that was
used to create the data.
SAS data set
a file whose contents are in one of the native SAS file formats. There are two types
of SAS data sets: SAS data files and SAS data views. SAS data files contain data
values in addition to descriptor information that is associated with the data. SAS data
views contain only the descriptor information plus other information that is required
for retrieving data values from other SAS data sets or from files whose contents are
in other software vendors' file formats.
SAS data view
a type of SAS data set that retrieves data values from other files. A SAS data view
contains only descriptor information such as the data types and lengths of the
variables (columns) plus other information that is required for retrieving data values
from other SAS data sets or from files that are stored in other software vendors' file
formats. Short form: data view.
SAS format
a type of SAS language element that applies a pattern to or executes instructions for
a data value to be displayed or written as output. Types of formats correspond to the
data's type: numeric, character, date, time, or timestamp. The ability to create userdefined formats is also supported. Examples of SAS formats are BINARY and
DATE. Short form: format.
SAS informat
a type of SAS language element that applies a pattern to or executes instructions for
a data value to be read as input. Types of informats correspond to the data's type:
numeric, character, date, time, or timestamp. The ability to create user-defined
informats is also supported. Examples of SAS informats are BINARY and DATE.
Short form: informat.
Glossary 149
SAS library
one or more files that are defined, recognized, and accessible by SAS and that are
referenced and stored as a unit. Each file is a member of the library.
SAS variable
a column in a SAS data set or in a SAS data view. The data values for each variable
describe a single characteristic for all observations (rows).
SAS XML Mapper
a graphical interface that you can use to create and modify XMLMaps for use by the
SAS XML LIBNAME engine. The SAS XML Mapper analyzes the structure of an
XML document and generates basic XML markup for the XMLMap.
sequence field
a field that identifies and provides access to segments in a database. It contains the
record's key, which is located in the same position in each record of a key-sequenced
data set.
tagset
a template that defines how to create a type of markup language output from a SAS
format. Tagsets produce markup output such as Hypertext Markup Language
(HTML), Extensible Markup Language (XML), and LaTeX.
Uniform Resource Identifier
See URI.
Uniform Resource Locator
See URL.
URI
a string that identifies resources such as files, images, and services on the World
Wide Web. A URL is a type of URI. Short form: URI.
URL
a character string that is used by a Web browser or other software application to
access or identify a resource on the Internet or on an intranet. The resource could be
a Web page, an electronic image file, an audio file, a JavaServer page, or any other
type of electronic object. The full form of a URL specifies which communications
protocol to use for accessing the resource, as well as the directory path and filename
of the resource. Short form: URL.
variable
See SAS variable.
XML
a markup language that structures information by tagging it for content, meaning, or
use. Structured information contains both content (for example, words or numbers)
and an indication of what role the content plays. For example, content in a section
heading has a different meaning from content in a database table. Short form: XML.
XML engine
See XML LIBNAME engine.
XML LIBNAME engine
the SAS engine that processes XML documents. The engine exports an XML
document from a SAS data set by translating the proprietary SAS file format to XML
150
Glossary
markup. The engine also imports an external XML document by translating XML
markup to a SAS data set.
XMLMap file
a file that contains XML tags that tell the SAS XML LIBNAME engine how to
interpret an XML document.
151
Index
A
Access documents
importing 29
ampersand
importing XML documents with 27,
107
apostrophe (')
importing XML documents with 27,
107
ATTRIBUTE element 120
AUTOMAP= option
LIBNAME statement 100
B
beginend= attribute
DECREMENT-PATH element 132
INCREMENT-PATH element 131
RESET-PATH element 132
TABLE-END-PATH element 123
C
CDISC ODM markup
CodeList elements 101
example document 141
exporting XML documents 22
importing XML documents 37
CDISCODM markup 108
character data
non-escaped 27, 107
character sets
specifying 102
class= attribute
COLUMN element 125
COLUMN element 124
column elements 124
columns
selecting best columns 66
concatenated data
avoiding 64
D
data investigation 50
data sets, exporting XML documents from
See exporting XML documents
See exporting XML documents with
XMLMap
data sets, importing XML documents as
See importing XML documents
See importing XML documents with
XMLMap
DATASETS procedure 4
DATATYPE element 126
date values
exporting XML documents containing
11
ISO 8601 informats and formats for
importing 69
datetime values
exporting XML documents containing
11
DECREMENT-PATH element 132
DEFAULT element 127
DESCRIPTION element 128
description= attribute
SXLEMAP element 118
DOM application
XML engine as 6
double quotation marks
importing XML documents with 27,
107
152
Index
E
enclosed attribute format 104
encoding 6, 105
engine nicknames 7, 95, 99
ENUM element 127
errors
when importing XML documents not
created with SAS 7
exporting elements 119
exporting XML documents 5, 9
CDISC ODM markup 22
customized tagset for 86
date, time, and datetime values 11
for Oracle 9
metadata information in separate file
17, 107
numeric values 13
exporting XML documents with
XMLMap 41
hierarchical structure 41
external files
for metadata-related information 108
F
filerefs 106
URL access method for referencing 74
format catalog
libref for 102
replacing format entries 102
FORMAT element 127
FORMATACTIVE= option
LIBNAME statement 101
FORMATLIBRARY= option
LIBNAME statement 102
FORMATNOREPLACE= option
LIBNAME statement 102
formats
importing dates 69
G
generated numeric keys
including key field with 60
generating XMLMap 80
GENERIC markup 96, 108
exporting XML documents containing
date, time, and datetime values 11
importing XML documents 23
physical structure for importing XML
documents 46
H
HEADING element 120
hierarchical data
I
id= attribute
NS element 119
importing XML documents 4, 23
CDISC ODM markup 37
concatenated documents 35
errors when not created with SAS 7
GENERIC markup 23
Microsoft Access documents 29
non-escaped character data 27, 107
numeric values 25
importing XML documents with
XMLMap 45
as multiple data sets 52
as one data set 49
automatically generating XMLMap 80
avoiding concatenated data 64
columns. selecting best 66
importing hierarchical data as related
data sets 56
ISO 8601 informats and formats for
importing dates 69
ISO 8601 informats and formats for
importing time values with time
zone 71
key field with generated numeric keys
60
location path on PATH element 74
namespace elements 77
observation boundary 64, 66
physical structure for GENERIC
markup 46
URL access method for referencing
filerefs 74
INCREMENT-PATH element 130
INDENT= option
LIBNAME statement 102
INFORMAT element 128
informats
importing dates 69
input processing 6
installation
SAS XML Mapper 137
ISO 8601 informats and formats
importing dates 69
importing time values with a time zone
71
K
key field
Index
L
LENGTH element 128
LIBNAME statement, XML 95
engine nicknames 95, 99
exporting XML documents from data
sets 5
functionality enhancements for XMLV2
96
importing XML documents as data sets
4
options 97
required arguments 99
syntax 99
librefs 99
assigning 4
for format catalog 102
location path, specifying on PATH
element 74
M
map files 96
markup languages
See tagsets
menu bar
SAS XML Mapper 137
metadata
exporting XML documents with
separate metadata 17, 107
external file for 108
Microsoft Access documents
importing 29
MSACCESS markup 109
importing XML documents 29
N
name= attribute
ATTRIBUTE element 120
COLUMN element 124
SXLEMAP element 118
TABLE element 121
TABLEREF element 120
namespace elements
importing XML documents with
XMLMap 77
syntax 118
NAMESPACES element 118
ndec= attribute
FORMAT element 128
INFORMAT element 128
nicknames for XML engine 7, 95
153
O
observation boundary
avoiding concatenated data 64
selecting best columns 66
ODS MARKUP destination
XML engine versus 7
ODSCHARSET= option
LIBNAME statement 102
ODSRECSEP= option
LIBNAME statement 103
ODSTRANTAB= option
LIBNAME statement 103
one-to-many relationship 56
open element format 104
Oracle
exporting XML documents for 9
ORACLE markup 109
OUTPUT element 120
output files
translation tables for 103
output processing 6
overriding tagsets 104
P
PATH element 129
specifying location path on 74
physical structure
importing XML documents with
GENERIC markup 46
prefix= attribute
NS element 119
R
read processing 6
record separators
XML documents 103
RESET-PATH element 131
retain= attribute
COLUMN element 125
S
SAS processing
154
Index
U
update processing 6
updating XMLMap 135
URL access method
referencing filerefs 74
V
validating XML documents 6
value= attribute
ATTRIBUTE element 120
version= attribute
SXLEMAP element 117
versions, XML engine 95
W
W3C specifications 27, 96
width= attribute
FORMAT element 128
INFORMAT element 128
windows
SAS XML Mapper 136
T
TABLE element 121
table elements 120
TABLE-DESCRIPTION element 124
TABLE-END-PATH element 122
TABLE-PATH element 121
TABLEREF element 120
TAGSET= option
LIBNAME statement 104
tagsets 85
customized 85
customized, defining with TEMPLATE
procedure 86
customized, exporting XML documents
86
overriding 104
TEMPLATE procedure 85
defining customized tagsets 86
time values
exporting XML documents containing
11
importing with time zone 71
time zones
importing time values with 71
tool bar
SAS XML Mapper 137
transferring XML documents across
environments 6
translation tables
X
XML documents 3
CDISC ODM format 141
concatenated 104
not in required physical structure 49
record separators 103
transferring across environments 6
validating 6
XML documents, exporting
See exporting XML documents
See exporting XML documents with
XMLMap
XML documents, importing
See importing XML documents
See importing XML documents with
XMLMap
XML engine 3
as DOM and SAX applications 6
as sequential access engine 6
how it works 4
nicknames 7
ODS MARKUP destination versus 7
SAS processing supported by 6
versions 95
XML engine version 95
XML map files 96
XML Mapper
Index
155
156
Index