Introduction To SAS Informats and Formats
Introduction To SAS Informats and Formats
SAS 9 lists other informat categories besides the three mentioned. Some of these are
for reading Asian characters and Hebrew characters. The reader is left to explore
these other categories.
SAS provides a large number of informats. The complete list is available in SAS Help
and Documentation. In this text, we will review some of the more common informats
and how to use them. Check SAS documentation for specifics on reading unusual
data.
Chapter 1: Introduction to SAS Informats and Formats 3
7 08/11/2003 12500.02
The following program is used to read the data into a SAS data set. Since variables
are in fixed starting columns, we can use the column-delimited INPUT statement.
data transact;
infile transact;
input @1 id $6. n
@10 tran_date mmddyy10. o
@25 amount 8.2 p
;
run;
Figure 1.1
4 The Power of PROC FORMAT
The ID variable is read in as a character variable using the $6. informat in line n.
The $w. informat tells SAS that the variable is character with a length w. The $w.
informat will also left-justify the variable (leading blanks eliminated). Later in this
section we will compare results using the $CHARw. informat, which retains leading
blanks.
Line o instructs SAS to read in the transaction date (Tran_Date) using the date
informat MMDDYYw. Since each date field occupies 10 spaces, the w. qualifier is
set to 10.
Line p uses the numeric informat 8.2. The w.d informat provides instruction to read
the numeric data having a total width of 8 (8 columns) with two digits to the right of
the decimal point. SAS will insert a decimal point only if it does not encounter a
decimal point in the specified w columns. Therefore, we could have coded the
informat as 8. or 8.2.
The PROC PRINT output is shown here. Note that the Tran_Date variable is now in
terms of SAS date values representing the number of days since the first day of the
year specified in the YEARCUTOFF option (for this run, yearcutoff=1920).
Output 1.1
We can make this example a bit more complicated to illustrate some potential
problems that typically arise when reading from flat files. What if the Amount variable
contained embedded commas and dollar signs? How would we generate
Chapter 1: Introduction to SAS Informats and Formats 5
the code to read in these records? Here is the modified data with the code that reads
the file using the correct informat instruction:
7 08/11/2003 $12,500.02
data transact;
infile transact;
input @1 id $6.
@10 tran_date mmddyy10.
@25 amount comma10.2 n
;
run;
Line n uses the numeric informat named COMMAw.d to tell SAS to treat the
Amount variable as numeric and to strip out leading dollar signs and embedded
comma separators. The PROC PRINT output is shown here:
Output 1.2
Note that the output is identical to the previous run when the data was not embedded
with commas and dollar signs. Also note that the width of the informat in the code is
now larger (10 as opposed to 8 to account for the extra width taken up by commas
and the dollar sign). What seemed like a programming headache was solved simply
6 The Power of PROC FORMAT
by using the correct SAS informat. When you come across nonstandard data, always
check the documented informats that SAS provides.
Now compare what would happen if we changed the informat for the ID variable
from a $w. informat to a $CHARw. informat. Note that the $CHARw. informat will
store the variable with leading blanks.
data transact;
infile transact;
input @1 id $CHAR6.
@10 tran_date mmddyy10.
@25 amount comma10.2
;
run;
Output 1.3
Note that the ID variable now retains leading blanks and is right-justified in the
output.
Chapter 1: Introduction to SAS Informats and Formats 7
data transact2;
set transact;
id_num = input(id,6.); n
The INPUT function in line n returns the numeric variable Id_Num. The line states that
the ID variable is six columns wide and assigns the numeric variable, Id_Num, by
using the numeric w.d informat. Note that when using the INPUT function, we do not
have to specify the d component if the character variable contains embedded decimal
values. The output of PROC PRINT is shown here. Note that the Id_Num is right-
justified as numeric values should be.
tran_
Obs id date amount id_num
Output 1.4
Also note that the resulting informat for the variable assigned using the INPUT function
is set to the type of informat used in the argument. In the above example, since 6. is a
numeric informat, the Id_Num variable will be numeric.
8 The Power of PROC FORMAT
options yearcutoff=1920;
Note that the INPUTC function works like the INPUTN function but uses character
informats. Also note that dates are numeric, even though we use special date
informats to read the values.
data transact;
infile transact;
attrib id informat=$6.
tran_date informat=mmddyy10.
amount informat=comma10.2
;
Chapter 1: Introduction to SAS Informats and Formats 9
input @1 id
@10 tran_date
@25 amount
;
run;
This next example shows how we could also use the INFORMAT statement to read in
the data as well. With SAS there is always more than one way to get the job done.
data transact;
infile transact;
informat id $6.
tran_date mmddyy10.
amount comma10.2
;
input @1 id
@10 tran_date
@25 amount
;
run;
Since formats are primarily used to format output, we will look at how we can use
existing SAS internal formats using the FORMAT statement in PROCs.
10 The Power of PROC FORMAT
data transact;
infile transact;
input @1 id $6.
@10 tran_date mmddyy10.
@25 amount 8.2
;
run;
Output 1.5
Notice that we used a DOLLARw.d format to write out the Amount variable with a
dollar sign and comma separators. If we used a COMMAw.d format, the results
would be similar but without the dollar sign. We see that the COMMAw.d informat
used in Section 1.2.1 has a different function from the COMMAw.d format. The
informat ignores dollar signs and commas while the COMMAw.d format outputs
data with embedded commas without the dollar sign. Check SAS Help and
Documentation when using informats and formats since the same-named informat may
have a different functionality from the same-named format.
Chapter 1: Introduction to SAS Informats and Formats 11
options center;
filename transact 'C:\BBU FORMAT\DATA\TRANS1.DAT';
data transact;
infile transact;
input @1 id $6.
@10 tran_date mmddyy10.
@25 amount 8.2
;
run;
Run the following code to create a new flat file called transact_out.dat:
data _null_; n
set transact; o
file 'c:\transact_out.dat'; p
put @1 id $char6. q
@10 tran_date mmddyy10.
@25 amount 8.2
;
run;
n The data set name _NULL_ is a special keyword. The _NULL_ data set does not
get saved into the workspace. The keyword turns off all the default automatic
output that normally occurs at the end of the DATA step. It is used typically for
writing output to reports or files.
o Use the SET statement to read the transact data into the DATA step.
12 The Power of PROC FORMAT
p Specify the output flat file using the FILE statement. Review SAS documentation for
FILE statement options for specific considerations (i.e., specifying record lengths
for long files, file delimiters, and/or outputting to other platforms such as
spreadsheets).
q Specify the $CHARw. format, but since the ID variable is already left-justified
using the $w. informat, the output would be the same if a $w. format had been
used.
The data file created from the above code is shown here:
Output 1.6
If the user of the file requires the ID variable to be right-justified, the following changes
to the code can accommodate that request. In this code, a new numeric variable
called Id_Num was created, which applies the INPUT function to the character ID
variable.
data _null_;
set transact;
file 'c:\transact_out.dat';
id_num = input(id,6.);
put @1 id_num 6.
@10 tran_date mmddyy10.
@25 amount 8.2
;
run;
7 08/11/2003 12500.02
What if the user calls back requesting that the ID variable have leading zeros? This is
not a problem because SAS has a special numeric format to include leading zeros
called Zw.d. Here is the modified code and the output file:
data _null_;
set transact;
file 'c:\transact_out.dat';
id_num = input(id,6.);
put @1 id_num z6.
@10 tran_date mmddyy10.
@25 amount 8.2
;
run;
The above example is handy to have. Especially if you read zip code data as numeric
and then want to output results with leading zeros in flat files, reports, or PROCs.
For example, what if we have a data set with a 13-digit numeric variable called
Accn_Id and we want to generate a character variable called Char_Accn_Id from the
numeric variable with leading zeros? The following PUT function can be applied in a
DATA step:
char_accn_id = put(accn_id,z13.);
14 The Power of PROC FORMAT
Note that the PUT function always returns a character variable while the INPUT
function returns a type (numeric or character) dependent on the informat used in the
argument.
To make the concept clear, we’ll look at the problem of converting character to
numeric data that was introduced in Section 1.2.2. The INPUT function can be used
to convert character data to numeric. Here is another example and code:
Chapter 1: Introduction to SAS Informats and Formats 15
data test;
input @1 x $5. @7 y $5. @13 z The x and y are characters
1.; in Test.
datalines;
117.7 1.746 1
06.61 97239 2
97123 0.126 3
;;
run;
data test2;
Num_x and num_y are numeric
set test;
num_x = input(x,5.); transformations of x and y in
num_y = input(y,5.); Test2.
Figure 1.2
When we look at the output of the run we get the following results, which at first look
strange:
Output 1.7
16 The Power of PROC FORMAT
It looks like the X variable translated correctly, but when we look at the Y variable we
notice that digits got rounded off in the Num_y variable. Before blaming the INPUT
function in creation of data set Test2, try to increase the width of the numeric display
using a BESTw. format. Here are the code and output:
Output 1.8
With the BESTw. format applied, we see that the character-to-numeric translation
was done correctly.
We can apply the BESTw. format to all numeric variables as shown in the following
change of PROC PRINT, which formats all numeric data in the data set with the
best10. format:
Output 1.9
As a review of this chapter, the following table shows the function and usage of
informats and formats:
More Information
More information about SAS informats and formats can be found in SAS Help and
Documentation.
18 The Power of PROC FORMAT