SAS Notes
SAS Notes
The, DATA steps create or modify SAS datasets and PROC steps tell SAS what
analyses are to be conducted on the dataset. Some programs will not have
a PROC step, but almost all will have a DATA step.
A DATA or PROC section continues until all of the commands in that section are
completed. The end of a section is indicated when another DATA or PROC statement
appears or when SAS encounters a RUN statement.
DATA Step
The DATA step creates a SAS dataset that contains the data along with a "data
dictionary." The data dictionary contains information on the variables and their
properties (whether they are numeric or character, the width of the values at input,
etc.)
The following example creates a SAS data set from raw input:
DATA EXAMPLE1;
INPUT NAME $ SEX $ AGE INCOME;
CARDS;
Susan F 18 12000
Fred M 20 21586
Jane F 19 22232
(many observations omitted)
John M 19 14128
;
Notice that each command line ends with a semicolon. Also, the dollar sign after the
variables NAME and SEX indicate that those variables are character variables and not
numbers. It is also good practice to put a semicolon at the end of the data set. This is
not essential but it does provide a logical break point.
The above DATA step inputs the data but does nothing with it. To conduct an analysis,
we need a PROC statement.
PROC Step
The PROCedure step is used to perform some type of analysis on the data, including
PRINTing it. The following are examples of PROC statements.
PROC PRINT;
PROC MEANS;
VARIABLES AGE INCOME;
RUN;
According to the SAS documentation, the RUN command is optional in some
versions of SAS. Our version of SAS, however, does seem to require it. Together, the
DATA and the PROC steps make up a SAS program.
SAS statements
SAS statements follow certain rules so that the program can understand what you
want. Specifically,
All SAS statements end with a semicolon (;). The semicolon is like a
period at the end of a sentence written in English.
SAS does not distinguish between upper and lower case letters.
Consequently, "PROC" is the same as "Proc." Upper and lower case does
matter with data, however, since the statements SEX= 'F' and SEX= 'f'
are not equivalent.