Lecture 23
Lecture 23
Statistics 135
Autumn 2005
Your SAS programs can be written in any text editor, though you will often
want to the built in editor in SAS. Whatever you do it in, you will want to
create an ASCII text file (like you needed with LATEX).
PROC UNIVARIATE;
proc univariate;
PrOc UnIvArIaTe;
• All lowercase
I also suggest that you comment your code There are two formats for
comments
* comments
/* comments */ {c style}
There is one important thing to remember when writing SAS commands.
All commands must end with a ‘;’ (like in c). This allows for commands
to be split across lines, which can help with readability.
There are a number of OPTIONS that can be set in SAS which affect the
output formatting. These include
• DATE | NODATE: Should today’s date appear at the top of each page.
(Default = DATE)
Output Formatting 4
• PAGESIZE = n: The maximum number of lines per page of output
(range = 15 to 32767). (Default can vary)
So to left justify your output and set the pagesize to 50, you can use the
command
If you wish you can split the command across multiple commands
OPTIONS NOCENTER;
OPTIONS PAGESIZE=50;
Output Formatting 5
Data Entry
SAS can read in data from a wide range of formats, including text files
(space, comma, or tab delimited), Excel, Access, DBF, Lotus 1-2-3, etc.
There are two common approaches to inputing data, a DATA step or PROC
IMPORT.
• DATA step:
This can be used for a wide range of text files. Often will be used with
space delimited files (the example last class was), but can be used with
fixed format files and files with other delimiters.
The data file being read in should not include the variables names (as
you can do in S). Instead they are given in the INPUT part of the DATA
step. For example, the file from last time (now named margarine.dat)
looks like
Data Entry 6
1 167
1 171
1 178
1 175
1 184
1 176
and so on
DATA margarine;
INFILE ’margarine.dat’;
INPUT brand time;
This will create a SAS dataset named margarine containing two numeric
variables brand and time.
Data Entry 7
Now suppose that the data set for the example last time (now in a file
margarine.dat) had brand in columns 1-3 and time was in columns
4-8. This can be read in with
DATA margarine;
INFILE ’margarine.dat’;
INPUT brand 1-3 time 4-8;
You can mix variables occurring in fixed columns with others delimited
by space as follows
DATA margarine;
INFILE ’margarine.dat’;
INPUT brand 1-3 time;
This would get brand from the first 3 columns and would start looking
in column 4 to find time, but with no restriction on where it ends.
Data Entry 8
Being able to state which columns is an artifact when storage was
extremely limited or entry was done by punch cards, which are limited to
80 columns. Being able to remove spaces allows you to get more data in
a limited space. Today you don’t want to do it, as storage usually isn’t
a problem and trying to read files like this is tough.
To use a different delimiter, use the DLM option to INFILE. A couple of
possibilities are
Data Entry 9
It is also possible to include your data as part your SAS program. This
can be done with the DATALINES command in a DATA step as follows
DATA uspresidents;
INPUT president $ party $ number;
DATALINES;
Adams F 2
Lincoln R 16
Grant R 18
Kennedy D 35
;
RUN;
Data Entry 10
• PROC IMPORT:
This approach can be used for non-text files and text files with the
variable names in the first row.
To read in a version of the margarine data, but with the variable names
in the first row, the following approach can be used
Data Entry 11
Other possibilities for the DBMS option include
– tab: tab delimited file
– cvs: comma delimited file
– excel: Excel file
– dbf: dBase 5.0, IV, III+, and III files
This option is not needed if the data file name has the standard file
extension (e.g. *.xls for Excel, *.txt for tab delimited, or *.mdb for
Microsoft Access).
Data Entry 12
Data Export
There are two common approaches to exporting data, a DATA step or PROC
EXPORT.
• DATA step:
This is one approach to writing out your data as a text file. An example
is the following
DATA _NULL_;
/* loads in SAS datafile with desired variables */
SET finaldata;
/* give output file name */
FILE ’margarineout.dat’;
/* variables to write to file */
PUT brand time invtime pred z nscores;
RUN;
Data Export 13
This will write the variables in the the file margarineout.dat with each
variables separated by a space. Other delimiters can be used with the
DLM option.
The DATA step can be used to create SASbinary data files. For example
DATA ’margarine2’;
SET finaldata;
RUN;
DATA test2;
SET ’margarine2’;
RUN;
Data Export 14
• PROC EXPORT:
The is effectively the opposite of PROC IMPORT, taking most of the same
options. The main differences OUTFILE is used instead of DATAFILE and
DATA is used instead of OUT.
One example is
This will create a tab delimited file containing all the variables in the
datafile finaldata.
PROC EXPORT will create all the file types that PROC IMPORT will read
in, such as Excel, Lotus 1-2-3, or JMP.
Data Export 15
Manipulating Data
DATA temp;
INFILE ’p147.3’;
INPUT brand time;
invtime = 1/time;
This created a new variable invtime and added it to the variables read in
from the data file p147.3.
The are a wide range of functions (over 400) that can be used to alter
datafiles. They are in the areas of
• Character
Manipulating Data 16
• Financial
• Macro
• Mathematical
• Probability
• Random numbers
• Sample statistics
The following example gives a feeling of what can be done in a SAS DATA
step
Manipulating Data 17
DATA homegarden;
INPUT name $ tomato zucchini peas grapes;
DATALINES;
Gregor 10 2 40 0
Molly 15 5 10 1000
Luther 50 10 15 50
Susan 20 0 . 20
;
zone = 14;
type = ’home’;
zucchini = zucchini * 10;
total = tomato + zucchini + peas + grapes;
pertom = (tomato / total) * 100;
RUN;
Manipulating Data 18