SPSS Syntax for Data Files
SPSS Syntax for Data Files
Data Files
Course Notes
Edition 1
January 2009
SPSS Syntax
for Data Files
The course covers, in the first session, SPSS commands used to define a simple
SPSS Data file. Most of the data definition concepts have already been discussed in
the first SPSS course SPSS 1: Getting Started with SPSS, course code 1301.
The second session covers matching and merging data and file commands which
build on concepts introduced SPSS 3: Changing Data in SPSS, course code 1303,
and SPSS 4: Changing SPSS data files, course code 1304.
The third session discusses creating cases in SPSS for more complex file types and
extracting cases from data using programming,
How to run SPSS commands or syntax has been discussed in the previous course
SPSS 5: Getting Started with SPSS Syntax, course code 1305. It is strongly
recommended as a prerequisite for this course.
Copyright © IS 2009
Contents
Contents
Data Definition Syntax
Defining Data Commands........................................................................2
Variable Names................................................................................... 2
RENAME VARIABLES........................................................................ 2
Variable Attribute Commands...................................................................3
VARIABLE LABELS............................................................................ 3
Variable Level...................................................................................... 4
VALUE LABELS.................................................................................. 4
MISSING VALUES.............................................................................. 5
FORMATS .......................................................................................... 5
Type declaration.......................................................................................6
NUMERIC Type declaration................................................................ 6
STRING Type declaration................................................................... 6
The DOCUMENTS command..................................................................7
DISPLAY command..................................................................................8
DISPLAY keywords............................................................................. 8
SYSFILE INFO command................................................................... 8
Data reading commands..........................................................................9
DATA LIST command.......................................................................... 9
Try for Yourself....................................................................................... 11
Variable Names
Can be up to 64 bytes long
No spaces
RENAME VARIABLES
Use RENAME VARIABLES to change an existing variable name or a list of
variables.
MISSING VALUES
FORMATS
PRINT, WRITE
TYPE DECLARATION
NUMERIC, STRING
VARIABLE LABELS
The VARIABLE LABELS command attaches a text label to a
variable name.
VALUE LABELS
The VALUE LABELS command adds individual text labels to
variable values.
VALUE LABELS
CHD 0 'No Coronary heart disease'
1 'Has coronary heart disease'.
EXECUTE.
VALUE LABELS
More than one set of value labelling can be done at the same
time.
VALUE LABELS
CHD 0 'No Coronary heart disease'
1 'Has coronary heart disease'
/ famhist 'Y' 'Yes' 'N' 'No'
/ smokes 0 'non-smoker' 1 'Light' 2
'Heavy'.
EXECUTE.
MISSING VALUES
MISSING VALUES command defines up to 3 distinct values or
one value and a value range as missing.
FORMATS
FORMATS is used to reformat existing variables.
DOCUMENTS text.
DOCUMENTS this is a test file.
Added Documents are saved with the data file.
DROP DOCUMENTS
DROP DOCUMENTS command clears all entries.
DISPLAY keyword.
for example
DISPLAY LABELS.
DISPLAY keywords.
DISPLAY LABELS.
Gives a list of variable names and their variable labels
DISPLAY VARIABLES.
Gives a list of variable names and their formats
DISPLAY DICTIONARY.
Gives all available information about the variables
DISPLAY ATTRIBUTES.
Displays attributes defined using VARIABLE ATTRIBUTE &
DATAFILE ATTRIBUTE commands
DISPLAY DOCUMENTS.
For text defined by DOCUMENTS
DISPLAY MACRO.
For any macros
DISPLAY SCRATCH.
For any scratch variables
DISPLAY VECTOR.
for data vectors defined by VECTOR
DATA LIST
Reads data into SPSS from text files
M:/spsswork/cardiac.sps has an outline for the syntax file to start the definition
process.
Session summary
This session covered commands for defining and saving SPSS data files.
There is no menu equivalent of the UPDATE command and the commands for
creating new data. In Session C we will look at these commands for creating new
data
Ascending sort
SORT CASES BY varname (a).
Descending sort
SORT CASES BY varname (d).
Aggregating data
id sa lb eg se x time age sa lno w edleve l work job cat mino rity se xrace
62 8 84 00 0 81 28 .5 16 080 16 0.25 4 0 1
63 0 24 000 0 73 40 .33 41 400 16 12 .5 5 0 1
63 2 10 200 0 83 31 .08 21 960 15 4.08 5 0 1
63 3 87 00 0 93 31 .17 19 200 16 1.83 4 0 1
63 5 17 400 0 83 41 .92 28 350 19 13 5 0 1
63 7 12 996 0 80 29 .5 27 250 18 2.42 4 0 1
jobcat s alnow_1 s albeg_ 1
64 1 69 00 0 79 28 16 080 15 3.17 1 0 1
1 11 134 .82 57 33. 95
64 9 54 00 0 67 28 .75 14 100 15 0.5 1 0 1
2 11 136 .41 54 78. 97
65 0 50 40 0 96 27 .42 12 420 15 1.17 1 0 1
3 12 375 .56 60 31. 11
65 2 63 00 0 77 52 .92 12 300 12 26 .42 3 0 1
4 23 901 .07 99 56. 49
65 3 63 00 0 84 33 .5 15 720 15 6 1 0 1
5 25 595 .63 13 258 .88
65 6 60 00 0 88 54 .33 88 80 12 27 1 0 1
6 26 100 .00 12 837 .60
65 7 10 500 0 93 32 .33 22 000 17 2.67 4 0 1
7 36 691 .67 19 996 .00
65 8 10 800 0 98 41 .17 22 800 15 12 5 0 1
65 9 13 200 0 64 31 .92 19 020 19 2.25 5 0 1
66 0 56 40 0 94 46 .25 12 300 12 20 3 0 1
66 9 13 500 0 81 30 .75 22 200 19 5.17 4 0 1
67 1 69 00 0 72 32 .67 10 380 15 6.92 1 0 1
68 3 63 00 0 70 58 .5 85 20 15 31 1 0 1
68 5 11 004 0 89 34 .17 27 500 17 3.17 4 0 1
69 0 72 00 0 79 46 .58 11 460 15 21 .75 1 0 1
69 6 10 992 0 83 35 .17 20 500 16 5.75 5 0 1
69 7 16 992 0 85 43 .25 27 700 20 11 .17 7 0 1
70 2 87 00 0 65 28 28 000 16 1.58 4 0 1
70 4 13 992 0 65 39 .75 22 000 19 10 .75 5 0 1
70 6 12 804 0 78 30 .08 27 250 19 2.92 4 0 1
70 7 13 992 0 83 30 .17 27 000 17 0.75 5 0 1
70 8 66 00 0 70 44 .5 90 00 12 18 2 0 1
... ... ... ... ... ... ... ... ... ... ...
Aggregate specification
Break Variable(s) - to define the new cases
Break Variable(s)
The number of new cases or break groups are defined by the
number of unique break variable value combinations.
Sort cases by break variables first
Aggregated data
The resulting aggregated data can be saved either:
in a new external SPSS save file, in a new dataset in the data
editor, merged with the active data file in the data editor or
replaces active data file in the data editor
AGGREGATE command
AGGREGATE [OUTFILE={'savfile'|'dataset'}]
{* }
MODE={REPLACE }][OVERWRITE={NO }]
{ADDVARIABLES} {YES}
[/MISSING=..] [/DOCUMENT] [/PRESORTED]
Restructuring files
Casestovars example
ID to specify what creates a case
Varstocases example
MAKE defines variables made from combining a list.
VARSTOCASES
/MAKE reading FROM reading1 reading2
reading3
/INDEX = newfact “new var label” (reading)
/KEEP = caseid age gender group
/NULL = KEEP.
v1 v2 v3 v4 v5 v6 v7
MatchMatch
Files with case
Files matching
with case matching
caseid var1 var3 var4 var5 caseid var2 var6
1 23 2 45 O 1 45 1
2 34 2 45 C 2 67 4
3 26 1 45 O 4 9 2
4 75 1 54 O 5 35 1
LLIfBYyoukeyword
are matching using a key variable or variable list with the
as with the second example, then all the files need
to be sorted by the key variable(s) and saved using the sort
cases and SAVE commands.
LLInkeyword
both examples, using a key variable or variable list with the BY
requires all the files to be sorted by the key variable(s)
and saved using the sort cases and SAVE commands.
Update example
update file="customer"
/file="saltil01"
/file="sales02"
/by custno.
• Use Add Files command to add the data file customers_new.sav to the
customers_model.sav in the C:/Programs Files/SPSS/sample_files/Tutorial
folder
• Repeat the ADD FILES command this time matching by customer_id and
using SORT CASES and SAVE commands to sort by customer_ID and
save both data files. Does it work?
• Now try using UPDATE command to merge the two customer files, how do
the commands differ?
• Use Add Files command to match_response1.sav and match_response2
.sav.
• Use MATCH FILES command to add the demographics variables from the
match_demographics.sav lookup table to the Match_response1.sav dataset
matching by the ID variable. Try using the IN subcommand to create an indicator
variable and remember to use sort cases by id before hand.
For example, Data > Identify Duplicate Cases is used to identify duplicate
cases in a file. With the duplicates.sav dataset try identifying duplicate cases
starting with the menus using ID_household and ID_person variables and
sorting by the interview date, then use Paste button on the dialog box to copy the
underlying commands into a syntax window. Examine these commands to work out
what each command is doing - they are all commands you have come across.
Session summary
This session covers commands used with data files.
The next session, C, is about commands used to create cases and complex data file
structures.
The notes in this session will consist of the contents of slides presented at the
course. The SPSS Programming and Data Management manual chapter 3 section
on Reading Complex Text Files covers the same material, so can also be used for
revision.
One of the building blocks for this programming is the DATA LIST command
detailed in session A, it is revised in the presentation but will not be duplicated in this
session.
MIXED
NESTED
0 6 2 6 4 1 0 7 1 1 6 5 6 2 9 9 4 2 0
0 6 3 2 8 9 9 1 0 1 8 2 7 6 3 3 5 7 1 0
0 6 4 1 9 0 2 0 0 0 0 0 0 2 7 0 . 9 0 4 8 7 5 9 0 0 0 0 7 5 9 2 9 7 2 3 4
7 3 5 2 1 6 4 1 4
Session C: Creating New Cases Grouped data - city data - formatting commands
variable labels
Mixed Data
Nested data
aggregate outfile=*/break=carid/
agecar=first(agecar)/
numinjur=sum(injured).
BREAK.
VECTOR.
END CASE.
END FILE.
REREAD.
Examples of LOOP
COMPUTE X = 0.
LOOP.
. COMPUTE X = X+1.
END LOOP.
LOOP #I = 1 TO 5.
. COMPUTE X = X+1.
END LOOP.
LOOP IF Y<70.
...
END LOOP.
LOOP #I = 1 TO M.
. LOOP #J = 1 TO M.
....
. END LOOP. /* #J
END LOOP. /* #I
examination centre
sex
up to 14 sets of
- subject code
- subject grade (standard, higher, …)
- subject award band 1
- subject award band 2
subtotal check
INPUT PROGRAM
STRING SEX(A1)
VECTOR #S(56)
D A T A L I S T F I L E = S E B D A T A notable /
#CENTRE, #SEX, #PRESTYP, #S1 TO
#S56,#SUBTOTL
(T2,P4.0,T43,A1,P1.0,T49,14(P3.0,P2.0,P2.
0,4X,
P2.0,6X),T321,P2.0)
LOOP #I=1 TO 53 BY 4
. DO IF MISSING(#S(#I))
. BREAK
. END IF
. COMPUTE CENTRE=#CENTRE
. COMPUTE SEX=#SEX / * S E X is ' M ' or ' F '
. COMPUTE PRESTYPE=#PRESTYP
. COMPUTE SUBJNO=#S(#I)
. COMPUTE SUBJGR=#S(#I+1)
. COMPUTE SUBJRN=#S(#I+2)
. COMPUTE SUBJOR=#S(#I+3)
. END CASE
END LOOP
END INPUT PROGRAM
RECODE SUBJNO
(0019=0001)(0030=0002)(0069=0003)(0070=0004)
(0090=0005)
< lines omitted >
(7905=0185)(8000=0186)
(ELSE=0) INTO SUBJNOA
RECODE SEX('M'=1)('F'=2)(ELSE=0) INTO SEXA
VALUE LABELS
CENTREA 1 " A inslie P ark H igh " 2 " A rmadale
A cademy "
< lines omitted >
5 0 " W ester H ailes E C " 51
" W hitburn A cademy ” /
SEXA 1 ' male ' 2 ' female ' /
SUBJGR 1 ' ordinary ’ 2 ' standard ’ . . . 9
' short course ' /
PRINT FORMATS CENTREA(F2.0)/SEXA(F1.0)/
SUBJNOA(F3.0)
SAVE OUTFILE=SEBSAVE
/ D R O P = C E N T R E , S U B J N O , S E X / * drop
un‑recoded vars
All SPSS syntax or command files end in .sps extension - double click on one of
them and SPSS should open them in a syntax window.
grouped data
• GROUPED.DAT with GROUPED.SPS
• patientgroup.txt with patientgroup.sps
• Cities.txt with Cities.sps
mixed data
• MIXED.DAT with mixed.sps
nested data
• Accident.txt with Accident.sps
input program
• children.dat with children.sps
• input1.sps (the data is incorporated in the syntax file and read into SPSS
using commands Begin data. and end Data.)
LLCheck the file paths for any data files read in using syntax - it's
the biggest source of errors is not being able to find the file.
These examples use M:/spsswork/ in front of the file name - you
may need to change this if not using Is skills machines.