0% found this document useful (0 votes)
22 views57 pages

SPSS Syntax for Data Files

The document provides course notes for SPSS Syntax for Data Files, detailing commands for defining and managing data files in SPSS. It covers variable names, attributes, data reading commands, and merging data, with practical examples and exercises. The content is structured into sessions, with a focus on data definition and merging techniques, and includes references to additional resources for further learning.

Uploaded by

maria barradas
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
22 views57 pages

SPSS Syntax for Data Files

The document provides course notes for SPSS Syntax for Data Files, detailing commands for defining and managing data files in SPSS. It covers variable names, attributes, data reading commands, and merging data, with practical examples and exercises. The content is structured into sessions, with a focus on data definition and merging techniques, and includes references to additional resources for further learning.

Uploaded by

maria barradas
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 57

SPSS Syntax for

Data Files

Course Notes

Edition 1
January 2009
SPSS Syntax
for Data Files

Edition 1, January 2009


Document Number: 3638-2009
iv
Preface
Preface
The course notes in this workbook are based on the slides presenting the syntax/
commands for the course, SPSS 6: SPSS Syntax for Data Files. More details
about each command can be found in the SPSS syntax guide which comes with
the SPSS installation in PDF format. It can be opened from the menus in SPSS by
choosing Help > Command Syntax Reference or it can be downloaded from
the IS skills site as a link from the course notes at:
http://www.ucs.ed.ac.uk/usd/cts/catalogue/ISCatData.html following
the Statistical Analysis link and then choosing SPSS 14 manuals.

SPSS Programming and Data Management 3rd edition by Raynald Levesque


is also available in the SPSS 14 manuals section and is a comprehensive and very
useful follow-on from this course. More details in the course information booklet.

The course covers, in the first session, SPSS commands used to define a simple
SPSS Data file. Most of the data definition concepts have already been discussed in
the first SPSS course SPSS 1: Getting Started with SPSS, course code 1301.

The second session covers matching and merging data and file commands which
build on concepts introduced SPSS 3: Changing Data in SPSS, course code 1303,
and SPSS 4: Changing SPSS data files, course code 1304.

The third session discusses creating cases in SPSS for more complex file types and
extracting cases from data using programming,

How to run SPSS commands or syntax has been discussed in the previous course
SPSS 5: Getting Started with SPSS Syntax, course code 1305. It is strongly
recommended as a prerequisite for this course.

If you require this document in an alternative format, such as larger


print, please contact Fiona Kneale on 0131 650 3350 or email
[email protected].

Copyright © IS 2009

Permission is granted to any individual or institution to use, copy or redistribute this


document whole or in part, so long as it is not sold for profit and provided that the
above copyright notice and this permission notice appear in all copies.

Where any part of this document is included in another document, due


acknowledgement is required.
SPSS 6: SPSS Syntax for Data Files
v

Contents

Contents
Data Definition Syntax
Defining Data Commands........................................................................2
Variable Names................................................................................... 2
RENAME VARIABLES........................................................................ 2
Variable Attribute Commands...................................................................3
VARIABLE LABELS............................................................................ 3
Variable Level...................................................................................... 4
VALUE LABELS.................................................................................. 4
MISSING VALUES.............................................................................. 5
FORMATS .......................................................................................... 5
Type declaration.......................................................................................6
NUMERIC Type declaration................................................................ 6
STRING Type declaration................................................................... 6
The DOCUMENTS command..................................................................7
DISPLAY command..................................................................................8
DISPLAY keywords............................................................................. 8
SYSFILE INFO command................................................................... 8
Data reading commands..........................................................................9
DATA LIST command.......................................................................... 9
Try for Yourself....................................................................................... 11

Syntax for Merging Data


Data Handling Commands ....................................................................14
Sort Cases command ............................................................................15
SPLIT FILE command............................................................................16
Aggregating data....................................................................................17
Restructuring files...................................................................................19
Transforming cases to variables.............................................................20
Transforming Variables to cases............................................................21
Transposing data files............................................................................21
Merging SPSS data files........................................................................22
ADD FILES command............................................................................24
MATCH FILES command.......................................................................25
UPDATE.................................................................................................26
Try for Yourself.......................................................................................27

Creating New Cases


Commands for Creating Data.................................................................30
Creating Cases using File types.............................................................31
Grouped data ........................................................................................32
Mixed Data.............................................................................................37
Nested data............................................................................................39
Creating Cases using programming.......................................................43
SEB data example..................................................................................46
Try for Yourself.......................................................................................50

SPSS 6: SPSS Syntax for Data Files


vi

SPSS 6: SPSS Syntax for Data Files


Session A:
Data Definition
Syntax
2

Session A: Data Definition Syntax


Defining Data Commands
The commands or syntax in this session cover the data definition concepts
discussed in SPSS 1: Getting Started with SPSS, course code 1301.

Variable Names
Can be up to 64 bytes long

Up to 32 character in double byte languages such as Japanese,


Chinese and Korean

Up to 64 characters in single byte languages

Must start with a letter, @, # or $

No spaces

Fullstop ., underscore _, @, # or $ can all be used a variable


name.

Avoid ending a name with fullstop . or underscore _

System variables start with a $

Temporary variables start with a #

Variable names can be created with commands,


DATA LIST, KEYED DATA LIST, MATRIX DATA, NUMERIC,
STRING, COMPUTE, RECODE and COUNT

Some keywords cannot be used as variable names


ALL AND BY EQ GE GT LE LT NE NOT OR TO WITH

Names can mix upper and lower case letters

Use RENAME VARIABLES to change a name

RENAME VARIABLES
Use RENAME VARIABLES to change an existing variable name or a list of
variables.

RENAME VARIABLES oldname = newname.

RENAME VARIABLES oldvarlist = newvarlist.


RENAME VARIABLES sex = gender.
RENAME VARIABLES var00001 var00002 var00003
= caseid treatment scale .
EXECUTE.

SPSS 6: SPSS Syntax for Data Files


3

Variable Attribute Commands

Session A: Data Definition Syntax


VARIABLE
LABELS, ALIGNMENT, LEVEL, WIDTH

VALUE LABELS, ADD VALUE LABELS

MISSING VALUES

FORMATS
PRINT, WRITE

TYPE DECLARATION
NUMERIC, STRING

VARIABLE LABELS
The VARIABLE LABELS command attaches a text label to a
variable name.

VARIABLE LABELS varname 'text'.


VARIABLE LABELS score1 'First test score'.
EXECUTE.
Many variables can be labelled using a single VARIABLE
LABELS command.
VARIABLE LABELS
score1 'First test score'
score2 'Second test score'
score3 'Third test score'
score4 'Fourth test score'
EXECUTE.
Variable labels can be up to 255 characters in length. Command
lines can now be 255 bytes in length, but text can to be split as
follows
VARIABLE LABELS
attitud1 'Q1: how happy are you with'+
' your present supplier of … '
attitud2 'Q2: how happy are you with'+
' the quality of…. '.
EXECUTE.

SPSS 6: SPSS Syntax for Data Files


4

Session A: Data Definition Syntax Variable Level


The VARIABLE LEVEL command changes the measurement
level associated with a variable name. Measurement levels are
scale, ordinal or nominal

VARIABLE LEVEL varlist (level keyword).


VARIABLE LEVEL group (NOMINAL).
EXECUTE.
The VARIABLE LEVEL command changes the measurement
level associated with a variable name. Measurement levels are
scale, ordinal or nominal
VARIABLE LEVEL group (NOMINAL)
/height weight (SCALE)
/attitud1 to attitud6 (ORDINAL).
EXECUTE.

VALUE LABELS
The VALUE LABELS command adds individual text labels to
variable values.

VALUE LABELS varname value1 'Label 1' value2 'Label 2'

value3 'Label 3'.

VALUE LABELS
CHD 0 'No Coronary heart disease'
1 'Has coronary heart disease'.
EXECUTE.
VALUE LABELS
More than one set of value labelling can be done at the same
time.
VALUE LABELS
CHD 0 'No Coronary heart disease'
1 'Has coronary heart disease'
/ famhist 'Y' 'Yes' 'N' 'No'
/ smokes 0 'non-smoker' 1 'Light' 2
'Heavy'.
EXECUTE.

SPSS 6: SPSS Syntax for Data Files


5

ADD VALUE LABELS

Session A: Data Definition Syntax


Running VALUE LABELS again will clear all current value labels.

ADD VALUE LABELS will change or add a few labels to a


variable only for specified values without loosing previously
defined labels for other values.
ADD VALUE LABELS
REGION 20345 'Dunedin' 20346 'Pluto'.
EXECUTE.

MISSING VALUES
MISSING VALUES command defines up to 3 distinct values or
one value and a value range as missing.

MISSING VALUES varlist (value).


MISSING VALUES attitud1 (8, 9).
MISSING VALUES income (lo thru -1) / kid1 to
kid10("xxx") .

FORMATS
FORMATS is used to reformat existing variables.

FORMATS varname (format).


FORMATS salbeg salnow (dollar15.2).
FORMATS bigvar (f20.5)
/nofrac (f8.0).

There are two other FORMATS commands.


PRINT FORMATS is used to change the format of variable values
in the output viewer.

WRITE FORMATS is used to change the format of variable


values written out to a data file.

FORMATS command changes the data format both for PRINT


and WRITE at the same time.

SPSS 6: SPSS Syntax for Data Files


6

Session A: Data Definition Syntax


Type declaration
Sometimes new variables need to be declared when working with
command language.

NUMERIC for all types of numeric variables


NUMERIC saldif (dollar15.2).
STRING for string variables
STRING name (A40).

NUMERIC Type declaration


NUMERIC is used to declare all types of numeric variables Any
numeric format can be used, including date formats.

NUMERIC varlist (numeric format).


NUMERIC saldif (dollar15.2).
NUMERIC date1 to date10 (edate11) /perdif
(pct8.2)

STRING Type declaration


String declaration requires the new variable name and the length
of the new variable.

STRING varlist (An).


STRING name (A40).
STRING name kid1 to kid10(A40) /
address (A100).

SPSS 6: SPSS Syntax for Data Files


7

The DOCUMENTS command

Session A: Data Definition Syntax


The DOCUMENTS command attaches text notes to the data file
which can be saved with the data file.

DOCUMENTS text.
DOCUMENTS this is a test file.
Added Documents are saved with the data file.

Document command can be run again and again


Each run adding a text entry to the active file.
Each document entry is date stamped.

It does not overwrite previous documents.

Existing documents can be viewed and modified through the


menus:
Utilities > Data File Comments…

DROP DOCUMENTS
DROP DOCUMENTS command clears all entries.

SPSS 6: SPSS Syntax for Data Files


8

Session A: Data Definition Syntax


DISPLAY command
DISPLAY is used to show information about the variables in your
SPSS file.

DISPLAY keyword.

for example
DISPLAY LABELS.

DISPLAY keywords.
DISPLAY LABELS.
Gives a list of variable names and their variable labels
DISPLAY VARIABLES.
Gives a list of variable names and their formats
DISPLAY DICTIONARY.
Gives all available information about the variables
DISPLAY ATTRIBUTES.
Displays attributes defined using VARIABLE ATTRIBUTE &
DATAFILE ATTRIBUTE commands
DISPLAY DOCUMENTS.
For text defined by DOCUMENTS
DISPLAY MACRO.
For any macros
DISPLAY SCRATCH.
For any scratch variables
DISPLAY VECTOR.
for data vectors defined by VECTOR

SYSFILE INFO command


SYSFILE INFO displays information about everything saved in a
named data file.
SYSFILE INFO file=‘otherfile.sav’.
Includes the data dictionary similar with output similar to DISPLAY
DICTIONARY.

SPSS 6: SPSS Syntax for Data Files


9

Data reading commands

Session A: Data Definition Syntax


GET commands
Reads data into SPSS from an existing SPSS data file (.sav)1

DATA LIST
Reads data into SPSS from text files

DATA LIST command


DATA LIST is used to read raw data into the data editor window.
It provides the following information:
Name of the file containing the raw data
Format of data
Number of records (lines of data) per case
Names of variables
Location (position on line) of each variable

DATA LIST example


DATA LIST / ID 1-2, SEX 5(A), WEIGHT 9-12, HEIGHT 16-18(2).
ID SEX WEIGHT HEIGHT
01 M 62.5 175
02 F 59.3 152
03 F 51.0 167
04 M 60.8 171

DATA LIST example


DATA LIST / ID 1-2, SEX 3(A), WEIGHT 4-7, HEIGHT 8-10(2).
01M62.5175
02F59.3152
03F51.0167
04M60.8171

DATA LIST - general form


DATA LIST [FILE=name] [{FIXED}] {RECORDS=n}
{FREE }
{("delimiter",..,TAB)}]
{LIST }
[{TABLE }]
{NOTABLE}
/1 variable location(type), variable

1 Covered in SPSS 5 session B


SPSS 6: SPSS Syntax for Data Files
10

Session A: Data Definition Syntax location(type),


/2 variable location(type), …
.
/n variable location(type), etc.

SPSS 6: SPSS Syntax for Data Files


11

Try for Yourself

Session A: Data Definition Syntax


Try using some of the data definition commands to create an SPSS data file from
the text file, cardiac.dat. Details of the variables in the cardiac data set can be
found in the course information booklet.

M:/spsswork/cardiac.sps has an outline for the syntax file to start the definition
process.

Outline for defining the cardiac set


TITLE 'EXERCISE FOR DATA INPUT AND
DEFINITION'.
DATA LIST FILE 'M:/spsswork/cardiac.dat'
/EDUYR 11-12 .
VARIABLE LABELS EDUYR 'YEARS IN EDUCATION'
CIGARET 'NO OF CIGARETTES PER DAY IN 1958'
HEIGHT 'STATURE, 1958 - TO NEAREST 0.1
INCH'
WEIGHT 'BODY WEIGHT, 1958 - LBS'
DAYOFWK 'DAY OF DEATH'.
VALUE LABELS DAYOFWK 1 'SUNDAY' 2 'MONDAY'
3 'TUESDAY' 4 'WEDNESDAY'
/ FAMHIST 'N' 'NO' 'Y' 'YES'.
MISSING VALUES DAYOFWK (9).
SAVE OUTFILE='M:/spsswork/cardiacdef.sav'.

SPSS 6: SPSS Syntax for Data Files


12

Session A: Data Definition Syntax

Session summary
This session covered commands for defining and saving SPSS data files.

• Data definition commands: VARIABLE NAMES, VARIABLE LABELS,


VARIABLE LEVEL, VALUE LABELS, MISSING VALUES, STRING, NUMERIC,
FORMATS, RENAME VARIABLES.
• Data file information commands, DOCUMENTS, DISPLAY and SYSFILE INFO.
• Data reading commands, GET and DATA LIST.

The next session, B, covers commands for merging data files.

SPSS 6: SPSS Syntax for Data Files


Session B
Syntax for
Merging Data
14

Session B: Syntax for Merging Data


Data Handling Commands
The commands in this session are for the most part syntax versions and syntax
extensions of the procedures detailed in SPSS 4: Changing SPSS Data Files,
course code 1304.

Data Handling commands


Rearranging and Restructuring data
Sort Cases Split File Aggregate
Casetovars Varstocases Transpose
Merging Data
A d d F i l e s Match Files Update
Creating new data
D a t a L i s t F i l e T y p e
Input Program

There is no menu equivalent of the UPDATE command and the commands for
creating new data. In Session C we will look at these commands for creating new
data

SPSS 6: SPSS Syntax for Data Files


15

Sort Cases command

Session B: Syntax for Merging Data


SORT CASES by varlist
BY keyword
variable name or variable list.
EXECUTE. command.

Ascending sort
SORT CASES BY varname (a).
Descending sort
SORT CASES BY varname (d).

Sort Cases examples


sort cases by jobcat.
execute.

sort cases by jobcat salnow.


execute.

sort cases by jobcat (d) salnow (a).


execute.

SPSS 6: SPSS Syntax for Data Files


16

Session B: Syntax for Merging Data


SPLIT FILE command
Split File by varlist.

Switches Split file on


Preceded by SORT CASES command (above)
SPLIT FILE command
Optional LAYERED (assumed) or SEPARATE keyword
BY keyword
variable name or variable list

Split file off.


Switches Split file off
SPLIT FILE command
OFF keyword

Split File example


Sort cases by jobcat.
split file by jobcat.
means salnow by sex by minority.
split file off.

Split File output


Employment Sex Minority Mean N Std. Deviation
Clerical Males White 13057.65 75 3251.459
Nonwhite 11752.57 35 2561.801
Total 12642.40 110 3097.955
Females White 9890.12 85 2839.324
Nonwhite 9258.75 32 1719.519
Total 9717.44 117 2589.957
Total White 11374.90 160 3419.587
Nonwhite 10561.49 67 2518.889
Total 11134.82 227 3196.569
Office trainee Males White 13092.23 35 3839.482
Nonwhite 11080.00 12 1086.747
Total 12578.47 47 3459.044
Females White 10501.78 81 1894.676
Nonwhite 9090.00 8 972.772
Total 10374.88 89 1871.799
Total White 11283.38 116 2877.801
Nonwhite 10284.00 20 1425.772
Total 11136.41 136 2732.603
Security officer Males White 12471.43 14 663.497
Nonwhite 12272.31 13 1025.168
Total 12375.56 27 845.847
Total White 12471.43 14 663.497
Nonwhite 12272.31 13 1025.168
Total 12375.56 27 845.847
College trainee Males White 24916.97 33 5321.938
Nonwhite 31400.00 1 .
Total 25107.65 34 5357.324
Females White 18040.57 7 3171.266
Total 18040.57 7 3171.266
Total White 23713.60 40 5638.122
Nonwhite 31400.00 1 .
Total 23901.07 41 5695.148
Exempt employee Males White 25570.71 28 7152.955
Nonwhite 31880.00 2 11483.414
Total 25991.33 30 7399.031
Females White 19660.00 2 4299.209
Total 19660.00 2 4299.209
Total White 25176.67 30 7107.904
SPSS 6: SPSS Syntax for Data Files Nonwhite 31880.00 2 11483.414
Total 25595.62 32 7364.404
MBA trainee Males White 26916.67 3 3003.470
17

Aggregating data

Session B: Syntax for Merging Data


The aggregate command essentially summarises an existing data set. It can be
thought of as the data equivalent of a summary table.

id sa lb eg se x time age sa lno w edleve l work job cat mino rity se xrace
62 8 84 00 0 81 28 .5 16 080 16 0.25 4 0 1
63 0 24 000 0 73 40 .33 41 400 16 12 .5 5 0 1
63 2 10 200 0 83 31 .08 21 960 15 4.08 5 0 1
63 3 87 00 0 93 31 .17 19 200 16 1.83 4 0 1
63 5 17 400 0 83 41 .92 28 350 19 13 5 0 1
63 7 12 996 0 80 29 .5 27 250 18 2.42 4 0 1
jobcat s alnow_1 s albeg_ 1
64 1 69 00 0 79 28 16 080 15 3.17 1 0 1
1 11 134 .82 57 33. 95
64 9 54 00 0 67 28 .75 14 100 15 0.5 1 0 1
2 11 136 .41 54 78. 97
65 0 50 40 0 96 27 .42 12 420 15 1.17 1 0 1
3 12 375 .56 60 31. 11
65 2 63 00 0 77 52 .92 12 300 12 26 .42 3 0 1
4 23 901 .07 99 56. 49
65 3 63 00 0 84 33 .5 15 720 15 6 1 0 1
5 25 595 .63 13 258 .88
65 6 60 00 0 88 54 .33 88 80 12 27 1 0 1
6 26 100 .00 12 837 .60
65 7 10 500 0 93 32 .33 22 000 17 2.67 4 0 1
7 36 691 .67 19 996 .00
65 8 10 800 0 98 41 .17 22 800 15 12 5 0 1
65 9 13 200 0 64 31 .92 19 020 19 2.25 5 0 1
66 0 56 40 0 94 46 .25 12 300 12 20 3 0 1
66 9 13 500 0 81 30 .75 22 200 19 5.17 4 0 1
67 1 69 00 0 72 32 .67 10 380 15 6.92 1 0 1
68 3 63 00 0 70 58 .5 85 20 15 31 1 0 1
68 5 11 004 0 89 34 .17 27 500 17 3.17 4 0 1
69 0 72 00 0 79 46 .58 11 460 15 21 .75 1 0 1
69 6 10 992 0 83 35 .17 20 500 16 5.75 5 0 1
69 7 16 992 0 85 43 .25 27 700 20 11 .17 7 0 1
70 2 87 00 0 65 28 28 000 16 1.58 4 0 1
70 4 13 992 0 65 39 .75 22 000 19 10 .75 5 0 1
70 6 12 804 0 78 30 .08 27 250 19 2.92 4 0 1
70 7 13 992 0 83 30 .17 27 000 17 0.75 5 0 1
70 8 66 00 0 70 44 .5 90 00 12 18 2 0 1
... ... ... ... ... ... ... ... ... ... ...

Aggregate specification
Break Variable(s) - to define the new cases

Aggregate Variables - Summary statistics of original variables for


each break group.

Aggregated data - What happens to the new data

Break Variable(s)
The number of new cases or break groups are defined by the
number of unique break variable value combinations.
Sort cases by break variables first

Aggregate Variables - can be summaries based on: number of


cases,.statistical summary functions or proportions or fractions.

Aggregate simple summaries


Simple summary statistics break group: Mean, SD Standard
deviation, Minimum, Maximum
First value, Last value
Number of cases in break group: N weighted, NU unweighted,
NMISS missing or NUmiss non-missing
Sum of values

SPSS 6: SPSS Syntax for Data Files


18

Session B: Syntax for Merging Data Aggregate range summaries


Percentage of cases in the break group:
PGT above a value, PLT below a value, PIN within a range of
values and POUT outside a range of cases
Fraction of cases:
FGT above a value, FLT below a value, FIN within a range of
values and FOUT outside a range of cases

Aggregated data
The resulting aggregated data can be saved either:
in a new external SPSS save file, in a new dataset in the data
editor, merged with the active data file in the data editor or
replaces active data file in the data editor

AGGREGATE command
AGGREGATE [OUTFILE={'savfile'|'dataset'}]
{* }
MODE={REPLACE }][OVERWRITE={NO }]
{ADDVARIABLES} {YES}
[/MISSING=..] [/DOCUMENT] [/PRESORTED]

/BREAK= varlist [({A})] [varlist…]


{D}
/aggvar ['label'] aggvar ['label']...
= function(arguments)
[/aggvar ...]

AGGREGATE Command examples

AGGREGATE OUTFILE = "avclass.sav"


/BREAK = SCHOOL CLASS
/av_eng av_math =MEAN(english math)
/loweng lowmath =MIN(english math)
/topeng topmath =MAX(english math)
/p1eng "English %passed grade 1"
=PIN(english,1,1)
/nclass "class size" =n.

SPSS 6: SPSS Syntax for Data Files


19

Restructuring files

Session B: Syntax for Merging Data


The Restructure Data Wizard equivalents are:
Casetovars Converts “long” data files into “wide” data files
Varstocases Converts “wide” data files into “long” data files
Transpose Transposes a data set, variables become
cases and cases become variables

Visualization of Restructure procedures in the wizard

SPSS 6: SPSS Syntax for Data Files


20

Session B: Syntax for Merging Data


Transforming cases to variables

Casestovars Command Syntax


CASESTOVARS
[/ID = varlist]
[/FIXED = varlist]
[/AUTOFIX = {YES**}]
{NO }
[/VIND [ROOT = rootname]]
[/COUNT = new variable ["label"]]
[/RENAME varname=rootname varname=rootname
...]
[/SEPARATOR = {"." }]
{"string"}]
[/INDEX = varlist]
[/GROUPBY = {VARIABLE**}]
{INDEX }]
[/DROP = varlist]

Casestovars example
ID to specify what creates a case

INDEX defines new variables

GROUPBY groups new variables by INDEX or original variables

Sort by ID and INDEX variables first.

SORT CASES BY caseid repeat.


CASESTOVARS /ID =caseid
/INDEX = repeat
/GROUPBY = INDEX
/VIND ROOT = “root”.

SPSS 6: SPSS Syntax for Data Files


21

Transforming Variables to cases

Session B: Syntax for Merging Data


Varstocases Command Syntax
VARSTOCASES
/MAKE new variable ["label"] [FROM] varlist
[/MAKE ...]
[ / I N D E X = { n e w v a r i a b l e [ " l a b e l " ] }]
{new variable ["label"] m a k e
variable name)}
{new variable ["label"] (n)
new variable ["label"] (n) ...}
[/ID = new variable ["label"]]
[/NULL = {DROP**}]
{KEEP }
[/COUNT=new variable ["label"]]
[/KEEP={ALL** }] [/DROP=varlist]
{varlist}
**Default if the subcommand is omitted.

Varstocases example
MAKE defines variables made from combining a list.

INDEX defines how the new index variable.

KEEP or DROP used to specify what variables are to be seen in


the new file.

NULL checks for new cases for null values.

VARSTOCASES
/MAKE reading FROM reading1 reading2
reading3
/INDEX = newfact “new var label” (reading)
/KEEP = caseid age gender group
/NULL = KEEP.

Transposing data files

Flip Command Syntax


FLIP [[VARIABLES=] {ALL }]
{varlist}
[/NEWNAMES=variable]

SPSS 6: SPSS Syntax for Data Files


22

Session B: Syntax for Merging Data


Merging SPSS data files
With command language, you can combine:
• consecutive files - (different cases, same variables)
• parallel files - (same cases, different variables)
• non-parallel files - (roughly same cases, may be different variables)
• provide table look-up between files.
• one or more files update the contents of a source file.

Add Files command


equivalent to menus Data > Merge FIles > Add Cases
Add Files
v1 v2 v3
1
2
3
4 v1 v2 v3
5 1
6 2
3
4
5
6
7
v1 v2 v3 8
7 9
8
9

Add Files command with different variables


Add Files with different variables
caseid var1 var3 var4 var5 caseid var2 var3 var4 var5
1 23 2 45 O 5 45 1 45 C
2 34 2 45 C 6 67 1 54 O
3 26 1 45 O 7 87 2 45 C
4 75 1 54 O 1 . 1 54 O

caseid var1 var2 var3 var4 var5


1 23 . 2 45 O
2 34 . 2 45 C
3 26 . 1 45 O
4 75 . 1 54 O
5 . 45 1 45 C
6 . 67 1 54 O
7 . 87 2 45 C
1 . . 1 54 O

SPSS 6: SPSS Syntax for Data Files


23

Match Files command

Session B: Syntax for Merging Data


equivalent to menus Data > Merge FIles > Add Variables
Match Files
v1 v2 v3
v4 v5 v6 v7

v1 v2 v3 v4 v5 v6 v7

MatchMatch
Files with case
Files matching
with case matching
caseid var1 var3 var4 var5 caseid var2 var6
1 23 2 45 O 1 45 1
2 34 2 45 C 2 67 4
3 26 1 45 O 4 9 2
4 75 1 54 O 5 35 1

caseid var1 var2 var3 var4 var5 var6


1 23 45 2 45 O 1
2 34 67 2 45 C 4
3 26 . 1 45 O .
4 75 9 1 54 O 2
5 . 35 . . 1

Match Files via look-up table


Match Files via look-up table
city state V1 V2 state V3
Albuquerque NM AK
Atlanta GA AL
Austin TX AR
Baltimore MD AZ
Birmingham AL CA
Boston MA …
… … WV
Tucson AZ WY
Washington DC city state V1 V2 V3
Albuquerque NM
Atlanta GA
Austin TX
Baltimore MD
Birmingham AL
Boston MA
… …
Tucson AZ
Washington DC

SPSS 6: SPSS Syntax for Data Files


24

Session B: Syntax for Merging Data


ADD FILES command
equivalent to menus Data > Merge FIles > Add Cases

Add Files file="file1" / file="file2" .

Add Files Syntax


ADD FILES FILE= {file }
{* }
[/RENAME=(old varlist=new varlist)...]
[/IN=varname]
[/FILE=...]
[/BY varlist]
[/MAP]
[/KEEP= {ALL }] [/DROP=varlist]
{varlist}
[/FIRST=varname] [/LAST=varname]

ADD FILES examples


Add files file="school1"
/file="school2"
/file="school3".

Add files file="register"


/file="newcases"
/in = newcase
/by caseid.

LLIfBYyoukeyword
are matching using a key variable or variable list with the
as with the second example, then all the files need
to be sorted by the key variable(s) and saved using the sort
cases and SAVE commands.

SPSS 6: SPSS Syntax for Data Files


25

MATCH FILES command

Session B: Syntax for Merging Data


equivalent to menus Data > Merge FIles > Add Variables

MATCH FILES file= * /file="filename".

MATCH FILES command syntax


MATCH FILES {FILE } = {file}
{ T A B L E } { * }
[/RENAME=(old varlist=new varlist)...]
[/IN=varname]
[/ {FILE }=...]
{TABLE }
[/BY varlist]
[/MAP]
[/KEEP= {ALL }] [/DROP=varlist]
{varlist}
[/FIRST=varname] [/LAST=varname]

MATCH FILES example


match files file="inter1"
/file="inter2"
/by respond
/map.

match files file="order"


/table="prices"
/by product.

LLInkeyword
both examples, using a key variable or variable list with the BY
requires all the files to be sorted by the key variable(s)
and saved using the sort cases and SAVE commands.

SPSS 6: SPSS Syntax for Data Files


26

Session B: Syntax for Merging Data


UPDATE
UPDATE file="file1" /file="update" /by key
UPDATE FILE={Master File}
{* }
[/RENAME=(old varlist=new varlist)...]
[/IN=varname]
/FILE={Transaction File1}
{* }
[/FILE=Transaction File2]
/BY Key Variables
[/KEEP={ALL }]
{varlist}
[/DROP=varlist]
[/MAP]

Update example
update file="customer"
/file="saltil01"
/file="sales02"
/by custno.

LLUsing a key variable or key variable list with the BY keyword


requires all the files to be sorted by the key variable(s) and saved
using the sort cases and SAVE commands.

SPSS 6: SPSS Syntax for Data Files


27

Try for Yourself

Session B: Syntax for Merging Data


Try producing command versions of the exercises we used in SPSS 4. All the data
files should be in the spsswork folder unless otherwise specified. Remember
always to save the file to a different file name, using the SAVE command, at the
end to preserve the original data.

For restructuring files


• Aggregate.command to condense the casestovars.sav data file into a new
household level data file, using id_household as a break variable.
• casestovars command with the casestovars.sav data file to create a
household level data file this time with variables for each household member.
• varstocases command with the varstocases.sav data file with two index
variables V (2 levels) and time (3 levels)..
• varstocases command with the anxiety 2.sav data set to try and get back
to the original anxiety.sav data set.

For matching and merging files


Remember to sort the files by the key variables you are using when matching by key
variables and save the sorted files under a different name before matching.

• Use Add Files command to add the data file customers_new.sav to the
customers_model.sav in the C:/Programs Files/SPSS/sample_files/Tutorial
folder
• Repeat the ADD FILES command this time matching by customer_id and
using SORT CASES and SAVE commands to sort by customer_ID and
save both data files. Does it work?
• Now try using UPDATE command to merge the two customer files, how do
the commands differ?
• Use Add Files command to match_response1.sav and match_response2
.sav.
• Use MATCH FILES command to add the demographics variables from the
match_demographics.sav lookup table to the Match_response1.sav dataset
matching by the ID variable. Try using the IN subcommand to create an indicator
variable and remember to use sort cases by id before hand.

Using menus to generate commands


Some utilities available in SPSS using menus involve a number of underlying SPSS
commands, so it can be useful to start off the command version by pasting the
commands using the Paste button from the equivalent dialog box.

For example, Data > Identify Duplicate Cases is used to identify duplicate
cases in a file. With the duplicates.sav dataset try identifying duplicate cases
starting with the menus using ID_household and ID_person variables and
sorting by the interview date, then use Paste button on the dialog box to copy the
underlying commands into a syntax window. Examine these commands to work out
what each command is doing - they are all commands you have come across.

SPSS 6: SPSS Syntax for Data Files


28

Session B: Syntax for Merging Data

Session summary
This session covers commands used with data files.

• Transforming data files: Sort Cases, Aggregate, Casetovars,


Varstocases and Transpose
• Merging Data files: Add Files, Match Files and Update

The next session, C, is about commands used to create cases and complex data file
structures.

SPSS 6: SPSS Syntax for Data Files


Session C
Creating New
Cases
30

Session C: Creating New Cases


Commands for Creating Data
The commands used in this session are more specialised for dealing with more
complex file structures than simple rectangular case by variable structure.

The notes in this session will consist of the contents of slides presented at the
course. The SPSS Programming and Data Management manual chapter 3 section
on Reading Complex Text Files covers the same material, so can also be used for
revision.

One of the building blocks for this programming is the DATA LIST command
detailed in session A, it is revised in the presentation but will not be duplicated in this
session.

SPSS 6: SPSS Syntax for Data Files


31

Creating Cases using File types

Session C: Creating New Cases


Types Of Non-rectangular File

SPSS has non-rectangular file types:


GROUPED

MIXED

NESTED

SPSS has separate tools for constructing cases from character


input, using INPUT PROGRAM

SPSS 6: SPSS Syntax for Data Files


32

Session C: Creating New Cases


Grouped data
allowable differences from standard rectangular file
different number of records for each case
records out of order
records with the wrong record identifier
duplicate records

Grouped Data - test data

Grouped data - test data commands


file type grouped f i l e = ’ A : \ p a t i e n t s . t x t ’ r e c o r d = 5
case=caseid 1-2 missing=nowarn.
record type 1.
data list / name 8-18(a) sex 19-24(a) age
25-27
home 30-39(a) insurnce 40-50(a).
record type 2.
data list / testa 8-14(a) datea 15-22(adate)
costa 23-31(2).
record type 3.
data list / testb 8-14(a) dateb 15-22(adate)
costb 23-31(2).
record type 4.
data list / testc 8-14(a) datec 15-22(adate)
costc 23-31(2).
end file type.
This generates the following data
caseid name sex age home insurnce testa datea costa testb dateb costb testc datec costc
1 John Smith Male 45 Evanston Blue Cross X-Ray 03/04/97 456.34 Blood 03/05/97 67.40 . .
2 Mary Doe Female 33 Lockport Blue Cross X-Ray 05/23/97 435.12 . . . . .

SPSS 6: SPSS Syntax for Data Files


33

Grouped data - city data - sample case

Session C: Creating New Cases


0 6 1 B o s t o n S u f f o l k
M a s s a c h u s e t t s 1 8 2 2

0 6 2 6 4 1 0 7 1 1 6 5 6 2 9 9 4 2 0

0 6 3 2 8 9 9 1 0 1 8 2 7 6 3 3 5 7 1 0

0 6 4 1 9 0 2 0 0 0 0 0 0 2 7 0 . 9 0 4 8 7 5 9 0 0 0 0 7 5 9 2 9 7 2 3 4
7 3 5 2 1 6 4 1 4

Grouped data - city data file

011Albuquerque Bernalillo New Mexico 1891


012 24450158 33176745
01 3 999999 999 999999 99
01 3 333266 102 454499 86
014 1318049290 22.02 113930000 86080200 91778547
021Atlanta De Kalb,Fulton Georgia 1847
022 49503927 42502229
023 1595517 18 2029618 16
02 5 3703819629 43.15 177903225 259718307 235052908
031Austin Travis Texas 1839
032 25353956 34549642
033 360463 93 536450 70
034 6680634190 5.70 132270000 98806248 102843197
041Baltimore Independent city Maryland 1797
042 905787 7 786775 9
043 2071016 13 2174023 14
044 2933326000 59.50 471078000 1944000 1731154967
051Birmingham Jefferson Alabama 1871
052 30091048 28441350
05 3 767230 44 847360 45
061Boston Suffolk Massachusetts 1822
062 64107116 56299420
063 2899101 8 2763357 10
064 1902000000 270.90 487590000 759297234 735216414
071Buffalo Erie New York 1832
072 46276828 35787039
07 4 1014912000 81.65 128308000 199214000 134929000
07 3 1349211 24 1242573 31

Grouped data - city data commands


file type grouped f i l e = ’ A : \ c i t i e s . t x t ' r e c o r d = 3
case=caseid 1-2.
record type 1.
d a t a l i s t / c i t y 4 - 1 6 ( a ) c o u n t y 1 7 - 5 2 ( a )
state 53-72(a) incorp
73-76.
record type 2.
data list / pop70 4-10 rank70 11-12 pop80
1 3 - 2 0 rank80 21-22.
record type 3.
d a t a l i s t / m p o p 7 0 4 - 1 3 m r a n k 7 0 1 4 - 1 8
mpop80 19-27 mrank80 28-32.
record type 4.
d a t a l i s t / v a l u e 4 - 1 6 t a x r a t e 1 7 - 2 3 ( 2 )
debt 24-33 revenue 34-45 expend
46-56.
end file type.
SPSS 6: SPSS Syntax for Data Files
34

Session C: Creating New Cases Grouped data - city data - formatting commands
variable labels

caseid 'case identification number'

incorp 'year city was incorporated'

pop70 '1970 population' rank70 'rank on


1970 population'

pop80 '1980 population' rank80 'rank on


1980 population'

mpop70 'metro pop in 1970' mrank70 'metro


rank in 1970'

mpop80 'metro pop in 1980' mrank80 'metro


rank in 1980'

value 'assessed value in dollars' taxrate


'city tax rate per 1000'

debt '(usually bonded) in dollars' revenue


'revenue in dollars'

expend 'expenditures in dollars'.

formats pop70 pop80 mpop70 mpop80 (comma9.0)/


value (dollar18.2)/taxrate(dollar7.2)/
debt(dollar14.0)/revenue, expend (dollar15.0).

descriptives var=pop70 pop80 mpop70 mpop80/


stat=mean stddev min max.

Grouped Data - output from the data input commands


> C ommand line : 1 4 C urrent case : 1 C urrent
splitfile group : 1 [ A lbuquerque ]
> D uplicate R ecord I D : 3
> C ase I D : 1
> S tart of R ecord : 0 1 3 3 3 3 2 6 6 1 0 2 4 5 4 4 9 9
8 6

> C ommand line : 1 9 C urrent case : 2 C urrent


splitfile group : 1 [ A tlanta ]
> U nknown R ecord I D : 5
> S tart of R ecord : 0 2 5 3 7 0 3 8 1 9 6 2 9 4 3 . 1 5
1 7 7 9 0 3 2 2 5 2 5 9 7 1 8 3 0 7 2 3 5 0 5 2 9 0 8

> C ommand line : 1 4 C urrent case : 2 C urrent


splitfile group : 1 [ A tlanta ]
> M issing R ecord I D : 4
> C ase I D : 2

> C ommand line : 1 4 C urrent case : 5 C urrent


splitfile group : 1 [ B irmingham ]
> M issing R ecord I D : 4

SPSS 6: SPSS Syntax for Data Files


35

Session C: Creating New Cases


> C ase I D : 5

> C ommand line : 1 4 C urrent case : 7 C urrent


splitfile group : 1 [ B uffalo ]
> M isordered R ecord I D : 3
> C ase I D : 7
> S tart of R ecord : 0 7 3 1 3 4 9 2 1 1 2 4 1 2 4 2 5 7 3
3 1

Grouped data - values read in

1 A lbuqu er qu e B ernal illo N ew Mexic o 18 91 24 4501 58 33 1,7 67 45 33 3,2 66


2 A tlan ta De Ge orgi a 18 47 49 5039 27 42 5,0 22 29 1,595 ,51 7
K alb,Fu lton
3 A us tin T rav is T ex as 18 39 25 3539 56 34 5,4 96 42 36 0,4 63
4 B altimo re Ind epend ent Mary la nd 17 97 90 5787 7 78 6,7 75 9 2,071 ,01 6
c ity
5 B irmingh am Je ffer s o n A la bama 18 71 30 0910 48 28 4,4 13 50 76 7,2 30
6 B os ton Su ff ol k Mass ac hu set ts 18 22 64 1071 16 56 2,9 94 20 2,899 ,10 1
7 B uf fa lo E ri e N ew Yo rk 18 32 46 2768 28 35 7,8 70 39 1,349 ,21 1

mrank mpop80 mrank80 value taxrate debt revenue expend


70
… 102 454,499 86 $1318049290.0 $22.02 $113930000 $86080200 $91,778,547
… 18 2,029,618 16 . . . . .
… 93 536,450 70 $6680634190.0 $5.70 $132270000 $98806248 $102,843,197
… 13 2,174,023 14 $2933326000.0 $59.50 $471078000 $1,944,000 $1731154967
… 44 847,360 45 . . . . .
… 8 2,763,357 10 $1902000000.0 $270.90 $487590000 $759297234 $735,216,414
… 24 1,242,573 31 $1014912000.0 $81.65 $128308000 $199214000 $134,929,000

Grouped data - sample results


V ariable M ean S td D ev M inimum
M aximum N L abel
P O P 7 0 4 7 1 9 4 5 . 0 0 2 4 0 1 8 2 . 2 1 2 4 4 , 5 0 1
9 0 5 , 7 8 7 7 1 9 7 0 population
P O P 8 0 4 4 2 0 4 8 . 1 4 1 7 6 6 2 2 . 6 9 2 8 4 , 4 1 3
7 8 6 , 7 7 5 7 1 9 8 0 population
M P O P 7 0 1 3 3 9 4 0 0 . 6 9 4 2 6 5 7 . 8 7 3 3 3 , 2 6 6
2 , 8 9 9 , 1 0 1 7 metro pop in 1 9 7 0
M P O P 8 0 1 4 3 5 4 1 1 . 4 8 9 5 8 5 8 . 9 9 4 5 4 , 4 9 9
2 , 7 6 3 , 3 5 7 7 metro pop in 1 9 8 0

Grouped data - general form of commands


F I L E T Y P E G R O U P E D [ F I L E = file ] R E C O R D = [ varname ]
col loc
C A S E = [ varname ] col loc
[WILD={WARN }] [DUPLICATE= {WARN }]
{NOWARN} {NOWARN}
[MISSING={WARN }] [ORDERED={YES}]
{NOWARN} {NO }
A lways :
END FILE TYPE
and
R E C O R D T Y P E { value list } [ S K I P ] [ C A S E = col
loc ]
SPSS 6: SPSS Syntax for Data Files
36

Session C: Creating New Cases {OTHER }


[DUPLICATE={WARN }]
[MISSING={WARN }]
{NOWARN}
{NOWARN}

SPSS 6: SPSS Syntax for Data Files


37

Mixed Data

Session C: Creating New Cases


Mixed data - sample data

Mixed data - sample commands


FILE TYPE MIXED F I L E = ' C : \ T R A I N \ M I X E D . D A T '
RECORD = SYSTEM 6.
RECORD TYPE 1.
DATA LIST / ID 2-4 AGE 8-9 SEX 11 SALARY 13-17 TENURE 19-20
JOBCODE 22 LOCATION 24 JOBRATE 26.
RECORD TYPE 2.
DATA LIST / ID 2-4 AGE 8-9 SEX 11 JOBRATE 13 LOCATION 15
JOBCODE 17 TENURE 19-20 SALARY 22-26.
RECORD TYPE 3.
DATA LIST / ID 2-4 AGE 8-9 SEX 11 SALARY 13-17
TENURE 19-20 LOCATION 22.
END FILE TYPE.

Mixed data - formatting commands


VARIABLE LABELS S Y S T E M ' S ystem of R ecord
K eeping '
J O B R A T E ' O verall J ob P erformance '
L O C A T I O N ' D epartment '
T E N U R E ' M onths in current job ' .
V A L U E L A B E L S S Y S T E M 1 ' L atest version '
2 ' M iddle version '
3 ' E arliest version ' /
S E X 0 ' M ale ' 1 ' F emale ' /
J O B R A T E 1 ' P oor ' 2 ' F air ' 3 ' G ood ' 4
' E xcellent ' .

SPSS 6: SPSS Syntax for Data Files


38

Session C: Creating New Cases FREQUENCY SYSTEM.

Mixed data - output

Mixed data - general form of commands


FILE TYPE MIXED [FILE=file] RECORD=[varname]
col loc
[WILD={NOWARN}]
{WARN }
Always:
END FILE TYPE

and (as for FILE TYPE GROUPED):

RECORD TYPE {value list}[CASE={col loc }]


{OTHER } {(format)}
[DUPLICATE={WARN }[MISSING={WARN
}]
{NOWARN}
{NOWARN}
[SKIP]

SPSS 6: SPSS Syntax for Data Files


39

Nested data

Session C: Creating New Cases


Nested data - sample data

0001 1 03 01 00/04/26 Edinburgh


0001 2 0001 4 GB Ford 01
0001 3 M 1 Royal Sun Alliance
0001 2 0002 1 GB Rover 03
0001 3 M 1 Direct Line
0001 3 F 1 Direct Line
0001 3 M 1 Direct Line
0001 2 0003 4 F Citroen 01
0001 3 M 0 AXA
0002 1 01 01 00/04/26 East Lothian
0002 2 0004 2 GB Lotus 01
0002 3 F 2 CGU
0003 1 01 02 00/04/27 Edinburgh
0003 2 0005 5 GB Vauxhall 05
0003 3 F 1 Eagle Star
0003 3 M 3 Eagle Star
0003 3 F 2 Eagle Star
0003 3 M 0 Eagle Star
0003 3 F 2 Eagle Star

SPSS 6: SPSS Syntax for Data Files


40

Session C: Creating New Cases Nested data - sample commands


file type nested f i l e = ' A : \ a c c i d e n t . t x t ’
record=#recid 6 case=accid 1-4.
record type 1. / * a c c i d e n t r e c o r d
. data list / ncar 8-9 weather 12-13 yr
20-21
mo 23-24 da 26-27 county 29-40(a).
record type 2. / * v e h i c l e r e c o r d
. data list / carid 8-11 agecar 12-13
country 15-16(a) make 20-29(a) nperson
40-41.
record type 3. / * v i c t i m r e c o r d
/* defines the case
. data list / sex 8 (a) injury 16 insurer
20-39(a).
end file type.

nested data - formatting commands


variable labels ncar ' number of vehicles '
weather ' weather conditions at time of
accident '
agecar ' age of vehicle '
nperson ' number of persons '
insurer ' insurance company providing
benefits ’
injury ' extent of injury ' .
value labels
weather 1 ' clear ' 2 ' rain ' 3 ' sleet ' 4
' light snow ' /
injury 0 ' none ' 1 ' minor ' 2 ' major ' 3
' doa ' / .
tables / format Z ero
/ tables = injury by weather > ( statistics )
/ statistics cpct ( ( pct 5 . 0 ) ' R ow % ' : injury )
count ( ( f 5 . 0 ) ) .

SPSS 6: SPSS Syntax for Data Files


41

Nested data - values read in

Session C: Creating New Cases


a n w y m da district c a c make n s i insurer
c c e e on y a g o p e n
c a a a th r e u e x j
i r t r i c n r u
d h d a t s r
e r r o y
r y n
1 3 1 00 04 26 Edinburgh 1 4 GB Ford 1 M 1 Royal Sun
Alliance
1 3 1 00 04 26 Edinburgh 2 1 GB Rover 3 M 1 Direct Line
1 3 1 00 04 26 Edinburgh 2 1 GB Rover 3 F 1 Direct Line
1 3 1 00 04 26 Edinburgh 2 1 GB Rover 3 M 1 Direct Line
1 3 1 00 04 26 Edinburgh 3 4 F Citroen 1 M 0 AXA
2 1 1 00 04 26 East 4 2 GB Lotus 1 F 2 CGU
Lothian
3 1 2 00 04 27 Edinburgh 5 5 GB Vauxhall 5 F 1 Eagle Star
3 1 2 00 04 27 Edinburgh 5 5 GB Vauxhall 5 M 3 Eagle Star
3 1 2 00 04 27 Edinburgh 5 5 GB Vauxhall 5 F 2 Eagle Star
3 1 2 00 04 27 Edinburgh 5 5 GB Vauxhall 5 M 0 Eagle Star
3 1 2 00 04 27 Edinburgh 5 5 GB Vauxhall 5 F 2 Eagle Star

Nested files - sample output

Nested data - further commands


recode injury (2,3=1)(else=0) into injured.

comment aggregate up to car level.

aggregate outfile=*/break=carid/
agecar=first(agecar)/
numinjur=sum(injured).

formats numinjur(f1.0) agecar(f2.0).

crosstabs /tables=agecar by numinjur.

SPSS 6: SPSS Syntax for Data Files


42

Session C: Creating New Cases Further output

SPSS 6: SPSS Syntax for Data Files


43

Creating Cases using programming

Session C: Creating New Cases


Input Program
Input programs provide a set of tools to read character data which
are not set out in the rectangular case/variable format needed for
analysis in SPSS.

Repeating data - sample data


house - no . no .
hold adults children

64C 2 2 JOHN M93 IRIS F91


66C 4 1 DAVID M93
68C 2 3 MARY F93 FAY F95
JOHN M00
70C 5 0
72C 1 1 PETER M98

Repeating data - cases for analysis


64C 2 2 JOHN M 93
64C 2 2 IRIS F 81
66C 4 1 DAVID M 93
68C 2 3 MARY F 93
68C 2 3 FAY F 95
68C 2 3 JOHN M 00
72C 1 1 PETER M 98

Repeating data - sample commands


INPUT PROGRAM.
DATA LIST FILE=‘children’ / HSEHLD 1-3(A)
NADULTS 5-6 NCHILD 8-9.
REPEATING DATA O C C U R S = N C H I L D / S T A R T S = 1 2 /
DATA = NAME 1-9(A) SEX 10(A) BIRTHYR
11-12.
END INPUT PROGRAM.

Input Program - general commands


INPUT PROGRAM.

commands to create cases

END INPUT PROGRAM.

SPSS 6: SPSS Syntax for Data Files


44

Session C: Creating New Cases Input Program - commands to create cases


LOOP.
transformations
END LOOP.

BREAK.
VECTOR.
END CASE.
END FILE.
REREAD.

Reread - sample commands


INPUT PROGRAM.
DATA LIST FILE=CARUSE / TYPE 1-4(A).
DO IF (TYPE = 'WORK').
. REREAD.
. DATA LIST / ORIGIN 5-7, DEST 8-10.
. END CASE.
ELSE IF (TYPE = 'SOCL').
. REREAD.
. DATA LIST / ORIGIN 15-20, DEST 31-35.
. END CASE.
END IF.
END INPUT PROGRAM.

Trivial example of rearrangement


Input data are in the form:
2 1 1
3 5 4
For analysis they are required in the form:
2
1
1
3
5
4

SPSS 6: SPSS Syntax for Data Files


45

Trivial example - commands

Session C: Creating New Cases


INPUT PROGRAM.
DATA LIST FILE=‘test’ / #X1 TO #X3 1- 6.
VECTOR V = #X1 TO #X3.
LOOP #I = 1 TO 3.
. COMPUTE X = V(#I).
. END CASE.
END LOOP.
END INPUT PROGRAM.

Examples of LOOP
COMPUTE X = 0.
LOOP.
. COMPUTE X = X+1.
END LOOP.

LOOP #I = 1 TO 5.
. COMPUTE X = X+1.
END LOOP.

More loop examples


LOOP.
...
END LOOP IF X=5.

LOOP IF Y<70.
...
END LOOP.

LOOP #I=1 TO N BY M IF NOT MISSING(Y).


...
END LOOP IF X > Z.

LOOP #I = 1 TO M.
. LOOP #J = 1 TO M.
....
. END LOOP. /* #J
END LOOP. /* #I

VECTOR AGES = AGE1 TO AGE10.


LOOP #I=1 TO NCHILD.
. COMPUTE #TEEN = RANGE(AGES(#I), 13, 19).
END LOOP IF #TEEN.

SPSS 6: SPSS Syntax for Data Files


46

Session C: Creating New Cases


SEB data example
One case (one examination candidate) has:

examination centre

sex

presentation type (S4, S5, S6, ..)

up to 14 sets of
- subject code
- subject grade (standard, higher, …)
- subject award band 1
- subject award band 2

subtotal check

plus other data such as candidate’s name ignored for these


purposes

SPSS 6: SPSS Syntax for Data Files


47

SEB example - commands

Session C: Creating New Cases


INPUT PROGRAM
STRING SEX(A1)
VECTOR #S(56)
D A T A L I S T F I L E = S E B D A T A notable /
#CENTRE, #SEX, #PRESTYP, #S1 TO
S56,#SUBTOTL
(T2,P4.0,T43,A1,P1.0,T49,14(P3.0,P2.0,P2.0
,4X,P2.0,6X)
,T321,P2.0)
LOOP #I=1 TO 53 BY 4
. DO IF MISSING(#S(#I))
. BREAK
. END IF
. COMPUTE CENTRE=#CENTRE
. COMPUTE SEX=#SEX /*
S E X is ' M ' or ' F '
. COMPUTE PRESTYPE=#PRESTYP
. COMPUTE SUBJNO=#S(#I)
. COMPUTE SUBJGR=#S(#I+1)
. COMPUTE SUBJRN=#S(#I+2)
. COMPUTE SUBJOR=#S(#I+3)
. END CASE
END LOOP
END INPUT PROGRAM

INPUT PROGRAM
STRING SEX(A1)
VECTOR #S(56)
D A T A L I S T F I L E = S E B D A T A notable /
#CENTRE, #SEX, #PRESTYP, #S1 TO
#S56,#SUBTOTL
(T2,P4.0,T43,A1,P1.0,T49,14(P3.0,P2.0,P2.
0,4X,
P2.0,6X),T321,P2.0)
LOOP #I=1 TO 53 BY 4
. DO IF MISSING(#S(#I))
. BREAK
. END IF
. COMPUTE CENTRE=#CENTRE
. COMPUTE SEX=#SEX / * S E X is ' M ' or ' F '
. COMPUTE PRESTYPE=#PRESTYP
. COMPUTE SUBJNO=#S(#I)
. COMPUTE SUBJGR=#S(#I+1)
. COMPUTE SUBJRN=#S(#I+2)
. COMPUTE SUBJOR=#S(#I+3)
. END CASE
END LOOP
END INPUT PROGRAM

SPSS 6: SPSS Syntax for Data Files


48

Session C: Creating New Cases SEB example - continued


PRINT FORMATS CENTRE(F7.0)/PRESTYPE(F1.0)/
SUBJNO TO SUBJOR(F4.0)
V A R I A B L E L A B E L S C E N T R E ' E xamination centre '
P R E S T Y P E ' P resentation type '
SUBJNO ' S ubject code number '
SUBJGR ' S ubject grade ( stan , O , H , short ) '
SUBJRN ' S ubject award band ( % for O ) ’
SUBJOR ' S ubject award band ( O only ) '
SELECT IF ANY(PRESTYPE,4,5,6) / * O mit
unwanted data
A N D A N Y ( C E N T R E , / * O mit unwanted
data
5502136,5502330,5502535,5502632,55092
38,5509432,
5509734,5509831,5509939,5510031,55101
39,5510236,
< 6 lines omitted >
5556031,5556139,5556236,5599999)

RECODE CENTRE (5531330=1)(5509238=2)


(5541638=3)(5509432=4)
(5547938=5) (5509831=6)(5531438=7)
(5531632=8)(5510031=9)
< 6 lines omitted >
(5533732=46)(5534534=47)(5534631=48)
(5502330=49)(5533937=50)
(5510430=51)(5599999=52) INTO CENTREA

RECODE SUBJNO
(0019=0001)(0030=0002)(0069=0003)(0070=0004)
(0090=0005)
< lines omitted >
(7905=0185)(8000=0186)
(ELSE=0) INTO SUBJNOA
RECODE SEX('M'=1)('F'=2)(ELSE=0) INTO SEXA
VALUE LABELS
CENTREA 1 " A inslie P ark H igh " 2 " A rmadale
A cademy "
< lines omitted >
5 0 " W ester H ailes E C " 51
" W hitburn A cademy ” /
SEXA 1 ' male ' 2 ' female ' /
SUBJGR 1 ' ordinary ’ 2 ' standard ’ . . . 9
' short course ' /
PRINT FORMATS CENTREA(F2.0)/SEXA(F1.0)/
SUBJNOA(F3.0)
SAVE OUTFILE=SEBSAVE
/ D R O P = C E N T R E , S U B J N O , S E X / * drop
un‑recoded vars

SPSS 6: SPSS Syntax for Data Files


49

Session C: Creating New Cases


FREQUENCIES VAR=CENTREA(1,52),SEXA(0,2),PRESTY
PE(4,6),
SUBJNOA(0,186),SUBJGR(0,66)

SPSS 6: SPSS Syntax for Data Files


50

Session C: Creating New Cases


Try for Yourself
A series of data files and syntax files have been placed in the spsswork folder which
are equivalent to some of the slides so you can test out some of the commands.

All SPSS syntax or command files end in .sps extension - double click on one of
them and SPSS should open them in a syntax window.

grouped data
• GROUPED.DAT with GROUPED.SPS
• patientgroup.txt with patientgroup.sps
• Cities.txt with Cities.sps

mixed data
• MIXED.DAT with mixed.sps

nested data
• Accident.txt with Accident.sps

input program
• children.dat with children.sps
• input1.sps (the data is incorporated in the syntax file and read into SPSS
using commands Begin data. and end Data.)

LLCheck the file paths for any data files read in using syntax - it's
the biggest source of errors is not being able to find the file.
These examples use M:/spsswork/ in front of the file name - you
may need to change this if not using Is skills machines.

SPSS 6: SPSS Syntax for Data Files


51

Session C: Creating New Cases


Session summary
This session covers creating cases using syntax

• Using File Type - End file type commands


• File types: Grouped, Mixed or nested
• Using Input Program - End Program commands and sometimes with
Loop - End Loop

SPSS 6: SPSS Syntax for Data Files

You might also like