Stat Transfer User Manual
Stat Transfer User Manual
USER
MANUAL
Version12
+
Copyright 2013
All Rights Reserved
Circle Systems, Inc.
The Stat/Transfer program is licensed for use on a single computer system or network
node. Use by multiple users on more than one computer is prohibited. If in doubt,
please call and ask about our very economical site licenses.
Stat/Transfer is a trademark of Circle Systems, Inc.
This manual refers to numerous products by their trade names. In most, if not all, cases
these designations are claimed as trademarks or registered trademarks by their
respective companies.
CIRCLE SYSTEMS
SINGLE USER LICENSE AGREEMENT AND LIMITED WARRANTY FOR
STAT/TRANSFER
IMPORTANT READ CAREFULLY BEFORE INSTALLING THE STAT/TRANSFER SOFTWARE. By clicking the
Next button, opening the sealed packet(s) containing the software, or using any portion of the software, you accept all of the
following Circle Systems License Agreement.
THIS IS A LEGAL AGREEMENT BETWEEN CIRCLE SYSTEMS, INC. AND YOU, THE END USER. CAREFULLY
READ THIS AGREEMENT BEFORE OPENING, INSTALLING, OR USING THE STAT/TRANSFER SOFTWARE (the
software). CIRCLE SYSTEMS WILL NOT ACCEPT ANY PURCHASE ORDER OR SELL YOU A LICENSE TO
INSTALL AND USE THE SOFTWARE UNLESS YOU AGREE TO ALL OF THE TERMS OF THIS LICENSE
AGREEMENT. IF YOU DO NOT AGREE TO THESE TERMS, DO NOT OPEN THE DISK PACKAGE, OR INSTALL OR
USE THE SOFTWARE ON YOUR COMPUTER; REMOVE ALL COPIES FROM YOUR COMPUTER AND RETURN
THE SOFTWARE AND ANY ACCOMPANYING MATERIALS WITHIN 30 DAYS OF PURCHASE, WITH PROOF OF
PURCHASE, FOR A FULL REFUND OF THE AMOUNT YOU ORIGINALLY PAID FOR THE SOFTWARE.
SOFTWARE LICENSE
Circle Systems grants you the right to load and use one copy of the software on a single computer (your Dedicated Computer).
You may transfer the software to another single Dedicated Computer provided you remove all copies of the software from the
first computer when you install it on the other computer. If one individual uses the Dedicated Computer more than 80% of the
time that it is in use, then that individual may also load and use the software on that individuals portable or home computer.
You may also make a copy of the software for backup or archival purposes. If you receive a copy of the software electronically
and on disk, you may use the disk copy for archival purposes only.
Copyright and other intellectual property laws and international treaty protect this software. Copyright law prohibits you
from making any other copy of the software and user manual without the permission of Circle Systems. You may not alter,
modify, or adapt the software or user manual, or create any derivative works based on them. Circle Systems distributes the
software in computer executable form only, and does not allow user access to the underlying source code and data. You may
not reverse engineer, decompile, or disassemble the software to gain access to such code and data, except to the extent
applicable law expressly permits such activity. Decompiling or disassembling the software may also violate the softwares
copyright.
You may not sublicense, sell, rent, lend, lease, sublicense, or give away the software to others. You may, however, with the
prior written permission of Circle Systems, transfer the software, written materials, and this license agreement as a package if
the other party registers with Circle Systems and agrees to accept this agreement. You may not transfer a license originally
sold in a volume or network license unless you transfer all the licenses at the site. You may not retain any copies of the
software yourself once you have transferred it.
Any unauthorized copying, distribution, or modification of the software will automatically cancel your license to use the
software and violate the softwares copyright.
recovering software or data, the cost of substitute software, claims by third parties, or similar costs. In no event will the
liability of Circle Systems exceed the amount paid for the software.
GENERAL
This is the complete and exclusive statement of the agreement between you and Circle Systems. It supersedes any prior
agreement or understanding, oral or written, between you and Circle Systems, its agents and employees, with respect to this
subject. No Circle Systems distributor, dealer, or agent is authorized to make any modification, extension, or addition to this
Agreement and the limited warranty and limitation of liability. The laws of the State of Washington, USA govern this
agreement.
Table of Contents
Introduction .................................................................................................................................................................... 1
WHAT STAT/TRANSFER DOES .................................................................................................................. 1
SUPPORTED FILE TYPES ........................................................................................................................... 2
WHATS NEW IN STAT/TRANSFER ........................................................................................................................... 3
GETTING STARTED .................................................................................................................................................. 5
TRIAL MODE SOFTWARE ......................................................................................................................... 5
ACTIVATION ......................................................................................................................................... 5
ACTIVATING ONLINE .............................................................................................................................. 5
ACTIVATING WITHOUT AN INTERNET CONNECTION ...................................................................................... 5
READ-ME FILE ....................................................................................................................................... 6
DEMO FILES ......................................................................................................................................... 6
WEB UPDATE ....................................................................................................................................... 6
ONLINE DOCUMENTATION ...................................................................................................................... 7
REMOVING STAT/TRANSFER .................................................................................................................... 7
TECHNICAL SUPPORT .............................................................................................................................. 8
THE STAT/TRANSFER USER INTERFACE ................................................................................................................... 9
THE USER INTERFACE ............................................................................................................................. 9
STARTING STAT/TRANSFER ...................................................................................................................... 9
TRANSFER DIALOG BOX ...........................................................................................................................................10
SELECTING THE INPUT FILE FORMAT ........................................................................................................ 10
SELECTING THE INPUT DATA FILE ............................................................................................................ 10
DATA VIEWER ..................................................................................................................................... 13
VARIABLE SELECTION INDICATOR ............................................................................................................ 13
SELECTING THE OUTPUT FILE FORMAT ..................................................................................................... 14
SELECTING THE OUTPUT FILE ................................................................................................................. 15
RUNNING THE PROGRAM ...................................................................................................................... 16
VARIABLES DIALOG BOX .........................................................................................................................................18
VARIABLE SELECTION............................................................................................................................ 18
OUTPUT VARIABLE TYPES ...................................................................................................................... 19
AUTOMATIC OPTIMIZATION OF TARGET TYPES .......................................................................................... 20
MANUALLY CHANGING THE TYPES OF OUTPUT VARIABLES ........................................................................... 21
OBSERVATIONS DIALOG BOX ...................................................................................................................................23
SELECTING CASES FROM THE INPUT FILE................................................................................................... 23
CASE-SELECTION EXPRESSIONS ............................................................................................................... 24
SAMPLING FUNCTIONS ......................................................................................................................... 25
OPTIONS DIALOG BOX.............................................................................................................................................27
GENERAL OPTIONS .............................................................................................................................. 27
USER MISSING VALUES ......................................................................................................................... 29
DATE/TIME FORMATS - READING ........................................................................................................... 30
DATE/TIME FORMATS - WRITING ........................................................................................................... 31
ENCODING OPTIONS ............................................................................................................................ 32
ODBC OPTIONS.................................................................................................................................. 34
ASCII/TEXT FILES - READ OPTIONS......................................................................................................... 34
ASCII/TEXT FILES - WRITE OPTIONS ..................................................................................................... 36
READING SAS VALUE LABELS ................................................................................................................. 37
WRITING SAS VALUE LABELS ................................................................................................................. 39
WORKSHEETS ..................................................................................................................................... 39
JMP OPTIONS .................................................................................................................................... 41
R AND S-PLUS OPTIONS ...................................................................................................................... 41
RATS OPTIONS - WRITING.................................................................................................................... 41
OUTPUT OPTIONS (1) .......................................................................................................................... 42
OUTPUT OPTIONS (2) .......................................................................................................................... 42
USER INTERFACE OPTIONS..................................................................................................................... 43
DEFAULT DIRECTORIES.......................................................................................................................... 44
DATA VIEWER OPTIONS ........................................................................................................................ 44
STAT/TRANSFER PROGRAM GENERATION................................................................................................. 45
AUTOMATIC TRANSFER LOGGING ........................................................................................................... 46
RESTORING AND SAVING OPTIONS .......................................................................................................... 46
ADVANCED ODBC OPTIONS .................................................................................................................. 47
RUN PROGRAM DIALOG BOX ....................................................................................................................................48
CREATING A PROGRAM AUTOMATICALLY FROM THE USER INTERFACE ............................................................ 48
CREATING A PROGRAM MANUALLY ......................................................................................................... 48
RUNNING A PROGRAM ......................................................................................................................... 48
LOG DIALOG BOX ...................................................................................................................................................49
STAT/TRANSFER LOG ........................................................................................................................... 49
LOG LEVEL.......................................................................................................................................... 49
SAVE LOG FILE .................................................................................................................................... 50
CLEAR LOG ......................................................................................................................................... 50
SEND ERROR REPORT ........................................................................................................................... 50
THE COMMAND PROCESSOR .................................................................................................................................51
STARTING THE COMMAND PROCESSOR .................................................................................................... 51
THE COPY COMMAND .............................................................................................................................................52
TRANSFERS FROM THE COMMAND PROCESSOR ......................................................................................... 52
TRANSFERS FROM THE OPERATING SYSTEM PROMPT .................................................................................. 53
COMBINING FILES ...................................................................................................................................................54
SPECIFYING THE FILE TYPE ........................................................................................................................................55
FILE FORMAT SPECIFICATION ................................................................................................................. 55
SPECIAL CASES WHEN SPECIFYING FILES ................................................................................................... 57
OPTIONS SET BY PARAMETERS AFTER COPY.................................................................................................................58
OPTIONS FOR PAGES AND TABLES ........................................................................................................... 58
OPTIONS FOR VARIABLES ...................................................................................................................... 60
OPTIONS FOR MESSAGES ...................................................................................................................... 61
SELECTING CASES ...................................................................................................................................................63
SELECTING VARIABLES .............................................................................................................................................64
KEEP AND DROP COMMANDS .............................................................................................................. 64
CHANGING OUTPUT VARIABLE TYPES ..........................................................................................................................66
THE TYPES COMMAND ........................................................................................................................ 66
CHANGING TARGET OUTPUT TYPES ......................................................................................................... 67
SETTING OPTIONS WITH THE SET COMMAND ...............................................................................................................68
AVAILABLE OPTIONS ............................................................................................................................ 68
OTHER AVAILABLE COMMAND PROCESSOR COMMANDS .................................................................................................74
OPERATING SYSTEM COMMANDS ........................................................................................................... 74
QUIT ................................................................................................................................................. 74
COMMAND PROCESSOR HELP ................................................................................................................ 75
LOGGING STAT/TRANSFER SESSIONS ....................................................................................................... 75
COMMAND FILES....................................................................................................................................................76
CONSTRUCTING COMMAND FILES ........................................................................................................... 76
COMMAND FILE NAME EXTENSIONS ....................................................................................................... 76
EXECUTING COMMAND FILES ................................................................................................................. 76
RUNNING PROGRAMS AND COMMANDS IN COMMAND FILES ....................................................................... 77
ODBC DATA SOURCES ............................................................................................................................................78
THE DBR AND DBW COMMANDS .......................................................................................................... 78
RUNNING BATCH JOBS WITH ODBC ........................................................................................................ 79
VARIABLE NAMING AND LIMITS ............................................................................................................................81
VARIABLE NAMES ................................................................................................................................ 81
LIMITATIONS ON THE NUMBER OF VARIABLES ........................................................................................... 81
STRINGS WITH VALUE LABELS................................................................................................................. 81
INTERNAL LIMITATIONS ......................................................................................................................... 81
RETURN TRANSFERS TO THE ORIGINAL FORMAT ........................................................................................ 82
Introduction
What Stat/Transfer Does
Stat/Transfer is designed to simplify the transfer of statistical data between different programs.
Data generated by one program is often needed in another context, either for analysis, for cleaning
and correction, or for presentation. However, not only must the data be transferred, but in addition,
the variables generally must be re-described for each program with additional information, such as
variable names, missing values and value and variable labels. This process is not only timeconsuming, it is error-prone. For those in possession of data sets with many variables, it represents a
serious impediment to the use of more than one program.
Stat/Transfer removes this barrier by providing an extremely fast, reliable and automatic way to
move data.
Stat/Transfer will automatically read statistical data in the internal format of one of the supported
programs and will then transfer as much of the information as is present and appropriate to the
internal format of another.
Stat/Transfer preserves all of the precision in your data by storing it internally in double precision
format. However, on output, it will, where possible, automatically minimize the size of your output
data set by intelligently choosing data storage types that are only as large as necessary to preserve the
input precision. Stat/Transfer also allows precise and easy manual control over the storage format of
your output variables, in case this is necessary.
In addition to converting the formats of variables, Stat/Transfer also processes missing values
automatically.
Stat/Transfer can save hours and even days of manual labor, while at the same time eliminating error.
Furthermore, you gain this speed and accuracy without losing flexibility, since Stat/Transfer allows
you to select just the variables and cases you want to transfer.
In addition to the standard graphical user interface, a command processor allows you to run a transfer
in batch mode, using a command file. The user interface can automatically generate a command file
to exactly reproduce your data transfer operations. This makes it straightforward to set up fully
automatic batch procedures for repetitive tasks and allows you to precisely document the work that
you have done
Introduction 1
NLOGIT
ODBC
OpenDocument
Spreadsheets
OSIRIS (read-only)
Paradox
Quattro Pro
R
RATS
SAS Data
SAS CPORT (read only)
SAS Transport
S-PLUS
SPSS Data
SPSS Portable
Stata
Statistica
SYSTAT
Triple-S
In addition, for data archiving and exchange, Stat/Transfer will write ASCII data together with
programs to read the data back into SAS, SPSS, and Stata. It also supports its own, easy to use,
Schema format for reading in external data in ASCII format.
See the topics for specific file types for more information on supported versions of each type.
2 Introduction
Excel 2013
FileMaker
gretl
Stata 13
SYSTAT 13
New Options
Command Processor
There is a new switch that allows you to read a file of options before transferring from the operation
system.
You can now install and uninstall silently, without any user intervention or prompting.
One copy of a lease license can now serve all licensed users on a network.
Getting Started
Activation
After installation is complete, you must activate your software, using a code found on your CD
envelope or emailed to you. It is easiest if you have an internet connection during the activation
process.
For detailed instructions, please see the Support/Activation section of our website,
www.stattranfer.com. If you have any trouble with the activation process, please contact
[email protected] for assistance.
Activating Online
First, go the About tab and press the Activate Online button. On the next screen, enter your
activation code. Press Next and you will be asked to enter your name, organization and email address.
Press Next again to enter a password, which will be used if you re-activate your software on another
computer (see below). You should not use a valuable password. Remember to write down your
password in your software manual or another place where you can find it if you need it.
Finally, when you press Next again, your information will be sent to our server and, if your serial
number is valid, the activation information will be written to your computer. Once activation is
complete, you must restart Stat/Transfer
.
Getting Started 5
Read-me File
The installation or Web update procedure may copy a file called read.me, which will be a supplement
to the on-line help or manual. There is a shortcut to the read.me file from the Windows Start menu,
in the Stat/Transfer group.
You should check to see if the read.me file exists or has been updated. If so, you can read it in any
editor or word-processor.
We make every effort to keep up with changes in the file formats of popular software and the read.me
file will contain the latest information on which versions of these programs are supported. The file will
also contain the latest information on other improvements to Stat/Transfer.
You can also get current information about Stat/Transfer by visiting our website at
www.stattransfer.com.
You can also reach our website from the Stat/Transfer About screen.
-o-
Demo Files
The distribution disk contains sample files in many of the supported formats, which you may find
useful in learning about Stat/Transfers capabilities. The file name indicates which program format
each file corresponds to. In addition there is a file, demo.xls, which illustrates the way Stat/Transfer
treats different kinds of variables. The installation program will copy these files to the same directory
chosen for the installation of Stat/Transfer.
Web Update
We periodically post maintenance releases of Stat/Transfer on our website to support new file formats,
add features, or to fix problems that have come to our attention. However, we have found that many
people are not taking advantage of these releases, so they are using software that is older than it should
be. To address this problem, Stat/Transfer will automatically check the web for updates.
By default, Stat/Transfer will check for new versions once every week. However, you can change this
option. You can check immediately, daily, weekly, monthly, quarterly, or (if you are running on a
computer that is not connected to the web) never. To change the interval at which the program checks
the web, click on the About tab and select one of the options.
Suppose you choose Every Week (the default). Each time you start Stat/Transfer, the program will
compare the current date to the date at which the version was last checked. If the difference is less
6 Getting Started
than seven days, nothing will happen. If it is seven days or more, the update program will ask you if
you would like to check the web.
If you choose to do so, the program will check our website for the latest version. If it finds a version
that is newer than yours, it will ask you whether or not to download the latest release. If you choose to
do so, it will be installed on your computer. The read-me file will also be downloaded, so that you
can check to see what new features have been added.
If you wish, you can tell Stat/Transfer to do an immediate check for a newer version rather than wait
for an automatic check. To do so, go to the About tab and select Right Now. The update program
will then check our website for updates as described above.
-o-
Online Documentation
Manual
The document you are currently reading is available on our website in HTML and PDF format. The
PDF manual is installed with Stat/Transfer for Windows, and can be opened by clicking on the Start
button, then pointing to Programs. Point to the Stat/Transfer folder and when the contents appear,
click on Stat/Transfer PDF Manual.
If you are using Stat/Transfer on other platforms, you can download the PDF file from our website.
Online Help
The Stat/Transfer online help contains all of the information found in the manual. You can access the
online help by pressing the Help buttons or the ? buttons on the Stat/Transfer dialog boxes.
Removing Stat/Transfer
On Windows, if you would like to remove Stat/Transfer from your hard disk, simply select the
Uninstall option from the Start menus Stat/Transfer folder.
On OS-X select the Uninstall item in the Stattransfer12 folder of your Applications folder.
On Linux and Solaris, navigate to the Stat/Transfer program directory and execute the uninstall
program.
-o-
Getting Started 7
Technical Support
Before you seek support, please check the online help or look in the online manual and see if the
solution to your problem can be found there. Be sure to check the Frequently Asked Questions
section. You can also check to see if your problem is addressed in the Support section of our website.
If you have a problem that you cannot resolve by these methods, the best way to seek help is by using
our web form at www.stattransfer.com/support/help.html. We have found that the use of this form,
rather than a free-form email, allows us to provide faster and better support. If you use this form, we
are much more likely to get the information we need to address your problem on the first try. In
addition, your request will be routed to the right person and automatically entered into a tracking
system so that your problem will not fall through the cracks. To ensure the best service, please be sure
to follow the instructions on the form and enter all of the information that is called for.
You can also seek support by using the Log tab. This method is particularly helpful if you think you
have found a bug in Stat/Transfer, because it is possible to automatically send us a compressed and
encrypted copy of the input file that was causing you problems, as well as a complete description of
your environment and your own description of the problem. If you send a support request through this
method, it is a very good idea to also use the form on our website. This will ensure that your problem
is entered into our tracking system.
It is always good to make sure you are running the latest version of Stat/Transfer. You can go to the
About menu tab and look up the exact version of Stat/Transfer that you are using. You can also check
for updates from there.
8 Getting Started
Starting Stat/Transfer
On Windows, the installation procedure will install a folder for Stat/Transfer in the Programs menu
and a shortcut to Stat/Transfer. A shortcut to Stat/Transfer will also be installed on your desktop.
Click on either of the shortcuts to start the program.
On OS-X, click on Stat/Transfer in the StatTransfer12 folder of your Applications folder.
On Linux, a shortcut to Stat/Transfer will be installed on your desktop.
The ? Button
You can obtain information on a given file type by clicking on the '?' button.
The input data file is selected on the second line of the Transfer dialog box, the File Specification
line.
If your files are named using the standard file extensions for Stat/Transfer, given on the next page,
you will ordinarily use the Browse control to select a file.
When you click on Browse, a standard File Open dialog box will open. To select the input file, first
make sure that the path is the correct ones for your input file. If not, change to the correct one.
Next, you need to select the correct file. Note that a wildcard file specification, *.ext has been
created for the File Name entry, where .ext is the Stat/Transfer standard extension for the type of
input data file you have selected.
10 The Stat/Transfer User Interface
All of the files in the current directory with this extension will appear in a list box below the File
Name line. You can either use this list and click on the name of the file you wish to use, or type the
name on the File Name line.
The ? Button
If you click on the ? button beside the File Specification line, you can obtain information on the file
type currently displayed.
wk*
mdb
txt, csv
stsd (Schema file)
sts (Schema file)
dbf
xml
rec
xls
dbf
dat
htm*
jmp
lpj
mat
schema, sch
mtw
lpj
[none]
ods
dict, dct
db
wq?, wb?
rdata
rat
sd2, sas7bdat
ssd01, sas7bdat
stc
xpt, tpt
[none]
sav
por
sps
dta
do
sta
sys
xml
Data Viewer
You can preview your input data by pressing the View button in the Transfer dialog box. Your data
will appear in a scrollable grid. By default, the viewer is on.
The data can be sorted by any variable by clicking on the variable name. You can navigate to any
row by entering the row number in the Quick Navigation box and then pressing Go.
Columns can be moved by clicking and holding the column heading and then dragging the column to
the new location.
To set viewer options, go to the Options tab and click on Data Viewer Options.
To return to the Transfer screen, press Close Viewer.
Long String Viewer
The data viewer will display long strings, international character sets, variable characteristics and value
labels. The long string viewer allows you to see strings that are too long to be viewed at the current
column width.
By default, the viewer is on. To use it, simply left click on the cell you want to examine and a viewing
window will display the string. To disable it, uncheck the Show Long String Viewer option in Data
Viewer Options in the Options dialog box.
By default, the viewing window will close automatically when your cursor leaves the cell that contains
the string you are viewing. To turn off this behavior, uncheck the Hide Automatically option.
Variable Info Viewer
By default, if you click on the variable name at the top of the grid, a Variable Info Viewer box will
open that will show you the variable name, its type and, if available, its label. To disable it, uncheck
the Show Variable info Viewer option in Data Viewer Options in the Options dialog box.
By default, the viewing window will close automatically when your cursor leaves the cell that contains
the variable you are viewing. To turn off this behavior, uncheck the Hide Automatically option.
The list of output file formats will be the same as the list of input file formats, but with more choices
of version, and with the following exceptions:
HTML tables and Mplus files will appear on the output format list, since they can be written by
Stat/Transfer, although they cannot be used as input.
OSIRIS files will not appear, since they are only read by Stat/Transfer
SAS CPORT files will not appear since they are only read by Stat/Transfer
When a worksheet has been chosen as input, then worksheets will not appear in the output format
list. These types of conversions, such as a Lotus 1-2-3 worksheet to an Excel worksheet, are not
supported since it is usually possible to do them within your spreadsheet program.
Conversions from one xBASE file type to another are not supported since the file formats of dBASE
and FoxPro are identical. Thus if a dBASE file is chosen as input, then FoxPro will not appear on
the output format list and vice versa.
Stata Output
The two types of Stata files, Stata (Standard) and Stata/SE appear in the list of output file formats.
After you select the Stata file type, the version to be used for output will be displayed next to the file
type. You can change the version to be output by selecting Output Options(1) from the Options tab.
SAS Output
SAS V6, SAS V7-8, and SAS V9 will appear in the list of output files types. You can specify the
platform you wish for the output by selecting Output Options(1) from the Options tab.
Delimited ASCII Choices
If you wish to write delimited ASCII files, you will see two choices in the list of output file types:
ASCII/text- Delimited
Automatic Logging
Stat/Transfer always writes a log that is displayed in the Log dialog box. You can view it and, if you
would like, manually save it to disk.
However, you can tell Stat/Transfer to automatically write a log file to disk every time you do a data
transfer. The log file will contain detailed information about what has occurred during your transfer.
To turn on this feature, check the option, Automatically write a log file in the Automatic Transfer
Logging section of the Options dialog box.
By default the log file will go to the same directory as your output file and will be appended to an
existing log file. Options in the Automatic Transfer Logging section allow you can change the
name and whether or not an existing file is overwritten.
Variable Selection
Automatic Selection of All Variables in the Data Set
When the input file has been specified in the Transfer dialog box, by default Stat/Transfer selects all
of the variables for transfer. A message will appear in the Transfer dialog box below the input File
Specification line, telling you that all of the variables in the data set have been selected and giving
you the total number of variables.
If you wish to transfer all of the variables of the input data set, you need do nothing more to specify
them.
Manually Selecting Particular Variables
If you want to select only some of the variables in the input data set, click on the Variables tab at the
top of the Transfer dialog box. The Variables dialog box will appear with a list of all of the variables.
When you highlight a variable, the variable label will appear in the box at the upper right
By default, all of the variables are selected. You can select or unselect variables one by one by going
to a particular variable and toggling selection on or off for that variable. To do so, click on the check
box next to the name or click on the variable name and press the SPACE key.
If you wish to select or unselect a group of variables, use the Quick Variable Selector.
Quick Variable Selector
The box in the upper right corner enables you to specify selection criteria for the variables displayed
in the list box at the left of the page. This is considerably less tedious for long lists of variables than
manually checking or unchecking them.
To select or unselect all of the variables, type a star, *, in the Quick Variable Selector box and
click either Keep or Drop.
18 The Stat/Transfer User Interface
Selection conditions can take the form of the wildcard characters * or ? or you can use variable
ranges. The question mark matches exactly one character, while the asterisk matches more than one.
Unlike standard wildcards, more than one asterisk can be included in a specification. For instance:
*inc* will match any variable with the string inc in any position. Ranges of contiguous variables
can be specified with a dash (without spaces) between two variable names. For instance distance-a9
will select (or drop) variables distance through a9, inclusive.
Space or comma delimited lists of conditions can be entered at one time. For example:
factor1,cluster,a2-a10,L1*
followed by a click on the Drop button, will uncheck the variables factor1, cluster, a2 through
a10, and any variable which starts with the string L1.
If needed, you can successively refine your selection by entering conditions and then clicking on
either the Drop or Keep buttons, or, alternatively, by manually checking or unchecking variables in
the list box.
Variable Selection Indicator
Select all of the variables you want to transfer. When you have finished, you can click on the
Transfer tab at the top of the dialog box and you will return to the Transfer dialog box, where you
will see a message telling you how many variables have been selected.
Value Labels Browser
If your input file has value labels, the option Value Label Browser allows you to display them for
each variable.
This option is set by clicking on the Options tab and then clicking on User Interface Options. You
can choose to have the value labels displayed in a vertical box, just to the right of the variable names,
or in a horizontal box below the variable names. The default is to display the value labels in a
horizontal box.
When reading numerical variables, Stat/Transfer selects a target output variable type based on the
information available to it. This target variable type is not used for internal storage during the
transfer, but is simply the preferred output type. If this type is not supported in the chosen output file
type, the best approximation will be chosen.
The various target output variable types used by Stat/Transfer are:
Stat/Transfer Target Output Variable Types
byte
int
long
float
double
date
time
date/time
string
Remember that the target type will not necessarily be the actual output type. If the target type
assigned to a variable by Stat/Transfer is available as one of the variable types of the output file
format, then that type will be used for the output. If the assigned target type is not one of the available
output types, then a format of the next larger size will be used.
Optimization, by default, takes place during a transfer. The Optimize button is used to optimize
manually before the transfer. This allows you to see the target types that Stat/Transfer has chosen for
your variables before they are transferred, and to change any output types that you wish. This is
discussed in the next section.
This feature is useful when you want to change the type of a number of variables, where the new type
is the same for all. The Quick Type Changer dialog box enables you to specify selection criteria for
the variables. This is considerably less tedious for long lists of variables than manually changing
each one.
Selection conditions are entered in the same way as in the Quick Variable Selector box, described
above. They can take the form of the wildcard characters * or ? or you can use variable ranges.
The question mark matches exactly one character, while the asterisk matches more than one. Unlike
standard wildcards, more than one asterisk can be included in a specification. For instance: *inc*
will match any variable with the string inc in any position. Ranges of contiguous variables can be
specified with a dash (without spaces) between two variable names. For instance distance-a9 will
select (or drop) variables distance through a9, inclusive.
Space or comma delimited lists of conditions can be entered at one time. For example:
factor1,cluster,a2-a10,L1*
will select the variables factor1, cluster, a2 through a10, and any variable which starts with the
string l1. The output type for these variables is specified from the drop down menu below the line
specifying the variables.
Automatic optimization will, of course, not take place during your transfer when you have used the
Optimize button.
Output variable types can be changed freely for ASCII files and worksheet files. For all other file
formats, you can change freely among the numeric types of byte, integer, long, float and
double and you can change among the time types. However, conversions between any of the numeric
types and dates or strings are not supported.
You should be careful not to choose a smaller type than that chosen by Stat/Transfer unless you are
sure you know more about your data than Stat/Transfer does.
Remember that you are selecting a target type. If the output data format does not support the
specific type you have selected, then Stat/Transfer will use the best match to the type you have
selected.
You can determine the output variable types supported for each output file type by consulting the
appropriate table in the Supported Programs section.
Handling Mixed Data
If you have mixed data in which some variables need doubles and others do not (for example, you
might have precisely measured dollar amounts, which should be in doubles, along with scales of
survey items, which should be in floats) you should press the Optimize button in order to designate
integers for the right variables and then designate floats and doubles to reflect the appropriate level of
measurement for each variable.
Automatic Dropping of Constants
You can tell Stat/Transfer to automatically drop variables that are constant or missing for a selected
subset of data. You select this option by checking the Drop Constants check box in the Variables
dialog box and then pressing the Optimize button.
This feature is useful when the part of a data set selected for transfer contains variables with values
that are either constant or missing (such as a pregnancy variable when only male subjects are selected
or variables in yearly surveys where the same questions do not appear for each year.)
This feature is not likely to be used often, but is extremely valuable when it is needed. If the data set
has a large number of variables, it can be exceedingly tedious to select only the meaningful ones
manually.
Use Doubles Option
The Use Doubles option tells Stat/Transfer whether or not to put variables with fractional parts into
double or float on output. The default is to have Stat/Transfer use doubles.
If you do not change the default behavior, Stat/Transfer will evaluate each variable to see if it can be
represented as a float without a loss of information and will put only those variables that require it
into a double. This default behavior is the safest option.
However, most data are not measured with more than eight or nine digits of precision (survey data,
for example, never are). Therefore, if you need to save space on output and do not need more
precision, you can change the default behavior, so that only floats are used.
You can reset the Use Doubles option by clicking on the Options tab, selecting General Options and
uncheck the Use Double box. Alternatively, for a single transfer, you can uncheck the Use Double
box at the bottom right of the Variables dialog box and then press the Optimize button.
The scrolling text box in the upper left corner provides brief, on-screen documentation on how to
select particular data records based on conditions that you specify. The variables of the input data set
are listed in the box at the right of the screen.
At the bottom of the screen is the case-selection field in which you enter the case selection, or
WHERE, expression that will specify cases. This expression gives the conditions on the variables
that will define the subgroup of the data set that you wish to select.
Variable names can be entered in this field by selecting their names from the variable list box. When
you double-click on a variable name it will be copied to the case-selection box.
Case-Selection Expressions
The WHERE statement is used to give the conditions on the variables that will define the subgroup of
the data set that you wish to select.
The case-selection, or WHERE, expression, has the following form:
WHERE variable expression relational operator selection condition
Here, variable expression consists of a single variable or an expression involving several variables,
relational operator is one of the operators listed below, while selection condition gives specifications
for the variables to be selected.
Variable Expression
All of the usual arithmetic operators [+ - / * ( ) ] are available for use in this expression.
If variable names used in WHERE expressions contain embedded blanks or characters such as
relational or arithmetic operators like /, then they must be enclosed in single quotes.
Internal Variable
An internal variable, _rownum is available which allows specific rows or records of the data set to
be referenced.
Relational Operators
The following relational operators are available:
=
!=
<
>
<=
>=
&
|
,
!
equals
not equal
less than
greater than
less than or equal
greater than or equal
and
or
or (used in a series)
not
Selection Conditions
If variable values consist of strings, then when they contain blanks or characters such as /, they must
be enclosed in double quotes.
Examples
The comma operator , is used to list different values of the same variable name that will be used as
selection criteria. It allows you to bypass potentially lengthy OR expressions when selecting lists of
values. For example, the WHERE expression above can be more easily written:
where name = mc*,mac*
Other examples are:
where age = 21,31,41,51,61
which will select only the listed ages, and
where caseid != 22*,30??,4?00
which will select all cases except those ids starting with 22, or four character ids starting with 30,
or starting with 4 and ending with 00.
Missing Values
You can test to see if the value of any variable is missing by comparing it to the special internal
variable _missing.
For example
where income != _missing & age != _missing
Sampling Functions
Three functions are available for sampling.
Random Samples
For example, for a random sample of one tenth of a data set, use:
where samp_rand(.1)
Random Samples of Fixed Size
Expressions are evaluated from left to right. You can thus sample from a subset of your cases by
subsetting them first and then sampling. For example, to take a random half of high school graduates,
use:
where schooling >= 12 & samp_rand(.5)
The random number generator that provides the basis of these sampling routines is rand_port() in
Jerry Dwyer, Quick and Portable Random Number Generators. C Users Journal, June, 1995, pp.
33-44. By default, it is seeded using a permutation of the time of day, and will yield a different
sample on each run.
If you need a reproducible sample, you can generate it by using the same seed each time. The seed is
entered in the General Options of the Options dialog box and should be a positive integer in the
range of 1 through 2,147,483,646.
General Options
Ask Permission before Overwriting Files
The option Ask Permission Before Overwriting Files is on by default. If a file, or a database table
exists, you will be prompted for permission before it is overwritten.
If you wish to suppress these warning messages, click on the box to remove the check mark.
Write New, Numeric Variable Names (Vn)
When you go from one format to another, by default Stat/Transfer will create legal variable names for
you, based as much as possible on the original names. In particular, when you transfer from systems
such as Paradox or JMP, which allow long variable names with embedded spaces, to older systems
which restrict variable names to eight characters, by default Stat/Transfer will truncate these for you.
However, these truncated names often have little resemblance to the names you started with.
Stat/Transfer will use the variable names as variable labels, so that your original names are available.
If you check the option Write new, numeric variable names. (Vn), instead of the default variable
names, Stat/Transfer will create new variable names of the form V1...VN. This is chiefly useful when
dealing with truncated names. If your output system supports variable labels, it is sometimes better to
check this option and have Stat/Transfer simply create numeric names for your variables. You can
then use the variable labels for the description.
Because this option is likely to be useful only in special circumstances, it reverts to the default
between sessions.
Preserve Value Label Tags and Sets
Many software packages allow users to assign the same set of value labels to more than one variable.
(In SAS, the term for value labels is user-defined format). For example, a survey with a list of
questions with Yes and No responses could use the same set of value labels for the variables
associated with each of these questions.
Options Dialog Box
The Stat/Transfer User Interface 27
If the option Preserve value label tags and sets is checked, the mapping of value label sets to
multiple variables will be preserved on output. If tags are used in the input file to identify value labels
sets, these will be preserved. Otherwise, tags will be constructed by Stat/Transfer (LABA-LABZ and
so on).
If this option is not checked, each labeled variable will have a unique value label set and the tag used
to identify the set will be constructed from the name of the variable.
This option is off by default.
Use Doubles
The Use Doubles box can be checked here if you want Stat/Transfer to use doubles when it optimizes
your output data set. By default, this option is on.
Uncheck the Use Doubles option if you need to minimize the size of your output dataset (particularly
for Stata) and you know that the precision of measurement of all of your variables is less than eight
decimal digits.
Preserve String Widths if Possible
Normally, Stat/Transfer will calculate the minimum string width for each variable and use it in the
output encoding. This ensures that the output file will be as small as possible. This option allows you
to maintain the input string width. This is particularly useful when combining different files.
If this option is checked, Stat/Transfer will use the input width will be used as the output width if it can
do it without losing data. More precisely, the variable width will be the minimum of the string width
and the input width. The output can be greater, but not less than the input width. For plain ASCII
data, it will be the same.
Seed for Sampling Functions
By default, the sampling functions in WHERE expressions will generate a starting seed randomly,
based on the clock time. This means that each time you run a transfer on a given file you will select a
different sample. If, in contrast, you need a reproducible sample, you can enter a seed for the random
sampling process. The seed should be a positive integer in the range of 1 through 2,147,483,646.
Variable Name Case Conversions
Stat/Transfer always follows the variable-naming rules of the output file type and will convert input
names so that they will conform to those rules. Some older packages require upper case variable
names. Other more modern packages allow mixed case variable names. Some packages, notably
Stata, S-PLUS and R allow mixed case variable names, but are case sensitive. In these case-sensitive
systems, if Stat/Transfer were to move data from an upper-case system and do no case-conversion, the
user of the data set would need to always hold down the shift key when typing in variable names.
Stat/Transfer allows you to specify your case conversion preferences for case-insensitive packages as
well as case-sensitive packages.
The available options are given in the drop-down menus for both case-sensitive and case-insensitive
programs and are:
Convert to lower
Preserve if mixed case
Preserve always
Convert to upper
The default for case-insensitive programs is Preserve always.
The default for case-sensitive programs is Preserve if mixed case. This choice is designed to handle
mixed case variable names such as FamilyIncome. If both upper and lower case letter are found in a
variable name, it will be left untouched. Otherwise it will be converted to lower case.
Options Dialog Box
28 The Stat/Transfer User Interface
Formats which are considered case-sensitive on output are: Stata, S-PLUS, R, and Matlab.
Formats that require upper-case variable names are: SAS Version 6, SAS XPORT, SPSS Portable,
SYSTAT, Epi Info, and older xBASE versions such as Clipper and dBASE II.
All other formats are case-insensitive.
Since current versions of Stata and SAS support the labeling of missing values, this option is less
useful. If you have value labels, it is best to keep it unchecked and rely on the value labels to
differentiate your values.
Note that we believe these options are potentially dangerous. To avoid the chance of users checking
one of these options and then forgetting about it, Stat/Transfer does not save the settings when
options are automatically saved at the end of a session.
%M
%n
%N
input minute
input milliseconds
input milliseconds or tenths or hundredths of seconds.
(If no field width is given and the input string has
a field width of three, then input will be milliseconds.
A field width of 1, either given explicitly or
inferred from the input string, will cause input
of 10ths of a second; a width of 2 will cause
input of 100ths of a second.)
%p
%S
%w
%y
input seconds
skip a whitespace delimited word (see also %c)
input year.
(If less than 100, the century changeover year is
used to determine the actual year.)
%Y
%%,%[,%]
[...]
date or time part of the output format has the form %char. Leading zeros cause the value to print
with leading zeros. For example %0d will print the day of the month with a leading zero.
]\The characters below are used to create the output formats. Anything to be printed in the output
character string that is not in the list below, such as commas, spaces or other delimiters, must be given
explicitly in the output format.
%a
%A
%b
%B
%d
%D
%H
%I
%m
%M
%N
%1N
%2N
%p
%S
%y
%Y
%%
abbreviated weekday
full name of weekday
abbreviated name of month
full name of month
day of the month (1 - 31)
day of the year (1 - 366)
hour (24 hour clock) (0 - 23)
hour (12 hour clock) (1 - 12)
month as number (1 - 12)
minutes (0 - 59)
milliseconds (0 - 999)
tenths of seconds (0 - 10)
hundredths of seconds (0 - 99)
am or pm
seconds (0 - 59)
year as two digits
year as four digits
% character
The default formats for converting dates and times to strings are:
%m/%d/%Y
TDate:
%0H:%0M:%0S
Time:
Date**
%m/%d/%Y %0H:%0M:%0S
should
this be
time?:
Encoding
Options
(5/18/1945)
(14:05:48)
(10/1/1990 02:20:09)
Your computers operating system maintains a setting of the current working code page. In most
cases applications will use that information to encode characters in a consistent manner. However, if
someone in Japan, or Greece, or Russia sends you a file, you will need to tell Stat/Transfer the
encoding that was used to write their files.
The options on this page provide a way for the user to specify, when necessary, the encoding that is
present in an input file and the one that is desired in an output file.
Input Character Set
In general, unless you are sure of what you are doing, you should leave this option at the default
setting, Use current system default code page. If that is checked, you will be able to see which
code page is in use on your computer. That code page will then be used for files that that do not have
known encoding information. If the encoding of an input file can be determined from the contents of
the file (or its format), that will, of course, override this setting.
If you choose to override the default behavior, you can choose a different encoding by first selecting
the Region and then the Character set. Because Stat/Transfer represents characters internally in
Unicode, any character set can, in principle, be converted on input. However, if you select an
incorrect character set, what you will get is likely to be nonsense. Therefore we strongly urge you to
look carefully at your data in the viewer to make sure that all is well and to make sure that the font
you have selected for the viewer is capable of displaying the characters you need.
Output Character Set
Unless you are sure of what you are doing, you should leave this option at the default setting, Use
current system default code page. This is especially true for the output character set because any
input character set can be converted to our internal representation, Unicode. However, you have to
get it just right when going from Unicode to another single or multi-byte character set or you will be
guaranteed to get total nonsense. For instance, if you have read a Japanese file and then want to
convert it to a multi-byte Chinese character set, Stat/Transfer will simply stop in its tracks since this
is impossible (unless the characters consist only of numbers, simple punctuation, and the letters AZ.)
For file formats that are Unicode-aware (e.g. Excel and SAS 9+), Stat/Transfer will write Unicode,
regardless of how you set this option.
On Encoding Errors
Unicode can represent any character. Unfortunately the same cannot be said for other character sets.
When Stat/Transfer moves character data from its internal Unicode representation to a character set
that can represent fewer characters, there is some probability that some will not fit. For example,
Microsoft applications such as Excel store characters in the their Unicode representation and in some
cases, although it looks like the contents of the data can be represented in a Western European
character set, there are some characters that cannot be encoded. These include the right and left
apostrophes and, often, the Euro sign.
The default behavior when this occurs is Substitute. With this option, Stat/Transfer will substitute
characters for those that cannot be converted. For example, it will substitute a single quote for right
and left apostrophes, and, if necessary, non-accented letters for those with accents. If no substitution
is possible, an underscore will be substituted. If you do not want any substitution performed, you
can check the option Stop, which directs Stat/Transfer to stop on the first conversion error
encountered.
A box allows you to control the substitution character and the option Error limit will limit the
number of permitted substations, which defaults to 100.
Note, if your Western European data have Euro and other currency signs, a good choice for your
single byte output character set is ISO-8850-15, which is a more modern version of ISO
ODBC Options
The following options allow you to fine-tune your ODBC transfers. They are generally for advanced
users.
Use NULL instead of empty strings
A null string is a string for which a value has never been entered while an empty string has zero
length. Most databases support both NULL and empty strings (Oracle is the exception which
converts empty strings to NULL strings). If you check this box, Stat/Transfer will write NULLS
instead of empty strings into your database.
Prefer datetime over smalldatetime for MS SQL Server
The smalldatetime type in MS SQL Server represents the dates between Jan 1,1900 to June 6, 2079
with accuracy to a minute. If this is sufficient for your needs and you want to save space, uncheck
this option. The DateTime type, in contrast, represents time values to an accuracy of close to 3
milliseconds and dates back to 1753.
Rows to read when scanning datetime variables
In some databases (including Access) there is only one date type, which can hold dates, times, and
date/time values. Stat/Transfer will, by default, read ten rows of data to determine the type of such
variables. If ten rows are not enough for your data, you can set a higher number here
Prefer (w)varchar over (w)char
On output, if your data are stored in char or its Unicode variant wchar, the length of the field will
be equal to the length of its widest member. On the other hand, varchar is stored in variable length
fields that save storage when the length of the string data varies between cases. In general, you will
want to leave this option at its default and write variable length string types when these are permitted.
Show the name of the table/view owner
A table or view in a database is identified by its owner and its name. If you are in an environment in
which tables with different owners have the same name, you will need to check this option to make
sure that you can select the proper table.
Append to Access and ODBC tables
This option (which is off by default), allows you to append your data to an existing database table.
Stat/Transfer will match as many variables as is possible to those already in the table and add your
data to the matching columns. At least one column must match exactly and the table must be free of
constraints that would prohibit a simple append operation, such as those requiring unique keys.
The default for Combine adjacent blanks is off. If you turn this option on, you can select
Spaces in which case multiple blanks are treated as one blank, or you can select Spaces and tabs,
in which case multiple instances of tabs and blanks are converted to a single space.
Variable Names
By default, Stat/Transfer will sense whether the first line of your input data set contains field names
or data. You may, if you wish, explicitly override this default.
AutoSense: If this option is set to AutoSense, Stat/Transfer will look at the first and second rows of
data. If there is a change from a string to a number for one or more variables between these rows,
Stat/Transfer will use the first row as the field names. This will fail if your first row contains the field
names, and all of your variables are of the string type. In that case you should choose one of the
following two options:
First Row: When this option is set to First Row, Stat/Transfer uses the data found in your first row
as the field names.
Make Up: When this option is set to Make Up, Stat/Transfer treats the first row in your file as data
and assigns the field names col1 coln.
Numeric Missing Value
It is possible to specify a string that will be interpreted as a missing value when Stat/Transfer reads
ASCII files. For example, your input data set may use the string NA to represent missing values or
it may use a period.
Enter the string that represents missing values in the input data in the Numeric Missing Value field.
If you wish to read extended missing values for either delimited or fixed ASCII files, use the option
below or enter the work extended.
Read Variable Labels from Second Row
If this option is checked, variable labels will be read from the row that follows the variable names
(ordinarily the second row).
Rows to Skip
Enter the number of rows you want to skip at the top of the file. This option is useful for skipping
headings and titles.
Convert extended (a-z) missing values
If this option is checked, the keyword extended will be entered into the Numeric Missing Value
field. When extended is entered, extended missing values (.a - .z, ., and ._) in either delimited
or fixed ASCII files will be read from the file. Note that reading missing values is case-insensitive
(that is, .a and .A, for example, are equivalent).
These extended missing values will be automatically written to the output file for output formats that
support them (SAS and Stata).
String Quote Character
This is the character that is used to enclose string fields in the input data set. The default character is
set as double quotes. You can choose the appropriate character for the input data. However, if string
variables are not enclosed by any character, you can leave this option set at the default double quote.
Maximum Number of Lines to Examine
Stat/Transfer first reads your ASCII data to determine what type of variable is present in each
delimited position. By default it will read your entire data set. If your data are consistent, so that the
first few lines suffice to show each variable type, and your data have enough rows that it actually takes
more than a few seconds to examine them all, you might want to set this option to a numeric limit,
such as 50.
Decimal Point
If your data set uses a symbol other than the default period to indicate the decimal point in a number
(a comma, for example), enter the character on the Decimal Point line.
Thousands Separator
If your data set uses a symbol other than the default comma to mark thousands in a number (a period,
for example), enter the character on the Thousands Separator line.
This option is used when generating programs to accompany fixed format ASCII files. When Write
complete paths is checked, the complete path specification will be written into the programs for the
output types SAS Program + Data File, SPSS Program + Data File and Stata Program + Data
File. This option is checked by default.
The option Write complete paths is useful if you are going to read in the program and
accompanying ASCII data on your own machine.
However, if you are going to store your data and program archivally or send it to another user, it is
best to leave this option unchecked, so that only the default directory . and the file name are written.
This allows the program to be more easily moved to another machine, since it can be executed by
setting the default directory rather than editing the program.
Shorten names and labels for older versions
If this option is left unchecked (the default) programs will be written for the latest version of SAS,
SPSS and Stata, using the longest possible variable names and labels.
On the other hand, if the option Shorten names and labels for older versions is checked, variable
names will be truncated to a width of eight characters, if necessary, and labels will be suitable for
older versions of the software. Use this option if you want to maximize compatibility.
Preserve Input Widths
By default, Stat/Transfer will optimize your data and write fixed format ASCII data into the
narrowest width that is possible. Stat/Transfer also formats ASCII data using a format in which the
decimal point is allowed to float.
In some cases, particularly for SPSS files that were originally created from fixed format ASCII data
or for fixed format data read with a Stat/Transfer Schema, the input file will contain enough
information to skip the optimization step and write out a file that has the same widths as the data that
were originally input to SPSS. If this is the case for your data and you want fixed decimals and the
original widths, check this option.
Note that for most other file formats, this option will result in incorrect output. You should check
your results and use this option at your own risk.
If this option is selected, the entry %ipath%/formats.sas7bcat will automatically appear in the Name
box below the SAS Value Labels - Reading box. This instructs Stat/Transfer to look for a file
named formats.sas7bdat in the same directory as your data file.
You can change the path if your file is in a different location.
Read from a catalog in a CPORT Library (.stc)
Select this option if you have a Windows CPORT library file that contains the formats for your data
set. (Your data will usually be read from a CPORT file as well, but need not be.)
If this option is selected, the entry %ipath%/%iname.stc will automatically appear in the Name box
below the SAS Value Labels - Reading box. This instructs Stat/Transfer to look for a file with the
same name as your input file and the extension .stc, in the same directory as your data file.
You can change the path if your file is in a different location.
If the Use default catalog name box is checked, Stat/Transfer will look in the CPORT file for a
catalog named formats, which is the SAS default. If you would like to read your formats from a
different catalog, uncheck the box, and then click on the Read Library button to get a list of catalogs
in the CPORT file. You can then select the catalog that contains the formats that are appropriate for your
data.
Read a SAS datafile (.sas7bdat)
The selection, Read a SAS datafile, is the appropriate option if you are working on a SAS platform
other than Windows and wish to read user-defined formats. As described in the Supported Programs
section SAS Value Labels, for cases where you are not working in Windows, Stat/Transfer can read
user-defined formats that are produced in SAS data set form by PROC FORMAT using the cntlout
keyword.
When this option is selected, then by default, Stat/Transfer will look in the same directory as your
input data set for a file named sas_fmts.ext, where .ext is the extension of your input file. If you
would like to use a file located somewhere else or with a different name, you can change it in the
Name box below the SAS Value Labels - Reading box. You can type in a complete file
specification, or you can use the macros below as part of the file specification.
%ipath%
%iname%
%iext%
If this option is selected, the entry %ipath%/%iname.tpt will automatically appear in the Name box
below the SAS Value Labels - Reading box. This instructs Stat/Transfer to look for a file with the
same name as your input file and the extension .tpt, in the same directory as your data file. You can
change the path if your file is in a different location.
If the box Use default member names is checked, Stat/Transfer will look in the Transport file for a
member named sas_fmts. If you would like to read your formats from a different member, uncheck
the box, and then click on the Read Library button to get a list of members in the Transport file.
You can then check the member that contains the formats that are appropriate for your data.
Continue if the Format File is not Found
If Stat/Transfer is told to look for a user-defined format file of some type and the file containing the
labels is not found when a transfer is initiated, by default the transfer will stop and an error message
will be generated.
If the box Continue if the format file is not found is checked, processing will continue in spite of
the error. Value labels will not be written to the output file.
Continue if there is an Error Processing Formats
This will instruct Stat/Transfer to continue processing if there is an error reading the format file or if
no matching formats are found in the file.
Worksheets
You have several options to specify what part of an input worksheet to read and how to read variable
names.
Data Range
You can choose different ranges to be read in input worksheets by using the Data Range drop-down menu.
AutoSense This option is the default selection. When it is selected, Stat/Transfer will read to
the first non-blank cell and use that as the upper left corner of the data range. By default, it will
then read data until it encounters an entirely blank line. This default behavior with regard to
blank rows can be changed using the Blank Rows options given below.
Specify Named Range You can change the default Autosense behavior by specifying a named
range. If you select the option Specify Named Range, then the Range line will become active
and you must enter the name of a named range in your worksheet.
Specify Explicit Range You can also change the default behavior by specifying an explicit range.
If you select this option, then the Range line will become active and you must enter explicit
coordinates that define the range (such as C3:F280).
Options Dialog Box
When a range has been specified by either one of these methods, Stat/Transfers default treatment of
entirely blank rows will also be overridden. They will be returned in your output data and, in addition,
blank rows at the end, through the last row of the specified range, will also be returned. Note that
because this option generally only applies to specific worksheets and because Stat/Transfers defaults
usually work fine, the setting is not saved between sessions.
Field Name Row
By default, Stat/Transfer will attempt to autosense whether or not the data in the first non-blank row
(or the first row of a specified range) are variable names or the first row of data. It does so by looking
for at least one column in which there is a string in the first row and a number in the second. If this
behavior is inappropriate for your worksheet (for example, if you have only string data), you can
override it. Note that the setting is not saved between sessions.
You can specify one of the following options from the Field Name Row drop-down menu: AutoSense
The default behavior.
First Non-Blank Row This option will take the values in the first non-blank row as the field
names even if there is no change to a numeric type in the second row.
No Names in Worksheet This option will treat the first non-blank row as data. Stat/Transfer will
assign variable names col1 through coln.
Specify Row Explicitly This option will take the field names from a specified row. If you
select Specify Row Explicitly from the drop-down menu, then the Row line will become
active and you must enter a row.
Blank Rows
This option menu is used when Autosense is chosen for the Data Range option.
Stop Reading This is the default behavior. After finding the beginning of the data by identifying
the first non-blank cell, Stat/Transfer will read data until it encounters an entirely blank line. It
will end the transfer at this point.
Skip Blank Rows If the option Skip Blank Rows is chosen, Stat/Transfer will read the entire
worksheet page, searching for and returning further non-blank rows. However, any blank rows that
it finds will not be written out.
Return Blank Rows If this option is selected, then all rows of the input worksheet page
including blank ones will be written to the output file. This option may return unexpected blank
rows from the end of a page that contains formatting but no data.
Read Variable Labels from Second Row
If this option is checked, variable labels will be read from the row immediately following the variable
name row, typically the second row.
Write Variable Labels to the Second Row
This option will write variable labels, if present, to the second row of the worksheet. If there are no
variable labels present in the file, this option will have no effect.
Numeric Missing Value
If there is a string in your worksheet, such as NA, that you would like to have treated as a numeric
missing value, enter it here. Note that a single . is treated as missing by default.
Concatenate Worksheet Pages
The option Concatenate Worksheet Pages allows you to combine worksheet pages into a single
output file. This option is appropriate if your worksheet contains many sheets that are identical in
structure. These can be then be combined into a single output file of any type.
For example, you may have a workbook that has 50 sheets, with one sheet for each state and the same
variables on each sheet. If you check this box, Stat/Transfer will effectively combine the sheets into
Options Dialog Box
40 The Stat/Transfer User Interface
one large input file, dropping the field names, if necessary, on the second and higher sheets. You will
end up with the data from all of your worksheet pages in a single output file.
Output Fieldname Row
This check box controls whether or not your field names will be written as the first row of the output
worksheet. By default it is checked.
JMP Options
Write value labels to JMP files
If this option is checked Stat/Transfer will write value labels to the output JMP file.
Use custom properties as variable labels
JMP allows users to assign custom text strings to columns. Check this box if you would like
Stat/Transfer to use that text as value labels for your variables.
If there are no date variables in your file, or you do not want to use them to specify the output
frequency, check the Specify a start date option. You can then select a specific frequency from the
Frequency list and also enter a start date in the Select start date box.
You can, instead, choose to leave them as they are, or you can have Stat/Transfer issue a warning.
Output file name case
You can choose how you would like Stat/Transfer to handle case conversion in your file names.
The default is Smart case conversion, in which filenames in mixed case are preserved and those
in all upper are converted to lower.
In addition to this option, you can choose to have Stat/Transfer preserve the case of your file names,
convert them to lower case, or convert them to upper case.
These options are not "sticky". They must be set before each file transfer. The input file must be
selected before they are set.
gretl Row Labels
Use these options to control which variable, if any, is used for the row label in gretl files.
Use first date or string variable uses the first date or string variable encountered in the file.
None does not write a row label at all.
Selected allows you to select a specific date or string variable from the drop down list below. This
list will not be available until you select your input file from the Transfer screen.
Stat/Transfer stores a history of the files that you have read for each Input File Type. You can control the
number of files that are stored with this option. The default is ten specifications for each file type.
Add Time and Date to log messages
If this is checked, the time and date will be written to each line of your log.
Clear Log before opening input file
By default, the messages that are written to the Log dialog box are cleared when a new file is
opened. You can override this behavior and save all messages by unchecking this box. Note that if
you are concerned with permanently documenting your transfer activities, you should use the
Automatic Transfer Logging option that will automatically write a log to disk for each transfer.
Stat/Transfer GUI font
This controls the font that is used in the Stat/Transfer user interface (GUI). It will be dependent on
your operating system. You can use the Choose Font button to call up a dialog box that will allow
you to choose a different font. After you make your selection, you will need to close and then restart
Stat/Transfer for your change to take effect. Note that you can get seriously ugly results by
choosing an inappropriate font. It is a good idea to remember the font that you started with.
Default Directories
These options allow you to control which directories will appear by default in the input and output
file selection controls on the Transfer Screen.
Input File Directory
Last Used opens the file selection control in the same directory as the last input file that was selected
and transferred.
Fixed Directory allows you to choose a directory. This will be useful if your input files almost
always come from the same directory.
Output File Directory
Same as Input constructs an output file specification from the input file directory, the input file name,
and the appropriate extension for the output file format. In addition, the drop down list under the file
specification will contain a list of file specifications constructed from a most recently used list of
output directories.
Last Used constructs the file specification from the last used output directory. The drop down list of
other previously used paths will still be available.
Fixed Directory, will construct the output file specification from a constant directory of your own
choosing.
By default, the viewer is on. To use it, simply left click on the cell you want to examine and a
viewing window will display the string. To disable it, uncheck the Show Long String Viewer
option.
Hide Automatically
By default, the viewing window will close automatically when your cursor leaves the cell that
contains the string you are viewing. To turn off this behavior, uncheck the Hide Automatically
option.
Long String Viewer Width
The value entered here gives the maximum number of characters that will be displayed on a line of
the long string viewer. If more characters are needed, the string will be wrapped in a box of the
given width. If fewer are needed the size of the viewer window will be adjusted downward to fit.
Maximum Length Displayed
This is the maximum string width that will be displayed in the long string viewer. Note that the
maximum length of a string in Stat/Transfer is 32767 characters.
Variable Info Viewer
If you click on the variable name at the top of the grid, a Variable Info Viewer box will open up
that will show you the variable name, its type and, if available, its label.
Show Variable Info Viewer
The Variable Info Viewer box is on by default. You can turn off this feature by unchecking the box.
Hide Automatically
By default, the viewing window will close automatically when your cursor leaves the cell that
contains the variable you are viewing. To turn off this behavior, uncheck the Hide Automatically
option.
Autosize Column Names
The columns in the data viewer are sized to accommodate the width of the data. By checking this
box, you can have the columns sized to accommodate the variable names.
Data Viewer Font
You can use this option to control the font that is used in the data viewer. Note that if you are
displaying international character sets, you should choose a Unicode font, or at the very least a font
that is capable of displaying the characters that are present in your data.
The path, including the drive (on Windows), of the input data file
The name, without the extension, of the input data file
The path, including the drive (on Windows), of the output data file
The name, without the extension, of the output data
automatically generated programs and those generated from the Save Program button on the
Transfer dialog box.
If checked, the default, when the program is run you will be asked permission before an output data
set is overwritten. If it is not checked the -y option will be appended to the copy command in the
program so your output data files will be overwritten without prompting.
Insert Quit at end of program
If this option is checked, programs run from the command processor prompt or the operating
system will cause the Stat/Transfer command processor to terminate at their conclusion. It will not
affect the behavior of programs run from the Run Program dialog box.
This option is checked by default because it allows you, for example, to click on a program in
Windows Explorer to have it run and then cleanly terminate. However, you may wish to keep the
command processor open to examine the results, or run other programs, or execute a series of
programs from another. In that case, you would not want quit to be inserted at the end of your
programs.
The path, including the drive (on Windows), of the input data file
The name, without the extension, of the input data file
The path, including the drive (on Windows), of the output data file
The name, without the extension, of the output data file
The Restore Saved button replaces the currently selected options in the dialog boxes with those
stored the last time the options were saved, either from an explicit save with the Save button or from
your last exit from Stat/Transfer.
Save
The Save button saves all of the current options, with the exception of the Write New, Numeric
Variable Name option, the User Missing Value selections, and the options for Input Worksheets:
Data Range and Input Worksheets: Field Name Row. This is the same behavior that occurs when
you quit a Stat/Transfer session.
SQL statement
These options (separately for reading and writing) allow you to enter an SQL statement that will be
executed before the select statement on read and before any data are written on write. Your statement
cannot return any data. For example, you can execute a Drop Table or set commands for your
database.
You can load, edit and run a Stat/Transfer command processor program directly from this tab in the
user interface. Typically, these will be programs that have been automatically generated by the user
interface, but you can run any Stat/Transfer program from this window.
Running a Program
Begin by pressing the Open button and selecting the program you wish to use. The program will
appear in the top window of the screen, where it can be viewed and, if necessary, edited. If you make
any changes to the program, the Save button will become active and a Save as dialog box will allow
you to save the program back to disk.
To run the program, click on the Run button. Output from the command processor will appear in the
Output window. If you want to stop a program in mid-run, click on the Stop button.
Log Dialog Box
48 The Stat/Transfer User Interface
This feature gives information on what is occurring during a transfer and allows better technical
support.
When you carry out a transfer with the user interface, Stat/Transfer will write status, progress and
error messages, which can be read as the transfer progresses and can be saved to a file. In case of
errors or problems, this file can be sent to Circle Systems technical support, [email protected].
Stat/Transfer Log
The log messages for a single session of Stat/Transfer will appear in the Stat/Transfer Log window.
You can, if you wish, save these messages to a permanent file.
Log Level
This option is designed to determine the information that is written to the log window during a
transfer. Currently these settings control the error messages for SAS catalog reading only. Their
coverage will be expanded in the future.
The options are: Critical Errors Only, Information and Errors, Verbose.
Clear Log
You can clear the contents of the log by pressing the Clear Log button.
-x1 -x2
where infilename.ex1 and outfilename.ex2 are the input and output files, respectively and .ex1 and
.ex2 are standard extensions used to determine the file types. (See Page 54 for standard extensions.)
The file names can be complete file specifications.
An alternative for specifying the file type is to use a file-type tag instead of a standard extension.
This is discussed on Page 54.
Parameters -xi following the file names allow some options to be selected, such as automatic
optimization of variables or suppression of warning messages. See Page 57.
You can give additional commands before you give the COPY command. These allow you to select
cases and variables, to manually change output variable types and to set a number of options. See
Pages 67 - 70. (Note that these commands can be given only at the Stat/Transfer prompt or in a
command file or in a start-up file. They cannot be used when running interactively from the
operating system prompt.)
If no commands for selecting cases or variables are given, then by default, all of the variables and
cases will be transferred.
For example, to copy all of the variables and all of the cases from an Excel file, indata.xls, to a SAS
file, outdata.sas7bdat, type:
copy indata.xls outdata.sd2
where the file type is given by using standard extensions.
Instead of entering commands directly, you can store them in a command file and execute that file at
the Stat/Transfer prompt. See Page 75.
Wildcard Transfers
Wildcards can be used to copy multiple files in one transfer. For example:
copy in/*.dta out/*.sas7bdat
will convert all of the Stata files in the directory /in to SAS files in the directory /out.
Standard wildcards work in the input file (? and * ). The output filename should always be
specified with an asterisk and will be consist of the input filename, with the new extension.
We recommend that you use wildcard specifications with an input directory that contains just the files
you wish to transfer. To avoid the possibility of overwriting existing files or of being prompted about
this possibility, the directory should be empty on the output side. You may also specify the -y
parameter, to suppress prompting, but you should do so with extreme caution.
Combining Files
If you wish to combine multiple input files into a single output file, use the COMBINE command
instead of the COPY command:
COMBINE *.ex1 filename.ex2 -[f[+]filename]
where *.ex1 are the input files to be combined (all in the same directory), filename.ex2 is the output
file and .ex1 and .ex2 are standard extensions used to specify the file types. The parameter
-[f[+]filename] is described below. The input files will be concatenated into one output file.
The syntax for this command is similar to that of the COPY command using wildcards, except that
the output file specification does not contain a wildcard.
The files to be combined do not need to have identical variables and the variables do not need to be in
the same order.
By default, the first file in alphabetical order becomes the reference file and the variables in this file
and their characteristics determine the structure of the output file. The variables in succeeding files
are then matched to the first file. If a file does not have a variable with a matching name or the type
of the variable does not match, it will not be included in the output.
The switch -f allows you to specify which file will be used as the reference file. For example, if
you have file1, file2, and file3 as input and you want to use file3 instead of file1 as the reference, you
would specify -ffile3. Note that you use the file name without the extension.
The file name itself may contain some information about the data in the file. For example, you might
have input files with data for each state, with files named with the state name. In such cases, you may
wish to have an additional variable created from the input file names and written to the output file.
The parameter -f+ tells Stat/Transfer to create a new variable from the input file names. This can be
combined with the reference file specification. For example, -f+file3 tells Stat/Transfer to use file3
as the reference file and to create a new output variable from the input file names.
All of the options used with the COPY command can be used with the COMBINE command.
In those cases where different versions of a given file type use the same standard extension, the
standard extension will not enable Stat/Transfer to distinguish between different versions on output.
To write a version other than the default version, the output file name must be preceded by a file-type
tag, which identifies the appropriate output version. Note that for JMP files, a file type tag may be
necessary on input as well.
For example, Stata and Stata/SE both have the standard extension .dta. By default, if the output file
name has the extension dta, a standard Stata file will be written. If you wish to write a Stata/SE file,
then the output file name must be preceded by the file-type tag stata/se.
Whenever the same standard extension appears in the table below for different versions of a given file
type, you must use a file-type tag. The default version is marked with an asterisk.
File Names without Standard Extensions
If your file does not have a standard extension, you must precede the file name by a file-type tag,
which indicates the file type.
For example, if you want to write a Windows SAS file that has a .dat extension instead of an
.sas7bdat extension, you can type
copy indata.xls sas outdata.dat
where the file-type tag sas indicates that the output file is to be a SAS Windows file, while the
standard extension .xls identifies the input file as an Excel worksheet.
Combining Files
Standard
Extension
1-2-3
wk?
Access
mdb txt,
ASCII - Delimited
csv
ASCII - Delimited with Schema
stsd (Schema file)
sts (Schema file)
ASCII - Fixed with Schema
dbf
dBASE and compatibles
DDI Schemas
xml
Epi Info
rec
Excel 2007
xlsx
Excel 97+
xls
Excel Version 2
xls
FoxPro
dbf
Gauss 96
dat
HTML
html
JMP Version *
jmp
Please see the discussion of JMP in the Supported Files section
LIMDEP
lpj
Matlab 7
mat
Matlab
mat
Mineset
schema, sch
Minitab
mtw
Mplus
inp
NLOGIT
lpj
ODBC
[none]
OpenDocument Spreadsheet
ods
OSIRIS
dict, dct
Paradox
db
Quattro Pro
wq?,wb?
R
rdata
RATS
rat
SAS V6 for Windows and OS/2
sd2
SAS V6 for Mac, Unix-HP, Sun,
ssd01
sas7bdat
IBM SAS V7 and V8
sas7bdat
SAS V9
SAS CPORT
stc
SAS Transport Files Stpt, xpt
PLUS
.
S-PLUS for HP, IBM, Sun Unix
.
SPSS Data Files
sav
SPSS Data for HP, Sun Unix, IBM sav
SPSS Portable Files
por
SPSS Program and ASCII data
sps
Stata (Standard)
dta
dta
Stata/SE
Stata Program and ASCII Data
do
sta
Statistica Version 5 and 6
sta
Statistica Version 7+
sys
SYSTAT
Triple-S
xml
File-Type
Tag
123
access
delim
stdelim
stfixed
xbase
ddi
epi
excelx
excel
excel2
xbase
gauss96
html
jmp*
lpj
matlab7
matlab
mineset
minitab
mplus
nlogit
odbc
od
osiris
paradox
quattro
r
rats
sas2
sas1
sas
sas9
cport
sasx
splus
splus-hl
spss
spss-hl
spssp
spss-dat
stata
stata/se
stata-dat
statistica
statistica7
systat
sss
Default
**
**
**
**
**
**
**
Sometimes, when both the input and the output file require a table or subfile specification, the -T
switch can be ambiguous. In that case, you can modify the switch with a > or <. The parameter T> means the input table specification and -T> means the output table specification.
For example to transfer sheet 3 of an Excel file to an Access table named cities, you could write:
copy in.xls out.mdb -T<sheet3 -T>cities
Wildcard Operators
The wildcard operators * and ? allow you to move more than one input Access data source table
with a single command.
For example, suppose the Access data source company.mdb consists of two tables, sales and
marketing. Then
copy company.mdb dept.xls -t*
will move each table in company.mdb to a separate Excel worksheet. The output worksheet names
will be the root name of the output file (that given in the COPY command) with the table names
appended, dept_sales.xls and dept_marketing.xls.
Wildcards in the file name can be mixed with wildcards in the Table parameter. For example, to
transfer all tables in all of the Access files in the directory indata and write them out in Stata format,
you would use
copy indata/*.mdb outdata/*.dta -t*
or
-Tname
where n is the page number and name is the page name. You must use quotes around the entire
parameter if name contains blanks.
Examples are:
copy dept.xls out.dta -t3
copy dept.xls out.dta -tnov sales
Wildcard Operators
The wildcard operators * and ? allow you to move more than one input worksheet page with a
single command.
For example, suppose the worksheet company.xls consists of two pages, sales and marketing. Then
copy company.xls dept.dbf -t*
will move each page to a separate dBASE file. The output file names will be the root name of the
output file (that given in the COPY command) with the page names appended, dept_sales.dbf and
dept_marketing.dbf.
Wildcards in the file name can be mixed with wildcards in the page parameter. For example, to
transfer all of the pages in all of the worksheet files in the directory indata and write them out in
Stata format, you would use
copy indata/*.mdb outdata/*.dta -t*
Selecting Members of SAS CPORT or Transport Files
Whenever you select a SAS CPORT or Transport file as input, the first member will be the one used
as the input data set, unless you select another one.
If the data you wish to use are in a different member, you must use the parameter
-Tmembername
where membername is the member you wish to select.
For example, to transfer the data from the member part4 of the SAS Transport file indata.xpt to the
Gauss file outdata.dat, type:
copy indata.xpt outdata.dat -tpart4
Wildcard Operators
The wildcard operators * and ? allow you to move more than one member of a SAS Transport file
with a single command. The wildcard operators are used for SAS members in the same way that they
are used for worksheet pages, as discussed above.
Most data are not measured with more than eight or nine digits of precision (survey data, for
example, never are). If you are concerned with the size of your output dataset, you might want to
use the floats option.
To tell the Stat/Transfer command processor to force numbers with fractional parts into floats where
appropriate, use the following:
COPY infilename.ex1 outfilename.ex2 -of
When this parameter is given, Stat/Transfer will both optimize the variable types and use floats. The
floats option can only be used when output is optimized.
If you choose to set output variable types manually, the parameter -of should not be used with the
COPY command.
Automatic Dropping of Constants from Output File
You can tell Stat/Transfer to automatically drop variables that are constant or missing for a selected
subset of data.
In order to tell Stat/Transfer to automatically drop constant or missing variables, the parameter -oc
is used:
COPY infilename.ex1 outfilename.ex2 -oc
When this parameter is given, Stat/Transfer will both optimize the variable types and drop variables
where needed. Note that the drop-constant option can only be used when the output is optimized.
You can use the doubles parameter and the drop-constants parameter simultaneously, using the
parameter -ocf.
COPY infilename.ex1 outfilename.ex2 -ocf
If you choose to set output variable types manually, the parameters -oc or -ocf should not be used
with the COPY command.
Selecting Cases
Cases to be transferred can be selected when you are using the COPY command from the
Stat/Transfer command processor or when you are using a command file.
When you are using a command file at the operating system command line, you can enter operators
for case selection. However, transfers that are typed in at the operating system command line will
transfer all of the input cases.
You can select cases from the input file with the Stat/Transfer command processor, using the same
WHERE statement that you would use with the user interface. A complete description of WHERE
statements is given in the Stat/Transfer User Interface section Selecting Cases, Page 23.
When using case selection with the command processor, the WHERE statement is entered before the
COPY command.
The accuracy of the WHERE statement will be checked only when the COPY command is executed.
For this reason, particular care should be taken when typing in variable names.
Selecting Variables
Selecting Variables
Variables to be transferred can be selected when you are using the COPY command from the
Stat/Transfer command processor or when you are using a command file.
When you are using a command file at the operating system command line, you can enter operators
for variable selection. However, transfers that are typed in at the operating system command line will
transfer all of the input variables.
Variables can be selected with one of the commands KEEP, DROP, or TYPES. Only one of these
commands can be given and it must be given prior to the COPY command.
Selecting Cases
variables will be written, one per line, unless the parameter -oc has been given If variable labels are
available, these will also be written as comments in the file.
The parameter -oc, the drop-constants option, is used if you wish to drop constant or missing
variables from the variable list you are creating.
Once the file has been created, you can use your favorite editor to delete the variables that are not to
be used with the DROP or KEEP command.
Writing to the Screen
If, for some reason, you wish to have the input variables listed on the screen, use the VARS command
without specifying an output file:
VARS filename
Selecting Variables
Available Options
The following options are available with the SET command. Case does not matter when they are entered.
Most of these options have already been discussed in the sections on the various options set in the
Options dialog box of the user interface. Go to the sections given by the headings below.
The options marked with asterisks, **, are discussed after the list of available options.
SET Command Options
General Options
NUMERIC-NAMES
PRESERVE-LABEL-SETS
SAMP-SEED
VAR-CASE-CI
VAR-CASE-CS
(Y/N)
Write new, numeric variable names
(Y/N)
Preserve variable label tags and grouping where possible
(Auto/Number)
Seed for sampling functions
(lower/preserve-mixed/ Variable case conversion rule for case-insensitive output file
preserve-always/ upper)
(lower/ preserve-mixed/ Variable case conversion rule for case-sensitive output file
preserve-always/upper)
* WRITE-OLD-VER
* DROP-KEEP
* BYTE-ORDER
(Y/N) or (Number/N)
(Clear/Save)
(HL/LH)
(Format string)
(Format string)
DATETIME-FMT-WRITE(Format string)
Encoding Options
IN-ENCODING
OUT-ENCODING
(System/Other)
(System/Other)
If you choose to override the default behavior, you can choose a different encoding by going to the user interface, clicking
on the Options tab and selecting Encoding Options. First select the Region and then the Character Set. The value in
parentheses after the name of the character set is the value you should enter for Other. Be sure to read Encoding
Options, page 32.
ODBC Options
ODBC-NULL-STRINGS
ODBC-USE-DATETIME
ODBC-DATE-ROWS-TO-READ
ODBC-VARCHAR-OVER-CHAR
DB-TABLE-APPEND
(Y/N)
Use nulls for missing strings
(Y/N)
When possible, use larger datetime type
(10/Number/All) Number of rows to read to determine the type of date variable
(Y/N)
Use varchar if it has the same length as char
(Y/N)
Append to Access or ODBC database tables
ASCII Options-Reading
DELIMITER-RD
(Autosense/Comma/Tab/
Space/Semicolon/Other)
COMBINE-DELIMITERS
(Off/ Spaces only/
Spaces&Tabs)
ASCII-RD-VNAMES
(Autosense/First-row/
Make-up)
ASCII-VAR-LABS-2ND-ROW (Y/N)
SKIP-ROWS
(0/Number)
NUM-MISS-RD
(Missing value characters/
Extended/None)
QUOTE-CHAR-RD
(Character/None)
MAX-LINES
(Number/All)
DECIMAL-POINT
(Character)
THOUSANDS-SEP
(Character)
ASCII Options-Write
DELIMITER-WR
(Comma/Tab/Space/
Semicolon/Other)
QUOTE-CHAR-WR
(Character/None)
NUM-MISS-WR
(Missing value characters/
Extended/ None)
SET LINE-END
(Win, Unix)
WRITE-FIELD-NAMES
(Y/N)
CODE-OMIT-PATHS
(Y/N)
PROG-PRESERVE-WIDTHS (Y/N)
(Y/N)
(Filespec)
(catalogname)
(datasetname)
(Y/N)
(Y/N)
** The various choices for reading SAS value labels are discussed for the section SAS Value Label
Options, on Page 37. The following SET command sequences correspond to these choices:
Do not read formats
SET READ-SAS-FMTS N
SET READ-SAS-FMTS Y
SET READ-FMT-NAME filename.sas7bcat (the extension indicates a catalog file)
SET READ-SAS-FMTS Y
SET READ-FMT-NAME filename.sas7bdat (the extension indicates a data file)
SET READ-SAS-FMTS Y
SET READ-FMT-NAME filename.tpt (the extension indicates a Transport file)
SET UDF-DAT-MEMBER datasetname
(Y/N)
(Filespec)
Worksheet Options
WKS-NAME-ROW
WKS-DATA-RANGE
(Autosense/First-row/
No-name/Number)
(Autosense/Name/
Coordinates)
WKS-VAR-LABS-2ND-ROW (Y/N)
WKS-BLANK-ROWS
(Stop/Skip/Use)
Read variable labels from the row following the variable names
Treatment of worksheet null rows
CONCATENATE-PAGES (Y/N)
WKS-NA-STRING
(string)
WKS-WRITE-VNAME
(Y/N)
JMP Options
JMP-LABELS
(Y/N)
JMP-CUST-PROPERTIES (Y/N)
(Y/N)
R-OLD-NA
(Y/N)
RATS Options
RATS-DATE-VAR
The variable in the input file that specifies the series start
and frequency in RATS output
RATS-DATE-VAR-NAME (Variable name)
If rats-date-var is set to "specified", the variable name
should be set with this option
RATS-VAR-DATE-FREQ (Autosense, Daily, Weekly, This option will be applied if rats-date-var is set to Monthly,
Quarterly, Annual) "specified" or "first"
RATS-START-DATE
(Date in DD/MM/YYYY This will be used if "rats-date-var" is set to none
format)
RATS-DATE-FREQ
(Daily, Weekly, Monthly, This is applied if "rats-date-var" is set to none)
Quarterly, Annual)
Output Options
AS-OUTREP
(Alpha_True64
Alpha_Vms_32
Alpha_Vms_64
Alpha_Ia64
Hp_Ux_32
Hp_Ux_64
Intel_Abi
Linux_32
Mips_Abi
Os2
Rs_6000_Aix_32
Rs_6000_Aix_64
Solaris_32
Solaris_64
Windows_32
Windows_64)
MLAB-DATETIME-AS-STRING
STAT7-VLAB-DESCRIP
DDI-AGENCY
(Y/N)
(Y/N)
(example.org)
WRITE-OLD-VER
(Y/N) or (Number/N)
By default, except for Stata, a particular output file type will be written using the latest version
supported by Stat/Transfer. For example SAS Version 9 files will be written if SAS is specified.
If Stata is specified as the output file type, Stata Version 12 files will be written by default. If Stata
Version 12 files were to be written, then users of older versions of Stata would see an error message,
"File is not in Stata format" generated when Stata sees the newer file type. The default Stata Version
12 files can be read by all Stata versions.
To write a file for a previous version of a particular file type, use either of the SET commands:
SET WRITE-OLD-VER Y|N
SET WRITE-OLD-VER versionnum
where versionnum is the version number of the output file you wish.
If you use the first form of the SET command, Stat/Transfer will use the next-to-the-last version of
the output file type you have selected. For example, if you have selected a SAS output file, then
set write-old-ver y
would tell Stat/Transfer to write a SAS Version 8 output file.
If you wished to create a Stata 5 output file, you would use
set write-old-ver 5
Note that you can use either form if you wish to use the next-to- the-last version.
Reset Variable Selection Statement
DROP-KEEP
(Clear/Save)
If data are transferred from more than one input file during a single session, then you need to specify
the variables that are to be transferred from each file, using the KEEP or DROP variable selection
command. The DROP-KEEP option of the SET command allows you to reuse or clear the variable
selection command.
Input Files Specified Separately
If input files are given with separate COPY commands (that is, without using wildcards), then the
default behavior is that you must give a KEEP or DROP variable selection statement before each
COPY command. This corresponds to the option Clear for DROP-KEEP.
However, if the same variables are to be transferred from each file, you can specify that the same
KEEP or DROP command apply to all input files that follow, until another KEEP or DROP command
is encountered. To do so, use the SET command
set drop-keep save
Input Files Specified with Wildcards
If the input files are specified with wildcards, the default behavior is that the KEEP or DROP variable
selection command you give before a COPY command applies to all of the files specified in that
command. Thus the default when files are specified with wildcards is Save.
If you set the DROP-KEEP option to Clear, then the variable selection statement will apply only to
the first file. All of the other files given by the wildcard specification will transfer all of their
variables.
Note that the default changes depending on whether or not the input file specification contains
wildcards. The default is Clear for no wildcards and Save for wildcards.
Set Byte Order for S-PLUS and SPSS Output Files
BYTE-ORDER
(HL/LH)
By default, Stat/Transfer will write files in low-high byte order for S-PLUS and SPSS output files.
This is appropriate for Windows. However, Unix machines vary in their byte order. DEC Alpha and
Intel processors are low-high byte order, other Unix machines are high-low. You can change the
default in order to write a file appropriate for a Unix machine with the second byte order. For
example, if you wish to produce an S-PLUS file suitable for a Sun machine, you would use the SET
command
set byte-order hl
Quit
You can terminate your Stat/Transfer session by typing quit or q.
Command Files
You can enter Stat/Transfer commands in a command file, as well as interactively at the Stat/Transfer
prompt or the operating system prompt.
When you enter commands at the prompt, the command is executed immediately. When you store
commands in a file, the commands are executed when you execute the file.
You can execute command files from the Stat/Transfer prompt or from the operating system prompt.
When you wish to run batch jobs, you must use command files.
Command Files
For example, a command file, repeat.stcmd, will be executed if, at the operating system prompt, you
type:
st repeat.stcmd
Note that the extension, which is explicitly given, must be .stcmd. If the file is not in the
Stat/Transfer program directory, the complete path to it must be given.
If you do the same transfers repeatedly, the ability to execute command files in this way allows you to
set up a shortcut that will perform your transfers with a single click on a desktop icon.
Running and Editing Command Files from the Windows Explorer
When you double click on a Stat/Transfer command file in the Windows Explorer, the command
processor will be launched and the command file executed.
If the last command in the file is quit, you will be taken directly back to the Explorer. Otherwise,
the command quit at the command line will return you to the Explorer.
You can edit a command file by right-clicking then selecting edit.
Command Files
During a Stat/Transfer session you can obtain the value of any parameter by entering DBR or DBW
and the parameter name without a value:
DBR | DBW parametername
If no value has been entered yet, you will be told that the value is blank.
Clearing Parameter Values
The DBR and DBW values are persistent across COPY commands during a session. Thus you will
need to change or clear them if you are executing multiple COPY commands using different data sources.
78 The Command Processor
Command Files
You can clear the value of any parameter by entering clear in place of a value. For example:
dbr connstr clear
clears the value of the parameter CONNSTR. Note that it is necessary to clear parameter values if
you want to be re-prompted by the ODBC driver manager for any values that you have previously entered.
or
dbw connstr
You will find that the connection strings are generally long and complicated, so that typing one into a
command file after retrieval is tedious and error prone. Furthermore, because the connection string
may contain your password, you might not want it written openly to a command file.
Stat/Transfer solves these problems with operators that allow you to write or retrieve the value of any
parameter and by allowing encryption.
The Write and Retrieve Operators
The operator > allows you to store the value of any parameter in a file, while the operator < allows
you to read back a stored value.
DBR | DBW parametername > | < storagefile
Thus, in order to write the connection string for an input ODBC data source to a file, connect.str*, type:
dbr connstr > connect.str
To retrieve the connection string and make it available to an input ODBC data source, type:
dbr connstr < connect.str
To use a connection string in a batch job, you can connect interactively to the ODBC data source, use
the > operator to store the connection string in a file and then use the < operator with a DBR or
DBW command in a command file.
Encrypting Connection Strings
Because the connection string may contain your password, you may not want it written to unprotected
files. You can encrypt and decrypt the stored value of any parameter by adding a pound sign to the >
or <operators.
For example:
dbr connstr ># connect.str
writes the connection string in encrypted format and
dbr connstr <# connect.str
retrieves and decrypts the connection string.
Variable Names
Stat/Transfer will, by default, ensure that variable names are legal in the target data set and are
unique. It does this by adding special characters when needed, changing case where appropriate,
substituting underscores (or another legal character) for invalid characters and by truncating variable
names to an acceptable length. It then eliminates duplicates created by this process by appending a
number to the end of the second and higher instances of duplicate variable names.
The original variable names will, if possible, be retained as variable labels.
If you check the option Write New, Numeric Variable Names (VN), in the General Options section
of the Options dialog box, Stat/Transfer will create new variable names of the form V1...VN, instead
of the default variable names, This is chiefly useful when dealing with truncated names, which often
have little resemblance to the names you started with. If your output system supports variable labels,
you may wish to use this option to assign numeric names for your variables. You can then use the
variable labels for description.
Because this option is likely to be useful only in special circumstances, it reverts to the default
between sessions of Stat/Transfer.
Internal Limitations
The maximum width of an alphanumeric variable is 32,000 characters.
Variable names are limited internally to a maximum length of 255 characters.
Data are generally processed one case at a time, so the number of cases is not limited. Exceptions to
this are the worksheets and file formats that must be transposed, such as S-Plus. These transfers are
limited by available virtual memory. It is unlikely, however, to be a problem.
Supported Programs
Specific information for each of the supported file formats is given in the sections that follow.
Read-me File
The installation or Web update procedure may copy a file called read.me, which will be a supplement
to the on-line help or manual. There is a shortcut to the read.me file from the Windows Start menu
in the Stat/Transfer group. For OS-X and Unix the read.me file can be found in the installation
directory.
You should check to see if the read.me file exists or has been updated. If so, you can read it in any
editor or word-processor. We make every effort to keep up with changes in the file formats of
popular software and the read.me file will contain the latest information on which versions of these
programs are supported. The file will also contain the latest information on other improvements to
Stat/Transfer.
You can also get current information about Stat/Transfer by visiting our website at
http://www.stattransfer.com.
You can reach our website from the Stat/Transfer About screen.
Supported Programs 83
wk1
wk3
Worksheet database files are structured worksheets where each row is a single case and each column
contains a variable. Data can consist of numbers (including serial date numbers), labels, or formulas.
The first non-blank row of a worksheet database file usually has strings in each column that give the
names of the variables. The data then begins in the next row. However, variable names may be in
different rows or not present at all.
You have several options to specify what part of an input worksheet to read and how to read variable
names. An option allows you to read variable labels from the row after the variable names.
Data Range
You can choose different ranges to be read in input worksheets by using the drop-down menu for
Data Range in the Worksheets section of the Options dialog box or using the SET command,
WKS-DATA-RANGE.
If you use Autosense (the default), Stat/Transfer will read to the first non-blank cell and use that as
the upper left corner of the data range. It will then read data until it encounters an entirely blank line.
This is the default behavior. You can change the behavior with respect to blank rows by using the
Blank Rows option.
Rather than have Stat/Transfer automatically sense the number of rows to be read, you can use the
other options for Data Range to specify a range, either by giving a named range or by giving explicit
coordinates.
When a range has been specified, Stat/Transfers treatment of entirely blank rows will also be
overridden. They will be returned in your output data and, in addition, blank rows at the end of the
worksheet, through the last row of the specified range, will also be returned.
Determining Variable Names
By default, Stat/Transfer will attempt to determine whether or not the data in the first non-blank row
(or the first row of a specified range) are variable names or the first row of data. It does so by
looking for at least one column in which there is a string in the first row and a number in the
second.
If this behavior is inappropriate for your worksheet (for example, if you have only string data), you
can specify different options in the Field Name Row drop-down menu of the Worksheets section of
84 Supported Programs
Options dialog box. You can specify that variable names must be taken from the first non-blank row,
that they be taken from a specific row, or that the worksheet does not contain variable names, so that
Stat/Transfer should assign them (col1 through coln.)
Determining the Data Types and Widths
After identifying the label row, Stat/Transfer will look, if necessary, at the entire column to find the
first non-blank cell in order to determine the data type of each column. If the first non-empty data
cell of a particular column is a number (or a label with a single period), Stat/Transfer will transfer the
column as a number. If the data cell contains a label, the variable will be transferred as a string.
The width of the column for each numeric variable and the format of the first non-blank data cell in
that column are used, where possible, to set the default target, or output, types for the numeric
variables. If the format of the first data row has any decimal places (for example, F(2)), the target
type will be float. On the other hand, if the cell format has no decimal places (for example, F(0))
the target types will be various flavors of integers which depend on the column width. If the column
width is less than three, the target type will be byte. If the column width is less than five, the target
type will by int. Otherwise the target type will be long. Any date format in the first data row will
set the target type to date.
The maximum width of character variables is determined by examining the widths of all of the strings
in a column.
Stat/Transfer is lenient in typing variables from worksheets. If it is expecting a character variable and
it encounters a number it will convert it to a string.
Combining Multiple Input Worksheet to a Single Output File
The option Concatenate Worksheet Pages, found in the Worksheets options in the Options dialog
box allows you to combine worksheet pages into a single output file. This option is appropriate if
your worksheet contains many sheets that are identical in structure. These can be then be combined
into a single output file of any type.
For example, you may have a workbook that has 50 sheets, with one sheet for each state and the same
variables on each sheet. If you check this box, Stat/Transfer will effectively combine the sheets into
one large input file, dropping the field names, if necessary, on the second and higher sheets. You will
end up with the data from all of your worksheet pages in a single output file.
Writing 1-2-3 Worksheet Files
On output, by default, Stat/Transfer will write variable labels in the first row of the worksheet. Data
values will be placed in the second and succeeding rows. You can change this behavior in the
Worksheets options in the Options dialog box.
Column widths and formats will be determined by the variable information available. Dates and
character variables are straightforward. For numerical data, information on the width and number of
decimal places of variables, where available, is used to set the column widths and formats.
Missing Data
On input, blank cells and cells containing labels consisting of a single dot are read as missing.
If there is a string in your worksheet, such as NA, that you would like to have treated as a numeric
missing value, you can specify it using the Numeric Missing Value option in the Worksheets
section of the Options dialog box.
When transferring data to worksheets from other formats, missing values will be written out to
worksheets as blank cells.
Supported Programs 85
Output Type
Numeric cell
(formatted if information is
available)
date
time
date/time
86 Supported Programs
Label
Access
Stat/Transfer will read and write Access databases on Windows.
Although the data are transferred through the Microsoft Access ODBC driver, Access files are treated
as a normal Stat/Transfer file type. You can thus choose the file from the normal Open or Save dialog
boxes and need not be concerned with the process of configuring an ODBC data source for each file.
In order to use Microsoft Access you must have the ODBC and Access components installed with the
Stat/Transfer program. If you did not do this when you installed Stat/Transfer, you can re-run the
installation program and install the components.
Standard extension: mdb
Reading Microsoft Access Databases
Stat/Transfer can either read single tables or multiple tables that are joined in an Access view.
Access has a single date type Date/Time, which is used to store date, time and date/time values.
The format of the column determines how these are displayed. There is no straightforward way
outside of Access to determine how a column is formatted. Stat/Transfer reads the first ten cases of
the file to determine if each Date/Time column contains date data, time data or both. This
information is used to set the type.
Variable labels are transferred when reading Access tables.
Writing Microsoft Access Databases
On output, tables can be created in a new file, new tables can be created in an existing file, or existing
tables can be overwritten with a new table.
In addition, new data can be appended to an existing database table. This option is off by default and
must be turned on by using the Append to Access and ODBC Tables option in the
ODBC/Access Options of the Options dialog box or the SET command DB-TABLE-APPEND in
the command processor.
Stat/Transfer will match as many variables as possible to those already in the table and add your data
to the matching columns. Obviously at least one column must match exactly and, in addition, the
table must be free of constraints, such as those requiring unique keys, that would prohibit a simple
append operation.
When writing time or date data to Access, there is no way for Stat/Transfer to set the format. Because
of this, these variable types are written as Date/Time values. You may, if you wish, change the
format with Access after you have transferred your data.
Missing Data
Access supports a null value, which is the same as Stat/Transfers missing value.
Output Variable Types
The output variable type that results from each target variable type is given in the following table:
Access
Supported Programs 87
88 Supported Programs
Target Type
Output Type
byte
int
long
float
double
date, time, date/time
string
Short
Access
Long
Single
Double
Date/Time
Character
What character is read as the delimiter. You can explicitly specify the character or you can
allow the program to sense it automatically.
Whether or not the first row is treated as variable names, or whether Stat/Transfer should
automatically sense it or whether Stat/Transfer should assign variable names.
The character that is used to enclose string fields in the input data.
The number of lines Stat/Transfer will read to determine each type of variable present. By
default the entire data set is read.
A scan format to determine whether a given field is a date, time or a date/time. The formats
for dates, times and date/times are then used to actually read the data. The default formats
will usually read these fields, but formats for unusual fields can be specified.
A century changeover year. If you are reading two-digit years, you can use this option to
control how they are read. The default for the option is 20.
All of the user interface options available for reading ASCII files are set in the Options dialog box, in
the sections Date/Time Formats - Reading and ASCII/Text File - Read Options and are discussed
on Page 30 and Page 34. The command processor options can be found on Page 67 in the section
Setting Options with the Set Command.
If you wish to have more flexibility in reading delimited files, you can describe the file with a
Schema file (see Page 93).
The character that will be used as the delimiter in each record: commas, tabs, spaces, or
semicolons, or some other character.
The character that will be used to enclose string variables on output. It is typically a
double quote.
Supported Programs 89
All of the options available for writing ASCII files are set in the Options dialog box, in the sections
Date/Time Formats - Writing and ASCII/Text File - Write Options. these options are discussed in
detail on Pages 31 and Page 36. The command processor options can be found in Setting Options
with the Set Command, Page 67.
Missing Values
By default, missing values are indicated on input and output by one delimiter immediately following
another. You may change this default behavior in the ASCII/Text File - Write Options section of
the Options dialog box or with the SET command.
Extended missing values are supported. If the missing values option is set to extended, then when
an ASCII file with extended missing values is transferred to a SAS or Stata file, the input missing
values will transfer to the equivalent SAS ones. When a SAS file is transferred to an ASCII file with
extended missing values specified, any missing values ._ in the input SAS file are written out as .
in the output.
Note that if a blank is used as the delimiter, missing values will be hard to determine.
Output Variable Types
The output variable type that results from each target variable type is given in the following table:
Output to Delimited ASCII from Stat/Transfer)
90 Supported Programs
Target Type
Output Type
byte
int
long
float
double
string
Number
(with a precision of up to
15 decimal places)
date
time
date/time
Character
(written according to the ASCII
format options currently in effect
Character
To use fixed format ASCII files as input with the Stat/Transfer user interface, select ASCII Fixed Format (S/T Schema) from the list of supported file types at the Input File Type line of the
Transfer dialog box.
In the File Specification line of the Transfer dialog box, rather than entering the name of the fixed
format ASCII file itself, enter the name of the Schema file that describes it. Schema files will
generally have the name of the input fixed format data file and the extension .sts. The Schema file
then points to the data file. Thus, the location of the data file must be accurately given in the Schema
file.
If Stat/Transfer wrote the Schema file originally, then the exact path that Stat/Transfer used to write
the ASCII file on your machine was entered as the file location in the Schema file.
The file location can be changed by opening up the Schema in any plain text editor and giving the
file location with the FILE command. Comments in the first lines of the Schema or program file will
guide you in changing the specified file location.
Using the Command Processor
To specify a fixed format ASCII file as input to Stat/Transfer when using the command processor,
give the name of the Schema file as the input file in the COPY command. The file should have the
extension .sts. If for some reason your file does not have the extension .sts, then you can use the
file-type tag stfixed.
The Schema file then points to the data file as described above.
Writing Fixed Format ASCII Data Files
On output, Stat/Transfer can write fixed format data files, and will, if directed, create a Stat/Transfer
Schema file. Alternatively, you can choose to have Stat/Transfer write programs to accompany
output fixed format ASCII files that will allow these files to be read into SAS, SPSS, or Stata.
Fixed format ASCII data files, along with the associated program files, are useful for passing your
data to colleagues. In addition, they are a prudent way to archive your data for future use, as they
are not dependent on changes in computer architecture or the fate of a particular software package.
You have various options when writing the program files. See Page 31 and Page 36.
Using the User Interface
When you click on the Output File Type control in the Transfer dialog box, you will see the
ASCII Files - Fixed Format
Supported Programs 91
92 Supported Programs
You can write fixed format ASCII files using the command processor.
The only change from the description given above for the user interface is that the type of output is
specified by using an appropriate extension for the output file name in the COPY command, rather
than making a selection from the user interface.
The file extensions used to specify each type of output are:
Fixed format ASCII with a SCHEMA file
Fixed format ASCII with a SAS program
Fixed format ASCII with an SPSS program
Fixed format ASCII with a Stata program
sts
sas-dat
spss-dat
stat-dat
Missing Values
By default, missing values are indicated on input and output by a blank field. This behavior can be
changed with the Numeric Missing Value option to extended in the ASCII READ or ASCII
WRITE sections of the Options dialog box or with the SET command.
When an ASCII file with extended missing values is transferred to a SAS or Stata file, the input
missing values will transfer to the equivalent SAS ones. When a SAS file is transferred to an ASCII
file with extended missing values specified, any missing values ._ in the input SAS file are written
out as . in the output.
Output Variable Types
The output variable types that result from each of the target variable types are given in the following
table.
Supported Programs 93
Output Type
byte
int
long
float
double
string
Number
(with a precision of up to
15 decimal places)
date
time
date/time
Character
94 Supported Programs
Character
when required
Fixed
Delimited
when required
VARIABLES
/
when required
Variable name | variable list | variable range columns (variable type)
Variable name (variable type)
value label
DATA
when required
Each of the major commands (FILE, FORMAT, FIRST LINE, VARIABLES, VARIABLE LABELS,
VALUE LABELS, and DATA) must begin in the first space on a line. All other elements must be
indented at least one space or tab. Commands in the Schema file are not case sensitive.
For fixed format data, the most basic SCHEMA file consists of a list of variable names, followed by
their column locations and, for non-numeric variables, the variable type in parentheses.
Supported Programs 95
For delimited data, the most basic Schema file consists of a list of variable names and the types, in
parentheses, of all variables.
Note that the SCHEMA syntax is similar to SPSS DATA LIST syntax, but is more convenient to use.
There is no MISSING VALUES command, because missing values can be defined along with the
variables. Similarly, variable labels can be defined alongside the variables. Value labels are attached
with tags in a manner that is closer to SAS and Stata than SPSS.
FILE Command
If your ASCII input file is not in the same directory as the SCHEMA file, if it has an extension other
than .dat, or has a name that is different from the SCHEMA file, you must use the FILE command.
This consists of the word FILE beginning in the first column, followed by the complete path and
name of your input data file. If there are embedded spaces in any of these elements, you must
enclose the file specification in single or double quotes.
For example:
The format command is optional for fixed format data, but it must be present in order to read
delimited data. It begins in the first column.
When reading fixed format data, the FORMAT command, if present, is followed by fixed. This
command is never required for fixed format files, but you may wish to use it for documentation
purposes.
When reading delimited data, the FORMAT command is followed by delimited, which is then
followed by the delimiter. The delimiter can be commas, tabs, spaces or semicolons, or you
can specify any other delimiter character by preceding it with a \. For instance to use the pipe
character for a delimiter, type
format delimited \|
To use commas, type
If you need to skip lines at the beginning of your data file, use this command, which starts in the first
column. After the command, enter the number of the first line to be read.
For example, to start on the third line of your file, type
first line 3
96 Supported Programs
VARIABLES Command
Variables
ID
Age
Name
Sex
1-5
6-7
8-20 (A)
21
Birthdte
(%d-%m-%Y)
The type is not necessary when the variable type is numeric. String variables are designated by the
letter A, and for dates and times you should use a Stat/Transfer date input format, as documented in
the ASCII options given on Page 30. For dates in the format 10-MAY-1950, for example, you would
use (%d-%m-%Y), the example above. Note that different formats can be used for different date and
time variables.
Numeric variables can be also be specified with a starting column location, and a format giving the
variable width and an optional number of implied decimal places. In this case, the ending column
location is not necessary. The format is given in parentheses after the starting column location.
variablename begcol (fwidth.n)
If a decimal point is present in the input number, it is read and used. If a decimal place is not found
in the data, the number will be divided by 10 to the nth power, where n is the number of decimal
placed given in the format.
For example,
income
(F10.2)
Supported Programs 97
will read the variable income beginning in column 6, with a width of 10. If the number found in the
field is 2.00 the result will be 2. If the data contain 200 the result will be two as well.
The input widths, whether given explicitly or with a format giving the variable width may be used to
specify the output width when writing program files with fixed format ASCII data. In order for these
widths to be used in the output file, you should check the option Preserve Input Widths in the
ASCII File - Write section of the Options dialog box or use the SET command PROG-PRESERVEWIDTHS Y in the command processor.
Variable Lists If you have adjacent variables in a record with the same widths and of the same type,
you can use a list to specify them in the VARIABLES command.
The width specified by the given column range is divided by the number of variables in the list to
give the width of each variable. If the number of variables does not give an integer when divided into
the width specified, then an error is returned.
For example,
Variables
ID
Age
Name
Sex
Wage
Birthdte
(A4)
(F)
(A16)
(F2)
(F10.2)
(%Y-%m-%d)
Variable names must begin with a letter or an underscore. If they have embedded spaces (not
recommended) they must be enclosed by single or double quotes.
The variable type is always necessary for delimited data.
String variables are indicated by the letter A followed by a width. The width specification is
required.
98 Supported Programs
Numeric data are indicated by the letter F. A width is not required for numeric data, but may be
given in order to specify the output width when writing program files with fixed format ASCII data.
You can give either a width such as F2 or a width and implied number of decimals, such as F4.2.
In order for these widths to be used in the output file, you should check the option Preserve Input
Widths in the ASCII File - Write section of the Options dialog box.
For dates and times you should use a Stat/Transfer date input format, as documented in the ASCII
options given on Page 30. For dates in the format 1950-MAY-10, for example, you would use
(%Y-%m-%d). The formats given here override those selected in the Options dialog box. Note that
different formats can be used for different date and time variables.
Optional Elements for both Fixed and Delimited ASCII Data
Missing Value Specifications
Missing values can be coded in your input data file as blank, or as one of the extended missing values
. and .a-.z. See Page 35 for specifying extended missing values.
However, it is probably better practice to code missing values numerically in your data and then assign
these numbers to missing as the data are read in. This allows the missing data to become a subject of
analysis.
Up to three missing value specifications can be entered for each variable (or variable list or variable
range, for fixed format files). These are entered on the same line as the variable name (list or range),
following the variable specification elements. The missing value specifications are entered in square
brackets and are separated by commas.
A number given by itself is an equal-to specification. That is, if an input value is equal to the
number, it will be considered missing. In addition, missing value specifications can be entered as
<= (less than or equal) plus a number and >= (greater than or equal) plus a number.
For a fixed format input file, for example, suppose the values `0, `98, and `99 for the variable age
are all used to indicate a missing value. Then
Age 6-7
Age 6-7
Age 6-7
[0,98,99]
[0,>=98]
[<=1,98,99]
are all equivalent (assuming positive values for the variable age).
For a delimited input file, this would be specified as
Age (F)
[0,98,99]
and so on.
Variable Labels
If you only have one variable on a line (not a list or a range), and wish to use variable labels, then it is
most convenient to put them on the same line as the variables.
The variable labels are enclosed in curly braces, as in
Age 6-7 [0,98,99] {Age of Respondent}
If you are using a range or a list with fixed format input to define several similar variables and you
wish to label them, you must use the VARIABLE LABELS command, described below.
Supported Programs 99
If you are using lists or range specifications (with fixed format input) in the VARIABLES command
and need to define labels for those variables, then you must use the VARIABLE LABELS command.
This command is followed by as many variable name / variable label pairs as needed.
For example:
Variable Labels
Housesat Satisfaction with Housing
Incsat Satisfaction with Income
will label the variable Housesat as Satisfaction with Housing and the variable Incsat as
Satisfaction with Income.
If the labels contain embedded blanks, they should be enclosed in single or double quotes.
VALUE LABELS Command
You can define sets of value labels with the VALUE LABEL command. The elements are a tag,
preceded by a backslash and then value/ label pairs. For instance:
Value Labels
\Agelab
0
98
99
Inapplicable
Not Ascertained
Refused
If the labels contain embedded blanks, they should be enclosed in single or double quotes.
DATA command
The input data can be read from the same file as the SCHEMA commands, rather than read from a
separate file. The DATA command, which must follow all of the rest of the SCHEMA commands,
tells Stat/Transfer to treat everything that follows as data.
When the DATA command is used, the FILE command must be omitted.
SCHEMA File Comments
You can put comments in the SCHEMA file with two slashes. Everything that follows on the line
will be treated as a comment.
// this is a comment
100 Supported
Programs
Missing Data
dBASE does not directly support missing values. On input to Stat/Transfer, blanks in an dBASE file
are interpreted as missing values. If a data set is being transferred to an dBASE file format, missing
values in the input files are set to blank in the dBASE file. Blanks are interpreted as zero by dBASE.
Many other programs, including Stat/Transfer, interpret these blanks as missing.
Target Type
Output Type
byte
Number
(width 4, 0 decimal places)
int
Number
(width 6, 0 decimal places)
long
Number
(width 11, 0 decimal places)
dBASE III
float
double
Number
(width 16, decimal places taken
from input data. If decimal
places unknown, set to 4.)
dBASE IV
float
double
string
date
time
date/time
Float
(width 16)
Character
Date
Character
(written according to the ASCII
format options currently in effect)
Writing DDI
Stat/Transfer writes delimited DDI. The specification requires that elements within the Schema be
identified by an "agency". This is typically a url. By default it is "example.org", but you should
change it to something more appropriate at Output Options(1) of the Options dialog box.
Missing Values
On input the missing value is taken from the Schema. On output it is a blank.
Output Variable Types
The output variable type that results from each target variable type is given in the following table:
Output Type
byte
int
long
float
double
date, time, date/time
string
Short
Long
Single
Double
Date/Time
Character
Epi Info
Stat/Transfer will read and write files for Epi Info, a free statistical program developed by the Center
for Disease Control. All versions through Version Six are supported.
Standard extension: rec
Reading Epi Info Files
The Epi Info .rec contains both a dictionary with enough information for the program to construct a
data-entry screen for the file, and the data itself in ASCII format. Stat/Transfer will use the data-entry
description field as the variable label, if it is present.
Writing Epi Info Files
Because Epi Info files are basically formatted ASCII, they are not suitable for numbers which vary
widely in magnitude. Keeping in mind that the.rec file is also a data-entry template, Stat/Transfer will
make its best effort to make an attractive and functional one. Labels will be used, if available for
variable descriptions.
Missing Data
Blank fields are used by Epi Info for missing values. Stat/Transfer recognizes this when reading Epi
Info files. Missing values will be written as blanks on output to Epi Info format.
Output Variable Types
The output variable types that result from each of the target variable types are given in the following
table.
Output to Epi Info from Stat/Transfer
Target Type
Output Type
byte
Number
(width 4, 0 decimal places)
Number
(width 6, 0 decimal places)
Number
(width 11, 0 decimal places)
Number
(width 16, decimal places taken
from input data. If decimal
places unknown, set to 4 )
int
long
float
double
string
Text
date
time
date/time
Date
Text
(written according to the ASCII
format options currently in effect)
Epi Info
Excel Worksheets
Stat/Transfer will read and write files from Excel. It will read all versions, and will write Version 2.1
files and files for Excel 97 and higher versions. Note that Excel 97 and higher versions support long
strings (up to 32K) and more (up to 64K) records in a worksheet.
Excel 2007 has vastly increased limits over earlier versions. The maximum number of columns has
been increased from 256 to 65,536 and the maximum number of rows has increased from 16,384 to
over one million. Note however, this does not mean that Excel is an appropriate tool for very large
datasets. You will find that the time to load very large files is agonizingly slow.
Excel 2013 has added a new Strict Open XML format. Stat/Transfer 12 will read this format.
Standard extension: xls
Worksheet database files are structured worksheets where each row is a single case and each column
contains a variable. Data can consist of numbers (including serial date numbers), labels, or formulas.
The first non-blank row of a worksheet database file usually has strings in each column that give the
names of the variables. The data then begins in the next row. However, variable names may be in
different rows or not present at all.
You have several options to specify what part of an input worksheet to read and how to read variable
names. An option allows you to read variable labels from the row after the variable names.
Data Range
You can choose different ranges to be read in input worksheets by using the drop-down menu for
Data Range in the Worksheets section of the Options dialog box or using the SET command,
WKS-DATA-RANGE.
If you use Autosense (the default), Stat/Transfer will read to the first non-blank cell and use that as
the upper left corner of the data range. It will then read data until it encounters an entirely blank line.
This is the default behavior. You can change the behavior with respect to blank rows by using the
Blank Rows option.
Rather than have Stat/Transfer automatically sense the number of rows to be read, you can use the
other options for Data Range to specify a range, either by giving a named range or by giving explicit
coordinates.
When a range has been specified, Stat/Transfers treatment of entirely blank rows will also be
overridden. They will be returned in your output data and, in addition, blank rows at the end of the
worksheet, through the last row of the specified range, will also be returned.
Excel Worksheets
By default, Stat/Transfer will attempt to determine whether or not the data in the first non-blank row
(or the first row of a specified range) are variable names or the first row of data. It does so by
looking for at least one column in which there is a string in the first row and a number in the
second.
If this behavior is inappropriate for your worksheet (for example, if you have only string data), you
can specify different options in the Field Name Row drop-down menu of the Worksheets section of
the Options dialog box. You can specify that variable names must be taken from the first non-blank
row, that they be taken from a specific row, or that the worksheet does not contain variable names, so
that Stat/Transfer should assign them (col1 through coln).
Determining the Data Types and Widths
After identifying the label row, Stat/Transfer will look, if necessary, at the entire column to find the
first non-blank cell in order to determine the data type of each column. If the first non-empty data
cell of a particular column is a number (or a label with a single period), Stat/Transfer will transfer the
column as a number. If the data cell contains a label, the variable will be transferred as a string.
The width of the column for each numeric variable and the format of the first non-blank data cell in
that column are used, where possible, to set the default target, or output, types for the numeric
variables. If the format of the first data row has any decimal places (for example, F(2)), the target
type will be float. On the other hand, if the cell format has no decimal places (for example, F(0))
the target types will be various flavors of integers which depend on the column width. If the column
width is less than three, the target type will be byte. If the column width is less than five, the target
type will by int. Otherwise the target type will be long. Any date format in the first data row will
set the target type to date.
The maximum width of character variables is determined by examining the widths of all of the strings
in a column.
Stat/Transfer is lenient in typing variables from worksheets. If it is expecting a character variable and
it encounters a number it will convert it to a string.
Combining Multiple Input Worksheet to a Single Output File
The option Concatenate Worksheet Pages, found in the Worksheets options in the Options dialog
box allows you to combine worksheet pages into a single output file. This option is appropriate if
your worksheet contains many sheets that are identical in structure. These can be then be combined
into a single output file of any type.
For example, you may have a workbook that has 50 sheets, with one sheet for each state and the same
variables on each sheet. If you check this box, Stat/Transfer will effectively combine the sheets into
one large input file, dropping the field names, if necessary, on the second and higher sheets. You will
end up with the data from all of your worksheet pages in a single output file.
Writing Excel Worksheet Files
On output, by default, Stat/Transfer will write variable labels in the first row of the worksheet. Data
values will be placed in the second and succeeding rows. You can change this behavior in the
Worksheets options in the Options dialog box.
Column widths and formats will be determined by the variable information available. Dates and
character variables are straightforward. For numerical data, information on the width and number of
decimal places of variables, where available, is used to set the column widths and formats.
Excel Worksheets
Missing Data
On input, blank cells and cells containing labels consisting of a single dot are read as missing.
If there is a string in your worksheet, such as NA, that you would like to have treated as a numeric
missing value, you can specify it using the Numeric Missing Value option in the Worksheets
section of the Options dialog box.
When transferring data to worksheets from other formats, missing values will be written out to
worksheets as blank cells.
Target Type
byte
int
long
float
double
string
date
time
date/time
Excel Worksheets
Output Type
Numeric cell
(formatted if information is
available)
Label
Serial date number
(with date format)
Time fraction
(with date format if available)
Date number + time fraction
(with date/time format if available)
FoxPro Files
Stat/Transfer will read and write FoxPro files.
Standard extension: dbf
Reading FoxPro Files
FoxPro files can have indices for key fields, which are stored in separate files. Stat/Transfer ignores
these indices, and treats all files sequentially.
On input, FoxPro numeric data and character variables are converted in a straightforward manner.
Logical variables are converted to numbers (True becomes 1, False becomes 0). Memo Fields
cannot be converted and will not appear for variable selection. Deleted records are not transferred.
Writing Fox Pro Files
Users should be aware that FoxPro files are limited to 128 variables.
FoxPro stores numeric data in fixed length character format. It is thus not very suitable for numbers
which vary widely in magnitude, or which are either very large or very small.
When Stat/Transfer is transferring data from a system in which the width and number of decimal
places are known, it uses that information to set the format of each field in the output FoxPro files.
For systems, such as SYSTAT, in which this information is not recorded in the file, Stat/Transfer uses
sets the formats based on the target type of the variable.
Missing Data
FoxPro does not directly support missing values. On input to Stat/Transfer, blanks in a FoxPro file are
interpreted as missing values. If a data set is being transferred to an FoxPro file format, missing
values in the input files are set to blank in the FoxPro file. Blanks are interpreted as zero by FoxPro.
Many other programs, including Stat/Transfer, interpret these blanks as missing.
Output Variable Types
The output variable types that result from each of the target variable types are given in the following
table.
Output to FoxPro from Stat/Transfer
Target Type
byte
int
long
float
double
string
date
time
date/time
Output Type
Number
(width 4, 0 decimal places)
Number
(width 6, 0 decimal places)
Number
(width 11, 0 decimal places)
Float
(width 16)
Character
Date
Character
(written according to the ASCII
format options currently in effect)
FoxPro Files
Gauss Files
Stat/Transfer will read and write Gauss data sets. There are two Gauss formats.
1. Gauss 89 was used on PC platforms, and consists of two files: a data file with a .dat extension and
a header, or dictionary file with a .dht extension.
2. Gauss 96 was first used on Unix platforms and now is used on all platforms. It is written to a single
file with a .dat extension.
Standard extension: dat
Reading Gauss Files
When you wish to transfer data from a Gauss data set, give Stat/Transfer the name of the data file (the
file with the .dat extension). If Stat/Transfer can find the .dht file in the same directory, it will read
the data file as a Gauss 89 file. If no .dht file is present, the data file will be read as a Gauss 96 file.
Stat/Transfer will read numeric data from integer, single precision and double precision Gauss data
sets. Character data will only be read from double precision data sets.
Writing Gauss Files
On output, you can choose whether to write a Gauss 89 or Gauss 96 file. If you choose to write a
Gauss 89 file, both of the Gauss files, the data file and the header file, will be written. Stat/Transfer
will show the data file name, with the .dat extension, in the output File Specification line. However,
the header file will be created as well, with a .dht extension.
Stat/Transfer writes double precision Gauss data sets.
Missing Data
Gauss supports missing values.
Output Variable Types
The output variable types that result from each of the target variable types are given in the following
table.
Target Type
Output Type
byte
int
long
float
double
string
date
MM/DD/YY format
(written as a character variable)
Character
(written according to the ASCII
format options currently in effect)
time
date/time
Gauss Files
gretl
gretl (Gnu Regression, Econometrics and Time-series Library) is a free, open-source package for
econometric analysis. Stat/Transfer can read and write gretl cross-section files. It does not support
Time series and panel data structures.
gretl supports numeric variables only (with the exception of a single, alphanumeric, or date row label).
Reading gretl
If necessary, Stat/Transfer will decompress gretl files automatically. The character encoding is read
from the file and will override any encode set in the options. If the file contains row labels,
Stat/Transfer will attempt to convert them to dates using the scan format set in the Date/Time Formats
Reading options. If the row labels cannot be converted to dates, they will be treated as an ordinary
string variable. The row label will be named obs.
Writing gretl
Stat/Transfer writes uncompressed gretl files with strings in Unicode (utf-8). You can control which
variable is used as the row label in Options/Output Options(2). Note that because gretl has only a
numeric data type, only one string variable can be written to the output file. All Date/Time variables,
except for one used for the row label (optionally), will be written to the file as Excel dates (the number
of days since December 30, 1899, with time as a decimal fraction of the day.
Target Type
Output Type
Table
Byte
Int
Long
Float
Double
Number
Date
Time
Date/time
String
JMP Files
HTML Tables
Stat/Transfer will write HTML tables for use in web pages.
Standard extension: htm
Writing HTML Tables
Field names will be written in bold characters as the first row of the table. Values of each field are
then written down the corresponding column.
The table generated by Stat/Transfer is bracketed by <HTML> tags. Thus it can be loaded directly
into your browser. However, most users will probably want to cut and paste the table into a larger
HTML page. To make this process easier, the output HTML tables have reasonably short lines and
continuation lines are indented.
Note that HTML output is in general only appropriate for fairly small tables. Stat/Transfer will
transfer large data sets to HTML format. However, if you use a large table in a web page, many
browsers will be brought to their knees when they try to read the table.
Missing Data
Missing values for any data type are written as a non-breaking space, . On any browser, this
will cause a blank table cell to be displayed.
Output Variable Types
The output variable types that result from each of the target variable types are given in the following
table.
Target Type
Output Type
byte
int
long
float
double
string
date
time
date/time
HTML Files
Character
Character
(written according to the ASCII
format options currently in effect)
JMP Files
JMP is a general statistics package from the SAS Institute that runs on both the Windows and
Macintosh platforms. Stat/Transfer reads and writes files that are usable on either platform.
All versions higher than 3 are supported.
Standard extension: mat
Reading JMP Files
JMP allows (and encourages) variable names of up to 31 characters. These variable names can
contain embedded blanks and any other character. These names may be truncated or altered when
they are transferred to other statistical packages. Because of this, Stat/Transfer will use the JMP
variable name for a variable label if it is greater than eight characters and contains an embedded
blank, or if you have checked the option Write New, Numeric Variable Names in the General
Options section of the Options dialog box.
By default, the notes property for a variable will be used as a variable label if present. If you have
additional textual data in the custom properties of your variables, you can append this to the variable
labels by checking the option Use custom properties as variable labels in the JMP Options section
of the Options dialog box.
Reading JMP files with the Command Processor
It is necessary to use a file-type tag to distinguish different versions of JMP files when reading them
with the command processor.
Version 4+ files are read by default. That is, if the input is a Version 4+ JMP file that has the standard
extension .jmp, no file-type tag is needed. If you want to read a Version 3 JMP file, then the file
name must be preceded by the file-type tag jmp3.
Writing JMP Files
Variable labels, when available from the input file, will be written to JMP output files as variable
notes.
You can choose to have Stat/Transfer write value labels to output JMP files by checking the option
Write value labels to JMP files in the JMP Options section of the Options dialog box.
Reading JMP files with the Command Processor
Since the same standard extension is used for all versions of JMP files, it is necessary to use a filetype tag to distinguish them when writing JMP files with the command processor.
Version 6 is written by default. That is, if the output is a Version 6 JMP file that has the standard
extension .jmp, no file-type tag is needed. If you want to write a Version 3 JMP file, then the file
name must be preceded by the file-type tag jmp3. For Version 4 and 6 JMP files, the file name must
be preceded by the file-type tag jmp4.
Missing Values
JMP has a missing value for each numeric type. Stat/Transfer recognizes these on reading JMP files
and will write them to JMP output.
JMP Files
Target Type
Output Type
float
double
byte
int
long
string
date
time
date/time
Numeric
JMP Files
Integer1
Integer2
Integer4
Character
Date
Time
Date/time
LIMDEP Files
Stat/Transfer supports all versions of LIMDEP . LIMDEP is an econometric software program for
the analysis of limited dependent variables.
Standard extension: lpj
Reading LIMDEP Files
LIMDEP has only a single data type, consisting of double precision numbers.
Writing LIMDEP Files
Stat/Transfer enforces LIMDEPs 200 variable limit.
Character variables cannot be exported to LIMDEP, since the program does not support them.
Missing Values
LIMDEP recognizes missing values.
Output Variable Types
The output variable types that result from each of the target variable types are given in the following
table.
Output to LIMDEP from Stat/Transfer
Target Type
Output Type
string
byte
int
long
float
double
date
time
datetime
Not exported
Number
LIMDEP Files
Matlab Files
Matlab matrices through Version 5 are supported on the following platforms:
Macintosh
Windows & OS/2
Unix HP/Sun/IBM
Matlab datasets for version 7+ are also supported.
Standard extension: mat
Reading Matlab Files
Stat/Transfer will automatically recognize an input files platform of origin on input.
Matlab does not write variable names into the file. Therefore Stat/Transfer makes up a variable name
for each column, coln, which consists of the string col plus the number of the column.
All values in a given matrix are of a single type, usually double precision, although particularly large
matrices may be written out by Matlab in integer format, when this is possible.
Strings are not supported in Matlab files.
Stat/Transfer does not read complex matrices.
Writing Matlab Files
On output, Stat/Transfer writes files that can be read by any version of Matlab.
Stat/Transfer always writes Matlab files in double precision.
Stat/Transfer does not write complex matrices.
By default, Stat/Transfer will write dates, times or datetime variables into Matlab as numbers. Dates
are serial date numbers with a base of Jan 1, 0000. Times are fractions of the day and date/time values
are the sum of these two components.
You will need to use internal Matlab functions to turn these numbers into something readable. On the
other hand, if you are not planning to do any computations on your dates, you can check the option
Write Matlab Date/Time as strings in the Output Options(1) section of the Options dialog box
and Stat/Transfer will write your date values as formatted strings in your Matlab file.
Missing Data
Matlab supports missing values.
Matlab Files
Target Type
Output Type
int
long
float
double
date
time
date/time
Double
string
Not written
Matlab Files
Mineset Files
Mineset is a data visualization package from Silicon Graphics, Inc. The published and portable data
format is tab-delimited ASCII, stored in a file with a .DATA extension and described by a dictionary in
a file with a .SCHEMA extension.
Standard extensions: schema, data
Reading Mineset files
When using Mineset data as input for a transfer, give the file with the .schema extension (the
dictionary file) as the input file. Stat/Transfer will then look in the same directory for a file of the
same name, but with the .data extension and will read the input data from this file.
Some of the more exotic features of Mineset files such as arrays and enumerations are not supported.
Writing Mineset files.
In the Output File Specification of the Transfer dialog box, specify a name for the .schema file.
The data file will then be written to the same directory with a .data extension
Missing Values
Mineset uses ? for missing numeric values. Stat/Transfer recognizes this on input of Mineset files
and writes it on output to Mineset files.
Output Variable Types
The output variable types that result from each of the target variable types are given in the following
table.
Target Type
Output Type
byte
int
long
float
double
string
time
date
date/time
Integer
Mineset Files
Float
Double
String
Time
Date
Minitab Worksheets
Minitab is a general statistics package from Minitab, Inc. Stat/Transfer will read Minitab worksheets
written by Versions 8 - 12 of Minitab and writes Version 11 worksheets.
Standard extension: mtw
Reading Minitab Worksheets
Stat/Transfer reads Minitab columns. It does not read the constants or matrices that may be stored in
Minitab worksheets.
Version 12 of Minitab introduced a new project file type that may contain several worksheets along
with other data. Stat/Transfer does not extract separate worksheets from these project files. You will
need to use Minitab to save the worksheets you wish to transfer as separate worksheet files.
Stat/Transfer reads Minitab Versions through 14.
Writing Minitab Worksheets
Stat/Transfer writes Minitab columns.
Missing Values
Minitab recognizes missing values.
Output Variable Types
The output variable types that result from each of the target variable types are given in the following
table.
Output to Minitab from Stat/Transfer
Target Type
Output Type
byte
int
long
float
double
string
date
Time
date/time
Number
Text
Date/time
Minitab Worksheets
Mplus Files
Stat/Transfer supports exporting to Mplus Version 6-7 for Windows. Mplus is a latent variable
modeling program. See www.statmodel.com.
Standard extension: inp, dat
Missing Values
Mplus recognizes missing values, which are written as '.'.
Target Type
Output Type
string
byte
int
long
float
double
date
time
datetime
Not exported
Number (with a precision of
up to 15 decimal places) Dates
are written as serial day
numbers with a base of
December 31, 1899. Times
are written
as fractions of a day.
MPLUS Files
NLOGIT Files
Stat/Transfer supports all versions of NLOGIT for Windows. NLOGIT is an econometric software
program for the analysis of discrete choice data.
Standard extension: lpj
Reading NLOGIT Files
NLOGIT has only a single data type, consisting of double precision numbers.
Writing NLOGIT Files
Stat/Transfer enforces NLOGITs 200 variable limit.
Character variables cannot be exported to NLOGIT, since the program does not support them.
Missing Values
NLOGIT recognizes missing values.
Output Variable Types
The output variable types that result from each of the target variable types are given in the following
table.
Output Type
string
byte
int
long
float
double
date
time
datetime
Not exported
Number
NLOGIT Files
For ODBC, the Stat/Transfer user interface will present a list of installed data sources, instead of the
Open or Save dialog boxes. For details on using ODBC data sources with the command processor,
see Page 77.
Stat/Transfer can either read single tables or multiple tables that are joined in an view.
Writing
On output, tables can be created in a new file, new tables can be created in an existing file, or existing
tables can be overwritten with a new table.
In addition, new data can be appended to an existing database table. This option is off by default and
must be turned on by using the Append to Access and ODBC Tables option in the ODBC/Access
Options section of the Options dialog box or the SET command DB-TABLE-APPEND in the
command processor.
Stat/Transfer will match as many variables as possible to those already in the table and add your data
to the matching columns. Obviously at least one column must match exactly and, in addition, the
table must be free of constraints, such as those requiring unique keys, that would prohibit a simple
append operation.
Missing Values
Support for missing values depends on the ODBC driver you are using. However, in most cases,
missing values are supported.
Target Type
byte
int
long
float
double
date
time
date/time
string
Output Type
Smallint
Integer
Real
Double
Date
Time
Timestamp
Character
OpenDocument Spreadsheets
The OpenDocument format for spreadsheets is an ISO standard XML format for spreadsheet data. It
supported by numerous applications, including OpenOffice.org, LibreOffice, and Google Docs.
Standard extension: ods
Worksheet database files are structured worksheets where each row is a single case and each column
contains a variable. Data can consist of numbers (including serial date numbers), labels, or formulas.
The first non-blank row of a worksheet database file usually has strings in each column that give the
names of the variables. The data then begins in the next row. However, variable names may be in
different rows or not present at all.
You have several options to specify what part of an input worksheet to read and how to read variable
names. An option allows you to read variable labels from the row after the variable names.
Data Range
You can choose different ranges to be read in input worksheets by using the drop-down menu for
Data Range in the Worksheets section of the Options dialog box or using the SET command, WKSDATA-RANGE.
If you use Autosense (the default), Stat/Transfer will read to the first non-blank cell and use that as
the upper left corner of the data range. It will then read data until it encounters an entirely blank line.
This is the default behavior. You can change the behavior with respect to blank rows by using the
Blank Rows option.
Rather than have Stat/Transfer automatically sense the number of rows to be read, you can use the
other options for Data Range to specify a range, either by giving a named range or by giving explicit
coordinates.
When a range has been specified, Stat/Transfer's treatment of entirely blank rows will also be
overridden. They will be returned in your output data and, in addition, blank rows at the end of the
worksheet, through the last row of the specified range, will also be returned.
Determining Variable Names
By default, Stat/Transfer will attempt to determine whether or not the data in the first non-blank row
(or the first row of a specified range) are variable names or the first row of data. It does so by
looking for at least one column in which there is a string in the first row and a number in the
second.
If this behavior is inappropriate for your worksheet (for example, if you have only string data), you
can specify different options in the Field Name Row drop-down menu of the Worksheets section of
the Options dialog box. You can specify that variable names must be taken from the first non-blank
row, that they be taken from a specific row or that the worksheet does not contain variable names, so
that Stat/Transfer should assign them ('col1' through 'coln'.)
124 Supported
Programs
OpenDocument Spreadsheets
After identifying the label row, Stat/Transfer will look at the entire column. If any cell contains a
string, the entire column will be treated as string data (with numbers and dates converted to strings).
If the column is all numeric or blank, it will be treated as a numeric column.
The width of the column for each numeric variable and the format of the first non-blank data cell in
that column are used, where possible, to set the default target, or output, types for the numeric
variables. If the format of the first data row has any decimal places (for example, F(2)), the target
type will be 'float'. On the other hand, if the cell format has no decimal places (for example, F(0)) the
target types will be various flavors of integers which depend on the column width. If the column
width is less than three, the target type will be 'byte'. If the column width is less than five, the target
type will by 'int'. Otherwise the target type will be 'long'. Any date format in the first data row will
set the target type to 'date'.
The maximum width of character variables is determined by examining the widths of all of the strings
in a column.
Stat/Transfer is lenient in typing variables from worksheets. If it is expecting a character variable and
it encounters a number it will convert it to a string.
Combining Multiple Input Worksheet to a Single Output File
The option Concatenate Worksheet Pages, found in the Worksheets section of the Options dialog box
allows you to combine worksheet pages into a single output file. This option is appropriate if your
worksheet contains many sheets that are identical in structure. These can be then be combined into a
single output file of any type.
For example, you may have a workbook that has 50 sheets, with one sheet for each state and the same
variables on each sheet. If you check this box, Stat/Transfer will effectively combine the sheets into
one large input file, dropping the field names, if necessary, on the second and higher sheets. You will
end up with the data from all of your worksheet pages in a single output file.
Missing Data
On input, blank cells and cells containing labels consisting of a single dot are read as missing.
If there is a string in your worksheet, such as 'NA', that you would like to have treated as a numeric
missing value, you can specify it using the Numeric Missing Value option in the Worksheets
section of the Options dialog box.
When transferring data to worksheets from other formats, missing values will be written out to
worksheets as blank cells.
OpenDocument Spreadsheets
126 Supported
Programs
Target Type
Output Type
byte
int
long
float
double
string
date
time
date/time
Numeric cell
(formatted if information is
available)
OpenDocument Spreadsheets
Label
Formatted Date
Formatted Time
Formatted Time Stamp
OSIRIS Files
OSIRIS is a general purpose statistical package written for use on IBM mainframes. It is no longer
actively supported. However an enormous store of survey data are available in OSIRIS format from
the Inter-University Consortium for Political and Social Research (ICPSR) at the University of
Michigan. For this reason, Stat/Transfer will read, but not write OSIRIS data.
An OSIRIS data set consists of a dictionary file and a data file.
Standard extension: dict, data
Paradox Tables
Paradox Tables
Because Paradox stores numbers in binary rather than character representation and because it
explicitly supports missing values, it is a much more suitable file format for statistical data than the
dBASE format.
Stat/Transfer reads Versions 4-9 and writes a version that is compatible with 79.
Standard extension: db
Reading Paradox Files
Paradox variable names can be up to 25 characters in length.
Paradoxs date format is supported on input.
Writing Paradox Files
Stat/Transfer stores numbers into Paradoxs integer format if they will fit. If not, it uses double
precision representation. Paradoxs date format is supported on output.
Missing Data
Paradox supports missing values for all data types.
Output Variable Types
The output variable types that result from each of the target variable types are given in the following
table.
Target Type
Output Type
byte
int
long
float
double
string
date
time
date/time
Short
OSIRIS Files
Numeric
Alphanumeric
Date
Time
Timestamp
Worksheet database files are structured worksheets where each row is a single case and each column
contains a variable. Data can consist of numbers (including serial date numbers), labels, or formulas.
The first non-blank row of a worksheet database file usually has strings in each column that give the
names of the variables. The data then begins in the next row. However, variable names may be in
different rows or not present at all.
You have several options to specify what part of an input worksheet to read and how to read variable
names. An option allows you to read variable labels from the row after the variable names.
Data Range
You can choose different ranges to be read in input worksheets by using the drop-down menu for
Data Range in the Worksheets section of the Options dialog box or using the SET command,
WKS-DATA-RANGE.
If you use Autosense (the default), Stat/Transfer will read to the first non-blank cell and use that as
the upper left corner of the data range. It will then read data until it encounters an entirely blank line.
This is the default behavior. You can change the behavior with respect to blank rows by using the
Blank Rows option.
Rather than have Stat/Transfer automatically sense the number of rows to be read, you can use the
other options for Data Range to specify a range, either by giving a named range or by giving explicit
coordinates.
When a range has been specified, Stat/Transfers treatment of entirely blank rows will also be
overridden. They will be returned in your output data and, in addition, blank rows at the end of the
worksheet, through the last row of the specified range, will also be returned.
Determining Variable Names
By default, Stat/Transfer will attempt to determine whether or not the data in the first non-blank row
(or the first row of a specified range) are variable names or the first row of data. It does so by
looking for at least one column in which there is a string in the first row and a number in the
second.
If this behavior is inappropriate for your worksheet (for example, if you have only string data), you
can specify different options in the Field Name Row drop-down menu of the Worksheets section of
the Options dialog box. You can specify that variable names must be taken from the first non-blank
Quattro Pro Worksheet Files
row, that they be taken from a specific row, or that the worksheet does not contain variable names, so
that Stat/Transfer should assign them (col1 through coln.)
Determining the Data Types and Widths
After identifying the label row, Stat/Transfer will look, if necessary, at the entire column to find the
first non-blank cell in order to determine the data type of each column. If the first non-empty data
cell of a particular column is a number (or a label with a single period), Stat/Transfer will transfer the
column as a number. If the data cell contains a label, the variable will be transferred as a string.
The width of the column for each numeric variable and the format of the first non-blank data cell in
that column are used, where possible, to set the default target, or output, types for the numeric
variables. If the format of the first data row has any decimal places (for example, F(2)), the target
type will be float. On the other hand, if the cell format has no decimal places (for example, F(0))
the target types will be various flavors of integers which depend on the column width. If the column
width is less than three, the target type will be byte. If the column width is less than five, the target
type will by int. Otherwise the target type will be long. Any date format in the first data row will
set the target type to date.
The maximum width of character variables is determined by examining the widths of all of the strings
in a column.
Stat/Transfer is lenient in typing variables from worksheets. If it is expecting a character variable and
it encounters a number it will convert it to a string.
Combining Multiple Input Worksheet to a Single Output File
The option Concatenate Worksheet Pages, found in the Worksheets options in the Options dialog
box allows you to combine worksheet pages into a single output file. This option is appropriate if
your worksheet contains many sheets that are identical in structure. These can be then be combined
into a single output file of any type.
For example, you may have a workbook that has 50 sheets, with one sheet for each state and the same
variables on each sheet. If you check this box, Stat/Transfer will effectively combine the sheets into
one large input file, dropping the field names, if necessary, on the second and higher sheets. You will
end up with the data from all of your worksheet pages in a single output file.
Writing Quattro Worksheet Files
On output, by default, Stat/Transfer will write variable labels in the first row of the worksheet. Data
values will be placed in the second and succeeding rows. You can change this behavior in the
Worksheets options in the Options dialog box.
Column widths and formats will be determined by the variable information available. Dates and
character variables are straightforward. For numerical data, information on the width and number of
decimal places of variables, where available, is used to set the column widths and formats.
Writing Quattro Worksheet Files
On output, Stat/Transfer will write variable labels in the first row of the worksheet. Data values will
be placed in the second and succeeding rows.
Column widths and formats will be determined by the variable information available. Dates and
character variables are straightforward. For numerical data, information on the width and number of
decimal places of variables, where available, is used to set the column widths and formats.
Missing Data
On input, blank cells and cells containing labels consisting of a single dot are read as missing.
If there is a string in your worksheet, such as NA, that you would like to have treated as a numeric
missing value, you can specify it using the Numeric Missing Value in the Worksheets section of the
Options dialog box.
When transferring data to worksheets from other formats, missing values will be written out to
worksheets as blank cells.
Output Type
byte
int
long
float
double
string
date
Numeric cell
(formatted if information is
available)
time
date/time
Label
Serial date number
(with date format)
Time fraction
(with date format if available)
Date number + time fraction
(with date/time format if available)
R
R is a free, open-source environment for statistical computing and graphics. Stat/Transfer will read
and write workspace files for R versions 2 and 3.
Standard extension: rdata
Reading R files
The R file format is very unstructured and allows the user to write almost anything into it. Therefore
Stat/Transfer imposes a few restrictions on input files. Specifically, your data file should contain at
least one of the following kinds of objects.
two dimensional matrices
vectors
factors
dataframes
Stat/Transfer can read R files in either binary or the more common ASCII format. Compressed files
are recognized, but not supported. You should decompress these first with another program such as
gzip or Winzip.
Factors
Factors in R consist of a vector of zero-based numeric values and a vector of string labels that are
mapped onto the values. If the input file contains factors, you can choose to have these written to an
output file as the numeric values and their value labels or you can write them as strings. This option
is controlled in the R and S-Plus Options section of the Options dialog box. If you are going to a
package such a Stata or SPSS, that supports value labels, the first option is more appropriate.
Writing R files
On output, Stat/Transfer writes an R dataframe. If your input data set does not have a variable named
rownames, Stat/Transfer will create an extra variable containing the case number, stored as an
integer variable and named rownames.
Stat/Transfer writes R data in ASCII format, which is compatible across platforms.
Missing Data
R supports missing values. On input, missing values are converted to the internal missing value in
Stat/Transfer. On output, missing values are converted to the value appropriate for each variable
type.
Version one string missing
If you are using Version One of R, check this option Write version one string missing and the
appropriate value will be written for null or missing strings. If you are using a later version of R,
leave this unchecked.
Output Variable Types
The output variable type that results from each target variable type is given in the following table:
Output Type
byte
int
long
float
double
date
Integer
time
date/time
POSIX timestamp
(note the timezone is not set, so
the time will be in GMT)
Real
Double
Date
RATS Files
RATS (Regression Analysis of Timeseries) is a general-purpose econometric and time series analysis
package. Stat/Transfer can read and write RATS Version 7-8 files.
Standard extension: rat
RATS Files
Missing Values
Missing values are supported.
Target Type
Output Type
byte
int
long
float
double
date
time
date/time
RATS Files
Standard extensions:
Version 6
PC/DOS
Windows & OS/2
Unix HP/Sun/IBM
Versions 7-9
ssd
sd2
ssd01
sas7bdat
When writing SAS data files, you should pick an output format that is appropriate for the version of
SAS that will be reading the file.
Value labels can be transferred according to the procedure on Page 135.
Missing Data
SAS supports the missing values A - Z, . and ._.
Input
On input, when a SAS file is transferred to a Stata file or an ASCII file with extended missing values
specified, the SAS input missing values will be transferred to the equivalent ones in the output file,
with the exception that any missing values '._' in the input SAS file will be written out as '.' in the
output.
For all other output formats, all SAS missing values are converted to a single internal missing value
in Stat/Transfer.
Output
When either an ASCII file or a Stata file with extended missing values is transferred to a SAS file, the
input missing values will transfer to the equivalent ones in the SAS output file.
For input files that support user missing values (SPSS and OSIRIS), the options User Missing Value
and Map to extended (a-z) missing in the User Missing Values section of the Options dialog box
can be used to map selected user missing values to extended missing values in the SAS output file.
For all other input file formats, on output to SAS, missing values are set to ., (the SAS standard
missing value).
Output Variable Types
The output variable types that result from each of the target variable types are given in the following
table.
Output to SAS Data Files from Stat/Transfer
Target Type
Output Type
byte
int
long
float
double
string
date
time
date/time
You can choose whether and how formats are to be read by using the Reading SAS Value Labels
options of the Options dialog box. See Page 37.
Do not read formats
Check this option, which is the default, if no formats will be read.
Read directly from a catalog file
Choose this option if you have a Windows SAS catalog file containing formats and wish to read
them. See Page 37.
If this option is checked, the entry %ipath%/catalog.sas7bcat will automatically appear on the SAS
catalog name line. This instructs Stat/Transfer to look for a file named formats.sas7bcat in the same
directory as your data file. You can change the path if your file is in a different location.
Read from a catalog in CPORT library
Check this option if you have your formats in a CPORT catalog.
If this option is checked, the entry %ipath%/%iname%.stc will automatically appear on the CPORT
library name line. This instructs Stat/Transfer to look for a file with the name (with extension .stc)
and directory of your input file, since you will most probably be reading your formats from the same
file as the data. If you are reading formats and data from separate files, then you can give the name
of the file on this line. You can type in a complete file specification or you can use the macros below
as part of the file specification.
%ipath% The path, including the directory, of the input file
%iname% The name, without the extension, of the input file
%iext%
The extension, without the dot, of the input file
If your formats are in a member with the default name formats, you need not specify anything more.
If not, uncheck the Use default catalog name box and press the Read Library button. You will then
be able to select the member that contains your formats.
Read a SAS datafile
Choose this option if you wish to read formats from a SAS datafile produced by SAS using PROC
FORMAT. When this option is checked, you must have both the SAS input file and a separate file
created by SAS that contains the formats. By default, Stat/Transfer will look in the same directory as
the input file for a file named sas_fmts.ext, where .ext is the extension of your input file. This is the
name that the file will have if you have created it by the procedure below.
If you wish to have Stat/Transfer look for a file located somewhere else or with a different name, you
can change it in the SAS dataset name box. You can type in a complete file specification or you can
use the macros given above as part of the file specification.
Read from a dataset in a CPORT library
Choose this option if you have your formats in a dataset in a CPORT library produced by PROC
FORMAT.
If this option is checked, the entry %ipath%/%iname%.stc will automatically appear on the CPORT
library name line. This instructs Stat/Transfer to look for a file with the name (with extension .stc)
and directory of your input file, since you will most probably be reading your formats from the same
file as the data. If you are reading formats and data from separate files, then you can give the name
of the file on this line. You can type in a complete file specification or you can use the macros above
as part of the file specification.
If your formats are in a dataset with the default name sas_fmts, you need not specify anything more.
If not, uncheck the Use default catalog name box and press the Read Library button. You will then
be able to select the dataset that contains your formats.
Read from a SAS Transport file
Choose this option if your formats are in a dataset in a SAS Transport file produced by PROC FORMAT.
If this option is checked, the entry %ipath%/%iname%.tpt will automatically appear on the
Transport library name line. This instructs Stat/Transfer to look for a file with the name (with
extension
.tpt) and directory of your input file, since you will most probably be reading your formats from the
same file as the data. If you are reading formats and data from separate files, then you can give the
name of the file on this line. You can type in a complete file specification or you can use the macros
above as part of the file specification.
If your formats are in a subfile with the default name sas_fmts, you need not specify anything more.
If not, uncheck the Use default member name box and press the Read Library button. You will
then be able to select the subfile that contains your formats.
Setting the Appropriate Options with the Command Processor Interface
SET commands are used to choose whether and how formats are to be read. See the section SAS
Value Labels-Reading in the Available Options section of the command processor discussion.
Creating a File with the PROC FORMAT Statement
To create the new file for Stat/Transfer to use for reading your SAS value labels, you will need to go
into SAS and run the following small program:
libname mylib path ;
proc format library = mylib
cntlout = mylib.sas_fmts ;
run ;
where path is the directory that contains your input data file.
SAS Value Labels
This procedure creates a SAS file in the directory path that has the format information for each SAS
data file. In this case, the file will have the name sas_fmts.ext, where .ext is the extension of the input
SAS file, and it will be found in the same directory as the input file.
To put your formats in CPORT library, use PROC FORMAT after the above procedure.
To create a format library in a Transport file, use the same procedure as above but reference the
XPORT engine in the output libname statement.
Transferring Value Labels with Data
When you carry out a transfer with a SAS data file as input, Stat/Transfer will check to see if you
have checked the option Read directly from a catalog file. If so, the formats will be transferred
automatically from the catalog file named in the SAS catalog name box.
If you check any of the other options, Stat/Transfer will look for the file that you have specified and
the formats will be transferred automatically.
Restrictions on Importing Value Labels
SAS catalog files not only support conventional value labels (the one-to-one mapping of a string to a
single number), but also the mapping of a range of numeric values to a single string (for example, zip
code mapped to state).
Because this latter form has no analog in any of the packages supported by Stat/Transfer, only
conventional one-to-one value labels are imported from SAS.
Options controlling how labels are processed
You can control how Stat/Transfer will behave if all does not go smoothly.
Continue if the format file is not found
By default, processing will stop if a file containing the formats is not found. Change this behavior
by unchecking this option.
Continue is there is an error processing formats
By default, processing will stop if there is an error reading the format file or if no tags in the dataset
are matched to formats in the file. Checking this option instructs Stat/Transfer to continue processing
even if there is an error processing the formats.
Writing SAS Value Labels
The separate catalog file in which SAS stores value labels, besides being undocumented, can be
shared among several SAS data sets. This makes it problematic to update a SAS catalog file using an
external program. When writing value labels for SAS data files, rather than update the catalog file
directly, Stat/Transfer creates a SAS program that you can run in order to have SAS update its catalog
file.
Setting the Appropriate Options
You will first need to check the appropriate box, Write a Proc Format program, on the Writing
SAS Value Labels section of the Options dialog box, to enable the writing of SAS labels.
The program file will by default be written to the same directory as your output data file, with the
same name as your output file, but with the extension .sas. You can change this default in the
Filename edit box. You can type in a complete file specification here, if you wish. See Writing SAS
Value Labels, Page 39 for details.
When you carry out a transfer with a SAS data file as output, Stat/Transfer will check to see if you
have checked the option Write a Proc Format program . During the transfer, it will then write a
SAS PROC program file with the name taken from the Filename line. In this case, assuming that
you have used the default name and directory, it will create a program file named outfilename.sas in
the same directory as the output SAS data file.
For example, if you write a SAS data file called out.sas7bdat, the SAS program will be in the same
directory as the output file and will be named out.sas.
The program file has the PROC FORMAT and MODIFY statements necessary to create the SAS
catalog file. Once the program file has been created, you can run it in SAS.
SAS CPORT
The SAS CPORT is primarily used for transporting data libraries between machines. Stat/Transfer
will read, but not write, Windows CPORT files for versions higher than SAS Seven.
.
S
d extension: stc
tandar
Missing Data
Stat/Transfer supports the SAS missing values A - Z, . and ._.
On input, when a SAS file is transferred to a Stata file or an ASCII file with extended missing values
specified, the SAS input missing values will be transferred to the equivalent ones in the output file,
with the exception that any missing values '._' in the input SAS file will be written out as '.' in the
output.
For all other output formats, SAS missing values are converted to a single internal missing value in
Stat/Transfer.
The resulting transport file can then be used for a Stat/Transfer data transfer.
If a transport file has been produced by Stat/Transfer, it can be read in SAS with the following:
/* read transport file trans - write system file new */
libname trans xport file-specification;
libname new file-specification;
proc copy in=trans out=new ;
run;
Note that you should not use PROC CPORT to write files that are to be read by Stat/Transfer. This
procedure creates files in an entirely different and incompatible format.
Reading SAS Transport Files
More than one data set may be stored in a single transport file. If Stat/Transfer finds more than one
data set in a file, it will allow you to select the one you want.
SAS CPORT
On input, when a SAS file is transferred to a Stata file or an ASCII file with extended missing values
specified, the SAS input missing values will be transferred to the equivalent ones in the output file,
with the exception that any missing values '._' in the input SAS file will be written out as '.' in the
output.
For all other output formats, all SAS missing values are converted to a single internal missing value
in Stat/Transfer.
Output
When either an ASCII file or a Stata file with extended missing values is transferred to a SAS file, the
input missing values will transfer to the equivalent ones in the SAS output file.
For input files that support user missing values (SPSS and OSIRIS), the options User Missing Value
and Map to extended (a-z) missing in the Options(1) dialog box can be used to map selected user
missing values to extended missing values in the SAS output file. For all other input file formats, on
output to SAS, missing values are set to ., the SAS standard missing value.
Output Type
byte
int
long
float
double
string
date
time
date/time
S-PLUS Files
S-PLUS Files
Stat/Transfer will read and write S-PLUS data sets. Files written on 64 bit machines such as the DEC
Alpha are not supported.
Standard extension: [none]
Reading S-PLUS files
Because the S-PLUS file format is so unstructured that it allows the user to write almost anything,
including code, into it, Stat/Transfer imposes a few restrictions on input files. Specifically, your data
should be in one of the following formats:
vectors
factors
dataframes
.
Factors
Factors in S-PLUS consist of a vector of zero-based numeric values, and a vector of string labels that
are mapped onto the values. If the input file contains factors, you can choose to have these written to
an output file as the numeric values and their value labels or you can write them as strings. This
option is controlled in the R and S-Plus Options section of the Options dialog box. If you are
going to a package such a Stata or SPSS, that supports value labels, the first option is more
appropriate.
Byte Order
S-PLUS writes out its data in the native format of the machine on which it is running. This means
that both the byte order and the width of numbers can vary between machines. On input,
Stat/Transfer will automatically sense the byte order of the machine that wrote the file.
Writing S-PLUS files
On output, Stat/Transfer writes a S-PLUS dataframe. If your input data set does not have a variable
named rownames, Stat/Transfer will create an extra variable containing the case number, stored as
an integer variable and named rownames.
You can choose whether you want to write out a file with low to high byte order, appropriate for such
processors as Intel or DEC, or a file with high to low byte order, for such processors as SPARC, HP,
or Motorola. If you are using the Windows version of S-PLUS, select Intel (low to high) byte order
on output.
Missing Data
S-PLUS supports missing values. On input, missing values are converted to the internal missing
value in Stat/Transfer. On output, missing values are converted to the value appropriate for each
variable type.
Output Variable Types
The output variable types that result from each of the target variable types are given in the following
table.
Target Type
Output Type
byte
int
long
float
double
date
Integer
time
date/time
Character
(written according to the ASCII
format options currently in effect)
Real
Double
Date
Target Type
Output Type
byte
int
long
float
double
date
time
date/time
string
Number
Target Type
byte
int
long
float
double
string
Output Type
Number
date
time
date/time
Date
Character
(written according to the ASCII
format options currently in effect)
Stata Files
String
Stata Files
Stat/Transfer will read and write data for any version of Stata including versions running on Unix and
the Macintosh.
Standard extension: dta
When an Stata file with extended missing values is transferred to a SAS or ASCII file, the input
missing values will transfer to the equivalent SAS or ASCII ones. When a SAS file or ASCII file is
transferred to an Stata file with extended missing values specified, missing values will transfer to
equivalent ones, except that ._ in input SAS files is written out as . in the output.
For input files that support user missing values (SPSS and OSIRIS), the options User Missing Value
and Map to extended (a-z) missing in the Options dialog box can be used to map selected user
missing values to extended missing values in the Stata output file.
Stata Files
Output Type
byte
Byte
int
long
float
double
date
time
date/time
Int
Long
Float
Double
Stata Date
Float (fractional part of a day)
Double (Stata date and fractional
part of a day)
Character
Character or strl
string
strl
Stata Files
Statistica Files
Stat/Transfer supports Statistica Versions 5 and 7 - 9.
Standard extension sta
Reading Statistica
Stat/Transfer will read and use Statistica variable and value labels. Each column has a single missing
value, which will be applied.
Writing Statistica
Statistica does not have a string type, so character variables cannot be exported.
Missing Values
Statistica has one missing value for each variable. Stat/Transfer uses this when it reads a Statistica
file. When writing a Statistica file, Stat/Transfer will use a value of -9999 for missing.
Output Variable Types
The output variable types that result from each of the target variable types are given in the following
table.
Statistica Files
Output Type
Number
Date
Time
Date/time
Not exported
SYSTAT Files
Stat/Transfer writes double precision SYSTAT files. It will read either double or single precision
SYSTAT files.
Standard extension: sys
Reading SYSTAT Files
When Stat/Transfer reads SYSTAT data sets, it processes the variable names by 1) dropping the dollar
signs on character variables and 2) removing the parentheses before and after subscripts. For example,
SCALE(1) becomes SCALE1.
Writing SYSTAT Files
Any variable name in the source data set which contains a left parentheses followed by a number will
be transferred into a SYSTAT subscripted variable.
Users should note that the SYSTAT error message, You are trying to read an empty file, will occur
when SYSTAT cannot find a data file. Your SYSTAT files should be in the default drive or directory.
Missing Data
SYSTAT supports missing values.
Output Variable Types
The output variable types that result from each of the target variable types are given in the following
table.
Output to SYSTAT from Stat/Transfer
Target Type
Output Type
byte
int
long
float
double
string
date
time
date/time
Number
Character
Date
Character
(written according to the ASCII
format options currently in effect)
SYSTAT Files
Triple-S
Triple-S is an open standard for the transfer of survey data and its meta-data between software packages.
For information on the standard and on other packages that support Triple-S see www.triple-s.org.
Stat/Transfer supports Triple-S XML Version 2. The data are contained in two files. The dictionary
information is stored in a file with the extension .xml and the data are stored in a separate file with
extension .asc.
Standard extension: xml, asc
Reading Triple-S
Stat/Transfer will read and use Triple-S variable and value labels. Version 2 of the standard supports
both fixed format and comma-separated data files. Stat/Transfer will read either. Triple-S supports
alternative sets of text for variable and value labels. These are called modes. For example, you can
have one mode for analysis and the other for interviewing. If more than one mode is found, you will
be allowed to select it by using the Table selector or the -t switch in the command processor.
The data file should be in the same directory as the dictionary file (.xml) and should have the extension
.asc.
Writing Triple-S
Stat/Transfer writes delimited Triple-S files.
Missing Values
Triple-S simply represents missing with a blank field. Numeric fields that cannot be converted to
numbers are treated as missing. On output, missing values are written as blanks.
Output Variable Types
The output variable types that result from each of the target variable types are given in the following
table.
Output to Triple-S from Stat/Transfer
Target Type
Output Type
byte
int
long
single
float
double
quantity
date
Date
time
Time
date/time
strng
Character
Triple-S
How do the date formatting functions know how to write out the proper weekday and month
names for languages other than English?
A.
Stat/Transfer retrieves the localized day and month names from the Windows registry. If they are
correct for your locality, they will be correct in Stat/Transfer.
A. Use the Stat/Transfer Command Processor. It is documented in your manual (and in a separate
chapter in the online help). It will let you do some extremely powerful things such as extract all of the
tables from an Access database in a single command or copy a whole directory full of Excel
spreadsheets to Stata files.
Q.
Its the middle of the night before a crucial deadline and my transfer wont work. What should I
do?
A.
Its like airplane travel. If you cant get a direct flight between Chicago and Los Angeles, try to
get one that stops in Dallas. Consider what formats your destination program will read and the
formats Stat/Transfer will write. Or, if you are having trouble reading a file from another program,
consider any different file formats that it is capable of saving. There is usually more than one route
between your source and your destination.
Q.
A.
For use with general purpose software, probably delimited ASCII is the best. It is the closest
thing to a lingua franca of data transport. Stat/Transfer writes delimited data in accordance with the
standard set by Excel, and that is followed by most software packages. Worksheet files, such as
Excel 97, are widely supported as well and have the advantage of storing numbers in double
precision.
As a general purpose transport format between statistical packages, SPSS binary .sav files and Stata
files will maintain your value and variable labels and missing values. They are also platform independent.
Q.
A.
I want to save my data for use in the future, what should I do?
If you are saving your statistical for use in the indefinite future, the best thing to do is pick one of
the ASCII files + Programs options. Even if the particular program is no longer available in future
years, you can be assured that some statistical package will be able to read your plain ASCII data and
you will have the information that is necessary to re-construct your dataset. The worst thing to do is
to store your data in a binary format. Also, pick your storage media carefully. Those who stored data
on nine track tapes and decks of cards can no longer read them. We recommend ISO-standard
compact disks for archival storage.
157
Q.
I have a file in which numeric variables are stored as strings. How can I get Stat/Transfer to
convert these variables to numbers in my output file?
A.
Stat/Transfer will let you change strings to numbers when reading worksheets and ASCII files,
but it will generally not let you do so when reading other file types. You can work around this
limitation by first setting the ASCII File Write option String Quote Character to blank. Then
transfer your file to delimited ASCII. When you read the ASCII file and transfer it to your final
destination format, your numeric variables, which were formally stored as strings, will be numeric.
Q.
A.
I have a file in which I want some numbers to be transferred as strings. How can I do that?
Q.
A.
Write it to an ASCII file with a Stat/Transfer Schema. Then edit the Schema and change the
variables you want to convert to a string type. Then read the file back in.
If your question is of the form How do I do X, please look first in the manual. If that does not
solve your problem, send an email, with your serial number and the exact version you are using, to
[email protected].
On the other hand, if you think you have found a bug in Stat/Transfer, please use the tools on the Log
tab of the user interface. That will allow you to send us a complete description of your problem, your
computing environment and the files that are necessary to reproduce the problem
Character Encoding
Q. What are Encoding Errors?
A.
Stat/Transfer stores strings internally in Unicode, which is capable of storing all of characters in
all languages, plus many, many other symbols. Most older character sets are of much more limited
scope. For instance, the most common encoding, ASCII, is only capable of storing a handful of
symbols, letters and numbers, since it has only 127 locations for characters and control codes. Other
single-byte character sets double the amount of storage and allow accented characters and other useful
symbols. There are a number of such single-byte character sets, for instance one is suitable for the
Cyrillic alphabet and another for modern Greek.
When Stat/Transfer reads data, it converts it to Unicode either based on the settings for character sets
in the encoding options, or information written in the input file. If you file does not have information
on the encoding, and is in a character set that is not the default encoding used on your computer, you
must tell Stat/Transfer which encoding to use. For instance, if a Greek colleague sends you a Stata
dataset, you may need to select a Greek character set in order to properly read it and translate it to a
Unicode based system such as Excel. If the dataset contains non-ASCII strings and you do not set the
encoding properly, you will get nonsense on output.
Because all single byte characters can be mapped to Unicode, there are seldom errors on input
However, you might encounter them if you are reading multi-byte characters such as those for
Japanese.
The most common problems occur on output, when sometimes a character that was read on input has
no mapping to the output character set. For instance, if you read your Greek data set and attempted to
write it to SAS, using your Western European machine default, there would be many encoding errors
because Greek characters in Unicode cannot be mapped to a character set such as latin1.
Some problems are more surprising because it looks as if you are dealing with ASCII, but your file
has some characters that cannot be represented in the output. For instance all Microsoft applications
use Unicode and characters such as the left apostrophe cannot be mapped to common non-unicode
character sets. The same is true for the Euro sign, which is not present in ISO-8859-1, but is present
in its more modern replacement, ISO-8859-15. If any of these characters are present, you may well
get an encoding error.
158 Frequently Asked Questions
Q.
A.
Most statistical data are numbers and ASCII letters. These are properly and unambiguously
represented in all character sets. If that is what is in your data, you will not encounter encoding
errors.
Also, If you are reading from or writing to a Unicode based format, the problem disappears because
every character can be properly represented and there is no ambiguity. Excel and Access are in this
class as well as versions of SPSS higher than 17. SAS Versions 9.1 and above store information
about the encoding in the file and Stat/Transfer can use that to set the encoding. Anytime that
encoding information is stored in the file, this information takes precedence over your option
settings.
character encodings?
See:
http://en.wikipedia.org/wiki/Character_sets
Also, a very lucid discussion of Unicode can be found here:
http://www.joelonsoftware.com/articles/Unicode.html
159
Command Processor
Q. When I go to the Start menu in Windows and click on Stat/Transfer, I see something called
Stat/Transfer Command Processor. What is it?
A.
The Stat/Transfer Command Processor is a separate program that lets you transfer files without
using the user interface, but rather through simple commands. It can be invaluable if you have a large
number of repetitive transfers or if you wish to do batch transfers. It can be reached from any platform
by typing st at the operating system prompt.
The command processor has been integrated into the user interface, so that you can generate
command files automatically and edit and run command files from the Run Program tab. Check the
Command Processor section of the manual for complete details.
Q.
I have an Access database with over one hundred tables. I want to convert these all to Stata files.
What is the easiest way to do this?
A.
First create a directory for your output files (out, for example). Then enter the Command
Processor (see Question 1). If your input file is in c:\data\in.mdb (assuming a Windows machine)
and your new output directory was c:\data\out, you could use the command
copy c:\data\in.mdb c:\data\out\*.dta -t*
The -t* modifier in this command tells Stat/Transfer to copy all of the tables from in.mdb to the
destination out. The output files will be automatically named with the name of the table and the
extension .dta.
Q. I have fifty dBASE files I would like to move to SAS, what is the easiest way to do this?
A.
Assuming your dBASE files are in data/dbase and you would like your output in data/sas, you
can do this in a single command:
copy data/sas/*.dbf data/sas/*.sas7bdat
A.
You can do anything in the command processor that you can do in the user interface. The
simplest way to learn to use the command processor is to run a transfer with the user interface and,
when you are done, press the Save Program button to generate a command file for the command
processor. Then examine the program from the Run Program tab. All of the options and commands
are thoroughly documented in the online help or in this manual.
Q. How do I set my options permanently so that I dont have to enter SET commands every time I
start up the command processor?
A. Put your SET commands in a file called profile.stcmd, located in the same directory as
Stat/Transfer.
Licenses
Q.
A.
Our Single User or Workgroup licenses do not allow installation on a server. If you want to share
the use of Stat/Transfer on a network, please consider a multi-user lease.
Q.
A.
Q.
Only if the use occurs on a single machine. Please encourage the future development and support
of Stat/Transfer by complying with our license agreement.
A.
It is simplest to download the latest release from our website, install it, and then re-activate it
with the code you obtained from your disk envelope or original email from our sales department. If
you cannot find your activation code, please send us an e-mail giving us as much detail as possible
about your version, and how, where, and when you purchased Stat/Transfer and we will see if we can
send you an activation code.
Q.
A.
Q.
A.
Q.
Simply re-install the software, either from your disk or from our website. Then use your activation
code to activate the program. See the Activation section, Page 6
The problem is that it takes so little time to do so much work with Stat/Transfer. If we allowed
simultaneous use, one copy could cover a very large workgroup and we would not make enough to
develop the next version. To figure out how many users you have in your workgroup, count the
number of people who are likely to use Stat/Transfer in a one year period and buy enough licenses to
cover that number. On Unix, for those who do not want to bother with this, we have single
machine licenses that will cover an unlimited number of users on a single machine.
A.
Yes, that is explicitly permitted by our license. If you are the primary user of the software you
can install it on multiple machines. However more than one user cannot use the software on more
than one machine.
161
Linux
Q.
A. If you have a multi-user license and you are installing the software for others to use, it is a good
idea to use the sudo command or to log in as root. In this case Stat/Transfer will be installed in
/usr/local/stattransfer10 by default. You can, of course, select another location if you wish.
However, if you are installing Stat/Transfer for your own use and have a personal license (a sixteen
character activation key) we recommend that you install it using your customary login id.
Stat/Transfer is built so that all of its shared libraries are in known locations in subdirectories under
the executable's. It thus does not require root privileges for an install and once installed, can be
copied freely on your disk.
Q.
Q.
A.
I can see my data source on Stat/Transfers list, but I encounter errors when I try to connect to it.
There are many possible causes for this problem. Your ODBC driver may not be properly
installed or configured for your network. Your database server and/or network may be down. You
might not have proper access rights or the proper password for your database.
We suggest you try to connect to your database with another tool, preferably one that is supplied by
your database vendor, or a general tool such as Microsoft Excel. If you still encounter difficulties,
you should first seek support from your local database administrator and/or the vendor who supplied
your database or driver.
Q.
A.
Many database systems allow you to define a view, that will appear to Stat/Transfer as a single
table. If you database allows this, it is the simplest and most robust way of joining tables for
Stat/Transfer. If this option is not available, the Stat/Transfer command processor allows you to
submit an SQL select statement. It is simply passed through to your ODBC driver, so it must be legal
SQL for your particular database driver.
datasets using Proc Format with the 'cntrlout' option. See the SAS Value Labels section, page 135.
Q.
A.
First, we dont support every platform. Check the sections on SAS Data, CPORT, and Transport
files. If your SAS data are not in one of these formats, we cannot read it, and you should use SAS to
create a SAS Transport file.
Further, we cannot read data that have been encrypted. If your data have been written with encryption,
you must use SAS to copy the file to an unencrypted format.
Finally, if you are moving the file from another platform, make sure that you use a binary, errorcorrecting, file transfer protocol.
If your file is in the proper format and you still cannot read it, please report the error using the
mechanism on the Log tab of the user interface. The SAS file format has not been publicly
documented and there may be aspects of it that we are not supporting properly. Please let us know
about any problems you are having so that we can fix them for you and others.
Q.
A.
SAS refuses to read the SAS data file created by Stat/Transfer. What should I do?
First, make sure that you are writing the proper kind of file for the flavor of SAS you are using
and that you are transporting the file properly. Then check our website to see if there is a problem that
has already been fixed. If you think you have discovered a new problem, please let us know about it
so we can fix it.
In the meantime, you can always use the Stat/Transfer output option SAS Program + ASCII Data
File and then use the generated SAS program to read the ASCII data file into SAS.
What is the
matter?
A.
Most commonly, particularly when the file is received from others, the problem is that the file is
not really a transport file, but, rather, is another kind of system file. You should examine the first part
of the file, either in an editor or by simply typing it to the screen. The text HEADER
RECORD*********LIBRARY HEADER RECORD should appear at the beginning of the file. If it
does not, it is not a transport file. You should refer to the Stat/Transfer manual or to SAS
documentation to find out how to create a Transport file in SAS.
S-PLUS Files
Q. Stat/Transfer will not read my S-PLUS file. What is wrong?
A. Remember, some S-PLUS file have very little structure and parts of the data are only meaningful
to S-PLUS. Make sure that your data are in the form of a two-dimensional matrix, a list or a
dataframe.
Stata Files
Q. Why do all of my labels and variable names come out in lower case when I transfer a file to Stata?
A. Thats not a bug its a feature. We respect the style of the package to which we are
transferring and packages such as S-Plus and Stata favor lower case letters. If you would like to
maintain the case of your variable names and labels, check the Variable Name Case Conversion
options under General Options in the Options dialog box.
Worksheet Files
Q. I have some blank rows in my worksheet.
A.
The reason Stat/Transfer behaves that way is that sometimes users like to put comments or notes
at the bottom of their data block. If they put at least one blank line between the data and the
comments, then by default, Stat/Transfer will read their data and skip the comments with no special
actions on their part.
However if you can change the behavior in one of two ways. In the Options dialog box, you can
either set the Blank Rows option to control reading of blank lines, or you can explicitly set a data
range by using the Data Range option In the latter case, Stat/Transfer will return all of the rows in
the range you specify. In other words, it assumes that you know what you are doing and will return
blank rows if that is what you want.
Q.
When I read my Excel spreadsheet, sometimes a whole column of numbers gets transferred as a
string variable, even though it contains lots of numbers.
A.
Stat/Transfer examines all of the cells in a column to determine the type. If there are any strings,
the column will be transferred into a string variable and the numbers and dates that are in the column
will be converted to their string representation. This scheme is, in general, what people expect,
particularly for columns of mixed numeric and string identifiers, where the alternative strategy would
make the strings into missing values.
If you have a column that you want to force to numeric, you can check it to make sure that there are
not any strings or numbers formatted with a text format. Alternatively, you can force a type
conversion by using the controls on the Variables dialog box.
Q.
A.
In general, you should just leave missing cells blank. You can also represent numeric missing
data with a period. Missing data for strings can be represented by an empty string (entered with a
single quote). However, blanks work just as well as any of these alternatives.
If your worksheets already use a specific string to represent missing, you can tell Stat/Transfer to use
that string as a missing value by setting the Numeric Missing Value string in Worksheet options.
Q.
I have variable names in the first row of my worksheet, but Stat/Transfer doesnt use them. It
makes up names like col1 and col2. How can I solve this problem?
A. Stat/Transfer tries to automatically sense whether you have variable name in your worksheet by
looking for a change from a string type to a numeric type between the first and second rows of your
worksheet. If all of your variables are string variables, this condition will not be met. To solve this
problem, simply go to the Field Name Row Worksheet option and change AutoSense to First NonBlank Row.
165
If you
are having trouble with Excel, please do two things. First, use the "Save As" menu option in Excel
to save your data into Lotus 1-2-3 or a version of Excel older than Excel '97. These files will be
read by a different module within Stat/Transfer. Second, make us aware of your problem, so that we
can correct Stat/Transfer.
Index
Access databases ....................................................86
appending tables ................................................34
converting to Stata files ....................................158
Converting to Stata files ...................................158
output table names ............................................15
selecting tables ...................................................12
Specifying tables .................................................57
Specifying tables with wildcards .........................58
Activation ..................................................................5
without an internet connection............................5
Alphnumeric variables, width .................................80
Arithmetic operations in WHERE ............................24
ASCII files
date/time format options .............................3032
fixed format ..................................................9091
programs for .......................................................90
Schema file....................................................9399
write options.................................................3637
ASCII Files - Delimited .......................................8889
ASCII input
SAS, SPSS, Stata...................................................90
Schema file....................................................9399
ASCII output ............................................................14
ASCII/Text Files - Read Options.........................3436
Automatic Logging ..................................................17
Automatic Program Generation ....................... 16, 48
Batch jobs
command files ..............................................7677
using generated programs ..................................16
Blank rows in worksheets .......................................40
Byte order for output..............................................73
'Byte' target type ....................................................20
Canceling a transfer ................................................16
Case conversions, variables ....................................28
Case selection
command processor ...........................................63
observations dialog box ................................2426
Case selection expressions ...............................2426
missing values .....................................................25
relational operators in ........................................24
strings in..............................................................24
wildcards in .........................................................25
Case-Selection expressions .....................................24
Catalog files, SAS .....................................................38
Century changeover year........................................31
Change output variable types .................................66
Changing output variable types ..............................22
Character encoding - FAQ .....................................156
Character encoding options..............................3233
Combining input files ..............................................54
Command files ..................................................7677
online ....................................................................7
technical support ..................................................8
HTML tables ..........................................................110
Input data viewer....................................................13
Input file format, selecting .....................................55
Input widths, preserving .........................................37
Installing on a second computer ..............................6
'int' target type .......................................................20
Internal limitations .................................................80
International character sets....................................32
JMP files ................................................................111
JMP options ...................................................... 41, 71
KEEP command .......................................................64
options for multiple sessions ..............................72
LIMDEP files ..........................................................113
Limitations
internal................................................................80
on changing output variable types .....................22
on the number of variables ................................80
Linux
FAQ ...................................................................160
Log dialog box .........................................................49
Logging, command processor .................................75
'long' target type.....................................................20
Lotus 1-2-3 worksheet files.....................................83
Matlab files ...........................................................114
Matlab output options............................................42
Members, SAS, CPORT, and Transport files ...... 12, 59
Messages
adding date and time to .....................................43
Command processor ...........................................61
Microsoft access .....................................................86
Mineset files .........................................................116
Minitab worksheets ..............................................117
Missing values ................................................... 25, 29
ASCII .............................................................. 35, 36
map to extended ................................................29
set command ......................................................69
Mixed data ..............................................................22
Mplus files.............................................................118
New features ............................................................3
NLOGIT files ..........................................................119
Numeric variable names .........................................27
OBDC
options ................................................................69
Observations dialog box .........................................23
ODBC
null vs. empty strings ..........................................34
options ................................................................34
ODBC data sources ...............................................120
Online documentation ..............................................7
Online help................................................................7
OpenDocument spreadsheets ........................12224
Options
New in Version 12 .................................................3