Sas Functions Pocketref
Sas Functions Pocketref
Chris Fehily
| iii
Chapter 1 1 Using Functions
Introduction
Like William Goldman’s The Princess Bride, this book is the “good
parts” version of the official SAS 9.1 function reference. I’ve
stripped that creature to its essentials: Every function (even the
obscure ones) is covered and each entry gives the function’s syn-
tax, purpose, and required and optional arguments. In most cases
you’ll also find examples, cross-references to related functions,
and other odds and ends.
I expect that you’ll use this book like I do: Pick it up while you’re
programming, flip to whatever you need reminding of, and then
toss it aside without breaking stride. To this end, I give parameter
names that are self-descriptive (in binomial functions the proba-
bility of success is prob_success, not p) and consistent (str always
is a string, char a character, x a floating-point number, n an inte-
ger, and so on).
Typographic Conventions
Italic type introduces new terms or represents replaceable vari-
ables in regular text.
Monospace type denotes SAS code in examples and regular text.
It also shows filenames, directory (folder) names, and the output
from programs or commands.
Italic monospace type denotes a variable in SAS code that you
must replace with a value. You’d replace filename with the name
of a real file, for example.
Bold monspace type highlights SAS code fragments and results
that are explained in the accompanying text.
1
About Functions
A function is a named routine that performs a computation on its
arguments and returns a value. Functions typically are used in
DATA-step statements but they also can appear in SQL, WHERE
expressions, PROC REPORT, and some statistical procedures.
You can use %sysfunc or %qsysfunc to invoke a function in a
macro statement.
The syntax of a function is
function(arg1,arg2,...)
Characters Meaning
| A vertical bar or pipe separates alternative arguments.
You can choose exactly one of the given items. (Don’t
type the vertical bar.) A|B|C is read “A or B or C.” Don’t
confuse the pipe symbol with the double-pipe symbol,
||, which is SAS’s string-concatenation operator.
Errors
If any argument’s value is invalid, SAS logs an error message and
the function returns a missing value (numeric or character). The
usual reasons for a bad argument are a value that’s out of range (a
probability that’s less than zero or greater than one, for example),
a wrong type (a number instead of a string, or vice versa), or a
missing value where a nonmissing one is needed.
Call Routines
I use the term function to refer to both functions and call routines
unless the distinction is important. A call routine is almost the
same as a function except that, instead of returning a result, it
changes the values of some or all of the variables used as its argu-
ments. (If you’ve programmed in other languages, you know this
behavior as call-by-reference. Functions, which never alter their
arguments, are call-by-value.) The syntax of a call routine is
call routine(arg1,arg2,...);
where routine is the call routine’s name. The argument(s) that the
routine puts its results in must be variables of appropriate type
and length, not constants or complex expressions. Note the ter-
minating semicolon—a call routine is a standalone statement that
can’t be nested or used in an assignment statement. You can use
%syscall to invoke a call routine in a macro statement.
Variable Lists
In functions or call routines that take variables as arguments (most
descriptive-statistics functions, for example) you can use OF, -,
Using Functions | 3
and other keywords and operators to specify a shorthand list of
variables rather than typing each variable explicitly. The follow-
ing table shows some examples of function calls.
Using Functions | 5
Category Functions
External files dclose, dcreate, dinfo, dnum, dopen, doptname,
doptnum, dread, dropnote, fappend, fclose, fcol,
fdelete, fexist, fget, fileexist, filename, fileref, finfo,
fnote, fopen, foptname, foptnum, fpoint, fpos, fput,
fread, frewind, frlen, fsep, fwrite, mopen, pathname,
sysmsg, sysrc
External call module, call modulei, modulec, moduleic,
routines modulein, modulen
Financial compound, convx, convxp, daccdb, daccdbsl, daccsl,
daccsyd, dacctab, depdb, depdbsl, depsl, depsyd,
deptab, dur, durp, intrr, irr, mort, netpv, npv, pvp,
saving, yieldp
Hyperbolic cosh, sinh, tanh
Macro call execute, call symput, call symputx, resolve,
symexist, symget, symglobl, symlocal
Mathematical abs, airy, beta, call allperm, call logistic, call softmax,
call stdize, call tanh, cnonct, coalesce, comb, constant,
dairy, deviance, digamma, erf, erfc, exp, fact, fnonct,
gamma, ibessel, jbessel, lgamma, log, log10, log2,
logbeta, mod, modz, perm, sign, sqrt, tnonct,
trigamma
Probability cdf, logcdf, logpdf, logsdf, pdf, poisson, probbeta,
probbnml, probbnrm, probchi, probf, probgam,
probhypr, probmc, probnegb, probnorm, probt, sdf
Quantile betainv, cinv, finv, gaminv, probit, quantile, tinv
Random call ranbin, call rancau, call ranexp, call rangam,
number call rannor, call ranperk, call ranperm, call ranpoi,
call rantbl, call rantri, call ranuni, call streaminit,
normal, ranbin, rancau, rand, ranexp, rangam, rannor,
ranpoi, rantbl, rantri, ranuni, uniform
SAS file I/O attrc, attrn, cexist, close, curobs, dropnote, dsname,
exist, fetch, fetchobs, getvarc, getvarn, iorcmsg,
libname, libref, note, open, pathname, point, rewind,
sysmsg, sysrc, var<attr>, varnum
Using Functions | 7
Chapter 2 2 List of Functions
This chapter lists all SAS functions and call routines alphabetically.
Those introduced in SAS 9.0 or later are marked 9+.
List of Functions | 9
abs
abs(x)
Returns the absolute value of x; that is, the nonnegative num-
ber that has the same magnitude as x. Examples: abs(2.2) →
2.2. abs(0/6) → 0. abs(-6) → 6. See also: sign (p. 146).
addr(var)
Returns the integer memory address of a variable on 32-bit
systems. Favor addrlong (p. 10) over addr because addrlong
works on both 32-bit and 64-bit systems. See also: call poke
(p. 22), peek (p. 114), peekc (p. 114).
addrlong(var)
Returns the memory address of a variable as a binary value on
32-bit or 64-bit systems.
Example:
Prints the hex-formatted memory address of the variable x
data _null_;
x=1234;
addr=addrlong(x);
put addr= $hex16.;
run;
See also: addr (p. 10), call pokelong (p. 22), peekclong
(p. 114), peeklong (p. 115), ptrlongadd (p. 125).
airy(x)
Returns the value of the Airy function Ai(x), which is a solu-
tion to the differential equation y″ – xy = 0 and defined for
real values of x by the integral
∞ 3
1 t
Ai ( x ) = --- ∫ cos ⎛⎝ ---- + xt⎞⎠ dt .
π 3
0
anydigit(s) → 5.
anydigit(s,0) → 0.
List of Functions | 11
arcos
anyupper(s,4) → 12.
anyupper(s,-4) → 3.
anyalpha(s,999) → 0.
anyalpha(s,-999) → 12.
anycntrl(s) → 0.
attr_name Returns
charset Character-set sort order:
Empty string—Dataset isn’t sorted
ascii—ASCII character set
ebcdic—EBCDIC character set
ansi—OS/2 ANSI standard ASCII character set
oem—OS/2 OEM code format
List of Functions | 13
attrc
attr_name Returns
sortlvl Dataset sort type:
Empty string—Dataset isn’t sorted
weak—Dataset is user-sorted (usually by the
SORTEDBY dataset option); SAS can’t validate cor-
rectness and so depends on the order of observations
strong—Dataset is sorted by SAS software (usually
by PROC SORT or the OUT= option of PROC
CONTENTS)
sortseq An empty string if the dataset is sorted on the native
machine or if the sort collating sequence is the
default for the OS; otherwise the result is the name of
the alternate collating sequence used to sort the file
type SAS dataset type
attr_name Returns
alterpw 1 if a password is needed to alter the dataset; 0
otherwise
anobs 1 if the engine knows the number of observa-
tions; 0 otherwise
any|varobs -1 if the dataset has no observations or vari-
ables; 0 if observations don’t exist but variables
do; or 1 if both observations and variables exist
arand|random 1 if the engine supports random access; 0 other-
wise
arwu 1 if the engine can write (create and update)
SAS files; 0 if the engine is read-only
audit 1 if logging to an audit file is turned on; 0 other-
wise
audit_data 1 if after-update record images are stored; 0 oth-
erwise
audit_before 1 if before-update record images are stored; 0
otherwise
audit_error 1 if unsuccessful after-update record images are
stored; 0 otherwise
crdte Dataset creation date, as a SAS datetime (use
the DATETIME. format to display this value)
iconst 0 if no dataset integrity constraints exist; 1 if
one or more general integrity constraints exist;
2 if one or more referential integrity constraints
exist; or 3 if both one or more general integrity
constraints and one or more referential integrity
constraints exist
List of Functions | 15
attrn
attr_name Returns
index 1 if the dataset supports indexing; 0 otherwise
isindex 1 if at least one dataset index exists; 0 otherwise
issubset 1 if the dataset is a subset (that is, at least one
WHERE clause is active); 0 otherwise
lrecl Logical record length
lrid Length of the record ID
maxgen Maximum number of generations
maxrc 1 if an application checks return codes; 0 other-
wise
modte Dataset modification date, as a SAS datetime
(use the DATETIME. format to display this val-
ue)
ndel Number of dataset observations marked for
deletion
nextgen Next generation number to generate
nlobs Number of logical observations (those not
marked for deletion); -1 if unknown. An active
WHERE clause doesn’t affect this number.
nlobsf Number of logical observations (those not
marked for deletion) after applying the OBS=
and FIRSTOBS= system options and WHERE
clauses. This attribute forces SAS to scan the
entire dataset.
nobs Number of physical observations (including
those marked for deletion); -1 if unknown. An
active WHERE clause doesn’t affect this num-
ber.
nvars Number of variables in the dataset
pw 1 if a password is needed to access the dataset; 0
otherwise
radix 1 if access by observation number (radix
addressability) is allowed; 0 otherwise
readpw 1 if a password is needed to read the dataset; 0
otherwise
attr_name Returns
tape 1 if the dataset is a sequential file; 0 otherwise
whstmt 0 if no WHERE clause is active; 1 if a perma-
nent WHERE clause is active; 2 if a temporary
WHERE clause is active; or 3 if both permanent
and temporary WHERE clauses are active
writepw 1 if a password is needed to write to the dataset;
0 otherwise
band(n,m)
Returns the bitwise AND of two integers (0 ≤ n ≤ 232, 0 ≤ m ≤
232). Example: band(10,7) → 2 (1010 & 0111 → 0010). See
also: bnot (p. 18), bor (p. 18), bxor (p. 18).
beta(a,b) 9+
Returns the value of the beta function with shape parameters
a (> 0) and b (> 0). The beta function is defined for positive
values of a and b by the integral
1
a–1 b–1
B ( a, b ) = ∫0x (1 – x) dx .
Examples:
beta(4,2) → 0.05.
beta(2,1) → 0.5.
beta(1,2) → 0.5.
beta(5,2) → 0.0333333333.
See also: betainv (p. 17), logbeta (p. 100), probbeta (p. 116).
betainv(prob,a,b)
Returns the prob-th quantile from the beta distribution with
shape parameters a (> 0) and b (> 0). The probability that an
List of Functions | 17
blshift
bor(n,m)
Returns the bitwise OR of two integers (0 ≤ n ≤ 232, 0 ≤ m ≤
232). Example: bor(10,7) → 15 (1010 | 0111 → 1111). See also:
band (p. 17), bnot (p. 18), bxor (p. 18).
brshift(n,shift)
Right-shifts the integer n by shift bits (0 ≤ n ≤ 232; 0 ≤ shift ≤
31). Example: brshift(7,3) → 0 (0111 >> 3 → 0000). See also:
blshift (p. 18).
bxor(n,m)
Returns the bitwise EXCLUSIVE OR of two integers (0 ≤ n ≤
232, 0 ≤ m ≤ 232). Example: bxor(10,7) → 13 (1010 ^ 0111 →
1101). See also: band (p. 17), bnot (p. 18), bor (p. 18).
byte(n)
Returns the n-th character (0 ≤ n ≤ 255) in the ASCII or
EBCDIC collating sequence. The result’s default length is one.
ASCII characters 0–127 are the standard set; characters 128–
255 vary depending on your OS. Examples: byte(80) → ‘P’
(ASCII). byte(80) → ‘&’ (EBCDIC). See also: collate (p. 45),
rank (p. 131).
Example:
Generate and print all permutations of the elements of ar
array ar [4] $1 ('a' 'b' 'c' 'd');
n=dim(ar);
nfact=fact(n);
do i=1 to nfact;
call allperm(i,of ar[*]);
put i 5. +2 ar[*];
end;
See also: call ranperk (p. 28), call ranperm (p. 28), fact
(p. 66), perm (p. 115).
call cats(result,str1 [,str2,...]); 9+
Concatenates strings, removing leading and trailing spaces,
returning the result in the variable result. If result is too small
to contain the concatenated string, SAS
• logs a warning that the result was truncated
• logs a note showing the function call’s location and the
argument that caused the truncation, except when called in
SQL or in a WHERE clause
• sets _ERROR_ to 1 in the DATA step, except in a WHERE
clause.
See also: cat (p. 38), cats (p. 40).
call catt(result,str1 [,str2,...]); 9+
Concatenates strings, removing trailing spaces, returning the
result in the variable result. See call cats (p. 19) for error
conditions. See also: cat (p. 38), catt (p. 40).
call catx(sep,result,str1 [,str2,...]); 9+
Concatenates strings, removing leading and trailing spaces
and inserting separators between strings, returning the result
in the variable result. sep usually is a space or a comma but can
be any string. See call cats (p. 19) for error conditions. See
also: cat (p. 38), catx (p. 40).
call compcost(op1=,cost1 [,op2=,cost2,...]); 9+
Sets the costs of operations for later use by compged (p. 47).
Each op is followed by its integer cost (-32767 to 32767). If an
List of Functions | 19
call execute
call execute(str);
Resolves a string and executes it. str is a quoted string, an
unquoted DATA-step character variable, or a character
expression that resolves to a macro text expression or SAS
statement. If str is a macro invocation, DATA-step execution
pauses and the macro executes immediately. If str is a SAS
statement or if macro-execution generates SAS statements,
the statement(s) execute after the calling DATA-step ends. For
details see Execute in SAS Macro Language: Reference.
call label(var,char_var);
Assigns a variable’s label to a character variable. var is any SAS
variable. If var has no label, var’s name is assigned to char_var.
char_var is any SAS character variable. Labels can be up to
256 characters long; make the length of char_var long enough
to avoid truncation. See also: v<attr> (p. 161).
call logistic(var1 [,var2,...]); 9+
Replaces each numeric var by its logistic value, defined by the
equation ex/(1+ex). Use only variables as arguments, not con-
ctrl_str Means
i Print hex representations of all call module arguments.
Use i for debugging. Using i implies e.
e Print detailed error messages. Without e (or i) call
module generates only “Invalid argument to function”.
Use e for production environments.
h Provide help about call module syntax, the attribute
file format, and suggested formats and informats
List of Functions | 21
call modulei
Example:
Call the xyz routine
routine xyz minarg=2 maxarg=2;
arg 1 input num byvalue format=ib4.;
arg 2 output char format=$char10.;
data _null_;
call module('xyz',1,x);
put x=;
run;
See also: call modulei (p. 22), modulec (p. 104), moduleic
(p. 104), modulein (p. 104), modulen (p. 104).
call modulei([ctrl_str,]module_name [,arg1,arg2,...]);
Calls an external routine without any return code (in IML
environment only). For details see call module (p. 21). See
also: moduleic (p. 104), modulein (p. 104).
call poke(source,pointer [,len]);
Same as call pokelong (p. 22) but for 32-bit systems only.
See also: addr (p. 10), peek (p. 114), peekc (p. 114).
call pokelong(source,pointer [,len]);
Writes a value directly into memory on 32-bit or 64-bit sys-
tems. source is a string that contains a value to write. pointer is
a string that contains the virtual address of the data to over-
write. len is the number of bytes to write from source to the
pointer address. Omit len to copy the entire value of source. See
also: addrlong (p. 10), call poke (p. 22), peekclong (p. 114),
peeklong (p. 115).
List of Functions | 23
call prxnext
call prxnext(prx_id,start_pos,stop_pos,str,
pos,len); 9+
Returns the position and length of a substring that matches a
pattern and iterates over multiple matches within a string.
prx_id is the pattern identifier that prxparse (p. 121) returns.
start_pos is the position at which to start the pattern matching
in str. If the match succeeds, call prxnext changes start_pos
to pos+max(1,len); otherwise start_pos isn’t changed.
stop_pos is the last character to use in str. If stop_pos is -1, then
the last character is the last nonblank character in str.
str is the character expression to search.
pos returns the position in str at which the pattern begins. If
no match is found, pos is zero (0).
len returns the length of the string that is matched by the pat-
tern. If no match is found, len is zero (0).
Example:
Find three-letter words starting with ‘c’, ‘r’, or ‘b’ and
ending with ‘at’
data _null_;
prx_id=prxparse('/[crb]at/');
str='The woods have a bat, cat, and a rat.';
start_pos=1;
stop_pos=length(str);
call prxnext(prx_id,start_pos,stop_pos,str,
pos,len);
do while (pos>0);
found=substr(str,pos,len);
put found= pos= len= start_pos=;
call prxnext(prx_id,start_pos,stop_pos,str,
pos,len);
end;
run;
List of Functions | 25
call prxsubstr
put hour= minute= ampm=;
end;
run;
List of Functions | 27
call ranperk
call ranperk(seed,k,var1 [,var2,...]); 9+
Generates a random permutation of k values from the vars,
which must be all the same type and, if character, same length.
seed is an integer random-number seed (< 231-1). If seed ≤ 0,
the time of day is used to initialize the seed stream.
Example:
Generate random permutations of 3 of the values x1-x5
data _null_;
array ar x1-x5 (1 2 3 4 5);
seed=98765;
do n=1 to 10;
call ranperk(seed,3,of x1-x5);
put seed= @20 x1-x3;
end;
run;
See also: call allperm (p. 18), call ranperm (p. 28), fact
(p. 66), perm (p. 115).
call ranperm(seed,var1 [,var2,...]); 9+
Generates a random permutation of all values from the vars,
which must be all the same type and, if character, same length.
seed is an integer random-number seed (< 231-1). If seed ≤ 0,
the time of day is used to initialize the seed stream.
Example:
Generate random permutations of the values x1-x4
data _null_;
array ar x1-x4 (1 2 3 4);
seed=98765;
do n=1 to 10;
call ranperm(seed,of x1-x4);
put seed= @20 x1-x4;
end;
run;
See also: call allperm (p. 18), call ranperk (p. 28), fact
(p. 66), perm (p. 115).
call ranpoi(seed,lambda,x);
Returns a random variate from a Poisson distribution with
mean lambda (≥ 0).
List of Functions | 29
call rxchange
call rxfree(rx_id );
Frees memory allocated to a SAS regular expression. rx_id is
the pattern identifier that rxparse (p. 135) returns. Example:
See rxparse.
call rxsubstr(rx_id,str,start_pos [,len [,score]]);
Returns the position, length, and score of a substring that
matches a pattern, or zero (0) if no match is found. rx_id is
the pattern identifier that rxparse (p. 135) returns. str is the
string to search. start_pos is the position in str where the
matched substring begins. len is the length of the matched
substring. score is an integer based on the number of matches
for a particular pattern in a substring. When a pattern match-
es more than one substring beginning at start_pos, the longest
substring is selected. Example: See rxparse. See also: call
rxchange (p. 30), call rxfree (p. 30), rxmatch (p. 134).
See also: call scanq (p. 31), scan (p. 145), scanq (p. 145).
call scanq(str,n,pos,len [,delims]); 9+
Returns the position and length of the n-th word in a string,
ignoring quote-enclosed delimiters. If n < 0, call scanq counts
words right to left. If n = 0 or |n| > the number of words in str,
call scanq returns zero in pos and len. Unmatched quotes in
str make left-to-right and right-to-left scans return different
words. delims specifies character(s) that separate words. The
default delimiters are whitespace characters: blank, horizontal
and vertical tab, carriage return, line feed, and form feed. You
can’t use single or double quotes as delimiters. Contiguous
delimiters are treated as one. Leading and trailing delimiters
are ignored. To extract the desired word after calling call
scanq, use substrn (p. 152).
Examples:
call scanq('12 "a c" xyz',2,pos,len) → pos=4, len=5.
See also: call scan (p. 30), scan (p. 145), scanq (p. 145).
call set(dataset_id );
Links SAS dataset variables to the DATA-step or macro vari-
ables in your program that have the same name and data type.
dataset_id is the dataset identifier that open (p. 111) returns.
List of Functions | 31
call sleep
Example:
Get values for the first 10 observations in sasuser.houses and
store them in mydata
data mydata;
length style $8 sqfeet bedrooms baths 8
street $16 price 8;
drop rc dataset_id;
dataset_id=open("sasuser.houses","i");
call set(dataset_id);
do i=1 to 10;
rc=fetchobs(dataset_id,i);
output;
end;
run;
( original – location )
result = add + mult × ⎛ ------------------------------------------------------⎞
⎝ scale ⎠
where result is the final output value returned for each vari-
able, add is the constant to add (add=), mult is the constant to
multiply by (mult=), original is the original input value,
location is the location measure, and scale is the scale mea-
sure. You can replace missing values by any constant; if you
don’t specify missing= or replace, variables that have miss-
ing values aren’t changed. The initial estimation method for
abw=, ahuber=, and awave= is mad. Percentiles are computed
by using definition 5; see pctl (p. 113).
Each opt is a case-insensitive string; leading and trailing spac-
es are ignored. Use a separate argument for each opt. An opt
that ends with an equal sign is followed by its corresponding
value: another argument that is a numeric constant, variable,
or expression. For details about the options see PROC
STDIZE in SAS/STAT User’s Guide. The default opts are std
and df. The following table lists location and scale options.
opt Option
abw= Tuning constant
agk= Proportion of pairs to be included in the estimation
of the within-cluster variances
ahuber= Tuning constant
awave= Tuning constant
euclen Euclidean length
iqr Interquartile range
List of Functions | 33
call stdize
opt Option
l= Power (≥ 1) to which differences are to be raised in
computing an L(p) or Minkowski metric
mad Median absolute deviation from the median
maxabs Maximum absolute values
mean Arithmetic mean (average)
median Middle number in a set of data that is ordered
according to rank
midrange Midpoint of the range
range Range of values
spacing= Proportion of data to be contained in the spacing
std Standard deviation (the default)
sum Result obtained when numbers are added
ustd Unstandardizes variables
opt Option
df Degrees of freedom (the default)
n Number of observations
opt Option
add= Number to add to each value after standardizing and
multiplying by the mult= value (default 0)
fuzz= Relative fuzz factor
missing= Value to be assigned to variables that have a missing
value
mult= Number to multiply each value by after standardizing
(default 1)
opt Option
norm Normalize the scale estimator to be consistent for the
standard deviation of a normal distribution (affects
only agk=, iqr, mad, and spacing=)
pstat Log the values of the location and scale measures
replace Replace each missing value with 0 in the standard-
ized data—this value corresponds to the location
measure before standardizing (to replace missing val-
ues by other values, use missing=)
snorm Normalize the scale estimator to have an expectation
of approximately 1 for a standard normal distribu-
tion (affects only spacing=)
Examples:
w=.; x=1; y=2; z=3;
call streaminit(seed ); 9+
Sets a seed value that rand (p. 128) uses to generate random
numbers. seed is an integer (< 231-1). To create reproducible
random-number streams, use call streaminit before calling
rand. Calling rand before call streaminit (or using call
streaminit with seed ≤ 0) makes rand use the system clock
to seed itself.
Example:
Create a reproducible stream of random numbers
data random;
call streaminit(12345);
do i=1 to 10;
x=rand('normal');
output;
List of Functions | 35
call symput
end;
run;
call symput(macro_var_name,char_value);
Assigns a value to a macro variable. macro_var_name is a
character expression that identifies the macro variable that’s
assigned a value. If the macro variable doesn’t exist, it’s creat-
ed. char_value is a character expression that contains the
DATA-step information to assign. For more information see
Symput in SAS Macro Language: Reference. See also: call
symputx (p. 36), symget (p. 153).
List of Functions | 37
cat
run;
data attributes;
set test;
by x;
input a b $ c;
length varname $32 vartype $3;
varname=' ';
varlen=1;
do i=1 to 99 until(varname=' ');
call vnext(varname,vartype,varlen);
put i= varname @20 vartype= varlen=;
end;
myvar=0;
datalines;
1 a 2
;
Examples:
Compare cat, cats, catt, and catx
data _null_;
str1=' abc';
str2='123';
str3=' xy ';
str4='Z';
cat=cat(of str1-str4);
List of Functions | 39
cats
cats=cats(of str1-str4);
catt=catt(of str1-str4);
catx=catx('**',of str1-str4);
put cat= cats= catt= catx= $char.;
run;
Variable Result
cat ' abc123 xy Z'
cats 'abc123xyZ'
catt ' abc123 xyZ'
catx 'abc**123**xy**Z'
See also: call cats (p. 19), call catt (p. 19), call catx
(p. 19).
cats(str1 [,str2,...]) 9+
Concatenates strings, removing leading and trailing spaces.
See cat (p. 38) for usage. Examples: See cat. See also: call
cats (p. 19).
catt(str1 [,str2,...]) 9+
Concatenates strings, removing trailing spaces. See cat (p. 38)
for usage. Examples: See cat. See also: call catt (p. 19).
catx(sep,str1 [,str2,...]) 9+
Concatenates strings, removing leading and trailing spaces
and inserting separators between strings. sep usually is a space
or a comma but can be any string. See cat (p. 38) for usage.
Examples: See cat. See also: call catx (p. 19).
cdf(dist,quantile [,param1, param2,...])
Returns the left cumulative distribution function for the con-
tinuous and discrete distributions listed in the following table.
The CDF is the function F, for a random variable X, defined
for all real values of x by
F(x) = Prob(X ≤ x) .
List of Functions | 41
ceil
Examples:
cdf('bern',0,0.5) → 0.5.
cdf('beta',0.2,2,1) → 0.04.
cdf('binom',6,0.5,10) → 0.828125.
cdf('chisq',5.5,10) → 0.144621493.
cdf('expo',1) → 0.632120558.
cdf('f',3.5,4,5) → 0.899032601.
cdf('lognormal',0.5,1,2) → 0.198616419.
cdf('normal',1.96) → 0.9750021049.
cdf('poisson',2,1) → 0.9196986029.
cdf('t',1.0,20) → 0.835371711.
cdf('uniform',3.0,2,4) → 0.5.
See also: logcdf (p. 100), pdf (p. 114), quantile (p. 127), sdf
(p. 146).
ceil(x)
Returns the smallest integer ≥ x. If x is within 10-12 of an inte-
ger, ceil returns that integer. Examples: The following table
shows examples of ceil, floor (p. 74), int (p. 87), and round
(p. 133). See also: ceilz (p. 43).
choosen(n,num1 [,num2,...]) 9+
Returns the n-th number. If n is negative, choosen counts back-
ward. Examples: choosen(99,1,2,3,4) → . (missing—bad n).
choosen(-2,11,12*2,13) → 24. See also: choosec (p. 43).
cinv(prob,df [,nc])
Returns the prob-th quantile from the chi-square distribution
with degrees of freedom df (> 0, noninteger allowed) and
noncentrality parameter nc (≥ 0). The probability that an
observation from a chi-square distribution is less than or
equal to the returned quantile is prob (0 ≤ x < 1). If nc is omit-
ted or is zero, cinv uses the central chi-square distribution.
Large nc values can cause cinv to return a missing value. cinv
is the inverse of probchi (p. 117). Example: cinv(0.95,3) →
7.8147279033. See also: cnonct (p. 44).
close(dataset_id )
Closes a SAS dataset, returning zero (0) if successful; nonzero
otherwise. dataset_id is the dataset identifier that open
(p. 111) returns. To free memory you should close open
datasets when they’re no longer needed. Note that SAS auto-
List of Functions | 43
cnonct
cnonct(2,4,probchi(2,4,1.5)) → 1.5.
See also: cinv (p. 43), fnonct (p. 75), probchi (p. 117), tnonct
(p. 156).
coalesce(x1 [,x2,...]) 9+
Returns the first nonmissing value from a list of numeric
arguments. The list can contain numeric values, missing
numeric values, and names of numeric variables. If only one
value is listed, coalesce returns that value. If all the values
are missing, coalesce returns a missing value. Examples:
coalesce(.,22,44) → 22. coalesce(.,.,.) → . (missing).
coalesce(7,8,9) → 7. See also: coalescec (p. 44).
coalescec(str1 [,str2,...]) 9+
Returns the first nonmissing value from a list of character
arguments. The list can contain character values, missing
character values, and names of character variables. If only one
value is listed, coalescec returns that value. If all the values
are missing, coalescec returns a missing value. Examples:
coalescec('','a','z') → ‘a’. coalescec('','','') → ‘’
(missing). coalescec('c1','c2') → ‘c1’. See also: coalesce
(p. 44).
Examples:
ASCII
collate(48,,10) or collate(48,57) → ‘0123456789’.
EBCDIC
collate(240,,10) or collate(240,249) → ‘0123456789’.
comb(n,r)
Returns the number of combinations of n elements taken r at
a time (0 ≤ r ≤ n). Use comb to determine the total possible
number of groups for a given number of items. A combina-
tion is any set or subset of items, regardless of their order.
Combinations are distinct from permutations, for which the
order is significant. comb(n,r) also is known as the binomial
coefficient and is read “n choose r”. The number of combina-
tions is n!/(r!(n–r)!) where n and r are integers and the symbol
! denotes a factorial. comb(n,r) is the same as fact(n)/
(fact(r)*(fact(n-r)). See also: fact (p. 66), perm (p. 115).
Examples:
comb(8,0) → 1.
comb(8,1) → 8.
comb(8,2) → 28.
comb(8,6) → 28.
comb(8,8) → 1.
List of Functions | 45
compare
comb(int(8.5),4) → 70.
compare(str1,str2 [,modifiers]) 9+
Compares two strings and returns the position of the leftmost
character by which they differ, or zero (0) if they match. The
result is negative if str1 precedes str2 in a sort sequence; posi-
tive otherwise. modifiers is one or more of the values listed in
the following table.
modifiers Means
i Ignore the case of str1 and str2
l Remove leading spaces from str1 and str2
n Remove quotes from any argument that is an n-literal
and ignore the case of str1 and str2
: (colon) Truncate the longer of str1 or str2 to the length of the
shorter string or to length one, whichever is greater; if
you omit this modifier, the shorter string is padded
with spaces to equal the length of the longer string
compbl(str)
Replaces runs of two or more consecutive spaces with a single
space. The result’s default length is the length of str.
Examples:
compbl('New York, NY 10014') →
‘New York, NY 10014’.
compbl(' ') → ‘ ’.
List of Functions | 47
complev
compress(str [,chars][,modifiers])
Removes or keeps specified characters in a string. chars is the
list of literal characters to keep or remove in str. By default, the
characters in chars are removed from str. If you specify k in
modifiers, only the characters in chars are kept in str. To add to
the chars list, set modifiers to one or more of the values listed
in the following table. If chars and modifiers are omitted, only
spaces are removed from str. The result’s default length is the
length of str.
modifiers Means
a Add letters (A–Z, a–z) to chars
c Add control characters to chars
d Add digits (0–9) to chars
f Add the underscore (_) and letters (A–Z, a–z) to chars
g Add graphic characters to chars
i Ignore the case of characters to be kept or removed
k Keep the characters in chars instead of removing them
n Add the underscore (_), letters (A–Z, a–z), and digits
(0–9) to chars
l Add lowercase letters (a–z) to chars
o Process chars and modifiers only once instead of every
time compress is called. Using this modifier in the
DATA step (excluding WHERE clauses) or PROC SQL
can make compress run much faster in a loop where
chars and modifiers don’t change.
p Add punctuation marks to chars
s Add whitespace characters to chars (space, horizontal
tab, vertical tab, carriage return, line feed, and form
feed)
t Trim trailing spaces from str and chars
u Add uppercase letters (A–Z) to chars
w Add printable characters to chars
x Add hexadecimal characters to chars
Examples:
s='ABC 123 xyz’';
compress(s) → ‘ABC123abc’.
List of Functions | 49
constant
compress(s,'A','ki') → ‘Aa’.
Examples:
On a Windows 32-bit machine
constant('e') → 2.7182818285.
constant('pi') → 3.1415926536.
constant('exactint') → 9.0071993E15.
constant('big') → 1.797693E308.
constant('sqrtbig') → 1.340781E154.
constant('small') → 2.22507E-308.
constant('maceps') → 2.220446E-16.
convx(yld_to_mat,freq,cf0,cf1,...,cfn)
Returns the convexity of cash flows. For a description of the
parameters see dur (p. 63).
Example: convx(0.06,2,10,20,30,40) → 3.51168.
See also: convxp (p. 51), durp (p. 63).
convxp(par,cpn_rate,cpn_freq,remaining_cpns,
time_to_next_cpn,yld_to_mat)
Returns the convexity of a bond. For a description of the
parameters see pvp (p. 126).
Example: convxp(1000,0.01,4,14,0.33/2,0.1) → 11.6023.
See also: convx (p. 51), durp (p. 63).
cos(angle)
Returns the cosine of an angle. angle is expressed in radians.
To convert degrees to radians, multiply degrees by π/180.
Note that cos(-x) = cos(x) and 1/cos(x) is the secant of x.
Examples: cos(0) → 1. cos(1.0472) → 0.5 (1.0472 radians ≈
60 degrees). See also: arcos (p. 12), sin (p. 146), tan (p. 155).
cosh(x)
Returns the hyperbolic cosine of x, defined by (ex + e–x)/2.
Examples: cosh(0) → 1. cosh(exp(1)) → 7.6101. See also:
sinh (p. 147), tanh (p. 155).
List of Functions | 51
count
count(str,substr [,modifiers]) 9+
Returns the number of times that a substring appears within a
string, or zero (0) if the substring isn’t found. See find (p. 71)
for modifiers. If two occurrences of substr overlap in str, count
returns inconsistent results.
Examples:
str='This is a thistle? Yes, this is a thistle';
count(str,'this') → 3.
count(str,'this','i') → 4.
count(str,'is') → 6.
count('gogogo','gogo') → 1 or 2 (inconsistent).
See also: countc (p. 52), index (p. 85), rxmatch (p. 134).
countc(str,chars [,modifiers]) 9+
Returns the number of specific characters that either appear
or don’t appear in a string, or zero (0) if none are found. See
findc (p. 72) for modifiers.
Examples:
str='Baboons Eat Bananas ';
countc(str,'a') → 5.
countc(str,'b') → 1.
countc(str,'b','i') → 3.
countc(str,'ab') → 6.
countc(str,'ab','i') → 8.
countc(str,'ab','v') → 16.
countc(str,'ab','vit') → 11.
countc(str,' ') → 5.
countc(str,' ','t') → 0.
See also: count (p. 52), indexc (p. 85), verify (p. 162).
curobs(dataset_id )
Returns the observation number of the current observation.
dataset_id is the dataset identifier that open (p. 111) returns.
Use curobs with only an uncompressed SAS dataset that is
accessed by using a native library engine. curobs returns a
missing value if the engine doesn’t support observation num-
bers. For SAS views, curobs returns the number of the obser-
vation within the view (not the observation number of any
underlying dataset). See also: fetchobs (p. 68).
cv(x1,x2 [,x3,...])
Returns the coefficient of variation of nonmissing arguments.
At least two nonmissing arguments are required or the result
is a missing value. Example: x1=3; x2=4; x3=5; y1=.;
cv(2,of x1-x3,6,y1) → 39.528470752.
daccdb(age,init_value,lifetime,rate)
Returns the accumulated declining balance depreciation. age
is the period of calculation. For a noninteger age, the depreci-
ation is prorated between the two consecutive time periods
that precede and follow the fractional period. init_value is the
asset’s depreciable initial value. lifetime (> 0) is the asset’s life-
time. rate (> 0) is rate of depreciation, expressed as a fraction.
Example: daccdb(10,1000,15,2) → 760.93228394.
See also: depdb (p. 56).
daccdbsl(age,init_value,lifetime,rate)
Returns the accumulated declining balance with conversion
to a straight-line depreciation. See daccdb (p. 53) for a descrip-
tion of the parameters (except that age must be an integer).
Example: daccdbsl(10,1000,15,2) → 772.65327087.
See also: depdbsl (p. 56).
List of Functions | 53
daccsl
daccsl(age,init_value,lifetime)
Returns the accumulated straight-line depreciation. See daccdb
(p. 53) for a description of the parameters.
Example: daccsl(10,1000,15) → 666.66666667.
See also: depsl (p. 57).
daccsyd(age,init_value,lifetime)
Returns the accumulated sum-of-years-digits depreciation. See
daccdb (p. 53) for a description of the parameters.
Examples:
datdif('1feb2007'd,'31mar2007'd,'30/360') → 60.
datdif('1feb2007'd,'31mar2007'd,'act/act') → 58.
datdif('29dec2005'd,'1jan2007'd,'360') → 362.
datdif('29dec2005'd,'1jan2007'd,'actual') → 368.
List of Functions | 55
day
day(sas_date)
Extracts the day of the month (1–31) from a SAS date.
Example: day(today()) → 2 (today is 2-Sep-2005).
See also: month (p. 104), qtr (p. 126), week (p. 163), weekday
(p. 163), year (p. 163).
dclose(dir_id )
Closes a directory, returning zero (0) if successful; nonzero
otherwise. dir_id is the directory identifier that dopen (p. 62)
returns. dclose also closes any open members. To free mem-
ory you should close open directories and members when
they’re no longer needed. Note that SAS automatically closes
all directories or members opened within a DATA step when
the step ends. See also: close (p. 43), fclose (p. 67).
dcreate(dir_name [,parent_dir]) 9+
Creates an external directory, returning the complete path-
name of the new directory if successful, or an empty string
otherwise. dir_name is the name of the directory to create
(this value can’t include a pathname). parent_dir is the com-
plete pathname of the directory in which to create the new
directory. If parent_dir is omitted, the current directory is
used. Example: (Windows) dcreate('newdir','c:\mydir\')
→ Creates (and returns) the directory c:\mydir\newdir.
depdb(age,init_value,lifetime,rate)
Returns the declining balance depreciation. See daccdb (p. 53)
for a description of the parameters.
Example: depdb(10,1000,15,2) → 36.7796.
depdbsl(age,init_value,lifetime,rate)
Returns the declining balance with conversion to a straight-
line depreciation. See daccdb (p. 53) for a description of the
parameters (except that age must be an integer).
Example: depdbsl(10,1000,15,2) → 45.4693.
See also: daccdbsl (p. 53).
List of Functions | 57
deviance
str dequote()
xxx xxx
"xxx xxx
deviance(dist,rand_var,shape_params [,eps])
Computes the deviance for the distributions listed in the fol-
lowing table. eps is an optional small bounding value. Example:
deviance('normal',5,2) → 9.
dhms(sas_date,hour,minute,second )
Creates a SAS datetime from a SAS date, hour, minute, and
second. The result is the number of seconds before (negative)
or after (positive) midnight, 1-Jan-1960.
Examples:
dhms('10may1936'd,00,00,00) → -746150400 (10-May-
1936 midnight).
dhms('1jan1960'd,00,00,00) → 0 (1-Jan-1960 midnight).
List of Functions | 59
digamma
input x @@;
d1=dif(x);
d2=dif2(x);
l1=lag(x);
l2=lag2(x);
datalines;
1 2 4 3 8 6 -3
;
run;
x d1 d2 l1 l2
1 . . . .
2 1 . 1 .
4 2 3 2 1
3 –1 1 4 2
8 5 4 3 4
6 –2 3 8 3
–3 –9 –11 6 8
digamma(x)
Returns the value of the digamma function Γ´(x)/Γ(x), where
Γ() and Γ´() are the functions gamma (p. 80) and its derivative,
respectively, and x is a nonzero real number that isn’t a nega-
tive integer. If x > 0 then digamma is the derivative of lgamma
(p. 98). Examples: digamma(6) → 1.7061176684. digamma(1)
→ -0.577215665. See also: trigamma (p. 157).
dim(array [,n])
dimn(array)
Returns the size (number of elements) of array, or the size of
dimension n (≥ 1) of a multidimensional array. If omitted, n
defaults to 1. dimn(array) and dim(array,n) are equivalent
calls.
Examples:
array a{2:3,10:12,5} x1-x30;
dim(a) or dim(a,1) → 2.
dim2(a) or dim(a,2) → 3.
dim3(a) or dim(a,3) → 5.
List of Functions | 61
dnum
dnum(dir_id )
Returns the number of members (files) in a directory. dir_id
is the directory identifier that dopen (p. 62) returns. Use dnum
to determine the highest number that you can pass to dread
(p. 62).
dopen(fileref )
Opens a directory (or subdirectory, MACLIB, or partitioned
dataset, depending on your OS) and returns a unique numer-
ic directory identifier if successful; zero otherwise. Use this ID
to identify the open directory to other directory functions.
fileref is a fileref assigned to the directory by the FILENAME
statement or the function filename (p. 69). See also: dclose
(p. 56), fopen (p. 75), mopen (p. 104), open (p. 111).
doptname(dir_id,n)
Returns the name of the n-th information item for a directory.
dir_id is the directory identifier that dopen (p. 62) returns. n is
the sequence number of the item. The returned item names
vary by OS. Use doptnum (p. 62) to get the number of items
and dinfo (p. 61) to get item values. Example: See dinfo. See
also: foptname (p. 77).
doptnum(dir_id )
Returns the number of information items that are available
for a directory. dir_id is the directory identifier that dopen
(p. 62) returns. The returned number varies by OS. Use
doptname (p. 62) to get item names and dinfo (p. 61) to get
item values. Example: See dinfo. See also: foptnum (p. 77).
dread(dir_id,n)
Returns the name of a directory member, or a blank if unsuc-
cessful. dir_id is the directory identifier that dopen (p. 62)
returns. n is the sequence number of the member (file) in the
directory. Use dnum (p. 62) to determine the maximum n.
dropnote(dataset_id|file_id,note_id )
Deletes a note created by note (p. 110) or fnote (p. 75) from a
SAS dataset or an external file, returning zero (0) if successful;
nonzero otherwise. dataset_id is the dataset identifier that
open (p. 111) returns. file_id is the file identifier that fopen
(p. 75) or mopen (p. 104) returns. note_id is the note identifier
that note or fnote returns. See also: fpoint (p. 77), point
(p. 116).
dsname(dataset_id )
Returns the name of a SAS dataset. dataset_id is the dataset
identifier that open (p. 111) returns. dsname returns an empty
string if dataset_id is invalid.
dur(yld_to_mat,freq,cf0,cf1,...,cfn)
Returns the modified duration of cash flows. yld_to_mat (0 <
yld_to_mat < 1) is the effective per-period yield-to-maturity,
expressed as a fraction. freq (> 0) is the integer frequency of
cash flows per period. cf0, cf1,..., cfn is a list of cash flows.
Example: dur(0.06,2,10,20,30,40) → 1.40123.
See also: convx (p. 51).
durp(par,cpn_rate,cpn_freq,remaining_cpns,
time_to_next_cpn,yld_to_mat)
Returns the modified duration of a bond. For a description of
the parameters see pvp (p. 126).
Example: durp(1000,0.01,4,14,0.33/2,0.1) → 3.2649.
See also: convxp (p. 51).
erf(x)
Returns the value of the normal error function, given by
2 x –z2
erf ( x ) = ------- ∫ e dz .
π 0
List of Functions | 63
eurocurr
eurocurr(amount,from_ccy,to_ccy)
Converts amount units of currency from_ccy to to_ccy. Use
the currency codes listed in the following table.
ccy Currency
ats Austrian schilling
bef Belgian franc
chf Swiss franc
czk Czech koruna
dem Deutsche mark
dkk Danish krone
esp Spanish peseta
eur or blank Euro
fim Finnish markka
frf French franc
gbp British pound sterling
grd Greek drachma
huf Hungarian forint
iep Irish pound
itl Italian lira
luf Luxembourg franc
nlg Dutch guilder
nok Norwegian krone
plz Polish zloty
pte Portuguese escudo
rol Romanian leu
rur Russian ruble
sek Swedish krona
sit Slovenian tolar
trl Turkish lira
yud Yugoslavian dinar
Examples:
Euros to Deutsche marks
eurocurr(1,'eur','dem') → 1.95583.
exp(-1) → 0.3678794412.
exp(0) → 1.
exp(1) → 2.7182818285.
exp(2) → 7.3890560989.
exp(log(2)) or log(exp(2)) → 2.
exp(.) → . (missing).
List of Functions | 65
fact
fact(n)
Returns the factorial of n (≥ 0), given by n! = 1 × 2 × ⋅⋅⋅ × n,
where n is an integer. The special case 0! is defined to be equal
to 1. Note that fact(n) equals gamma(n+1).
Examples:
fact(0) → 1.
fact(1) → 1.
fact(8) → 40320.
fact(int(8.5)) → 40320.
fact(20) → 2.432902E18.
See also: comb (p. 45), gamma (p. 80), perm (p. 115).
fappend(file_id [,cc])
Appends the record that is currently in the File Data Buffer
(FDB) to the end of an external file, returning zero (0) if suc-
cessful; nonzero otherwise. file_id is the file identifier that
fopen (p. 75) or mopen (p. 104) returns. cc is one of the car-
riage-control characters listed in the following table. See also:
fread (p. 78), fwrite (p. 79).
cc Means
blank Start the record on a new line
0 Skip one blank line before a new line
- Skip two blank lines before a new line
1 Start the line on a new page
+ Overstrike the line on the previous line
P Interpret the line as a terminal prompt
= Interpret the line as carriage-control information
All else Start the record on a new line
fcol(file_id )
Returns the current column position in the File Data Buffer
(FDB). file_id is the file identifier that fopen (p. 75) or mopen
(p. 104) returns. Use fcol with fpos (p. 77) to manipulate
data in the FDB. See also: fget (p. 69), fput (p. 78).
fdelete(fileref|dir)
Deletes an external file or an empty directory, returning zero
(0) if successful; nonzero otherwise. fileref is an unconcatenat-
ed fileref assigned to the external file by the FILENAME state-
ment or the function filename (p. 69). dir is an empty
directory that you have permission to delete.
Example:
Assign a fileref, delete a file, and then unassign the fileref
data _null_;
fileref="tempfile";
rc=filename(fileref,"physical_filename");
if rc=0 and fexist(fileref) then
rc=fdelete(fileref);
msg=sysmsg();
if msg ne ' ' then put msg=;
rc=filename(fileref);
run;
List of Functions | 67
fetchobs
See also: fetchobs (p. 68), getvarc (p. 81), getvarn (p. 81).
fetchobs(dataset_id,n [,options])
Reads the n-th observation from a SAS dataset into the
Dataset Data Vector (DDV), returning zero (0) if successful,
nonzero if unsuccessful, or -1 for end-of-dataset. Use sysmsg
(p. 153) to get the error message for a nonzero result.
dataset_id is the dataset identifier that open (p. 111) returns.
n is the number of the observation to read. fetchobs treats n
as a relative observation number (unless you use the 'abs'
option); n may or may not coincide with the physical observa-
tion number on disk because fetchobs skips observations
marked for deletion. When a WHERE clause is active,
fetchobs counts only observations that meet the WHERE
condition.
options is one or both of the following space-separated
options: 'abs' means that n is absolute (that is, deleted obser-
vations are counted) and 'noset' prevents the automatic
passing of dataset variable values to DATA-step or macro
variables even if call set (p. 31) has been called.
See also: fetch (p. 67), getvarc (p. 81), getvarn (p. 81).
fexist(fileref )
Determines whether the external file associated with a fileref
exists, returning one (1) if it exists; zero otherwise. fileref is a
fileref assigned to the external file by the FILENAME state-
ment or the function filename (p. 69). Example: See fdelete
(p. 67). See also: cexist (p. 43), exist (p. 65), fileexist
(p. 69).
filename(fileref,filename [,device_type
[,host_options[,dir_ref ]]])
Assigns or unassigns a fileref to an external file, directory, or
output device, returning zero (0) if successful; nonzero other-
wise. The association between a fileref and a physical file lasts
only for the duration of the current SAS session or until you
change or unassign the association by using the FILENAME
statement or the filename function again.
fileref is, in a DATA step, the fileref to assign to the external
file. In a macro (in the %sysfunc function, for example), fileref
is the name of a macro variable (without an ampersand)
List of Functions | 69
filename
device_type Device
disk|base Disk drive (when you assign a fileref to an on-disk
file, you don’t have to use disk)
dummy The output to the file is discarded (use for testing)
gterm Graphics device that will be receiving graphics data
pipe Unnamed pipe (some OSes don’t support pipes)
plotter Unbuffered graphics output device
printer Printer or printer spool file
tape Tape drive
temp Temporary file that exists only as long as the file-
name is assigned. The file is accessed through only
the logical name and is available only while the logi-
cal name exists. Don’t specify a physical pathname.
Files manipulated by the temp device can have the
same attributes and behave identically to disk files
terminal User’s terminal
uprinter Universal Printing printer definition name
fileref(fileref )
Determines whether a fileref has been assigned to a physical
file for the current SAS session, returning a positive number if
fileref isn’t assigned; a negative number if fileref exists but its
associated file doesn’t; or zero if fileref and the associated file
both exist. You can assign fileref to an external file by using
the FILENAME statement or the function filename (p. 69).
find(str,substr [,modifiers][,start_pos]) 9+
find(str,substr [,start_pos][,modifiers]) 9+
Searches a string for a substring and returns the first position
at which it’s found, or zero (0) if it’s not found. For a descrip-
tion of start_pos see any<chartype> (p. 10). modifiers is one
or more of the values listed in the following table.
modifiers Means
i Ignore character case during the search. If this modifier
is omitted, the search is case-sensitive.
t Trim trailing spacing from str and substr.
Examples:
str='She sells seashells? Yes, she does.';
find(str,'she ','i') → 1.
find(str,'she ','it') → 1.
See also: count (p. 52), findc (p. 72), index (p. 85), rxmatch
(p. 134).
List of Functions | 71
findc
findc(str,chars [,modifiers][,start_pos]) 9+
findc(str,chars [,start_pos][,modifiers]) 9+
Searches a string for any of specific characters that either
appear or don’t appear in the string and returns the first posi-
tion at which it’s found, or zero (0) if it’s not found. For a
description of start_pos see any<chartype> (p. 10). modifiers
is one or more of the values listed in the following table.
modifiers Means
i Ignore character case during the search. If this modifier
is omitted, the search is case-sensitive.
o Process chars and modifiers only once at the first call to
findc. Changes to chars or modifiers in subsequent
calls are ignored.
t Trim trailing spaces from str and chars.
v Count only the characters that don’t appear in chars.
Examples:
str='Hi there, Ian!';
findc(str,'Hi') → 1.
findc(str,'hi') → 2.
findc(str,'Hi','v') → 3.
findc(str,'Hi ','vi') → 4.
findc(str,'Hi','i',4) → 5.
See also: countc (p. 52), find (p. 71), indexc (p. 85), verify
(p. 162).
finfo(file_id,info_item)
Returns information about an external file. file_id is the file
identifier that fopen (p. 75) or mopen (p. 104) returns.
info_item is a string that identifies the item to retrieve. The
available info_items vary by OS. Use foptname (p. 77) to get
item names and foptnum (p. 77) to get the number of available
items. finfo returns a blank if info_item is invalid.
Example:
Create a dataset with information-item names and values
data finfo;
length info_item item_value $60;
drop rc file_id num_items i close;
rc=filename('myfile','physical_filename');
file_id=fopen('myfile');
num_items=foptnum(file_id);
do i=1 to num_items;
info_item=foptname(file_id,i);
item_value=finfo(file_id,info_item);
output;
end;
close=fclose(file_id);
run;
List of Functions | 73
fipnamel
See also: stfips (p. 150), stname (p. 150), zipfips (p. 165),
zipname (p. 166).
fipnamel(fips_code)
Converts a FIPS code to its state or U.S. territory name in
mixed case (≤ 20 characters). fips_code is a U.S. Federal Infor-
mation Processing Standards numeric code. Examples: See
fipname (p. 73). See also: fipstate (p. 74), stfips (p. 150),
stnamel (p. 150), zipfips (p. 165), zipnamel (p. 166).
fipstate(fips_code)
Converts a FIPS code to its two-character state or U.S. territo-
ry postal code or GSA geographic code in uppercase. fips_code
is a U.S. Federal Information Processing Standards numeric
code. Examples: See fipname (p. 73). See also: fipnamel (p. 74),
stfips (p. 150), zipfips (p. 165), zipstate (p. 166).
floor(x)
Returns the largest integer ≤ x. If x is within 10-12 of an inte-
ger, floor returns that integer. Examples: See ceil (p. 42). See
also: floorz (p. 75), int (p. 87), round (p. 133).
fnonct(2,4,5,probf(2,4,5,1.5)) → 1.5.
See also: cnonct (p. 44), finv (p. 73), probf (p. 117), tnonct
(p. 156).
fnote(file_id )
Returns a unique numeric record ID for the last record that
was read in an external file. file_id is the file identifier that
fopen (p. 75) or mopen (p. 104) returns. Use the returned ID
with fpoint (p. 77) to later return to the marked record. To
delete the ID, use dropnote (p. 62). See also: frewind (p. 78),
note (p. 110).
List of Functions | 75
fopen
open_mode Mode
a Append mode allows writing new records after the
current end-of-file
i Input mode allows reading only (the default)
o Output mode defaults to the open mode specified
in the OS option in the FILENAME statement or
filename (p. 69) function; if no OS option is spec-
ified, this option allows writing new records at the
beginning of the file (output mode overwrites the
file’s current contents without warning)
s Sequential input mode is used for pipes and other
sequential devices such as hardware ports
u Update mode allows both reading and writing
List of Functions | 77
fput
fput(file_id,fdata)
Moves data to the File Data Buffer (FDB) of an external file
starting at the FDB’s current column position, returning zero
(0) if successful; nonzero otherwise. file_id is the file identifier
that fopen (p. 75) or mopen (p. 104) returns. fdata is the file
data. In a DATA step, fdata is a quoted string or DATA-step
variable; in a macro, fdata is a macro variable. The variable’s
length determines the number of bytes moved to the FDB.
The value of the column pointer is increased to one position
past the end of the new text. See also: fcol (p. 67), fget (p. 69),
fpos (p. 77).
fread(file_id )
Reads a record from an external file into the File Data Buffer
(FDB), returning zero (0) if successful; nonzero otherwise.
file_id is the file identifier that fopen (p. 75) or mopen (p. 104)
returns. The file-pointer position updates automatically after
a read so that successive freads read successive records. To
position a file pointer explicitly, use fpoint (p. 77) or frewind
(p. 78). See also: frlen (p. 78), fwrite (p. 79).
frewind(file_id )
Moves the file pointer to the start of a file, returning zero (0) if
successful; nonzero otherwise. file_id is the file identifier that
fopen (p. 75) or mopen (p. 104) returns. frewind doesn’t affect
a file opened with sequential access. See also: fpoint (p. 77),
fread (p. 78), rewind (p. 133).
frlen(file_id )
Returns the length of the last record read or, if the file is
opened for output, returns the length of the current record.
file_id is the file identifier that fopen (p. 75) or mopen (p. 104)
returns. See also: fread (p. 78).
fsep(file_id,delims)
Sets the token delimiters for fget (p. 69), returning zero (0) if
successful; nonzero otherwise. file_id is the file identifier that
fopen (p. 75) or mopen (p. 104) returns. delims (default blank)
fwrite(file_id [,cc])
Writes the next record to a file, returning zero (0) if success-
ful; nonzero otherwise. file_id is the file identifier that fopen
(p. 75) or mopen (p. 104) returns. cc is one of the carriage-con-
trol characters listed in the following table.
cc Means
blank Start the record on a new line
0 Skip one blank line before a new line
- Skip two blank lines before a new line
1 Start the line on a new page
+ Overstrike the line on a previous line
P Interpret the line as a terminal prompt
= Interpret the line as carriage-control information
All else Start the line record on a new line
fwrite moves text from the File Data Buffer (FDB) to the
external file. To use the carriage-control characters, you must
open the file with the record format of 'p' (print format) in
fopen. See also: fappend (p. 66), fread (p. 78).
gaminv(prob,a)
Returns the prob-th quantile from the gamma distribution
with shape parameter a (> 0). The probability that an obser-
vation from a gamma distribution is less than or equal to the
List of Functions | 79
gamma
n x1 x2 … xn
reporting_opts Returns
in Graphic units in inches
cm Graphic units in centimeters
keyword Option values in KEYWORD= format suitable
for use in OPTIONS or GOPTIONS statements
Examples:
getoption('center') → NOCENTER.
getoption('yearcutoff') → 1920.
getoption('PAGESIZE','keyword') → PAGESIZE=55.
getvarc(dataset_id,n)
Returns the character value of the n-th variable of the current
observation in a SAS dataset. dataset_id is the dataset identifi-
er that open (p. 111) returns. Use varnum (p. 161) or PROC
CONTENTS to get n. See also: fetch (p. 67), fetchobs (p. 68),
getvarn (p. 81), var<attr> (p. 160).
getvarn(dataset_id,n)
Returns the numeric value of the n-th variable of the current
observation in a SAS dataset. dataset_id is the dataset identifi-
er that open (p. 111) returns. Use varnum (p. 161) or PROC
CONTENTS to get n. See also: fetch (p. 67), fetchobs (p. 68),
getvarc (p. 81), var<attr> (p. 160).
harmean(x1 [,x2,...]) 9+
Returns the harmonic mean of nonmissing arguments. If all
the arguments are missing values, the result is a missing value.
The harmonic mean is defined by
n
-----------------------------------------
1 1 … 1
----- + ----- + + -----
x1 x2 xn
List of Functions | 81
harmeanz
harmeanz(x1 [,x2,...]) 9+
Same as harmean (p. 81) but doesn’t fuzz close-to-zero x’s.
hbound(array [,n])
hboundn(array)
Returns the upper bound of array, or the upper bound of
dimension n (≥ 1) of a multidimensional array. If omitted, n
defaults to 1. hboundn(array) and hbound(array,n) are
equivalent calls.
Examples:
array a{2:3,10:12,5} x1-x30;
hbound(a) or hbound(a,1) → 3.
hbound3(a) or hbound(a,3) → 5.
See also: datetime (p. 55), dhms (p. 59), mdy (p. 101), time
(p. 155), today (p. 156).
gt > >
List of Functions | 83
ibessel
Examples:
htmlencode("I'm sure x<y") → ‘I'm sure x<y’.
is equivalent to
is equivalent to
if sales>500 then bonus=10000; else bonus=0;
List of Functions | 85
input
are delims, the beginning of str, and the end of str. If both str
and word are empty strings or contain only spaces, the result
is 1. If word is an empty string or contains only spaces, and str
contains a character or numeric value, the result is 0.
Examples:
indexw('asdf adog dog','dog') → 11.
indexw('asdf@adog%dog','adog','%@') → 6.
See also: inputc (p. 86), inputn (p. 87), put (p. 125), putc
(p. 126), putn (p. 126), INPUT statement.
inputc(str,char_informat.[,w])
Uses a character informat to read a string at run time. Specify-
ing w overrides any width specification in char_informat. The
result’s default length is the length of str.
intck(interval,from,to)
Counts the number of intervals between from and to, which
are two SAS dates, two SAS times, or two SAS datetimes. The
general form of interval is unit [mult][.start].
unit is the interval unit; use one of the values listed in the fol-
lowing table.
List of Functions | 87
intck
interval Interval
day3 Three-day intervals starting on Sunday
week.7 Weekly with Saturday as the first day of the
week
week6.13 Six-week intervals starting on second Fridays
week2 Biweekly intervals starting on first Sundays
interval Interval
week1.1 Same as week
week.2 Weekly intervals starting on Mondays
week6.3 Six-week intervals starting on first Tuesdays
week6.11 Six-week intervals starting on second Wednes-
days
week4 Four-week intervals starting on first Sundays
weekday1w Six-day week with Sunday as a weekend day
weekday35w Five-day week with Tuesday and Thursday as
weekend days (the trailing W means that day 3
and day 5 are weekend days)
weekday17w Same as weekday
weekday67w Five-day week with Friday and Saturday as
weekend days
weekday3.2 Three-weekday intervals with Saturday and
Sunday as weekend days (with the cycle three-
weekday intervals aligned to Monday 4 Jan
1960)
tenday4.2 Four ten-day periods starting at the second
tenday period
List of Functions | 89
intnx
interval Interval
year4.35 Four-year intervals starting in November of
even years between leap years (U.S. midterm
elections)
dtmonth13 Thirteen-month intervals starting at midnight
of 1-Jan-1960, such as 1-Nov-1957, 1-Dec-1958,
1-Jan-1960, 1-Feb-1961, and 1-Mar-1962
hour8.7 Eight-hour intervals starting at 6 AM, 2 PM,
and 10 PM
Examples:
intck('qtr','10jan2005'd,'01jul2005'd) → 2.
intck('week2.2','01jan2005'd,'31mar2005'd) → 6.
intck('year','01jan2005'd,'31dec2005'd) → 0.
intnx(interval,start,increment [,alignment])
Increments a date, time, or datetime by a specified interval or
intervals. The starting point start is a SAS date, time, or date-
time. increment is the integer number of intervals to shift
start. The general form of interval, unit [mult][.start], is
described in intck (p. 87). alignment sets the position of the
result; use one of the values listed in the following table.
Examples:
intnx('year','05feb2005'd,3) → 17532 (1-Jan-2008).
intnx('dtday','8jun2005:08:25:30'dt,0) →
1433808000 (8-Jun-2005 00:00:00).
iorcmsg()
Returns the error-message text of the current value of the
automatic variable _IORC_. The result’s default length is 200.
_IORC_ is created when you use the MODIFY statement or
the SET statement with the KEY= option. The _IORC_ value
is internal and meant to be read in conjunction with the
%sysrc autocall macro. Don’t set _IORC_ yourself.
iqr(x1 [,x2,...]) 9+
Returns the interquartile range (third quartile minus the first
quartile) of nonmissing arguments. If all the arguments are
missing values, the result is a missing value. Example: x1=3;
x2=4; x3=5; iqr(-10,2,of x1-x3,.,123) → 3. See also:
mad (p. 100), pctl (p. 113).
irr(freq,cf0,cf1,...,cfn)
Returns the internal rate of return. irr is the same as intrr
(p. 91), except that the result is a percentage, not a fraction.
List of Functions | 91
jbessel
jbessel(nu,x)
Returns the value of the Bessel function of order nu (≥ 0)
evaluated at x (≥ 0). Example: jbessel(1,2) → 0.5767248078.
See also: ibessel (p. 84).
juldate(sas_date)
Converts a SAS date to a Julian date. If sas_date falls within
the 100-year span defined by the YEARCUTOFF= system
option, the result is yyddd; otherwise the result is yyyyddd (1
≤ ddd ≤ 366). For example, if YEARCUTOFF=1920, juldate
returns 97001 for January 1, 1997 and 1878365 for December
31, 1878. Use juldate7 (p. 92) to avoid two-digit years.
Example: juldate(today()) → 05245 (today is 2-Sep-2005).
See also: datejul (p. 55).
juldate7(sas_date)
Converts a SAS date to a Julian date. juldate7 is the same as
juldate (p. 92), except that its result always has seven digits
yyyyddd (1 ≤ ddd ≤ 366).
Example: juldate7(today()) → 2005245 (today is 2-Sep-
2005).
See also: datejul (p. 55).
kcompare(str1,[start_pos,[len,]]str2)
(DBCS) Compares two strings and returns a negative number
if str1 < str2, zero if str1 = str2, or a positive number if str1 >
str2. start_pos is the starting position in str1 to begin the com-
parison. If start_pos is omitted, all of str1 is compared. If
start_pos < 0, str1 is assumed to be extended DBCS data that
contains no SO/SI characters. len is the number of bytes to
compare. If len is omitted, all of str1 that follows start_pos is
compared (trailing spaces are ignored).
kcompress(str [,chars])
(DBCS) Removes characters from a string. chars lists the
character(s) to remove. If chars is omitted, all single- and dou-
ble-byte spaces are removed. See also: kleft (p. 93), kright
(p. 94), ktrim (p. 95).
options Means
nososi| No shift code or Hankaku characters
noshift
kindex(str,substr)
(DBCS) Scans a string from left to right and returns the first
position at which a substring appears within the string, or
zero (0) if the substring isn’t found. See also: kindexc (p. 93).
kindexc(str,substr1 [,substr2,...])
(DBCS) Scans a string from left to right and returns the first
position in the string at which any of the characters in the
substrings appear, or zero (0) if no characters in any of the
substrings are found. See also: kindex (p. 93).
kleft(str)
(DBCS) Left-aligns a string by removing leading DBCS spaces
and SO/SI. See also: kcompress (p. 92), kright (p. 94), ktrim
(p. 95).
klength(expression)
(DBCS) Returns the length of its argument. The length is the
position of the rightmost nonspace character in expression. If
List of Functions | 93
klowcase
kscan(str,n [,delims])
(DBCS) Returns the n-th word in a string. See scan (p. 145)
for a description of the parameters.
kstrcat(str1,str2 [,str3,...])
(DBCS) Concatenates two or more single-byte or double-byte
strings, removing SO/SI pairs between the strings.
ksubstr(str,start_pos [,len])
(DBCS) Extracts a substring from a string. start_pos (≥ 1) is
the position of the first character in the substring. len is the
length of the substring to extract. If len is zero, negative, or
greater than the length of str after start_pos, ksubstr extracts
the remainder of str and logs an invalid-length note. If len is
omitted, ksubstr extracts the remainder of str. The result’s
default length is the length of str.
Example: ksubstr('kidnap',4,3) → ‘nap’.
See also: ksubstrb (p. 94).
ksubstrb(str,start_pos [,len])
(DBCS) Same as ksubstr (p. 94), except that start_pos and n
are expressed in byte units. If len is greater than the length (in
byte units) of str after start_pos, ksubstrb extracts the
remainder of str.
kupdate('abcde',3,'xx') → ‘abxx’.
kupdate('abcde',3,2,'xx') → ‘abxxe’.
List of Functions | 95
kurtosis
kurtosis(x1,x2,x3,x4 [,x5,...])
Returns the kurtosis (peakedness) of nonmissing arguments.
At least four nonmissing arguments are required or the result
is a missing value. Example: x1=-3; x2=0.5; x3=50; y1=.;
kurtosis(2,of x1-x3,6,y1) → 4.5733146626.
kverify(str,substr1 [,substr2,...])
(DBCS) Returns the position of the first character in a string
that’s not in any substring, or zero if every character in str is in
at least one substr.
lag(value)
lagn(value)
Returns the queued value supplied in the n-th preceding lag
function call. n ranges from 1 (the default) to 100. lag1 is
equivalent to lag. value is numeric or character. Each occur-
rence of a lagn in a program generates its own queue of val-
ues. The lagn queue is initialized with n missing values.
When lagn executes, its queue’s front value is removed and
returned; the remaining values are shifted frontward; and the
new value is placed at the back of the queue. Missing values
are returned for the first n executions of lagn, after which the
lagged values of value start to appear. When value is character,
the result’s default length is 200. Example: See dif (p. 59).
largest(n,x1 [,x2,...]) 9+
Returns the n-th largest of nonmissing values. If n is missing,
a noninteger, less than one, or greater than the number of
nonmissing values, then largest returns a missing value.
Example: x1=-3.5; x2=-4; x3=5; largest(2,2,of x1-
x3,6,.) → 5. See also: max (p. 101), min (p. 101), ordinal
(p. 112), pctl (p. 113), smallest (p. 147).
lbound(array [,n])
lboundn(array)
Returns the lower bound of array, or the lower bound of
dimension n (≥ 1) of a multidimensional array. If omitted, n
defaults to 1. lboundn(array) and lbound(array,n) are
equivalent calls.
Examples:
array a{2:3,10:12,5} x1-x30;
lbound(a) or lbound(a,1) → 2.
lbound3(a) or lbound(a,3) → 1.
length(str)
Returns the length of a string, not counting trailing spaces,
returning 1 if str is an empty string or a string of one or more
spaces. If str is a number, length returns 12 and logs a note
that numbers have been converted to characters. Examples:
List of Functions | 97
lengthc
lengthc(str) 9+
Returns the length of a string, counting trailing spaces. If str is a
number, lengthc returns 12 and logs a note that numbers have
been converted to characters. Examples: See length (p. 97). See
also: lengthm (p. 98), lengthn (p. 98).
lengthm(str) 9+
Returns the number of bytes allocated to a string. If str is a
number, lengthm returns 12 and logs a note that numbers
have been converted to characters. Examples: See length
(p. 97). See also: lengthc (p. 98), lengthn (p. 98).
lengthn(str) 9+
Returns the length of a string, not counting trailing spaces,
returning 0 if str is an empty string or a string of one or more
spaces. If str is a number, lengthn returns 12 and logs a note
that numbers have been converted to characters. Examples: See
length (p. 97). See also: lengthc (p. 98), lengthm (p. 98).
lgamma(x)
Returns the natural log of the function gamma (p. 80), defined
for x > 0. lgamma(x) is the same as log(gamma(x)). Examples:
lgamma(6) → 4.7874917428. lgamma(1) → 0. lgamma(0) → .
(missing). See also: digamma (p. 60), log (p. 99), trigamma
(p. 157).
libname(libref [,data_lib [,engine [,options]]])
Assigns or unassigns a libref for a SAS data library, returning
zero (0) if successful; or nonzero on an error (failure), warn-
log(x)
Returns the natural (base e) logarithm of x (> 0). log is the
inverse of exp (p. 65). See also: log10 (p. 99), log2 (p. 99).
Examples:
log(0.5) → -0.693147181.
log(1.0) → 0.
log(constant('e')) → 1.
log(exp(2)) or exp(log(2)) → 2.
log10(x)
Returns the base-10 logarithm of x (> 0). Examples: log10(.01)
→ -2. log10(1000) → 3. See also: log (p. 99), log2 (p. 99).
log2(x)
Returns the base-2 logarithm of x (> 0). Examples: log2(0.5)
→ -1. log2(256) → 8. See also: log (p. 99), log10 (p. 99).
List of Functions | 99
logbeta
logbeta(a,b) 9+
Returns the natural logarithm of the function beta (p. 17).
See beta for a description of the parameters. logbeta is the
same as log(beta()). See also: gamma (p. 80), log (p. 99).
logcdf(dist,quantile [,param1,param2,...]) 9+
Returns the natural logarithm of the left cumulative distribu-
tion function cdf (p. 40). See cdf for a description of the
parameters. logcdf is the same as log(cdf()). See also: log
(p. 99).
logpdf(dist,quantile [,param1,param2,...])
Returns the natural logarithm of the probability density
(mass) function pdf (p. 114). See cdf (p. 40) for a description
of the parameters. logpdf is the same as log(pdf()). Alias:
logpmf. See also: log (p. 99).
logsdf(dist,quantile [,param1,param2,...])
Returns the natural logarithm of the survival function sdf
(p. 146). See cdf (p. 40) for a description of the parameters.
logsdf is the same as log(sdf()). See also: log (p. 99).
lowcase(str)
Converts all uppercase letters in a string to lowercase. The
result’s default length is the length of str.
Examples:
lowcase('Place de l''Étoile') → ‘place de l'étoile’.
mdy(1,1,1960) → 0 (1-Jan-1960).
See also: datetime (p. 55), dhms (p. 59), hms (p. 82), time
(p. 155), today (p. 156).
mean(x1 [,x2,...])
Returns the arithmetic mean (average) of nonmissing argu-
ments. If all the arguments are missing values, the result is a
missing value. Example: x1=3; x2=4; x3=5; mean(2,of x1-
x3,6,.) → 4. See also: median (p. 101), n (p. 106), sum (p. 153).
median(x1 [,x2,...]) 9+
Returns the median of nonmissing arguments. If all the argu-
ments are missing values, the result is a missing value. Exam-
ple: x1=3; x2=4; x3=5; median(-10,2,of x1-x3,.,123)
→ 3.5. See also: mad (p. 100), mean (p. 101), n (p. 106).
min(x1,x2 [,x3,...])
Returns the smallest of nonmissing arguments. If all the argu-
ments are missing values, the result is a missing value. Exam-
ple: x1=-3; x2=-4; x3=5; min(2,of x1-x3,6,.) → -4. See
also: largest (p. 96), max (p. 101), smallest (p. 147).
missing('abc') → 0.
missing(.) → 1.
missing(.a) → 1.
missing(._) → 1.
missing('') → 1.
missing(' ') → 1.
missing(' ') → 1.
See also: call missing (p. 21), n (p. 106), nmiss (p. 109).
mod(x,divisor)
Returns the remainder when x is divided by divisor, returning
zero if the result is within 10-12 of zero or divisor to avoid most
unexpected floating-point results. Nonzero results have the
same sign as x; the sign of divisor is ignored.
Examples:
The following table shows examples of mod and modz (p. 104)
open_mode Mode
a Append mode allows writing new records after the
current end-of-file
i Input mode allows reading only (the default)
o Output mode defaults to the OPEN mode specified
in the OS option in the FILENAME statement or
the function filename (p. 69); if no OS option is
specified, this option allows writing new records at
the beginning of the file (output mode overwrites
the file’s current contents without warning)
s Sequential input mode is used for pipes and other
sequential devices such as hardware ports
u Update mode allows both reading and writing
w Sequential update mode is used for pipes and other
sequential devices such as ports
n(x1 [,x2,...])
Counts nonmissing arguments.
Examples:
n(10,11) → 2.
n(10,.,11) → 2.
n(.,.,.) → 0.
Examples:
nldate('24feb2003'd,'%B-%d.log') → ‘February-24.log’.
If OPTIONS LOCALE=German_Germany
nldate('24feb2003'd,'%A') → Montag.
See also: compare (p. 46), dequote (p. 57), nvalid (p. 110).
Examples:
If OPTIONS LOCALE=English_unitedstates
nltime('12:39:43't,'%I%p') → 00 PM.
If OPTIONS LOCALE=German_Germany
nltime('12:39:43't,'%I%p') → 00 nachm.
nmiss(10,.,11) → 1.
nmiss(.,.,.) → 3.
notdigit(s) → 1.
notdigit(s,0) → 0.
notupper(s,1) → 4.
notupper(s,-4) → 4.
notalpha(s,999) → 0.
notalpha(s,-999) → 13.
notcntrl(s) → 1.
note(dataset_id )
Returns a unique numeric observation ID for the current
observation of a SAS dataset. dataset_id is the dataset identifi-
er that open (p. 111) returns. Use the returned ID with point
(p. 116) to later return to the marked observation. To delete
the ID, use dropnote (p. 62). See also: fnote (p. 75), rewind
(p. 133).
npv(int_rate,freq,cf0,cf1,...,cfn)
Returns the net present value. npv is the same as netpv
(p. 106), except that int_rate is expressed as a percentage, not
a fraction.
nvalid(str [,validvarname]) 9+
Determines whether a string is valid for use as a SAS variable
name in a SAS statement. nvalid returns one (1) if str is valid
or zero (0) otherwise (trailing spaces are ignored).
validvarname is one of the validity rules listed in the following
table. If validvarname is omitted, nvalid uses the value of the
See also: compare (p. 46), dequote (p. 57), nliteral (p. 108).
open([dataset_name [,open_mode]])
Opens a SAS dataset and returns a unique numeric dataset
identifier if successful; zero otherwise. Use this ID to identify
the open dataset to other dataset functions.
dataset_name is the name of the SAS dataset or SQL view to
open. It takes the form
[libref.]member_name [(dataset_options)]
and defaults to _LAST_. libref is the libref assigned to a SAS
library by the LIBNAME statement or the function libname
(p. 98). Use any dataset_options except OBS= and FIRSTOBS=
to control how the dataset is read.
open_mode is the type of access to the dataset. Omit
open_mode to use random-access mode ('i' mode), or use
one of the values listed in the following table.
open_mode Mode
i Opens the dataset in random-access input mode (the
default); values can be read but not modified (if the
engine doesn’t support random access, open defaults
to in mode automatically and logs a warning)
open_mode Mode
in Opens the dataset in input mode; observations are
read sequentially and can be revisited
is Opens the dataset in input mode; observations are
read sequentially but can’t be revisited
You should close (p. 43) an open dataset when it’s no longer
needed. If you open a dataset within a DATA step, SAS closes
it automatically when the step ends.
Example: open('mydata.dsn','i') → 1.
See also: attrc (p. 13), attrn (p. 15), dopen (p. 62), fopen
(p. 75), mopen (p. 104).
ordinal(n,x1,x2 [,x3,...])
Sorts values in ascending order and returns the n-th value in
the sorted list. Missing values are sorted as the lowest values.
If n is missing, a noninteger, less than one, or greater than the
number of values, ordinal returns a missing value. ordinal
differs from smallest (p. 147) in that smallest ignores miss-
ing values whereas ordinal counts them.
Examples:
x1=-3.5; x2=-4; x3=5;
ordinal(5,8,6,of x1-x3,6,.) → 6.
ordinal(6,8,6,of x1-x3,6,.) → 6.
ordinal(7,8,6,of x1-x3,6,.) → 8.
See also: largest (p. 96), max (p. 101), min (p. 101), pctl
(p. 113).
See also: fexist (p. 68), fileexist (p. 69), fileref (p. 71).
pctl[def ](percentage,x1 [,x2,...]) 9+
Returns a percentile of nonmissing arguments. If all the argu-
ments are missing values, the result is a missing value.
def is a digit from 1 to 5 (default 5) that specifies the definition
of the percentile to compute. pctl() and pctl5() are equiva-
lent function calls. In general, differences among percentile
definitions are pronounced only for small samples or samples
that contain many ties. For information about how pctl cal-
culates percentiles see PROC UNIVARIATE in Base SAS Pro-
cedures Guide.
percentage (0 ≤ percentage ≤ 100) is the percentile to compute.
Examples:
pctl(25,2,4,1,3) → 1.5 (lower quartile).
See also: largest (p. 96), max (p. 101), min (p. 101), ordinal
(p. 112), smallest (p. 147).
pdf(dist,quantile [,param1,param2,...])
Returns the probability density (mass) function. See cdf
(p. 40) for a description of the parameters. Alias: pmf. Example:
pdf('normal',1.96) → 0.058441. See also: logpdf (p. 100),
quantile (p. 127), sdf (p. 146).
peek(addr [,len])
Same as peeklong (p. 115) but for 32-bit systems only. See
also: addr (p. 10), call poke (p. 22), peekc (p. 114).
peekc(addr [,len])
Same as peekclong (p. 114) but for 32-bit systems only. See
also: addr (p. 10), call poke (p. 22), peek (p. 114).
peekclong(addr [,len])
Returns the character contents stored at an address in memo-
ry on 32-bit or 64-bit systems. addr is the memory address.
len (1 ≤ len ≤ 32767, default 8) is the length of the result.
Example:
Prints contents=‘ab’ (32-bit system)
x='abcde';
addr=addrlong(x);
contents=peekclong(addr,2);
put contents=;
See also: addrlong (p. 10), call pokelong (p. 22), peeklong
(p. 115), ptrlongadd (p. 125).
peeklong(addr [,len])
Returns the numeric contents stored at an address in memory
on 32-bit or 64-bit systems. addr is the memory address. len
(32-bit: 1 ≤ len ≤ 4, default 4. 64-bit: 1 ≤ len ≤ 8, default 8) is
the length of the result.
Example:
Prints contents=1 (32-bit system)
length x $4;
x=put(1,ib4.);
addr=addrlong(x);
contents=peeklong(addr,4);
put contents=;
See also: addrlong (p. 10), call pokelong (p. 22), peekclong
(p. 114).
perm(n [,r])
Returns the number of permutations of n elements taken r at
a time (0 ≤ r ≤ n). If r is omitted, perm(n) returns the factorial
of n. A permutation is any set or subset of items where order
is significant. Permutations are distinct from combinations,
for which order doesn’t matter. The number of permutations
is n!/(n–r)! where n and r are integers and the symbol !
denotes the factorial. perm(n,r) is the same as fact(n)/
fact(n-r). See also: comb (p. 45), fact (p. 66).
Examples:
perm(8,0) → 1.
perm(8,1) → 8.
perm(8,2) → 56.
perm(8,6) → 20160.
perm(8,8) → 40320.
perm(8) → 40320.
perm(int(8.5),4) → 1680.
See also: call allperm (p. 18), call ranperk (p. 28), call
ranperm (p. 28).
point(dataset_id,note_id ))
Moves to the observation marked previously by note (p. 110),
returning zero (0) if successful; nonzero otherwise. dataset_id
is the dataset identifier that open (p. 111) returns. note_id is
the note identifier that note returns. point prepares the pro-
gram to read from the dataset. The Dataset Data Vector
(DDV) isn’t updated until a read is done by using fetch (p. 67)
or fetchobs (p. 68). See also: dropnote (p. 62), fpoint (p. 77),
rewind (p. 133).
poisson(lambda,n)
Returns the probability that an observation from a Poisson
distribution with mean lambda (≥ 0) is less than or equal to n
(≥ 0). The probability that an observation is equal to a given
value of n is the difference of two probabilities from the Pois-
son distribution for n and n–1. Example: poisson(2.5,2) →
0.5438131159. See also: cdf (p. 40).
probbeta(x,a,b)
Returns the probability that an observation from a beta distri-
bution with shape parameters a (> 0) and b (> 0) is less than
or equal to x (0 ≤ x ≤ 1). probbeta is the inverse of betainv
(p. 17). Example: probbeta(0.2,2,1) → 0.04. See also: beta
(p. 17), cdf (p. 40).
probbnml(prob_success,num_trials,num_successes)
Returns the probability that an observation from a binomial
distribution with probability of success prob_success (0 ≤
prob_success ≤ 1) and number of trials num_trials (≥ 1) is less
than or equal to the number of successes num_successes (0 ≤
num_successes ≤ num_trials). The probability that an observa-
tion equals a given value of num_successes is the difference of
two probabilities from the binomial distribution for
probbnrm(x,y,corr)
Returns the probability that an observation (X,Y) from a stan-
dard bivariate normal distribution with mean 0, variance 1,
and correlation coefficient corr (-1 ≤ corr ≤ 1), is less than or
equal to (x, y) (that is, the probability that X ≤ x and Y ≤ y).
The probability is
x y
2 2
1 u – 2ruv + v -
------------------------
2π ( 1 – r )
2
- ∫ ∫ exp – ----------------------------------
2(1 – r )
2
dv du
–∞ –∞
probf(x,ndf,ddf [,nc])
Returns the probability that an observation from an F distri-
bution with numerator degrees of freedom ndf (> 0, noninte-
ger allowed), denominator degrees of freedom ddf (> 0,
noninteger allowed), and noncentrality parameter nc (≥ 0) is
less than or equal to x (≥ 0). If nc is omitted or is zero, probf
uses the central F distribution. probf is the inverse of finv
(p. 73). Example: probf(3.5,4,5) → 0.899032601. See also:
cdf (p. 40), fnonct (p. 75).
probgam(x,a)
Returns the probability that an observation from a gamma
distribution with shape parameter a (> 0) is less than or equal
to x (≥ 0). probgam is the inverse of gaminv (p. 79). Example:
probit(prob)
Returns the prob-th quantile from the standard normal distri-
bution. The probability that an observation from the standard
normal distribution is less than or equal to the returned quan-
tile is prob (0 < x < 1). Extreme results may be truncated to fall
between -8.222 and 7.941. probit is the inverse of probnorm
(p. 119). Example: probit(0.025) → -1.95996398. See also: cdf
(p. 40).
probmc(dist,quantile,prob,df,num_treatments
[,sample_size_params])
Returns the quantile or the probability from various distribu-
tions for multiple comparisons of means.
dist is one of the values listed in the following table. quantile is
the quantile of dist. prob is the left probability of dist.
dist Distribution
dunnett1 One-sided Dunnett
dunnett2 Two-sided Dunnett
maxmod Maximum modulus
range Studentized range
williams Williams
probnorm(x)
Returns the probability that an observation from the standard
normal distribution is less than or equal to x. probnorm is the
inverse of probit (p. 118). Example: probnorm(1.96) →
≈0.975. See also: cdf (p. 40).
probt(x,df [,nc])
Returns the probability that an observation from a Student’s t
distribution with degrees of freedom df (> 0, noninteger
allowed) and noncentrality parameter nc is less than or equal
to x. If nc is omitted or is zero, probt uses the central t distri-
bution. probt is the inverse of tinv (p. 156). The significance
level of a two-tailed t test is (1-probt(abs(x),df ))*2. Exam-
ple: probt(0.95,10) → 0.817746627. See also: cdf (p. 40),
tnonct (p. 156).
pos=prxmatch(prx_id,str1);
if pos then paren1=prxparen(prx_id);
pos=prxmatch(prx_id,str2);
if pos then paren2=prxparen(prx_id);
pos=prxmatch(prx_id,str3);
if pos then paren3=prxparen(prx_id);
prxparse(perl_regex) 9+
Parses and compiles a Perl regular expression (pattern) and
returns a unique numeric pattern identifier that other PRX
functions can use. prxparse returns a missing value if a pars-
ing error occurs. The corresponding function for SAS regular
expressions is rxparse (p. 135).
perl_regex is a Perl regular expression. If perl_regex is a con-
stant or if it uses the /o option, perl_regex is compiled once.
Successive prxparse calls don’t recompile but return the pat-
tern ID from the preceding compile. This behavior means
that you don’t need to use an initialization block (if _N_=1) to
initialize Perl regular expressions. This compile-once behav-
ior occurs only in a DATA step, for all other uses perl_regex is
recompiled on each prxparse call. You can use metacharac-
Metacharacter Means
\ Mark the next character as either a special charac-
ter, a literal, a back reference, or an octal escape.
"n" matches the character "n"; "\n" matches a new-
line character; "\\" matches "\"; "\(" matches "(".
| Use an OR condition when comparing alphanu-
meric strings.
^ Match the position at the beginning of the string.
$ Match the position at the end of the string.
* Match the preceding subexpression zero or more
times (equivalent to {0}). "zo*" matches "z" and
"zoo".
+ Match the preceding subexpression one or more
times (equivalent to {1,}). "zo+" matches "zo" and
"zoo" but not "z".
? Match the preceding subexpression zero or one
time (equivalent to {0,1}). "do(es)?" matches the
"do" in "do" or "does".
{n} n is a non-negative integer that matches exactly n
times. "o{2}" matches the two o’s in "food" but not
the "o" in "Bob".
{n,} n is a non-negative integer that matches n or more
times (o{1,} is equivalent to "o+"; o{0,} is equiva-
lent to "o*"). "o{2,}" matches all the o’s in
"foooood" but not the "o" in "Bob".
{n,m} m and n (n ≤ m) are non-negative integers match-
ing at least n but at most m times (o{0,1} is equiva-
lent to "o?"). "o{1,3}" matches the first three o’s in
"fooooood". Don’t put a space on either side of the
comma.
dot (.) Match any single character except a newline. To
match any character, including a newline, use a
pattern like "[.\n]".
Metacharacter Means
(pattern) Match a pattern and create a capture buffer for the
match. To retrieve the position and length of the
captured match, use call prxposn (p. 25). To
retrieve the value of the capture buffer, use
prxposn (p. 124). To match a parenthesis charac-
ter, use "\(" or "\)".
x|y Match either x or y. "z|food" matches "z" or "food";
"(z|f)ood" matches "zood" or "food".
[xyz] A character set that matches any one of the
enclosed characters. "[abc]" matches the "a" in
"plain".
[^xyz] A character set that matches any character that
isn’t enclosed. "[^abc]" matches the "p" in "plain".
[a-z] A range of characters that matches any character
in the range. "[a-z]" matches any lowercase alpha-
betic character in the range "a" through "z".
[^a-z] A range of characters that matches any character
that isn’t in the range. "[^a-z]" matches any char-
acter that isn’t in the range lowercase "a" through
"z".
\b Match a word boundary (the position between a
word and a space). "er\b" matches the "er" in "nev-
er" but not the "er" in "verb".
\B Match a non-word boundary. "er\B" matches the
"er" in "verb" but not the "er" in "never".
\d Match a digit character (equivalent to [0-9]).
\D Match a nondigit character (equivalent to [^0-9]).
\s Match a whitespace character: space, tab, form
feed, and so on (equivalent to [\f\n\r\t\v]).
\S Match a nonwhitespace character (equivalent to
[^\f\n\r\t\v]).
\t Match a tab character (equivalent to "\x09").
\w Match a word character including the underscore
(equivalent to [A-Za-z0-9_]).
Metacharacter Means
\W Match a non-word character (equivalent to [^A-
Za-z0-9_]).
\n Match n, where n is a positive integer that refers
back to captured matches. "(.)\1" matches two
consecutive identical characters.
Example:
Validate a list of telephone numbers that have the form
(xxx) xxx-xxxx or xxx-xxx-xxxx
data _null_;
if _N_=1 then do;
paren="\([2-9]\d\d\) ?[2-9]\d\d-\d\d\d\d";
dash="[2-9]\d\d-[2-9]\d\d-\d\d\d\d";
regexp="/(" ||paren|| ")|(" ||dash|| ")/";
retain re;
re=prxparse(regexp);
if missing(re) then do;
put "Bad regexp: " regexp;
stop;
end;
end;
length first last home business $ 16;
input first last home business;
if ^prxmatch(re,home) then
put "Bad home phone: " first last home;
if ^prxmatch(re,business) then
put "Bad bus phone: " first last business;
run;
See also: call prxchange (p. 22), call prxdebug (p. 23), call
prxfree (p. 23), call prxnext (p. 24), call prxposn (p. 25),
call prxsubstr (p. 26), prxchange (p. 120), prxmatch
(p. 120), prxparen (p. 121), prxposn (p. 124).
prxposn(prx_id,buffer,str) 9+
Returns the value for a capture buffer. A capture buffer is part
of a match, enclosed in parentheses, in a regular expression.
prxposn returns the capture-buffer text directly, without the
need for substr (p. 151). To return a capture buffer, use the
put('1234',F4.2) → ‘12.34’.
See also: input (p. 86), inputc (p. 86), inputn (p. 87), putc
(p. 126), putn (p. 126), PUT statement.
quantile('chisq',0.5,10) → 9.341817765.
quantile('expo',0.95) → 2.995732273.
quantile('normal',0.95) → 1.64485362.
quantile('normal',0.975) → 1.959963984.
quantile('poisson',0.9,2) → 4.
quantile('t',0.95,20) → 0.
quantile('uniform',0.5,5,10) → 7.5.
str quote()
'x"y' "x""y"
'x''y' "x'y"
'Don''t' "Don't"
Example:
Generate a Cauchy variate x with location parameter alpha
and scale parameter beta
x=alpha+beta*rancau(seed );
rand(dist [,param1,param2,...])
Generates a random number from the continuous and dis-
crete distributions listed in the following table. The location,
scale, shape, degrees-of-freedom, and other params vary by
distribution. You can truncate dist to its first four characters.
Use call streaminit (p. 35) to set the seed for rand.
Normal rand('normal'[,mean,stdev])
Poisson rand('poisson',lambda)
t rand('t',df )
Tabled rand('table',prob1,prob2,...)
Triangular rand('triangle',height)
Uniform rand('uniform')
Weibull rand('weibull',a,b)
Examples:
rand('bern',0.5) → 0.
rand('normal') → -1.24987951.
ranexp(seed )
Returns a random variate from an exponential distribution
with parameter 1. seed is an integer (< 231-1). If seed ≤ 0, the
time of day is used to initialize the seed stream. To change
seed during execution, use call ranexp (p. 27) instead of
ranexp.
Examples:
Generate an exponential variate x with parameter lambda
x=ranexp(seed )/lambda;
rangam(seed,a)
Returns a random variate from a gamma distribution with
shape parameter a (> 0). seed is an integer (< 231-1). If seed ≤
0, the time of day is used to initialize the seed stream. To
change seed during execution, use call rangam (p. 27) instead
of rangam.
Examples:
Generate a gamma variate x with shape parameter alpha and
scale beta
x=beta*rangam(seed,alpha);
range(x1 [,x2,...])
Returns the range of nonmissing arguments. The range is the
absolute difference between the largest and the smallest val-
ues. If all the arguments are missing values, the result is a
missing value.
Examples:
range(-2,1,6) → 8.
range(-2,1,-6) → 7.
range(10) → 0.
range(-10,-10) → 0.
Examples:
Generate a normal variate x with mean m2 and variance s2
x=m2+sqrt(s2)*rannor(seed );
ranpoi(seed,lambda)
Returns a random variate from a Poisson distribution with
mean lambda (≥ 0). seed is an integer (< 231-1). If seed ≤ 0, the
time of day is used to initialize the seed stream. To change
seed during execution, use call ranpoi (p. 28) instead of
ranpoi.
rantbl(seed,prob1,...,probn)
Returns a random variate (positive integer) from the proba-
bility mass function defined by prob1 through probn.
rantabl returns 1 with probability prob1, 2 with prob2,..., n
with probn, where 0 ≤ probi ≤ 1 for i = 1, 2,..., n. If the probs
sum to less than one, rantabl can return n+1. If the probs
sum to greater than one, rantabl ignores the extra probs.
seed is an integer (< 231-1). If seed ≤ 0, the time of day is used
to initialize the seed stream. To change seed during execution,
use call rantbl (p. 29) instead of rantbl.
Example:
Assign to x one of the values m1, m2, or m3 with probabilities
of occurrence p1, p2, and p3, respectively (p1+p2+p3=1)
array m{3} m1-m3;
x=m{rantbl(seed,of p1-p3)};
rantri(seed,height)
Returns a random variate from a triangular distribution on
the interval (0,1) with mode height (0 < height < 1). seed is an
integer (< 231-1). If seed ≤ 0, the time of day is used to initial-
ize the seed stream. To change seed during execution, use
call rantri (p. 29) instead of rantri.
Example:
Generate a triangular variate x on [a,b] with mode c (a ≤ c ≤ b)
x=(b-a)*rantri(seed,(c-a)/(b-a))+a;
ranuni(seed )
Returns a random variate from a uniform distribution on the
interval (0,1). seed is an integer (< 231-1). If seed ≤ 0, the time
of day is used to initialize the seed stream. To change seed
during execution, use call ranuni (p. 29) instead of ranuni.
ranuni is the same as uniform (p. 158).
Example:
Generate a uniform variate x on the interval (b,a+b)
x=a*ranuni(seed )+b;
repeat(str,n)
Returns a string repeated n+1 times (n ≥ 0). The result’s
default length is 200. Example: repeat('Go',2) → ‘GoGoGo’.
resolve(macro_expr)
Resolves a macro expression and returns it. The result’s
default length is 200. For details see Resolve in SAS Macro
Language: Reference. See also: symget (p. 153).
reverse(str)
Reverses the order of characters in a string. The result’s
default length is the length of str. Example: reverse('Never
odd or even') → ‘neve ro ddo reveN’.
right(str)
Right-aligns a string by moving trailing spaces to the start. The
length of str doesn’t change. The result’s default length is the
length of str. Examples: See left (p. 97). See also: strip (p. 150),
trim (p. 158), trimn (p. 158).
rms(x1 [,x2,...]) 9+
Returns the root mean square of nonmissing arguments. If all
the arguments are missing values, the result is a missing value.
The root mean square is defined by
2 2 2
x1 + x2 + … + xn
---------------------------------------- .
n
round(x [,unit]) 9+
Rounds x to the nearest multiple of unit (> 0) or to the nearest
integer if unit is omitted. When unit < 1 and not the reciprocal
of an integer, round—unlike roundz (p. 134)—fuzzes to try to
make the result agree with decimal arithmetic.
Examples:
round(18,4) → 20.
round(386.667) → 387.
round(386.667,5) → 385.
round(386.667,10) → 390.
round(386.667,100) → 400.
round(386.667,0.1) → 386.7.
round(386.667,0.111) → 386.613.
round(386.667,0.25) → 386.75.
round(386.667,10.5) → 388.5.
rxmatch(rx_id,str)
Returns the position at which a pattern-match is found in a
string, or zero (0) if it’s not found. rx_id is the pattern identifier
that rxparse (p. 135) returns. Example: See rxparse. See also:
call rxchange (p. 30), call rxfree (p. 30), call rxsubstr
(p. 30).
Element Means
"string" Match a substring consisting of the charac-
ters in string.
letter Match the uppercase or lowercase letter in a
substring.
digit Match the digit in a substring.
dot (.) Match a dot (.) in a substring.
underscore (_) Match an underscore (_) in a substring.
? Match any one character in a substring.
colon (:) Match any sequence of zero or more charac-
ters in a substring.
$'pattern' or Match any one character in a substring. Use
$"pattern" a hyphen (-) to specify a range of alphanu-
meric variables (rxparse("$'a-z'")
matches any lowercase letter, for example).
See “Character Classes” on page 137.
~'char_class' or Match any one character that is not matched
^'char_class' or by the corresponding character class. Use a
~"char_class" or hyphen (-) to specify a range of alphanumer-
^"char_class" ic variables (rxparse("^'a-d'") excludes
the letters a–d from a match, for example).
See “Character Class Complements” on
page 138.
pattern1 pattern2 Select any substring that pattern1 matches
or followed immediately by any substring that
pattern1||pattern2 pattern2 matches (with no intervening spac-
es).
Element Means
pattern1|pattern2 Select any substring that pattern1 matches or
any substring pattern2 matches. You can use
an exclamation point (!) instead of a vertical
bar (|).
(pattern) Match a substring that contains pattern. You
can use parentheses to force the order of
operations.
[pattern] or Match a substring that contains pattern or
{pattern} an empty string.
pattern* Match zero or more consecutive strings that
pattern matches.
pattern+ Match one or more consecutive strings that
pattern matches.
@pos Match the position of a variable if the next
character is located in the column specified
by the integer pos. @0 matches end-of-line.
If pos is negative, it matches –pos positions
from end-of-line.
reuse_char_class Reuse a character class that you defined pre-
viously. See “Reusing Character Classes” on
page 139.
pattern_abbrev Specify ways to shorten pattern representa-
tion. See “Pattern Abbreviations” on
page 140 and “Default Character Classes” on
page 137.
balanced_symbols Specify the number of nested parentheses,
brackets, braces, or less-than/greater-than
symbols in a mathematical expression. See
“Matching Balanced Symbols” on page 140.
special_symbol Specify a position in a string, or a score val-
ue. See “Special Symbols” on page 141.
score_value Select the pattern with the highest score val-
ue. See “Scores” on page 142.
<pattern> Retrieve a matched substring for use in a
change expression. See “Tag Expression” on
page 142.
Element Means
change_expr Specify a pattern change operation that
replaces a string containing a matched sub-
string by concatenating values to the
replacement string. See “Change Expres-
sions” on page 142.
change_item Specify items used for string manipulation.
See “Change Items” on page 144.
Character Classes
Using a character-class element is a shorthand way to specify
a range of values for matching. You can use default classes
(p. 137), define your own classes (p. 138), use class comple-
ments (p. 138), or reuse classes (p. 139).
Default Character Classes
Specify a default character class with a dollar sign ($) followed
by a single uppercase or lowercase letter, as listed in the fol-
lowing table. A hyphen at the beginning or end of a character
class is treated as a member of the class, not as a range symbol.
See also “Character Class Complements” on page 138.
Example:
Prints pos1=8 match1=x pos2=1 match2=A
data _null_;
str='ABC123 xyz';
rx_id1=rxparse("$L");
rx_id2=rxparse("$U");
pos1=rxmatch(rx_id1,str);
pos2=rxmatch(rx_id2,str);
match1=substr(str,pos1,1);
match2=substr(str,pos2,1);
put pos1= match1= pos2= match2=;
call rxfree(rx_id1);
call rxfree(rx_id2);
run;
is equivalent to
rxparse("$'AB' $'AB' $'XYZ' $'XYZ' $'AB'").
$1, $2, and $-2 are replaced by AB, XYZ, and AB, respectively.
~n or ^n reuses the complement n-th character class, where n
is a nonzero integer. For example,
rxparse($'Al' $1 $'Jo' $2 $'Li' $3 ~2)
is equivalent to
rxparse($'Al' $'Al' $'Jo' $'Jo'
$'Li' $'Li' $'Al' $'Li').
Pattern Abbreviations
You can use the elements listed in the following table in your
pattern.
Pattern Matches a
$f or $F Floating-point number
$n or $N SAS name
$p or $P Prefix
$q or $Q Quoted string
$s or $S Suffix
Example:
Prints pos1=1 len1=4 pos2=20 len2=4
data _null_;
str='woodchucks eat firewood';
rx_id1=rxparse("$p 'wood'");
rx_id2=rxparse("'wood' $s");
pos1=rxmatch(rx_id1,str); ‘wood’ in woodchucks
pos2=rxmatch(rx_id2,str); ‘wood’ in firewood
call rxsubstr(rx_id1,str,pos1,len1);
call rxsubstr(rx_id2,str,pos2,len2);
put pos1= len1= pos2= len2=;
call rxfree(rx_id1);
call rxfree(rx_id2);
run;
Special Symbols
You can use the special symbols listed in the following table in
the pattern.
Symbol Means
\ Set the beginning of a match to the current position.
/ Set the end of a match to the current position. If you
use a backslash (\) in one alternative of a union (|), you
must use a forward slash ( /) in all alternatives of the
union, or in a position preceding or following the
union.
$# Request the match with the highest score, regardless of
the starting position. The position of this symbol with-
in the pattern isn’t significant.
$- Scan a string from right to left. The position of this
symbol within the pattern isn’t significant. Don’t con-
fuse a hyphen (-) used to scan a string with a hyphen
used in arithmetic operations.
$@ Require the match to begin where the scan of the text
begins. The position of this symbol within the pattern
isn’t significant.
Example:
Prints pos=6 match=ow
data _null_;
str='How now brown cow?';
rx_id=rxparse("@3:\ow");
pos=rxmatch(rx_id,str); Matches ‘ow’ in now
match=substr(str,pos,2);
put pos= match=;
call rxfree(rx_id);
run;
Scores
When a pattern is matched by more than one substring begin-
ning at a specific position, the longest substring is selected. To
change this selection criterion, assign a score value to each
substring by using the # symbol followed by an integer.
The score for any substring begins at zero. When #n appears
in the pattern, n is added to the score. If two or more match-
ing substrings begin at the same leftmost position, SAS selects
the one with the highest score; if both have the same score,
SAS selects the longer one. The following table lists score rep-
resentations.
Item Means
#n Add n to the score
#*n Multiply the score by nonnegative n
#/n Divide the score by positive n
#=n Assign the value of n to the score
#>n Find a match if the current score exceeds n
Tag Expression
To assign a substring of the searched string to a character var-
iable, use the expression name=<pattern>. The substring that
matches this expression is assigned to the variable name.
If you specify <pattern> (without the name=), SAS automati-
cally assigns the first occurrence of the pattern to the variable
_1, the second occurrence to _2, and so on. This assignment is
called tagging. SAS tags the corresponding substring of the
matched string.
Change Expressions
If you find a substring that matches a pattern, you can change
that substring by specifying, in the rxparse argument, the
pattern expression, the TO keyword, and the change expres-
Example:
Search for a semicolon (;) and replace it with a space
data mydata (drop=rx_id);
set mydata;
rx_id=rxparse("$';' to ' '");
call rxchange(rx_id,999,old_str);
run;
Change Items
You can use the items listed in the following table to manipu-
late the replacement string. Each item positions the cursor
without affecting the replacement string.
Item Means
@n Move the pointer to column n where the next string added
to the replacement string will start.
@= Move the pointer one column past the end of the matched
substring.
>n Move the pointer to the right to column n. If the pointer is
already to the right of column n, the pointer isn’t moved.
>= Move the pointer to the right, one column past the end of
the matched substring.
<n Move pointer to the left to column n. If the pointer is
already to the left of column n, the pointer isn’t moved.
<= Move the pointer to the left, one column past the end of
the matched substring.
+n Move the pointer n columns to the right.
-n Move the pointer n columns to the left.
-L Left-align the result of the previous item or expression in
parentheses.
-R Right-align the result of the previous item or expression in
parentheses.
-C Center the result of the previous item or expression in
parentheses.
*n Repeat the result of the previous item or expression in
parentheses n–1 times, producing a total of n copies.
See also: call rxchange (p. 30), call rxfree (p. 30), call
rxsubstr (p. 30), rxmatch (p. 134).
saving(fut_amt,pmt,int_rate,num_periods)
Returns a periodic-savings value. fut_amt (≥ 0) is the future
amount at the end of num_periods. pmt (≥ 0) is the fixed peri-
odic payment. int_rate (≥ 0) is the periodic interest rate,
scan(str,n [,delims])
Returns the n-th word in a string. If n < 0, scan counts words
right to left. If |n| > the number of words in str, scan returns
an empty string. delims specifies character(s) that separate
words. The default ASCII delimiters are: space . < ( + & ! $ * ) ;
^ - / , % |. The default EBCDIC delimiters are: space . < ( + | &
! $ * ) ; ¬ - / , % | ¢. Contiguous delimiters are treated as one.
Leading and trailing delimiters are ignored. The result’s
default length is 200.
Examples:
scan('123 abc xyz',2) → ‘abc’.
scan('123.abc(xyz)',3) → ‘xyz’.
See also: call scan (p. 30), call scanq (p. 31), scanq (p. 145).
scanq(str,n [,delims]) 9+
Returns the n-th word in a string, ignoring quote-enclosed
delimiters. If n < 0, scanq counts words right to left. If n = 0
or |n| > the number of words in str, scanq returns an empty
string. Unmatched quotes in str make left-to-right and right-
to-left scans return different words. delims specifies charac-
ter(s) that separate words. The default delimiters are white-
space characters: space, horizontal and vertical tab, carriage
return, line feed, and form feed. You can’t use single or double
See also: call scan (p. 30), call scanq (p. 31), scan (p. 145).
sdf(dist,quantile [,param1,param2,...])
Returns the survival function. See cdf (p. 40) for a description
of the parameters. Example: sdf('normal',1.96) → ≈0.025.
See also: logsdf (p. 100), pdf (p. 114), quantile (p. 127).
second(sas_time|sas_datetime)
Extracts the seconds (0–59.99...) from a SAS time or SAS
datetime.
Example: second(time()) → 49.4400 (the time is 10:19:49.44
PM).
See also: hour (p. 83), minute (p. 102).
sign(x)
Determines the sign of x, returning 1 if x is positive, zero (0) if
x is 0, or -1 if x is negative. Examples: sign(-0.000006) → -1.
sign(10.2) → 1. sign(0/5) → 0. See also: abs (p. 10).
sin(angle)
Returns the sine of an angle. angle is expressed in radians. To
convert degrees to radians, multiply degrees by π/180. Note
that sin(-x) = -sin(x) and 1/sin(x) is the cosecant of x.
Examples:
sin(0) → 0.
sin(constant('pi')/2) → 1.
See also: arsin (p. 12), cos (p. 51), tan (p. 155).
sinh(x)
Returns the hyperbolic sine of x, defined by (ex – e–x)/2. Exam-
ples: sinh(1) → 1.1752. sinh(-1) → -1.1752. See also: cosh
(p. 51), tanh (p. 155).
skewness(x1,x2,x3 [,x4,...])
Returns the skewness (asymmetry) of nonmissing arguments.
At least three nonmissing arguments are required or the result
is a missing value. Example: x1=-3; x2=0.5; x3=50; y1=.;
skewness(2,of x1-x3,6,y1) → 2.1170147309.
sleep(n [,unit])
Suspends program execution for a specified amount of time
(< 46 days), returning the time slept. n (≥ 0) is the number of
units of time. unit (defaults to 1.0 on Windows; 0.001 other-
wise) is the unit of time, as a power of 10, that’s applied to n. 1
is a second and 0.001 is a millisecond, for example.
Examples:
Suspend execution for one minute
time_slept=sleep(6000,0.01);
Operation Cost
match (no change) 0
singlet (delete one of a double letter) 25
doublet (double a letter) 50
swap (reverse the order of two consecutive letters) 50
truncate (delete a letter from the end) 50
append (add a letter to the end) 35
delete (delete a letter from the middle) 50
insert (insert a letter in the middle) 100
replace (replace a letter in the middle) 100
firstdel (delete the first letter) 100
Operation Cost
firstins (insert a letter at the beginning) 200
firstrep (replace the first letter) 200
sqrt(x)
Returns the positive square root of x (≥ 0). Examples: sqrt(16)
→ 4. sqrt(constant('pi')) → 1.7724538509.
std(x1,x2 [,x3,...])
Returns the standard deviation of nonmissing arguments. At
least two nonmissing arguments are required or the result is a
missing value. Example: x1=3; x2=4; x3=5; y1=.; std(2,of
x1-x3,6,y1) → 1.5811388301. See also: var (p. 160).
stderr(x1,x2 [,x3,...])
Returns the standard error of the mean of nonmissing argu-
ments. At least two nonmissing arguments are required or the
stfips(postal_code)
Converts a two-character state or U.S. territory postal code or
GSA geographic code to its numeric FIPS (Federal Informa-
tion Processing Standards) code. postal_code is case-insensi-
tive with no leading spaces; trailing spaces are ignored. Examples:
The following table shows examples of stfips, stname
(p. 150), and stnamel (p. 150). See also: fipstate (p. 74),
zipfips (p. 165).
stname(postal_code)
Converts a two-character state or U.S. territory postal code or
GSA geographic code to its name in uppercase (≤ 20 charac-
ters). postal_code is case-insensitive and can have trailing, but
not leading, spaces. Examples: See stfips (p. 150). See also:
fipname (p. 73), stnamel (p. 150), zipname (p. 166).
stnamel(postal_code)
Converts a two-character state or U.S. territory postal code or
GSA geographic code to its name in mixed case (≤ 20 charac-
ters). postal_code is case-insensitive and can have trailing, but
not leading, spaces. Examples: See stfips (p. 150). See also:
fipnamel (p. 74), stname (p. 150), zipnamel (p. 166).
strip(str) 9+
Removes leading and trailing spaces from a string, returning
an empty string (‘’) if str is blank. Assigning the result to a var-
iable doesn’t affect the variable’s length; if necessary, strip
pads the result with new trailing spaces to match the target
variable’s length. The result’s default length is the length of str.
strip is useful for removing trailing spaces during concate-
subpad('abcde',2,3) → ‘bcd’.
substrn('abcde',-2) → ‘abcde’.
substrn('abcde',2,3) → ‘bcd’.
substrn('abcde',2,99) → ‘bcde’.
symglobl(macro_var_name) 9+
Returns one (1) if a macro variable is in a global scope to the
calling DATA step; zero otherwise. For details see Symglobl in
SAS Macro Language: Reference. See also: symlocal (p. 153).
symlocal(macro_var_name) 9+
Returns one (1) if a macro variable is in a local scope to the
calling DATA step; zero otherwise. For details see Symlocal in
SAS Macro Language: Reference. See also: symglobl (p. 153).
sysget(env_var)
Returns the value of an OS environment variable. env_var is
the case-sensitive name of the environment variable. Trailing
spaces are significant; use trim (p. 158) to remove them. The
result’s default length is 200. sysget logs a warning if it trun-
cates the result, or returns a missing value if env_var is unde-
fined. Example: sysget('username') → ‘judy’ (Windows).
sysmsg()
Returns the text of error messages or warning messages that
are generated when a dataset or external-file access function
encounters an error condition. If no error message is avail-
%let id=&sysprocessid;
%let name=%sysfunc(sysprocessname(&id));
%put &name; (logs ‘DMS Process’)
sysprod(product_name)
Determines whether a SAS product is licensed, returning 1 if
product_name is a licensed SAS product, 0 if it’s unlicensed
(inaccessible), or -1 if it’s not a SAS product. A product is
licensed if its license expiration date hasn’t passed. A SAS
product can exist on your system but no longer be licensed.
product_name takes the case-insensitive values access, base
(always returns true), connect, ets, graph, share, stat, and
so on. You can prefix a product name with ‘SAS/’. Example:
sysprod('stat') → 1. See also: PROC SETINIT.
tanh(x)
Returns the hyperbolic tangent of x, defined by (ex–e–x)/(ex+e–x)
or sinh(x)/cosh(x). Examples: tanh(0) → 0. tanh(0.5) →
0.4621. See also: call tanh (p. 37), cosh (p. 51), sinh (p. 147).
time()
Returns the current time of day as a SAS time (the number
seconds after midnight, 0 ≤ time() < 86400).
Example: time() → 73580.56 (the time is 20:26:21).
See also: datetime (p. 55), dhms (p. 59), hms (p. 82), mdy
(p. 101), today (p. 156).
timepart(sas_datetime)
Extracts the time, as a SAS time, from a SAS datetime.
Example: timepart(datetime()) → 80389.44 (now is 02-Sep-
2005 22:19:49).
See also: datepart (p. 55), datetime (p. 55).
tnonct(2,4,probt(2,4,1.5)) → 1.499999999.
See also: cnonct (p. 44), fnonct (p. 75), probt (p. 119), tinv
(p. 156).
today()
Returns the current date as a SAS date (the number of days
since 1-Jan-1960). today is the same as date (p. 55).
Example: today() → 16681 (today is 2-Sep-2005).
See also: datetime (p. 55), dhms (p. 59), hms (p. 82), mdy
(p. 101), time (p. 155).
translate(str,to1,from1 [,to2,from2,...])
Replaces all occurrences of from characters in str with to char-
acters. to and from values correspond on a character-by-char-
acter basis; translate changes the first character of from to
the first character of to, and so on. If to has fewer characters
than from, translate changes the extra from characters to
spaces. If to has more characters than from, translate
ignores the extra to characters. The maximum number of to-
Example:
Transcode from Latin2 to uppercase Latin2
trantab('polyalloy','lat2_ucs') → ‘POLYALLOY’.
tranwrd(str,substr,replace_with)
Replaces or removes all occurrences of a substring within a
string. Trailing spaces in str and replace_with are significant.
The result’s default length is 200.
Examples:
tranwrd('Miss J. Kim','Miss','Ms.') → ‘Ms. J. Kim’.
trunc(1,8) → 1.00000000.
trunc(1.3,3) → 1.29980469.
trunc(1.3,4) → 1.29999924.
trunc(1.3,8) → 1.30000000.
trunc(1.5,4) → 1.50000000.
Examples:
upcase('Place de l''Étoile') → ‘PLACE DE L'ÉTOILE’.
uuidgen([max_warnings [,binary_result]]) 9+
Generates a universal unique identifier (UUID), returning a
36-character string by default. uuidgen logs no more than
max_warnings (default 1) warnings. Set binary_result to non-
Example:
Create a dataset of attributes of the variables in mydata
data vars;
length name label infmt fmt $ 32 type $ 1;
drop dataset_id nvars i rc;
dataset_id=open('mydata','i');
nvars=attrn(dataset_id,'nvars');
do i=1 to nvars;
name=varname(dataset_id,i);
label=varlabel(dataset_id,i);
infmt=varinfmt(dataset_id,i);
fmt=varfmt(dataset_id,i);
type=vartype(dataset_id,i);
See also: call vnext (p. 37), getvarc (p. 81), getvarn (p. 81),
v<attr> (p. 161).
varnum(dataset_id,var_name)
Returns the number of a named variable’s position in a SAS
dataset (or zero if the variable isn’t in the dataset). dataset_id
is the dataset identifier that open (p. 111) returns. var_name is
the variable’s name. PROC CONTENTS produces the same
variable numbers.
Example: varnum(open('mydata.dsn','i'),'x4') → 4.
See also: getvarc (p. 81), getvarn (p. 81), var<attr> (p. 160).
v<attr>(var)
v<attr>x(str)
Group of functions that returns attributes of a specified vari-
able in a SAS dataset. In the v<attr> functions, var is a vari-
able name or an array reference (not an expression or quoted
character constant). In the v<attr>x functions, str is a charac-
ter expression that evaluates to a variable name (but can’t be
an array reference). For example, use vformatd(myvar) or
vformatdx('myvar'). Replace attr with one of the values list-
ed in the following table.
Example:
Prints n=x t=N fn=COMMA w=10 d=2 len=8 v=1,234.00
data _null_;
x=1234;
format x comma10.2;
n=vname(x);
t=vtype(x);
fn=vformatn(x);
w=vformatw(x);
d=vformatd(x);
len=vlength(x);
v=vvalue(x);
put n= t= fn= w= d= len= v=;
run;
year(sas_date)
Extracts the four-digit year from a SAS date.
Example: year(today()) → 2005 (today is 2-Sep-2005).
See also: day (p. 56), month (p. 104), qtr (p. 126), week (p. 163),
weekday (p. 163).
Examples:
sdate='1feb2004'd; edate='1may2007'd;
yrdif(sdate,edate,'30/360') → 3.25.
yrdif(sdate,edate,'act/act') → 3.2440676697.
yrdif(sdate,edate,'act/360') → 3.2916666667.
yrdif(sdate,edate,'act/365') → 3.2465753425.
zipcity(zip) 9+
Converts a zip code to a city name, comma, space, and two-
character state postal code (≤ 20 characters). zip is numeric or
character, with or without leading zeros (trailing zeros are
required). zipcity uses the sashelp.zipcode built-in data-
set. Examples: See the following table. See also: zipfips (p. 165),
zipname (p. 166), zipnamel (p. 166), zipstate (p. 166).
zip zipcity()
‘94123’ ‘San Francisco, CA’
80303 ‘Boulder, CO’
‘04609’ ‘Bar Harbor, ME’
4609 ‘Bar Harbor, ME’
‘637’ ‘Sabana Grande, PR’
‘xxx’ ‘ ’ (missing—unknown zip code)
zipfips(zip)
Converts a zip code to a numeric FIPS (Federal Information
Processing Standards) code. zip is numeric or character, with
or without leading zeros (trailing zeros are required). Exam-
ples: The following table shows examples of zipfips, zipname
(p. 166), zipnamel (p. 166), and zipstate (p. 166).
See also: fipname (p. 73), fipnamel (p. 74), fipstate (p. 74),
stfips (p. 150), zipcity (p. 165).
zipname(zip)
Converts a zip code to its state or U.S. territory name in upper-
case (≤ 20 characters). zip is numeric or character, with or
without leading zeros (trailing zeros are required). Examples:
See zipfips (p. 165). See also: fipname (p. 73), stname (p. 150),
zipcity (p. 165), zipnamel (p. 166), zipstate (p. 166).
zipnamel(zip)
Converts a zip code to its state or U.S. territory name in mixed
case (≤ 20 characters). zip is numeric or character, with or
without leading zeros (trailing zeros are required). Examples:
See zipfips (p. 165). See also: fipnamel (p. 74), stnamel
(p. 150), zipcity (p. 165), zipname (p. 166), zipstate (p. 166).
zipstate(zip)
Converts a zip code to its two-character state or U.S. territory
postal code or GSA geographic code in uppercase. zip is
numeric or character, with or without leading zeros (trailing
zeros are required). Examples: See zipfips (p. 165). See also:
fipstate (p. 74), zipcity (p. 165), zipname (p. 166),
zipnamel (p. 166).