100% found this document useful (1 vote)
217 views

SAS Interview Questions

The document provides answers to common SAS interview questions covering topics such as reading in data, using SAS statements and functions, debugging programs, handling missing values, and using procedures. It includes explanations of SAS concepts like informats and formats as well as code examples for sorting, merging, and restricting output. The questions range from very basic to more advanced topics involving internal SAS processing.

Uploaded by

Dilshad Alam
Copyright
© © All Rights Reserved
Available Formats
Download as DOC, PDF, TXT or read online on Scribd
100% found this document useful (1 vote)
217 views

SAS Interview Questions

The document provides answers to common SAS interview questions covering topics such as reading in data, using SAS statements and functions, debugging programs, handling missing values, and using procedures. It includes explanations of SAS concepts like informats and formats as well as code examples for sorting, merging, and restricting output. The questions range from very basic to more advanced topics involving internal SAS processing.

Uploaded by

Dilshad Alam
Copyright
© © All Rights Reserved
Available Formats
Download as DOC, PDF, TXT or read online on Scribd
You are on page 1/ 40

More SAS Interview Questions

Very Basic
 What SAS statements would you code to read an external raw data file to a
DATA step?
Ans:- import

 How do you read in the variables that you need?

Ans:- using keep

 What is the difference between an informat and a format? Name three


informats or formats.
Informat – tells how to read the data
Format – tells how write the data
eg. data _null_;
format date date9.;
date = Today();
put date;
run;

data _null_;
format date mmddyy9.;
date = Today();
put date;
run;

 Name and describe three SAS functions that you have used, if any?
data nunu;
set july.export;
Substr_p= Substr (Name,1,3);
mean_p= Mean(Height,Weight);
sum_p = sum (Height, Weight);
Left_p= Right (Name);
UPcase_p= Upcase (Name);
Catx_p = Catx (',',Name,Age);
Index_p = Index (Name, 'a');
run;

 How would you code the criteria to restrict the output to be produced?

Ans – NO PRINT
Eg
proc means sum data = July.Export NoPrint;
CLass sex;
Var Height Weight;
output out = Work.New (Drop=_:)sum=;
run;

 If reading an external file to produce an external file, what is the shortcut to


write that record without coding every single variable on the record?

 If you're not wanting any SAS output from a data step, how would you code
the data statement to prevent SAS from producing a set?

Ans – Data _null_;

 If you have a data set that contains 100 variables, but you need only five of
those, what is the code to force SAS to use only those variable?

Ans ; USE KEEP option, this can be used in Data or set step but adviable in set
step
data New_h;
set July.Export (Keep = Age);
run;

 Code a PROC SORT on a data set containing State, District and County as
the primary variables, along with several numeric variables.

Ans;
proc sort data= July.Export;
by Sex Name _numeric_; or use _character_ instead of numeric
run;

proc sort data= July.Export;


by _all_;
run;

 How would you delete duplicate observations?


- Using a command noduprecs in Proc sort statement
- Proc sort data= July.Export out= work.New noduprecs;
- By Name;
- Run;

 How would you delete observations with duplicate keys?


- Using a command nodupKey in Proc sort statement
- Proc sort data= July.Export out= work.New nodupKey;
- By Name;
- Run;

-
-  How would you code a merge that will keep only the observations that have
matches from both sets.
-
- Ans – giving the condition if a & b

 How would you code a merge that will write the matches of both to one data set,
the non-matches from the left-most data set to a second data set, and the non-matches of
the right-most data set to a third data set.

Ans – if a & b & not i

Internals
 Does SAS 'Translate' (compile) or does it 'Interpret'? Explain.
Ans= compile. When you submit a data step for execution, SAS checks the
statement, that is automatically translate the statement in to machine code.

 At compile time when a SAS data set is read, what items are created?
ANS= program data vector, descriptive portion

 Identify statements whose placement in the DATA step is critical.


Ans= input, infile , run

 Name statements that function at both compile and execution time.


Ans= data is the statement that function at both compile and execution time

 Name statements that are for execution only.


Ans=delete, replace, select these are execution statement
 In the flow of DATA step processing, what is the first action in a typical
DATA Step?
Ans=

 What is _n_?

Base SAS
 What is the effect of the OPTIONS statement ERRORS=1?
Ans= stop on errors 1(if you set this option to 1 the session) will be stopped after
occurance of 1 error row. If it is 0
The A session will not be stopped even u got n number of error.

 What's the difference between VAR A1 - A4 and VAR A1 - - A4?

Ans: A1- A4 will take variables A1 A2 A3 A4 whereas A1 - - A4 takes in all


variables between A1 - - A4

 What do the SAS log messages "numeric values have been converted to
character" mean? What are the implications?

Ans – Numberic values changes to character and if there is a missing value it will
show as blank

 How do you control the number of observations and variables read or


written?

Ans: Keep / Drop for Variables


And _n_ for observations

 Approximately what date is represented by the SAS date value of 730?


data _null_;
a= 730;
b= put (a,date9.);
put b;
run;
We can use the above to see the result
 How would you remove a format that has been permanently associated with
a variable?? Ans Use Proc format

 What does the RUN statement do?


Ans – it stops compiling and starts executing the code

 What areas of SAS are you most interested in?


Data mining

 What versions of SAS have you used (on which platforms)?


Sas 9.1 on windows and unix

 What are some good SAS programming practices for processing very large
data sets?
- use keep drop statement for the selection of desire variables in data steps and
apply date time filters as and when required

- use a small subset of data to check a program - _null_

 What are some problems you might encounter in processing missing values?
*In Data steps? Arithmetic? Comparisons? Functions? Classifying data?
data show;
set test;
sum_fun = sum(a,b);
arit_fun = a+b;
run;

 How would you create a data set with 1 observation and 30 variables from a
data set with 30 observations and 1 variable?
Ans : Proc transpose
Prom3fdc transpose data = new out = trans (drop = _:) prefix = new_ ;

var cid;
run;

proc transpose data = piyush.export out = trans_new (drop = _:) PREFIX =


newage_ ;
by name height weight; (values not to be transported)
var age;(value to be transported)
id sex; (by which to be transported)
run;

 What is the different between functions and PROCs that calculate the same
simple descriptive statistics?

Procs are pre defined procedure in sas

 If you were told to create many records from one record, show how you
would do this using arrays and with PROC TRANSPOSE?

data new_array;
set exp1;
array test(*) _numeric_;
do i = 1 to dim(test);
if test (i)= . then test(i)=0;
drop i;
end;

 What are _numeric_ and _character_ and what do they do?


Ans= numeric and character are arrays and these specify numeric and
character variable in the respective arrays.

 How would you create multiple observations from a single observation?


Proc transpose

 For what purpose would you use the RETAIN statement?


Ans – retains the value of viable after each iteration

 What is a method for assigning first.VAR and last.VAR to the BY group


variable on unsorted data?
Will give error if data not sorted

If sorted will select 1st variable in sorted list

proc sort data= piyush.export;


by age;
run;

data m;
set piyush.export;
by age;
if first.age
then output;
run;

 What is the order of evaluation of the comparison operators: + - * / ** ( ) ?


bodmas

Testing, debugging
 How could you generate test data with no input data?

data _null_;
a= 730;
b= put (a,date9.);
put b;
run;

 How do you debug and test your SAS programs?


Ans -Checking log carefully

 What can you learn from the SAS log when debugging?
Ans – error

 What is the purpose of _error_?


- tells us the error in the program if yes the – 1

 What other SAS features do you use for error trapping and data validation?

- Ans For valiation use proc freq to give u an estimate and frequency distribution
of data

Missing values

 How many missing values are available? When might you use them?

Character & numberic

 How do you test for missing values?

Proc freq – tell us the number of missing values

 How are numeric and character missing values represented internally?


Numeric as . and characher as Blank

Functions

 What do the PUT and INPUT functions do?


Put – changes character to numeric
Input changes Numeric to character

 Which date function advances a date, time or date/time value by a given


interval?
- intnx

 What do the MOD and INT function do?

MOD
INT is Integer

 How might you use MOD and INT on numerics to mimic SUBSTR on
character strings?

1st the data needs to be changed to numberic

 How would you determine the number of missing or nonmissing values in


computations?

Proc freq
And proc Means
With nMiss

 There is a field containing a date. It needs to be displayed in the format


"ddmonyy" if it's before 1975, "dd mon ccyy" if it's after 1985, and as 'Disco Years'
if it's between 1975 and 1985. How would you accomplish this in data step code?
Using only PROC FORMAT.

 In the following DATA step, what is needed for 'fraction' to print to the log?
data _null_; x=1/3; if x=.3333 then put 'fraction'; run;

 What is the difference between calculating the 'mean' using the mean
function and PROC MEANS?
PROCs
 Have you ever used "Proc Merge"? NO
(be prepared for surprising answers..)

 If you were given several SAS data sets you were unfamiliar with, how would
you find out the variable names and formats of each dataset?

Proc contents
And use varnum to see the data in order it is there

 What SAS PROCs have you used and consider yourself proficient in using?

 How would you keep SAS from overlaying the a SAS set with its sorted
version?
Use the statement output out= or out =

 In PROC PRINT, can you print only variables that begin with the letter
"A"?
Yes we can use where condition such as where Nmae like “A”

 What are some differences between PROC SUMMARY and PROC MEANS?
Proc means with no print is same as proc summary

 PROC FREQ:
*Code the tables statement for a single-level (most common) frequency.
*Code the tables statement to produce a multi-level frequency.
*Name the option to produce a frequency line items rather that a table.
*Produce output from a frequency. Restrict the printing of the table.

 PROC MEANS:

*Code a PROC MEANS that shows both summed and averaged output of the data.
*Code the option that will allow MEANS to include missing numeric data to be
included in the report.
*Code the MEANS to produce output to be used later.
 Do you use PROC REPORT or PROC TABULATE? Which do you prefer?

Explain.
Merging/Updating
 What happens in a one-on-one merge? When would you use one?
 How would you combine 3 or more tables with different structures?
 What is a problem with merging two data sets that have variables with the
same name but different data?
 When would you choose to MERGE two data sets together and when would
you SET two data sets?
 Which data set is the controlling data set in the MERGE statement?
 How do the IN= variables improve the capability of a MERGE?
 Explain the message 'MERGE HAS ONE OR MORE DATASETS WITH
REPEATS OF BY VARIABLES".

Simple statistics
 How would you generate 1000 observations from a normal distribution with
a mean of 50 and standard deviation of 20. How would you use PROC CHART to
look at the distribution? Describe the shape of the distribution.
 How do you generate random samples?

Customized Report Writing


 What is the purpose of the statement DATA _NULL_ ;?
 What is the pound sign used for in the DATA _NULL_?
 What would you use the trailing @ sign for?
 For what purpose(s) would you use the RETURN statement?
 How would you determine how far down on a page you have printed in order
to print out footnotes?
 What is the purpose of using the N=PS option?

Macro
 What system options would you use to help debug a macro?
 Describe how you would create a macro variable.
 How do you identify a macro variable?
 How do you define the end of a macro?
 How do you assign a macro variable to a SAS variable?
 For what purposes have you used SAS macros?
 If you use a SYMPUT in a DATA step, when and where can you use the
macro variable?
 What do you code to create a macro?
 Describe how you would pass data to a macro.
 You have five data sets that need to be processed identically; how would you
simplify that processing with a macro?
 How would you code a macro statement to produce information on the SAS
log? This statement can be coded anywhere.
 How do you add a number to a macro variable?
 If you need the value of a variable rather than the variable itself, what would
you use to load the value to a macro variable?
 Can you execute a macro within a macro? Describe.
 Can you a macro within another macro? If so, how would SAS know where
the current macro ended and the new one began?
 How are parameters passed to a macro?

Pharmaceutical Industry
 Describe the types of SAS programming tasks that you performed: Tables?
Listings? Graphics? Ad hoc reports? Other?
 Have you been involved in editing the data or writing data queries?
 What techniques and/or PROCs do you use for tables?
 Do you prefer PROC REPORT or PROC TABULATE? Why?
 Are you involved in writing the inferential analysis plan? Tables
specifications?
 What do you feel about hardcoding?
 How experienced are you with customized reporting and use of DATA
_NULL_ features?
 How do you write a test plan?
 What is the difference between verification and validation?

Intangibles
 What was the last computer book you purchased? Why?
 What is your favorite all time computer book? Why?
 For contractors:
*Will it bother you if the guy at the next desk times the frequency and duration of
your bathroom/coffee breaks on the grounds that 'you are getting paid twice as
much as he is'?

*How will you react when, while consulting a SAS documentation manual to get
an answer to a problem, someone says: 'hey, I thought you were supposed to know
all that stuff already, and not have to look it up in a book!'

*Can you continue to write code while the rest of the people on the floor where
you work have a noisy party to which you were not invited?
Non-Technical
 Can you start on Monday?
 Do you think professionally?
*How do you put a giraffe into the refrigerator? Correct answer: Open the
refrigerator door, put the giraffe in, and close the door. This question tests whether
or not the candidate is doing simple things in a complicated way.

*How do you put an elephant in the refrigerator? Incorrect answer: Open the
refrigerator door, put in the elephant, and close the door. Correct answer: Open the
refrigerator door, take out the giraffe, put in the elephant, and close the door. This
question tests your foresight.

*The Lion King is hosting an animal conference. All the animals in the world
attend except one. Which animal does not attend? Correct answer: The elephant.
The elephant is in the refrigerator, remember? This tests if you are capable of
comprehensive thinking.

*There is a river notoriously known for it's large crocodile population. With
ease, how do you safely cross it? Correct answer: Simply swim across. All of the
crocodiles are attending the Lion King's animal conference. This questions your
reasoning ability.

Open-ended questions
 Describe a time when you were really stuck on a problem and how you solved
it.
 Describe the function and utility of the most difficult SAS macro that you
have written.
 Give me an example of ..
 Tell me how you dealt with ...
 How do handle working under pressure?
 Of all your work, where have you been the most successful?
 What are the best/worst aspects of your current job?
 If you could design your ideal job, what would it look like?
 How necessary is it to be creative in your work?
 If money were no object, what would you like to do?
 What would you change about your job?
What happens in the following code, if u type 8
instead of *? proc sql noprint; create table abc as select 0 2
8 from lib.abc; quit;
What are the difficulties u faced while doing vital
0 2
signs table or dataset?
We have a string like this "kannafromsalembut"
,from this i want to get only "fromsal" (but one
condition with out using substring function)here we 1 53
can not use scan because in the given string there is no
delimeter? so give ans without out using substring ?
How to get part of string form the source string 17
3
without using sub string function in SAS? 8
how to read character value without using substr 17
2
function in sas ? 3
proc means? proc sort? proc append? proc freq? 15
Oracle 1
proc print? proc content? 0
What is the order of evaluation of the comparison 15
1
&& logical && relational operators:? 3
PROC SQL always ends with QUIT 55
HP 4
statement.Why cant you use RUN in PROQ SQL ? 7
What is shift table? have you ever created shift Accenture 84
2
that? 6
What are the rows present in protocol Violation Accenture 30
1
table? 2
What are all the problems you faced while Accenture
0 56
validating tables and reports?
Accenture 34
What are TEAEs 2
3
Accenture 19
how do you validate tables abd reports? 2
2
Accenture 32
What procedure you used to calculate p-value? 2
8
What are the efficacy variables in your study?

Key concepts
A SAS technical interview typically starts with a few of the key concepts that are essential in SAS

programming. These questions are intended to separate those who have actual substantive experience

with SAS from those who have used in only a very limited or superficial way. If you have spent more than

a hundred hours reading and writing SAS programs, it is safe to assume that you are familiar with topics

such as these:

SORT procedure

Data step logic

KEEP=, DROP= dataset options

Missing values

Reset to missing, or the RETAIN statement

Log

Data types

FORMAT procedure for creating value formats

IN= dataset option

Tricky Stuf

After the interviewer is satisfied that you have used SAS to do a variety of things, you are likely to get

some more substantial questions about SAS processing. These questions typically focus on some of the

trickier aspects of the way SAS works, not because the interviewer is trying to trick you, but to give you a
chance to demonstrate your knowledge of the details of SAS processing. At the same time, you can show

how you approach technical questions and issues, and that is ultimately more important than your

knowledge of any specific feature in SAS.

STOP statement

The processing of the STOP statement itself is ludicrously simple. However, when you explain the

how and why of a STOP statement, you show that you understand:

How a SAS program is divided into steps, and the diference between a data step and a proc step

The automatic loop in the data step

Conditions that cause the automatic loop to terminate, or to fail to terminate

RUN statement placement

The output of a program may be diferent based on whether a RUN statement comes before or after

a global statement such as an OPTIONS or TITLE statement. If you are aware of this issue, it shows that

you have written SAS programs that have more than the simplest of objectives. At the same time, your

comments on this subject can also show that you know:

The distinction between data step statements, proc step statements, and global statements

How SAS finds step boundaries


The importance of programming style

SUM or +

Adding numbers with the SUM function provides the same result that you get with the + numeric

operator. For example, SUM(8, 4, 3) provides the same result as 8 + 4 + 3. Sometimes, though, you prefer

to use the SUM function, and at other times, the + operator. As you explain this distinction, you can show

that you understand:

Missing values

Propagation of missing values

Treatment of missing values in statistical calculations in SAS

Why it matters to handle missing values correctly in analytic processing

The use of 0 as an argument in the SUM function to ensure that the result is not a missing value

The performance diferences between functions and operators

Essential ideas of data cleaning

Statistics: functions vs. proc steps

Computing a statistic with a function, such as the MEAN function, is not exactly the same as

computing the same statistic with a procedure, such as the UNIVARIATE procedure. As you explain this

distinction, you show that you understand:


The diference between summarizing across variables and summarizing across observations

The statistical concept of degrees of freedom as it relates to the diference between sample statistics

and population statistics, and the way this is implemented in some SAS procedures with the VARDEF=

option

REPLACE= option

Many SAS programmers never have occasion to use the REPLACE= dataset option or system option,

but if you are familiar with it, then you have to be aware of:

The distinction between the input dataset and the output dataset in a step that makes changes in a

set of data

The general concept of name conflicts in programming theory

Issues of programming style related to name conflicts

How the system option compares to the corresponding dataset option

A question on this topic may also give you the opportunity to mention syntax check mode and issues

of debugging SAS programs.

WHERE vs. IF
Sometimes, it makes no diference whether you use a WHERE statement or a subsetting IF

statement. Sometimes it makes a big diference. In explaining this distinction, you have the opportunity

to discuss:

The distinction between data steps and proc steps

The diference between declaration (declarative) statements and executable (action) statements

The significance of the sequence of executable statements in a data step

Some of the finer points of merging SAS datasets

A few points of efficiency theory (although tests do not seem to bear the theory out in this case)

The origin of the WHERE clause in SQL (of course, bring this up only if you’re good at SQL)

WHERE operators that are not available in the IF statement or other data step statements

Compression

Compressing a SAS dataset is easy to to, so questions about it have more to do with determining

when it is a good idea. You can weigh efficient use of storage space against efficient use of processing

power, for example. Explain how you use representative data and performance measurements from SAS

to test efficiency techniques, and you establish yourself as a SAS programmer who is ready to deal with

large volumes of data. If you can explain why compression is efective in SAS datasets and observations

larger than a certain minimum size and why binary compression works better than character

compression for some kinds of data, then it shows you take software engineering seriously.
Macro processing

Almost the only reason interviewers ask about macros is to determine whether you appreciate the

distinction between preprocessing and processing. Most SAS programmers are somewhat fuzzy about

this, so if you have it perfectly clear in your mind, that makes you a cut about the rest — and if not, at

least you should know that this is a topic you have to be careful about. There are endless technical issues

with SAS macros, such as the system options that determine how much shows up in the log; your

experience with this is especially important if the job involves maintaining SAS code written with macros.

SAS macro language is somewhat controversial, so be careful what you say of your opinion of it. To

some managers, macro use is what distinguishes real SAS programmers from the pretenders, but to

others, relying on macros all the time is a sure sign of a lazy, fuzzy-headed programmer. If you are

pressed on this, it is probably safe to say that you are happy to work with macros or without them,

depending on what the situation calls for.

Procedure vs. macro

The question, “What is the diference between a procedure and a macro?” can catch you of guard if

it has never occurred to you to think of them as having anything in common. It can mystify you in a

completely diferent way if you have thought of procedures and macros as interchangeable parts. You

might mention:
The diference between generating SAS code, as a macro usually does, and taking action directly on

SAS data, as a procedure usually does

What it means, in terms of efficiency, for a procedure to be a compiled program

The drastic diferences in syntax between a proc step and a macro call

The IMPORT and EXPORT procedures, which with some options generate SAS statements much like a

macro

The %SYSFUNC macro function and %SYSCALL macro statement that allow a macro to take action

directly on SAS data, much like a procedure

Scope of macro variables

If the interviewer asks a question about the scope of macro variables or the significance of the

diference between local and global macro variables, the programming concept of scope is being used to

see how you handle the new ways of thinking that programming requires. The possibility that the same

name could be used for diferent things at diferent times is one of the more basic philosophical

conundrums in computer programming. If you can appreciate the diference between a name and the

object that the name refers to, then you can probably handle all the other philosophical challenges of

programming.

Run groups

Run-group procedures are not a big part of base SAS, so a question about run-group processing and

the diference between the RUN and QUIT statements probably has more to do with:
What a procedure is

What a step is

All the work SAS has to go through as it alternately acquires a part of the SAS program from the

execution queue, then executes that part of the program

Connecting the program and the log messages

SAS date values

Questions about SAS date values have less to do with whether you have memorized the reference

point of January 1, 1960, than with whether you understand the implications of time data treated as

numeric values, such as:

Using a date format to display the date variable in a meaningful way

Computing a length of time by subtracting SAS date values

Efficiency techniques

With today’s bigger, faster computers, efficiency is a major concern only for the very largest SAS

projects. If you get a series of technical questions about efficiency, it could mean one of the following:

The employer is undertaking a project with an especially large volume of data


The designated computer is not one of today’s bigger, faster computers

The project is weighed down with horrendously inefficient code, and they are hoping you will be

able to clean it all up

On the other hand, the interviewer may just be trying to gauge how well you understand the way

SAS statements correspond to the actions the computer takes or how seriously you take the testing

process for a program you write.

Debugger

Most SAS programmers never use the data step debugger, so questions about it are probably

intended to determine how you feel about debugging — does the debugging process bug you, or is

debugging one of the most essential things you do as a programmer?

Informats vs. formats

If you appreciate the distinction between informats and formats, it shows that:

You can focus on details

It doesn’t confuse you that two routines have the same name

You have some idea of what is going on when a SAS program runs
TRANSPOSE procedure

The TRANSPOSE procedure has a few important uses, but questions about it usually don’t have that

much to do with the procedure itself. The intriguing characteristic of the TRANSPOSE procedure is that

input data values determine the names of output variables. The implication of this is that if the data

values are incorrect, the program could end up with the wrong output variables. In what other ways

does a program depend on having valid or correct data values as a starting point? What does it take to

write a program that will run no matter what input data values are supplied?

_N_

Questions about the automatic variable _N_ (this might be pronounced “underscore N underscore”

or just “N”) are meant to get at your understanding of the automatic actions of the data step, especially

the automatic data step loop, also known as the observation loop.

A possible follow-up question asks how you can store the value of _N_ in the output SAS dataset. If

you can answer this, it may show that you know the properties of automatic variables and know how to

create a variable in the data step.

PUT function
A question about the PUT function might seem to be a trick question, but it is not meant to be.

Beyond showing that you aren’t confused by two things as diferent as a statement and a function having

the same name, your discussion of the PUT function can show:

An understanding of what formats are

Your experience in creating variables in data step statements

A few of the finer points of SQL query optimization

Important SAS trivia

Some SAS trivia may be important to know in a technical interview, even though it may never come

up in your actual SAS programming work.

MERGE is a data step statement only. There is no MERGE procedure. “PROC MERGE” is a mythical

construction created years ago by Rhena Seidman, and if you are asked about it in a job interview, it is

meant as a trick question.

It is possible to use the MERGE statement without a BY statement, but this usually occurs by mistake.

SAS does not provide an easy way to create a procedure in a SAS program. However, it is easy to

define informats and formats and use them in the same program. Beginning with SAS 9.2, the same is

true of functions.
The MEANS and SUMMARY procedures are identical except for the defaults for the PRINT option and

VAR statement.

Much of the syntax of the TABULATE procedure is essentially the same of that of the SUMMARY

procedure.

CARDS is another name for DATALINES (or vice versa).

“DATA _NULL_” is commonly used as a code word to refer to data step programming that creates

print output or text data files.

The program data vector (PDV) is a logical block of data that contains the variables used in a data

step or proc step. Variables are added to the program data vector in order of appearance, and this is

what determines their position (or variable number)

1. What SAS statements would you code to read an external raw data file to a DATA step?

2. How do you read in the variables that you need?

3. Are you familiar with special input delimiters? How are they used?
4. If reading a variable length file with fixed input, how would you prevent SAS from reading the

next record if the last variable didn’t have a value?

5. What is the diference between an informat and a format? Name three informats or formats.

6. Name and describe three SAS functions that you have used, if any?

7. How would you code the criteria to restrict the output to be produced?

8. What is the purpose of the trailing @? The @@? How would you use them?

9. Under what circumstances would you code a SELECT construct instead of IF statements?

10. What statement do you code to tell SAS that it is to write to an external file? What statement do

you code to write the record to the file?


11. If reading an external file to produce an external file, what is the shortcut to write that record

without coding every single variable on the record?

12. If you’re not wanting any SAS output from a data step, how would you code the data statement

to prevent SAS from producing a set?

13. What is the one statement to set the criteria of data that can be coded in any step?

14. Have you ever linked SAS code? If so, describe the link and any required statements used to

either process the code or the step itself.

15. How would you include common or reuse code to be processed along with your statements?

16. When looking for data contained in a character string of 150 bytes, which function is the best to

locate that data: scan, index, or indexc?

17. If you have a data set that contains 100 variables, but you need only five of those, what is the

code to force SAS to use only those variable?


18. Code a PROC SORT on a data set containing State, District and County as the primary variables,

along with several numeric variables.

19. How would you delete duplicate observations?

20. How would you delete observations with duplicate keys?

21. How would you code a merge that will keep only the observations that have matches from both

sets.

22. How would you code a merge that will write the matches of both to one data set, the non-

matches from the left-most data set to a second data set, and the non-matches of the right-most data

set to a third data set.

23. What is the Program Data Vector (PDV)? What are its functions?

24. Does SAS ‘Translate’ (compile) or does it ‘Interpret’? Explain.


25. At compile time when a SAS data set is read, what items are created?

26. Name statements that are recognized at compile time only?

27. Identify statements whose placement in the DATA step is critical.

28. Name statements that function at both compile and execution time.

29. Name statements that are execution only.

30. In the flow of DATA step processing, what is the first action in a typical DATA Step?

31. What is _n_?

What has been your most common programming mistake?


What is your favorite programming language and why?

What is your favorite operating system? Why?

Do you observe any coding standards? What is your opinion of them?

What percent of your program code is usually original and what percent copied and modified?

Have you ever had to follow SOPs or programming guidelines?

Which is worse: not testing your programs or not commenting your programs?

Name several ways to achieve efficiency in your program. Explain trade-ofs.

What other SAS products have you used and consider yourself proficient in using?

How do you make use of functions?


When looking for contained in a character string of 150 bytes, which function is the best to locate

that data: scan, index, or indexc?

What is the significance of the ‘OF’ in X=SUM(OF a1-a4, a6, a9);?

What do the PUT and INPUT functions do?

Which date function advances a date, time or date/time value by a given interval?

What do the MOD and INT function do?

How might you use MOD and INT on numerics to mimic SUBSTR on character strings?

In ARRAY processing, what does the DIM function do?

How would you determine the number of missing or nonmissing values in computations?
What is the diference between: x=a+b+c+d; and x=SUM(a,b,c,d);?

There is a field containing a date. It needs to be displayed in the format “ddmonyy” if it’s before

1975, “dd mon ccyy” if it’s after 1985, and as ‘Disco Years’ if it’s between 1975 and 1985. How would you

accomplish this in data step code? Using only PROC FORMAT.

In the following DATA step, what is needed for ‘fraction’ to print to the log? data _null_; x=1/3; if

x=.3333 then put ‘fraction’; run;

What is the diference between calculating the ‘mean’ using the mean function and PROC MEANS?

Have you ever used “Proc Merge”? (be prepared for surprising answers..)

If you were given several SAS data sets you were unfamiliar with, how would you find out the

variable names and formats of each dataset?

What SAS PROCs have you used and consider yourself proficient in using?
How would you keep SAS from overlaying the a SAS set with its sorted version?

In PROC PRINT, can you print only variables that begin with the letter “A”?

What are some diferences between PROC SUMMARY and PROC MEANS?

Code the tables statement for a single-level (most common) frequency.

Code the tables statement to produce a multi-level frequency.

Name the option to produce a frequency line items rather that a table.

Produce output from a frequency. Restrict the printing of the table.

Code a PROC MEANS that shows both summed and averaged output of the data.
Code the option that will allow MEANS to include missing numeric data to be included in the report.

Code the MEANS to produce output to be used later.

Do you use PROC REPORT or PROC TABULATE? Which do you prefer? Explain.

What happens in a one-on-one merge? When would you use one?

How would you combine 3 or more tables with diferent structures?

What is a problem with merging two data sets that have variables with the same name but diferent

data?

When would you choose to MERGE two data sets together and when would you SET two data sets?

Which data set is the controlling data set in the MERGE statement?
How do the IN= variables improve the capability of a MERGE?

Explain the message ‘MERGE HAS ONE OR MORE DATASETS WITH REPEATS OF BY VARIABLES”.

How would you generate 1000 observations from a normal distribution with a mean of 50 and

standard deviation of 20. How would you use PROC CHART to look at the distribution? Describe the

shape of the distribution.

How do you generate random samples?

What is the purpose of the statement DATA _NULL_ ;?

What is the pound sign used for in the DATA _NULL_?

What would you use the trailing @ sign for?

For what purpose(s) would you use the RETURN statement?


How would you determine how far down on a page you have printed in order to print out footnotes?

What is the purpose of using the N=PS option?

What system options would you use to help debug a macro?

Describe how you would create a macro variable.

How do you identify a macro variable?

How do you define the end of a macro?

How do you assign a macro variable to a SAS variable?

For what purposes have you used SAS macros?


What is the diference between %LOCAL and %GLOBAL?

How long can a macro variable be? A token?

If you use a SYMPUT in a DATA step, when and where can you use the macro variable?

What do you code to create a macro? End one?

Describe how you would pass data to a macro.

You have five data sets that need to be processed identically; how would you simplify that processing

with a macro?

How would you code a macro statement to produce information on the SAS log? This statement can

be coded anywhere.

How do you add a number to a macro variable?


If you need the value of a variable rather than the variable itself, what would you use to load the

value to a macro variable?

Can you execute a macro within a macro? Describe.

Can you a macro within another macro? If so, how would SAS know where the current macro ended

and the new one began?

How are parameters passed to a macro?

1. What is the effect of the OPTIONS statement ERRORS=1?


2. What’s the difference between VAR A1 - A4 and VAR A1 — A4?

3. What do the SAS log messages "numeric values have been converted to character" mean? What are the
implications?

4. Why is a STOP statement needed for the POINT= option on a SET statement?

5. How do you control the number of observations and/or variables read or written?

6. Approximately what date is represented by the SAS date value of 730?

7. How would you remove a format that has been permanently associated with a variable??

8. What does the RUN statement do?

9. Why is SAS considered self-documenting?


10. What areas of SAS are you most interested in?

11.Briefly describe 5 ways to do a "table lookup" in SAS.

12. What versions of SAS have you used (on which platforms)?

13. What are some good SAS programming practices for processing very large data sets?

14. What are some problems you might encounter in processing missing values? In Data steps? Arithmetic?
Comparisons? Functions? Classifying data?

15. How would you create a data set with 1 observation and 30 variables from a data set with 30 observations
and 1 variable?

16. What is the different between functions and PROCs that calculate the same simple descriptive statistics?

17. If you were told to create many records from one record, show how you would do this using arrays and
with PROC TRANSPOSE?

18. What are _numeric_ and _character_ and what do they do?

19. How would you create multiple observations from a single observation?

20. For what purpose would you use the RETAIN statement?

21. What is a method for assigning first.VAR and last.VAR to the BY group variable on unsorted data?

22. What is the order of application for output data set options, input data set options and SAS statements?

23. What is the order of evaluation of the comparison operators: + - * / ** ( ) ?

24. How could you generate test data with no input data?

25. How do you debug and test your SAS programs?

26. What can you learn from the SAS log when debugging?

27. What is the purpose of _error_?

28. How can you put a "trace" in your program?

29. Are you sensitive to code walk-throughs, peer review, or QC review?

30. Have you ever used the SAS Debugger?

31. What other SAS features do you use for error trapping and data validation?

32. How does SAS handle missing values in: assignment statements, functions, a merge, an update, sort order,
formats, PROCs?
33. How many missing values are available? When might you use them?

34. How do you test for missing values?

35. How are numeric and character missing values represented internally?

You might also like