Top 100 SAS Interview Questions and Answers For 2022
Top 100 SAS Interview Questions and Answers For 2022
listendata.com/2013/09/sas-interview-questions.html
This article includes most frequently asked SAS interview questions which would help
you to crack SAS Interview with confidence. It covers basic, intermediate and advanced
concepts of SAS which outlines topics on reading data into SAS, data manipulation,
reporting, SQL queries and SAS Macros. It includes questions ranging from simple
theoretical concepts to tricky interview questions which are generally asked in freshers
and experienced SAS programmers' interview.
Note : The variable name, followed by $ (dollar sign), idenfities the variable type as
character. In the example shown above, ID and SEX are numeric variables and Name a
character variable.
1
22
333
4444
This DATA step uses the numeric informat 4. to read a single field in each record of raw
data and to assign values to the variable ID.
data readin;
infile 'external-file' missover;
input ID4.;
run;
proc print data=readin;
run;
Obs ID
1 .
2 .
3 .
4 4444
Truncover
data readin;
infile 'external-file' truncover;
input ID4.;
run;
proc print data=readin;
run;
Obs ID
1 1
2 22
3 333
4 4444
DATA Readin;
Input Name $ Score @@;
cards;
Sam 25 David 30 Ram 35
Deeps 20 Daniel 47 Pars 84
;
RUN;
The DROP statement specifies the names of the variables that you want to remove from
the data set.
data readin1;
set readin;
drop score;
run;
The KEEP statement specifies the names of the variables that you want to retain from the
data set.
data readin1;
set readin;
keep var1;
run;
The main difference between DROP/ KEEP statement and DROP=/ KEEP=data set
option is that you can not use DROP/KEEP statement in procedures.
8. Name and describe functions that you have used for data
cleaning?
The MEAN function is an average of the value of several variables in one observation.
The average that is calculated using PROC MEANS is the sum of all of the values of a
variable divided by the number of observations in the variable.
In other words,The MEAN function will sum across the row and a procedure will SUM
down a column.
MEAN Function
PROC MEANS
13. What is the difference between '+' operator and SUM function?
SUM function returns the sum of non-missing arguments whereas “+” operator returns a
missing value if any of the arguments are missing.
Suppose we have a data set containing three variables - X, Y and Z. They all have
missing values. We wish to compute sum of all the variables.
data mydata2;
set mydata;
a=sum(x,y,z);
p=x+y+z;
run;
In the output, value of p is missing for 4th, 5th and 6th observations.
ID Name Score
1 David 45
1 David 74
2 Sam 45
2 Ram 54
3 Bane 87
3 Mary 92
3 Bane 87
4 Dane 23
5 Jenny 87
5 Ken 87
6 Simran 63
8 Priya 72
data readin;
input ID Name $ Score;
cards;
1 David 45
1 David 74
2 Sam 45
2 Ram 54
3 Bane 87
3 Mary 92
3 Bane 87
4 Dane 23
5 Jenny 87
5 Ken 87
6 Simran 63
8 Priya 72;
run;
There are several ways to identify and remove unique and duplicate values:
PROC SORT
In PROC SORT, there are two options by which we can remove duplicates.
The NODUPKEY option removes duplicate observations where value of a variable listed
in BY statement is repeated while NODUP option removes duplicate observations where
values in all the variables are repeated (identical observations).
The NODUPKEY has deleted 5 observations with duplicate values whereas NODUP has
not deleted any observations.
To fix this issue, sort on all the variables in the dataset READIN.
To sort by all the variables without having to list them all in the program, you can use the
keywork ‘_ALL_’in the BY statement (see below).
16. What are _numeric_ and _character_ and what do they do?
1. _NUMERIC_ specifies all numeric variables that are already defined in the current
DATA step.
2. _CHARACTER_ specifies all character variables that are currently defined in the
current DATA step.
3. _ALL_ specifies all variables that are currently defined in the current DATA step.
proc means;
var _numeric_;
run;
Example :
SELECT (str);
WHEN ('Sun') wage=wage*1.5;
WHEN ('Sat') wage=wage*1.3;
OTHERWISE DO;
wage=wage+1;
bonus=0;
END;
END;
data _null_ ;
phone='(312) 555-1212' ;
area_cd=substr(phone, 2, 3) ;
put area_cd=;
run;
MERGE
data readin;
merge file1(in=infile1) file2(in=infile2);
by id;
if infile1=infile2;
run;
data readin;
merge file1(in=infile1)file2(in=infile2);
by id;
if infile1 ne infile2;
run;
data readin;
do i=1 to 100;
temp=0 + rannor(1) * 1;
output;
end;
run;
proc means data=readin mean stddev;
var temp;
run;
proc format;
value score 0 - 100=‘100-‘
101 - 200=‘101+’
other=‘others’
;
proc freq data=readin;
tables outdata;
format outdatascore. ;
run;
data readin;
set outdata;
array Q(20) Q1-Q20;
do i=1 to 20;
if Q(i)=6 then Q(i)=.;
end;
run;
Note : DIM returns a total count of the number of elements in array dimension Q.
proc format;
value $missfmt ' '='Missing' other='Not Missing';
value missfmt .='Missing' other='Not Missing';
run;
39. Describe the ways in which you can create macro variables
There are 5 ways to create macro variables:
1. %Let
2. Iterative %DO statement
3. Call Symput
4. Proc SQl into clause
5. Macro Parameters.
43. How to count the number of intervals between two given SAS
dates?
INTCK(interval,start-of-period,end-of-period) is an interval function that counts the
number of intervals between two give SAS dates, Time and/or datetime.
Data strings;
Text1=“MICKEY MOUSE & DONALD DUCK”;
Text=scan(text1,2,’&’);
Run;
46. For what purpose would you use the RETAIN statement?
A RETAIN statement tells SAS not to set variables to missing when going from the
current iteration of the DATA step to the next. Instead, SAS retains the values.
PROC SQL;
SELECT WEIGHT,
CASE
WHEN WEIGHT BETWEEN 0 AND 50 THEN ’LOW’
WHEN WEIGHT BETWEEN 51 AND 70 THEN ’MEDIUM’
WHEN WEIGHT BETWEEN 71 AND 100 THEN ’HIGH’
ELSE ’VERY HIGH’
END AS NEWWEIGHT FROM HEALTH;
QUIT;
56. How Data Step Merge and PROC SQL handle many-to-many
relationship?
Data Step MERGE does not create a cartesian product incase of a many-to-many
relationship. Whereas, Proc SQL produces a cartesian product.
If you need to connect directly to a database and pull tables from there, then use PROC
SQL.
While I love having friends who agree, I only learn from those who don't
Let's Get Connected Email LinkedIn