Sas
Sas
Statements can start anywhere and end anywhere. A semicolon at the end of the last line marks the
end of the statement.
The DATA statement marks the creation of a new SAS data set. The rules for DATA set creation are as
below.
A single word after the DATA statement indicates a temporary data set name. Which means the data
set gets erased at the end of the session.
The data set name can be prefixed with a library name which makes it a permanent data set. Which
means the data set persists after the session is over.
DATA TempData;
DATA abc;
DATA newdat;
DATA LIBRARY1.DATA1
DATA MYLIB.newdat;
----
*.sas − It represents the SAS code file which can be edited using the SAS Editor or any text
editor.
*.log − It represents the SAS Log File it contains information such as errors, warnings, and
data set details for a submitted SAS program.
*.sas7bdat −It represents SAS Data File which contains a SAS data set including variable
names, labels, and the results of calculations.
----
* This is comment ;
/*message*/
1. Numeric Variables ::
INPUT ID SALARY COMM_PERCENT;
2. Character Variables
INPUT VAR1 $ VAR2 $ VAR3 $;
3. Date Variables
String Variables ::
data string_examples;
LENGTH string1 $ 6 String2 $ 5;
/*String variables of length 6 and 5 */
String1 = 'Hello';
String2 = 'World';
Joined_strings = String1 ||String2 ;
run;
proc print data = string_examples noobs;
run;
Example
The below code shows how the three types of variables are declared and used in a SAS Program
DATA TEMP;
DATALINES;
RUN;
----
SUBSTRN('stringval',p1,p2)
TRIMN('stringval')
Array ::
ARRAY COUNTRIES(0:8) A B C D E F G H I;
# Declare an array of length 5 named QUESTS which contain character values.
In the above example all the character variables are declared followed by a $ sign and the date
variables are declared followed by a date format.
we can produce a summary statistics of some of these variables using the Tasks options in SAS
studio. Go to Tasks -> Statistics -> Summary Statistics and double click it to open the window as
shown below.
DATA MYDATA1;
Add_result = COL1+COL2;
Sub_result = COL1-COL2;
Mult_result = COL1*COL2;
Div_result = COL1/COL2;
Expo_result = COL1**COL2;
datalines;
11.21 5.3
3.11 11
RUN;
^= Not equal to
DATA MYDATA1;
SUM = 0;
DO VAR = 1 to 5; or DO WHILE(VAR<6) or DO UNTIL(VAR>5);
SUM = SUM+VAR;
END;
DATA EMPDAT;
DATALINES;
Data EMPDAT1; ?
Set EMPDAT; ?
OR
IF SALARY > 650 THEN SALRANGE ="HIGH";
OR
IF SALARY < 600 THEN SALRANGE = "LOW";
OR
IF SALARY > 700 THEN DELETE;
run;
-----
max_val = MAX(v1,v2,v3,v4,v5);
rand_val = RANUNI(0);
SR_val= SQRT(sum(v1,v2,v3,v4,v5));
Read File ::
data TEMP;
infile
'/folders/myfolders/sasuser.v94/TutorialsPoint/emp.csv' dlm=",";
run;
RUN;
Export file ::
PROC EXPORT
OUTFILE = "filename"
outfile = '/folders/myfolders/sasuser.v94/TutorialsPoint/car_data.txt'
dbms = dlm;
run;
Concatenate dataset ::
DATA ITDEPT;
DATALINES;
RUN;
DATA NON_ITDEPT;
DATALINES;
2 Dan 515.2
4 Ryan 729.1
5 Gary 843.25
7 Pranab 632.8
8 Rasmi 722.5
RUN;
DATA All_Dept;
SET ITDEPT NON_ITDEPT;
RUN;
RUN;
Merging ::
DATA SALARY;
DATALINES;
1 Rick 623.3
2 Dan 515.2
3 Mike 611.5
4 Ryan 729.1
5 Gary 843.25
6 Tusar 578.6
7 Pranab 632.8
8 Rasmi 722.5
RUN;
DATA DEPT;
DATALINES;
1 IT
2 OPS
3 IT
4 HR
5 FIN
6 IT
7 OPS
8 FIN
;
RUN;
DATA All_details;
BY (empid);
RUN;
RUN;
DATA All_details;
BY (empid);
IF a = 1 and b = 1;
RUN;
RUN;
Sorting data ::
DATA Employee;
DATALINES;
1 Rick 623.3 IT
3 Mike 611.5 IT
4 Ryan 729.1 HR
6 Tusar 578.6 IT
RUN;
BY salary;
RUN ;
RUN ;
retain Department;
input Type $ @;
input Department $;
else do;
output;
end;
run;
RUN;
PROC EXPORT
It is a SAS inbuilt procedure used to export the SAS data sets for writing the data into files of
different formats.
outfile = '/folders/myfolders/sasuser.v94/TutorialsPoint/car_data.txt'
dbms = dlm;
run;
DATA All_Dept;
RUN;
For merging ::
DATA All_details;
BY (empid);
RUN;
RUN;
DATA OnlyDept;
SET Employee;
DROP salary;
RUN;
PROC PRINT DATA = OnlyDept;
RUN;
Sorting data ::
BY salary;
RUN ;
RUN ;
PROC SQL;
SELECT Columns
FROM TABLE
WHERE Columns
GROUP BY Columns
QUIT;
SAS SQL :::
PROC SQL;
SELECT Columns
FROM TABLE
WHERE Columns
GROUP BY Columns
QUIT;
DATA TEMP;
DATALINES;
1 Rick 623.3 IT
3 Michelle 611 IT
4 Ryan 729 HR
6 Nina 578 IT
RUN;
PROC SQL;
QUIT;
PROC PRINT data = EMPLOYEES;
RUN;
PROC SQL;
SELECT make,model,type,invoice,horsepower
FROM
SASHELP.CARS
QUIT;
DATA TEMP;
DATALINES;
1 Rick 623.3 IT
3 Michelle 611 IT
4 Ryan 729 HR
6 Nina 578 IT
RUN;
PROC SQL;
SELECT ID as EMPID,
Name as EMPNAME ,
SALARY as SALARY,
DEPARTMENT as DEPT,
SALARY*0.23 as COMMISION
FROM TEMP;
QUIT;
PROC SQL;
UPDATE EMPLOYEES2
QUIT;
RUN;
The delete operation in SQL involves removing certain values from the table using the SQL DELETE
statement. We continue to use the data from the above example and delete the rows from the table
in which the salary of the employees is greater than 900.
PROC SQL;
QUIT;
RUN;
For mean ::
RUN;
For class
var horsepower;
RUN;
For standard deviation ::
PROC SQL;
FROM
SASHELP.CARS
RUN;
run;
Regression ::
PROC SQL;
FROM
SASHELP.CARS
RUN;
run;
Target Identify value to be changed
Define macro variable %let firnum = 773