(Data Pre-Processing & Visualization) : (Mid-Term Assignment)
(Data Pre-Processing & Visualization) : (Mid-Term Assignment)
ertyuiopasdfghjklzxcvbnmqwert
yuiopasdfghjklzxcvbnmqwertyui
opasdfghjklzxcvbnmqwertyuiopa
[Data Pre-Processing & Visualization]
sdfghjklzxcvbnmqwertyuiopasdf
[Mid-Term Assignment]
ghjklzxcvbnmqwertyuiopasdfghj
SHIVAM MISHRA – 21PGDM074
klzxcvbnmqwertyuiopasdfghjklz
xcvbnmqwertyuiopasdfghjklzxcv
bnmqwertyuiopasdfghjklzxcvbn
mqwertyuiopasdfghjklzxcvbnmq
wertyuiopasdfghjklzxcvbnmqwe
rtyuiopasdfghjklzxcvbnmqwerty
uiopasdfghjklzxcvbnmqwertyuio
pasdfghjklzxcvbnmqwertyuiopas
dfghjklzxcvbnmqwertyuiopasdfg
hjklzxcvbnmqwertyuiopasdfghjk
Use the mtcars dataset (from R environment) to answer the
mtcars[1:20,]? [2 marks]
ANSWER – 1.
i.) mtcars[mtcars$cyl = 4, ]
ANSWER -To fixed this error we can use either subset function like this –
subset(mtcars,cyl==4) or mtcars[mtcars$cyl==4,] this can be also be used as
ANSWER – To fix this error we will either replace – sign as inside the brackets
positions are saved of the row and coloumns so it can not be negative . correct code
will be mtcars[1:4]
iii.) mtcars[mtcars$cyl <= 5]
ANSWER – the purpose of this function is to write straight away all the data
with cyl<=5 for which we can write a subset function and can run the code
iv) mtcars[mtcars$cyl == 4 | 6, ]
6,]. Thus, this will give the output as only the rows of that which have the cyl
column as 4 or 6.
QUESTION – 3 Rename the cars in mtcars file that have names as
ANSWER - we have to run the following codes so that the name of the rows are
changed from Merc to Mercedes . rownames(mtcars) x<-rownames(mtcars)
x=gsub("Merc","Mercedes",x) print(x) rownames(mtcars)<-x
First of all we’ll save the names of the rows in a variable x and then we will run the
gsub function to change the names of rows from Merc To Mercedes and then again
we will push the changed names of the rows into the table
QUESTION – 4 Extract car records from mtcars with cylinder
greater than 4.00 and weighs less than mean weight. (File) [4 marks]
summary(mtcars)
subset(mtcars,cyl>4)
Running the above code we will be able to extract the date with wt less than mean and cyinder
greater than 4.
Use the Building_Permits.csv file (from Google Classroom) to answer
variable/column to convert.
ii. What are the oldest and newest permit date records for the
buildings? [2 marks]
ANSWER - import the data file in RStudio. then, convert that permit creation
date column from character to date using the code: Building_Permits$`Permit
%d/%Y")
i) - The building records with permit date before 1 January 2013 can be
28”.
02-23”.
QUESTION – 6 Convert the Block column in the original dataset from
variable/column to convert.
ii. What are the oldest and newest permit date records for the
[2 marks]
variable/column to convert.
iii. Now extract the building records with permit date after 1
[4 marks]
ANSWER - convert the permit creation date column from character to date
using the code: Building_Permits$Block<-as.numeric(Building_Permits$Block)
ANSWER – i) To get the building records with permit date before 1 january
2015, we use:
the dates in accordance with the block 326. Hence, first we make a data
then, we use this data table in the min and max function to get the dates.
To extract the building records between Permit Creation Date after 1 January
2015 and Completed Date after 1 January 2018 we use the code:
you can see there are 6 columns with the same 3.130 rows.