bda file
bda file
27514
CSE-A
Index
Sr.No. Title Remarks
1 Perform basic Mathematical
Operations using R.
2 Write an R script to find basic
descriptive statistics using
summary, str, quartile function on
mtcars & cars datasets.
3 Write an R script to find subset of
dataset by using subset ( ),
aggregate ( ) functions on iris
dataset.
4 Reading different types of data
sets (.txt, .csv) from web and disk
and writing in file in specific disk
Location.
5 Reading Excel data sheet in R.
6 Reading XML dataset in R.
7 Find the data distributions using
box and scatter plot.
8 Find the outliers using the
previous plot.
9 Plot a histogram using the given
sample data.
10 Plot a bar chart using the given
sample data.
11 Plot the bar chart using the given
sample data.
12 Find a Correlation matrix and plot
the correlation on iris data set.
13 Plot the correlation plot on the
dataset and visualize, giving an
overview of relationships among
data on the iris data.
14 Analysis of covariance for the iris
dataset with categorical variables.
15 Plot the given cluster data using R
visualizations.
1. Perform basic Mathematical
Operations using R.
> A = 1563123
> B = 65132334
> C = A+B
>C
[1] 66695457
> D = B-A
>D
[1] 63569211
> E = A*B
>E
[1] 1.018098e+14
> F = A/B
>F
[1] 0.02399919
> class(A)
[1] "numeric"
> class(B)
[1] "numeric"
> class(C)
[1] "numeric"
> class(D)
[1] "numeric"
> class(E)
[1] "numeric"
> class(F)
[1] "numeric"
> E<F
[1] FALSE
Screenshots:
2. Write an R script to find basic
descriptive statistics using summary, str,
quantile function on mtcars & cars datasets.
> mtcars
> summary(mtcars)
drat wt qsec vs
am gear carb
> str(mtcars)
$ mpg : num 21 21 22.8 21.4 18.7 18.1 14.3 24.4 22.8 19.2 ...
$ drat: num 3.9 3.9 3.85 3.08 3.15 2.76 3.21 3.69 3.92 3.92 ...
$ vs : num 0 0 1 1 0 1 0 1 1 1 ...
$ am : num 1 1 1 0 0 0 0 0 0 0 ...
> quantile(mtcars$mpg)
> cars
speed dist
1 4 2
2 4 10
3 7 4
4 7 22
5 8 16
6 9 10
7 10 18
8 10 26
9 10 34
10 11 17
11 11 28
12 12 14
13 12 20
14 12 24
15 12 28
16 13 26
17 13 34
18 13 34
19 13 46
20 14 26
21 14 36
22 14 60
23 14 80
24 15 20
25 15 26
26 15 54
27 16 32
28 16 40
29 17 32
30 17 40
31 17 50
32 18 42
33 18 56
34 18 76
35 18 84
36 19 36
37 19 46
38 19 68
39 20 32
40 20 48
41 20 52
42 20 56
43 20 64
44 22 66
45 23 54
46 24 70
47 24 92
48 24 93
49 24 120
50 25 85
> summary(cars)
speed dist
> class(cars)
[1] "data.frame"
> dim(cars)
[1] 50 2
> str(cars)
> quantile(cars$speed)
4 12 15 19 25
Screenshots:
3. Write an R script to find subset of
dataset by using subset ( ), aggregate ( )
functions on iris dataset.
> iris
> data<-read.csv("input.csv")
> data
> print(is.data.frame(data))
[1] TRUE
> print(ncol(data))
[1] 5
> print(nrow(data))
[1] 8
> sal
[1] 843.25
> # Getting the details of the person with the max salary.
> details
> # Getting the details of all the employees working in the IT department.
> it_details
> join_details
> newdata
--- Please select a CRAN mirror for use in this session ---
downloaded 1.2 MB
downloaded 9.0 MB
downloaded 366 KB
C:\Users\Avi\AppData\Local\Temp\RtmpyWUuTJ\downloaded_packages
> library("xlsx")
> getwd()
[1] "C:/Users/Avi/Documents"
> getwd()
> library("xlsx")
> data
> library("methods")
> result<-xmlParse(file="input.xml")
> result
<?xml version="1.0"?>
<?xml-stylesheet href="sheet.css"?>
<dataset>
<data>
<id>"1"</id>
<name>"Rick"</name>
<salary>"623.3"</salary>
<start_date>"2012-01-01"</start_date>
<dept>"IT"</dept>
</data>
<data>
<id>"2"</id>
<name>"Dan"</name>
<salary>"515.2"</salary>
<start_date>"2013-09-23"</start_date>
<dept>"Operations"</dept>
</data>
<data>
<id>"3"</id>
<name>"Michelle"</name>
<salary>"611"</salary>
<start_date>"2014-11-15"</start_date>
<dept>"IT"</dept>
</data>
<data>
<id>"4"</id>
<name>"Ryan"</name>
<salary>"729"</salary>
<start_date>"2014-05-11"</start_date>
<dept>"HR"</dept>
</data>
<data>
<id>"5"</id>
<name>"Gary"</name>
<salary>"843.25"</salary>
<start_date>"2015-03-27"</start_date>
<dept>"Finance"</dept>
</data>
<data>
<id>"6"</id>
<name>"Nina"</name>
<salary>"578"</salary>
<start_date>"2013-05-21"</start_date>
<dept>"IT"</dept>
</data>
<data>
<id>"7"</id>
<name>"Simon"</name>
<salary>"632.8"</salary>
<start_date>"2013-07-30"</start_date>
<dept>"Operations"</dept>
</data>
<data>
<id>"8"</id>
<name>"Guru"</name>
<salary>"722.5"</salary>
<start_date>"2014-06-17"</start_date>
<dept>"Finance"</dept>
</data>
</dataset>
Screenshots:
7. Find the data distributions using box
and scatter plot.
> library(ggplot2)
> input
mpg cyl
Valiant 18.1 6
> boxplot(v)
Screenshots:
9. Plot a histogram using the given
sample data.
Histogram:
> library(graphics)
> v <- c(9, 13, 21, 8, 36, 22, 12, 41, 31, 33, 19)
> cor(d)
x1 x2 x3
> library(corrplot)
> COR
Screenshots:
13. Plot the correlation plot on the dataset
and visualize, giving an overview of
relationships among data on the iris data.
> Y <- seq(dim(y)[2])
> str(iris)
$ Sepal.Length: num 5.1 4.9 4.7 4.6 5 5.4 4.6 5 4.4 4.9 ...
$ Sepal.Width : num 3.5 3 3.2 3.1 3.6 3.9 3.4 3.4 2.9 3.1 ...
$ Petal.Length: num 1.4 1.4 1.3 1.5 1.4 1.7 1.4 1.5 1.4 1.5 ...
$ Petal.Width : num 0.2 0.2 0.2 0.2 0.2 0.4 0.3 0.2 0.2 0.1 ...
> set.seed(20)
> irisCluster
Cluster means:
Petal.Length Petal.Width
1 4.269231 1.342308
2 5.595833 2.037500
3 1.462000 0.246000
Clustering vector:
[1] 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3
31111111111111111111111111112111112111
[88] 1 1 1 1 1 1 1 1 1 1 1 1 1 2 2 2 2 2 2 1 2 2 2 2 2 2 2 2 2 2 2 2 1 2 2 2 2 2 2 1 2 2 2 2 2 2 2 2
222122222222222
Available components: