0% found this document useful (0 votes)

9 views54 pages

bda file

The document is a practical file for a Big Data Analytics course, detailing various tasks and R scripts related to data analysis. It includes operations such as basic mathematical calculations, descriptive statistics, data reading from different formats, and visualizations like histograms and correlation plots. The file is submitted by a student to their instructor and contains an index of tasks performed using R programming.

Uploaded by

kajalc4499

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

9 views54 pages

bda file

Uploaded by

kajalc4499

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 54

Practical File

Big Data Analytics

PCC-CSE-404G

Submitted by: Submitted to:

Kajal Dr. Chhavi Rana

27514

CSE-A
Index
Sr.No. Title Remarks
1 Perform basic Mathematical
Operations using R.
2 Write an R script to find basic
descriptive statistics using
summary, str, quartile function on
mtcars & cars datasets.
3 Write an R script to find subset of
dataset by using subset ( ),
aggregate ( ) functions on iris
dataset.
4 Reading different types of data
sets (.txt, .csv) from web and disk
and writing in file in specific disk
Location.
5 Reading Excel data sheet in R.
6 Reading XML dataset in R.
7 Find the data distributions using
box and scatter plot.
8 Find the outliers using the
previous plot.
9 Plot a histogram using the given
sample data.
10 Plot a bar chart using the given
sample data.
11 Plot the bar chart using the given
sample data.
12 Find a Correlation matrix and plot
the correlation on iris data set.
13 Plot the correlation plot on the
dataset and visualize, giving an
overview of relationships among
data on the iris data.
14 Analysis of covariance for the iris
dataset with categorical variables.
15 Plot the given cluster data using R
visualizations.
1. Perform basic Mathematical
Operations using R.

> A = 1563123

> B = 65132334

> C = A+B

[1] 66695457

> D = B-A

[1] 63569211

> E = A*B

[1] 1.018098e+14

> F = A/B

[1] 0.02399919

> class(A)

[1] "numeric"

> class(B)

[1] "numeric"

> class(C)

[1] "numeric"

> class(D)

[1] "numeric"
> class(E)

[1] "numeric"

> class(F)

[1] "numeric"

> E<F

[1] FALSE
Screenshots:
2. Write an R script to find basic
descriptive statistics using summary, str,
quantile function on mtcars & cars datasets.
> mtcars

mpg cyl disp hp drat wt qsec vs am gear carb

Mazda RX4 21.0 6 160.0 110 3.90 2.620 16.46 0 1 4 4

Mazda RX4 Wag 21.0 6 160.0 110 3.90 2.875 17.02 0 1 4 4

Datsun 710 22.8 4 108.0 93 3.85 2.320 18.61 1 1 4 1

Hornet 4 Drive 21.4 6 258.0 110 3.08 3.215 19.44 1 0 3 1

Hornet Sportabout 18.7 8 360.0 175 3.15 3.440 17.02 0 0 3 2

Valiant 18.1 6 225.0 105 2.76 3.460 20.22 1 0 3 1

Duster 360 14.3 8 360.0 245 3.21 3.570 15.84 0 0 3 4

Merc 240D 24.4 4 146.7 62 3.69 3.190 20.00 1 0 4 2

Merc 230 22.8 4 140.8 95 3.92 3.150 22.90 1 0 4 2

Merc 280 19.2 6 167.6 123 3.92 3.440 18.30 1 0 4 4

Merc 280C 17.8 6 167.6 123 3.92 3.440 18.90 1 0 4 4

Merc 450SE 16.4 8 275.8 180 3.07 4.070 17.40 0 0 3 3

Merc 450SL 17.3 8 275.8 180 3.07 3.730 17.60 0 0 3 3

Merc 450SLC 15.2 8 275.8 180 3.07 3.780 18.00 0 0 3 3

Cadillac Fleetwood 10.4 8 472.0 205 2.93 5.250 17.98 0 0 3 4

Lincoln Continental 10.4 8 460.0 215 3.00 5.424 17.82 0 0 3 4

Chrysler Imperial 14.7 8 440.0 230 3.23 5.345 17.42 0 0 3 4

Fiat 128 32.4 4 78.7 66 4.08 2.200 19.47 1 1 4 1

Honda Civic 30.4 4 75.7 52 4.93 1.615 18.52 1 1 4 2

Toyota Corolla 33.9 4 71.1 65 4.22 1.835 19.90 1 1 4 1

Toyota Corona 21.5 4 120.1 97 3.70 2.465 20.01 1 0 3 1

Dodge Challenger 15.5 8 318.0 150 2.76 3.520 16.87 0 0 3 2

AMC Javelin 15.2 8 304.0 150 3.15 3.435 17.30 0 0 3 2

Camaro Z28 13.3 8 350.0 245 3.73 3.840 15.41 0 0 3 4

Pontiac Firebird 19.2 8 400.0 175 3.08 3.845 17.05 0 0 3 2

Fiat X1-9 27.3 4 79.0 66 4.08 1.935 18.90 1 1 4 1

Porsche 914-2 26.0 4 120.3 91 4.43 2.140 16.70 0 1 5 2

Lotus Europa 30.4 4 95.1 113 3.77 1.513 16.90 1 1 5 2

Ford Pantera L 15.8 8 351.0 264 4.22 3.170 14.50 0 1 5 4

Ferrari Dino 19.7 6 145.0 175 3.62 2.770 15.50 0 1 5 6

Maserati Bora 15.0 8 301.0 335 3.54 3.570 14.60 0 1 5 8

Volvo 142E 21.4 4 121.0 109 4.11 2.780 18.60 1 1 4 2

> summary(mtcars)

mpg cyl disp hp

Min. :10.40 Min. :4.000 Min. : 71.1 Min. : 52.0

1st Qu.:15.43 1st Qu.:4.000 1st Qu.:120.8 1st Qu.: 96.5

Median :19.20 Median :6.000 Median :196.3 Median :123.0

Mean :20.09 Mean :6.188 Mean :230.7 Mean :146.7

3rd Qu.:22.80 3rd Qu.:8.000 3rd Qu.:326.0 3rd Qu.:180.0

Max. :33.90 Max. :8.000 Max. :472.0 Max. :335.0

drat wt qsec vs

Min. :2.760 Min. :1.513 Min. :14.50 Min. :0.0000

1st Qu.:3.080 1st Qu.:2.581 1st Qu.:16.89 1st Qu.:0.0000

Median :3.695 Median :3.325 Median :17.71 Median :0.0000

Mean :3.597 Mean :3.217 Mean :17.85 Mean :0.4375

3rd Qu.:3.920 3rd Qu.:3.610 3rd Qu.:18.90 3rd Qu.:1.0000

Max. :4.930 Max. :5.424 Max. :22.90 Max. :1.0000

am gear carb

Min. :0.0000 Min. :3.000 Min. :1.000

1st Qu.:0.0000 1st Qu.:3.000 1st Qu.:2.000

Median :0.0000 Median :4.000 Median :2.000

Mean :0.4062 Mean :3.688 Mean :2.812

3rd Qu.:1.0000 3rd Qu.:4.000 3rd Qu.:4.000

Max. :1.0000 Max. :5.000 Max. :8.000

> str(mtcars)

'data.frame': 32 obs. of 11 variables:

$ mpg : num 21 21 22.8 21.4 18.7 18.1 14.3 24.4 22.8 19.2 ...

$ cyl : num 6 6 4 6 8 6 8 4 4 6 ...

$ disp: num 160 160 108 258 360 ...

$ hp : num 110 110 93 110 175 105 245 62 95 123 ...

$ drat: num 3.9 3.9 3.85 3.08 3.15 2.76 3.21 3.69 3.92 3.92 ...

$ wt : num 2.62 2.88 2.32 3.21 3.44 ...

$ qsec: num 16.5 17 18.6 19.4 17 ...

$ vs : num 0 0 1 1 0 1 0 1 1 1 ...

$ am : num 1 1 1 0 0 0 0 0 0 0 ...

$ gear: num 4 4 4 3 3 3 3 4 4 4 ...

$ carb: num 4 4 1 1 2 1 4 2 2 4 ...

> quantile(mtcars$mpg)

0% 25% 50% 75% 100%

10.400 15.425 19.200 22.800 33.900

> cars
speed dist

1 4 2

2 4 10

3 7 4

4 7 22

5 8 16

6 9 10

7 10 18

8 10 26

9 10 34

10 11 17

11 11 28

12 12 14

13 12 20

14 12 24

15 12 28

16 13 26

17 13 34

18 13 34

19 13 46

20 14 26

21 14 36

22 14 60

23 14 80

24 15 20

25 15 26
26 15 54

27 16 32

28 16 40

29 17 32

30 17 40

31 17 50

32 18 42

33 18 56

34 18 76

35 18 84

36 19 36

37 19 46

38 19 68

39 20 32

40 20 48

41 20 52

42 20 56

43 20 64

44 22 66

45 23 54

46 24 70

47 24 92

48 24 93

49 24 120

50 25 85

> summary(cars)
speed dist

Min. : 4.0 Min. : 2.00

1st Qu.:12.0 1st Qu.: 26.00

Median :15.0 Median : 36.00

Mean :15.4 Mean : 42.98

3rd Qu.:19.0 3rd Qu.: 56.00

Max. :25.0 Max. :120.00

> class(cars)

[1] "data.frame"

> dim(cars)

[1] 50 2

> str(cars)

'data.frame': 50 obs. of 2 variables:

$ speed: num 4 4 7 7 8 9 10 10 10 11 ...

$ dist : num 2 10 4 22 16 10 18 26 34 17 ...

> quantile(cars$speed)

0% 25% 50% 75% 100%

4 12 15 19 25
Screenshots:
3. Write an R script to find subset of
dataset by using subset ( ), aggregate ( )
functions on iris dataset.
> iris

Sepal.Length Sepal.Width Petal.Length Petal.Width Species

1 5.1 3.5 1.4 0.2 setosa

2 4.9 3.0 1.4 0.2 setosa

3 4.7 3.2 1.3 0.2 setosa

4 4.6 3.1 1.5 0.2 setosa

5 5.0 3.6 1.4 0.2 setosa

6 5.4 3.9 1.7 0.4 setosa

7 4.6 3.4 1.4 0.3 setosa

8 5.0 3.4 1.5 0.2 setosa

9 4.4 2.9 1.4 0.2 setosa

10 4.9 3.1 1.5 0.1 setosa

11 5.4 3.7 1.5 0.2 setosa

12 4.8 3.4 1.6 0.2 setosa

13 4.8 3.0 1.4 0.1 setosa

14 4.3 3.0 1.1 0.1 setosa

15 5.8 4.0 1.2 0.2 setosa

16 5.7 4.4 1.5 0.4 setosa

17 5.4 3.9 1.3 0.4 setosa

18 5.1 3.5 1.4 0.3 setosa

19 5.7 3.8 1.7 0.3 setosa

20 5.1 3.8 1.5 0.3 setosa

21 5.4 3.4 1.7 0.2 setosa

22 5.1 3.7 1.5 0.4 setosa

23 4.6 3.6 1.0 0.2 setosa

24 5.1 3.3 1.7 0.5 setosa

25 4.8 3.4 1.9 0.2 setosa

26 5.0 3.0 1.6 0.2 setosa

27 5.0 3.4 1.6 0.4 setosa

28 5.2 3.5 1.5 0.2 setosa

29 5.2 3.4 1.4 0.2 setosa

30 4.7 3.2 1.6 0.2 setosa

31 4.8 3.1 1.6 0.2 setosa

32 5.4 3.4 1.5 0.4 setosa

33 5.2 4.1 1.5 0.1 setosa

34 5.5 4.2 1.4 0.2 setosa

35 4.9 3.1 1.5 0.2 setosa

36 5.0 3.2 1.2 0.2 setosa

37 5.5 3.5 1.3 0.2 setosa

38 4.9 3.6 1.4 0.1 setosa

39 4.4 3.0 1.3 0.2 setosa

40 5.1 3.4 1.5 0.2 setosa

41 5.0 3.5 1.3 0.3 setosa

42 4.5 2.3 1.3 0.3 setosa

43 4.4 3.2 1.3 0.2 setosa

44 5.0 3.5 1.6 0.6 setosa

45 5.1 3.8 1.9 0.4 setosa

46 4.8 3.0 1.4 0.3 setosa

47 5.1 3.8 1.6 0.2 setosa

48 4.6 3.2 1.4 0.2 setosa

49 5.3 3.7 1.5 0.2 setosa

50 5.0 3.3 1.4 0.2 setosa

51 7.0 3.2 4.7 1.4 versicolor

52 6.4 3.2 4.5 1.5 versicolor

53 6.9 3.1 4.9 1.5 versicolor

54 5.5 2.3 4.0 1.3 versicolor

55 6.5 2.8 4.6 1.5 versicolor

56 5.7 2.8 4.5 1.3 versicolor

57 6.3 3.3 4.7 1.6 versicolor

58 4.9 2.4 3.3 1.0 versicolor

59 6.6 2.9 4.6 1.3 versicolor

60 5.2 2.7 3.9 1.4 versicolor

61 5.0 2.0 3.5 1.0 versicolor

62 5.9 3.0 4.2 1.5 versicolor

63 6.0 2.2 4.0 1.0 versicolor

64 6.1 2.9 4.7 1.4 versicolor

65 5.6 2.9 3.6 1.3 versicolor

66 6.7 3.1 4.4 1.4 versicolor

67 5.6 3.0 4.5 1.5 versicolor

68 5.8 2.7 4.1 1.0 versicolor

69 6.2 2.2 4.5 1.5 versicolor

70 5.6 2.5 3.9 1.1 versicolor

71 5.9 3.2 4.8 1.8 versicolor

72 6.1 2.8 4.0 1.3 versicolor

73 6.3 2.5 4.9 1.5 versicolor

74 6.1 2.8 4.7 1.2 versicolor

75 6.4 2.9 4.3 1.3 versicolor

76 6.6 3.0 4.4 1.4 versicolor

77 6.8 2.8 4.8 1.4 versicolor

78 6.7 3.0 5.0 1.7 versicolor

79 6.0 2.9 4.5 1.5 versicolor

80 5.7 2.6 3.5 1.0 versicolor

81 5.5 2.4 3.8 1.1 versicolor

82 5.5 2.4 3.7 1.0 versicolor

83 5.8 2.7 3.9 1.2 versicolor

84 6.0 2.7 5.1 1.6 versicolor

85 5.4 3.0 4.5 1.5 versicolor

86 6.0 3.4 4.5 1.6 versicolor

87 6.7 3.1 4.7 1.5 versicolor

88 6.3 2.3 4.4 1.3 versicolor

89 5.6 3.0 4.1 1.3 versicolor

90 5.5 2.5 4.0 1.3 versicolor

91 5.5 2.6 4.4 1.2 versicolor

92 6.1 3.0 4.6 1.4 versicolor

93 5.8 2.6 4.0 1.2 versicolor

94 5.0 2.3 3.3 1.0 versicolor

95 5.6 2.7 4.2 1.3 versicolor

96 5.7 3.0 4.2 1.2 versicolor

97 5.7 2.9 4.2 1.3 versicolor

98 6.2 2.9 4.3 1.3 versicolor

99 5.1 2.5 3.0 1.1 versicolor

100 5.7 2.8 4.1 1.3 versicolor

101 6.3 3.3 6.0 2.5 virginica

102 5.8 2.7 5.1 1.9 virginica

103 7.1 3.0 5.9 2.1 virginica

104 6.3 2.9 5.6 1.8 virginica

105 6.5 3.0 5.8 2.2 virginica

106 7.6 3.0 6.6 2.1 virginica

107 4.9 2.5 4.5 1.7 virginica

108 7.3 2.9 6.3 1.8 virginica

109 6.7 2.5 5.8 1.8 virginica

110 7.2 3.6 6.1 2.5 virginica

111 6.5 3.2 5.1 2.0 virginica

112 6.4 2.7 5.3 1.9 virginica

113 6.8 3.0 5.5 2.1 virginica

114 5.7 2.5 5.0 2.0 virginica

115 5.8 2.8 5.1 2.4 virginica

116 6.4 3.2 5.3 2.3 virginica

117 6.5 3.0 5.5 1.8 virginica

118 7.7 3.8 6.7 2.2 virginica

119 7.7 2.6 6.9 2.3 virginica

120 6.0 2.2 5.0 1.5 virginica

121 6.9 3.2 5.7 2.3 virginica

122 5.6 2.8 4.9 2.0 virginica

123 7.7 2.8 6.7 2.0 virginica

124 6.3 2.7 4.9 1.8 virginica

125 6.7 3.3 5.7 2.1 virginica

126 7.2 3.2 6.0 1.8 virginica

127 6.2 2.8 4.8 1.8 virginica

128 6.1 3.0 4.9 1.8 virginica

129 6.4 2.8 5.6 2.1 virginica

130 7.2 3.0 5.8 1.6 virginica

131 7.4 2.8 6.1 1.9 virginica

132 7.9 3.8 6.4 2.0 virginica

133 6.4 2.8 5.6 2.2 virginica

134 6.3 2.8 5.1 1.5 virginica

135 6.1 2.6 5.6 1.4 virginica

136 7.7 3.0 6.1 2.3 virginica

137 6.3 3.4 5.6 2.4 virginica

138 6.4 3.1 5.5 1.8 virginica

139 6.0 3.0 4.8 1.8 virginica

140 6.9 3.1 5.4 2.1 virginica

141 6.7 3.1 5.6 2.4 virginica

142 6.9 3.1 5.1 2.3 virginica

143 5.8 2.7 5.1 1.9 virginica

144 6.8 3.2 5.9 2.3 virginica

145 6.7 3.3 5.7 2.5 virginica

146 6.7 3.0 5.2 2.3 virginica

147 6.3 2.5 5.0 1.9 virginica

148 6.5 3.0 5.2 2.0 virginica

149 6.2 3.4 5.4 2.3 virginica

150 5.9 3.0 5.1 1.8 virginica

> aggregate(. ~Species, data=iris, mean)

Species Sepal.Length Sepal.Width Petal.Length Petal.Width

1 setosa 5.006 3.428 1.462 0.246

2 versicolor 5.936 2.770 4.260 1.326

3 virginica 6.588 2.974 5.552 2.026

> subset(iris, iris$Sepal.Length==5.0)

Sepal.Length Sepal.Width Petal.Length Petal.Width Species

5 5 3.6 1.4 0.2 setosa

8 5 3.4 1.5 0.2 setosa

26 5 3.0 1.6 0.2 setosa

27 5 3.4 1.6 0.4 setosa

36 5 3.2 1.2 0.2 setosa

41 5 3.5 1.3 0.3 setosa

44 5 3.5 1.6 0.6 setosa

50 5 3.3 1.4 0.2 setosa

61 5 2.0 3.5 1.0 versicolor

94 5 2.3 3.3 1.0 versicolor

Screenshots:
4. Reading different types of data sets
(.txt, .csv) from web and disk and writing
in file in specific disk Location.
> library(utils)

> data<-read.csv("input.csv")

> data

id name salary start_date dept

1 1 Rick 623.30 2012-01-01 IT

2 2 Dan 515.20 2013-09-23 Operations

3 3 Michelle 611.00 2014-11-15 IT

4 4 Ryan 729.00 2014-05-11 HR

5 5 Gary 843.25 2015-03-27 Finance

6 6 Nina 578.00 2013-05-21 IT

7 7 Simon 632.80 2013-07-30 Operations

8 8 Guru 722.50 2014-06-17 Finance

> print(is.data.frame(data))

[1] TRUE

> print(ncol(data))

[1] 5

> print(nrow(data))

[1] 8

> # Getting the max salary.

> sal <- max(data$salary)

> sal
[1] 843.25

> # Getting the details of the person with the max salary.

> details <-subset(data, salary==sal)

> details

id name salary start_date dept

5 5 Gary 843.25 2015-03-27 Finance

> # Getting the details of all the employees working in the IT department.

> it_details<-subset(data, dept=="IT")

> it_details

id name salary start_date dept

1 1 Rick 623.3 2012-01-01 IT

3 3 Michelle 611.0 2014-11-15 IT

6 6 Nina 578.0 2013-05-21 IT

> # Getting the details of the employees employed after 2014-01-01

> join_details <- subset(data, as.Date(start_date)>as.Date("2014-01-01"))

> join_details

id name salary start_date dept

3 3 Michelle 611.00 2014-11-15 IT

4 4 Ryan 729.00 2014-05-11 HR

5 5 Gary 843.25 2015-03-27 Finance

8 8 Guru 722.50 2014-06-17 Finance

> # Writing the join_details into a new file.

> write.csv(join_details, "output.csv")

> newdata <- read.csv("output.csv")

> newdata

X id name salary start_date dept

1 3 3 Michelle 611.00 2014-11-15 IT

24 4 Ryan 729.00 2014-05-11 HR

35 5 Gary 843.25 2015-03-27 Finance

48 8 Guru 722.50 2014-06-17 Finance

Screenshots:
5. Reading Excel data sheet in R.
> install.packages("xlsx")

Installing package into ‘C:/Users/Avi/AppData/Local/R/win-library/4.2’

(as ‘lib’ is unspecified)

--- Please select a CRAN mirror for use in this session ---

also installing the dependencies ‘rJava’, ‘xlsxjars’

trying URL 'https://rweb.crmda.ku.edu/cran/bin/windows/contrib/4.2/rJava_1.0-6.zip'

Content type 'application/zip' length 1245703 bytes (1.2 MB)

downloaded 1.2 MB

trying URL 'https://rweb.crmda.ku.edu/cran/bin/windows/contrib/4.2/xlsxjars_0.6.1.zip'

Content type 'application/zip' length 9485708 bytes (9.0 MB)

downloaded 9.0 MB

trying URL 'https://rweb.crmda.ku.edu/cran/bin/windows/contrib/4.2/xlsx_0.6.5.zip'

Content type 'application/zip' length 374907 bytes (366 KB)

downloaded 366 KB

package ‘rJava’ successfully unpacked and MD5 sums checked

package ‘xlsxjars’ successfully unpacked and MD5 sums checked

package ‘xlsx’ successfully unpacked and MD5 sums checked

The downloaded binary packages are in

C:\Users\Avi\AppData\Local\Temp\RtmpyWUuTJ\downloaded_packages
> library("xlsx")

> getwd()

[1] "C:/Users/Avi/Documents"

> setwd("G:/Avi/8th Sem/BDA/LabWork")

> getwd()

[1] "G:/Avi/8th Sem/BDA/LabWork"

> library("xlsx")

> data<-read.xlsx("input.xlsx", sheetIndex=1)

> data

id name salary start_date dept

1 1 Rick 623.30 2012-01-01 IT

2 2 Dan 515.20 2013-09-23 Operations

3 3 Michelle 611.00 2014-11-15 IT

4 4 Ryan 729.00 2014-05-11 HR

5 5 Gary 843.25 2015-03-27 Finance

6 6 Nina 578.00 2013-05-21 IT

7 7 Simon 632.80 2013-07-30 Operations

8 8 Guru 722.50 2014-06-17 Finance

Screenshots:
6. Reading XML dataset in R.
> library("XML")

> library("methods")

> result<-xmlParse(file="input.xml")

> result

<?xml version="1.0"?>

<?xml-stylesheet href="sheet.css"?>

<data>

<start_date>"2012-01-01"</start_date>

</data>

<data>

<start_date>"2013-09-23"</start_date>

<dept>"Operations"</dept>

</data>

<data>

<name>"Michelle"</name>
<salary>"611"</salary>

<start_date>"2014-11-15"</start_date>

</data>

<data>

<start_date>"2014-05-11"</start_date>

</data>

<data>

<start_date>"2015-03-27"</start_date>

<dept>"Finance"</dept>

</data>

<data>

<start_date>"2013-05-21"</start_date>

</data>

<name>"Simon"</name>

<start_date>"2013-07-30"</start_date>

<dept>"Operations"</dept>

</data>

<data>

<start_date>"2014-06-17"</start_date>

<dept>"Finance"</dept>

</data>

</dataset>
Screenshots:
7. Find the data distributions using box
and scatter plot.
> library(ggplot2)

> input <- mtcars[,c('mpg','cyl')]

> input

mpg cyl

Mazda RX4 21.0 6

Mazda RX4 Wag 21.0 6

Datsun 710 22.8 4

Hornet 4 Drive 21.4 6

Hornet Sportabout 18.7 8

Valiant 18.1 6

Duster 360 14.3 8

Merc 240D 24.4 4

Merc 230 22.8 4

Merc 280 19.2 6

Merc 280C 17.8 6

Merc 450SE 16.4 8

Merc 450SL 17.3 8

Merc 450SLC 15.2 8

Cadillac Fleetwood 10.4 8

Lincoln Continental 10.4 8

Chrysler Imperial 14.7 8

Fiat 128 32.4 4

Honda Civic 30.4 4

Toyota Corolla 33.9 4

Toyota Corona 21.5 4

Dodge Challenger 15.5 8

AMC Javelin 15.2 8

Camaro Z28 13.3 8

Pontiac Firebird 19.2 8

Fiat X1-9 27.3 4

Porsche 914-2 26.0 4

Lotus Europa 30.4 4

Ford Pantera L 15.8 8

Ferrari Dino 19.7 6

Maserati Bora 15.0 8

Volvo 142E 21.4 4

> boxplot(mpg~cyl, data=mtcars,xlab="Number of Cylinders", ylab="Miles per Gallon",

main="Mileage Data")
Screenshots:
8. Find the outliers using the previous
plot.
> v=c(50, 75, 100, 125, 150, 175, 200)

> boxplot(v)
Screenshots:
9. Plot a histogram using the given
sample data.
Histogram:
> library(graphics)

> v <- c(9, 13, 21, 8, 36, 22, 12, 41, 31, 33, 19)

> hist(v, xlab="Weight", col="green", border="green")

Screenshots:
10. Plot a bar chart using the given
sample data.
Bar Chart:
> H <- c(7, 12, 28, 3, 41)

> M <- c("Jan", "Feb", "Mar", "Apr", "May")

> barplot(H, names.arg=M, xlab="Month", ylab="Revenue", col="blue", main="Revenue Chart",

border="blue")
Screenshots:
11. Plot the bar chart using the given
sample data.
Pie Chart:
> library(graphics)

> x <- c(21, 62, 10, 53)

> labels <- c("London", "NewYork", "Singapore", "Mumbai")

> pie(x, labels)

Screenshots:
12. Find a Correlation matrix and plot the
correlation on iris data set.
> d <- data.frame(x1=rnorm(10), x2=rnorm(10), x3=rnorm(10))

> cor(d)

x1 x2 x3

x1 1.0000000 0.47514914 -0.21575367

x2 0.4751491 1.00000000 0.09190779

x3 -0.2157537 0.09190779 1.00000000

> m <- cor(d)

> library(corrplot)

corrplot 0.92 loaded

> corrplot(m, method="square")

> x <- matrix(rnorm(2), nrow=5, ncol=4)

> y <- matrix(rnorm(15), nrow=5, ncol=3)

> COR <- cor(x, y)

> COR
Screenshots:
13. Plot the correlation plot on the dataset
and visualize, giving an overview of
relationships among data on the iris data.
> Y <- seq(dim(y)[2])

> Z <- COR

> image(x, Y, Z, xlab="X Column", ylab="Y Column")

Screenshots:
14. Analysis of covariance for the iris
dataset with categorical variables.
> data(iris)

> str(iris)

'data.frame': 150 obs. of 5 variables:

$ Sepal.Length: num 5.1 4.9 4.7 4.6 5 5.4 4.6 5 4.4 4.9 ...

$ Sepal.Width : num 3.5 3 3.2 3.1 3.6 3.9 3.4 3.4 2.9 3.1 ...

$ Petal.Length: num 1.4 1.4 1.3 1.5 1.4 1.7 1.4 1.5 1.4 1.5 ...

$ Petal.Width : num 0.2 0.2 0.2 0.2 0.2 0.4 0.3 0.2 0.2 0.1 ...

$ Species : Factor w/ 3 levels "setosa","versicolor",..: 1 1 1 1 1 1 1 1 1 1 ...

> ggplot(data=iris, aes(x=Sepal.Length, y=Petal.Length), geom_point(size=2, colour="black"),

geom_point(size=1, colour="white"), geom_smooth(aes(colour="black"), method="lm"),
ggtitle("sepal.lengthvspetal.length"), xlab("Sepal.Length"),
ylab("Petal.Length"),these(legend.position="none"))
Screenshots:
15. Plot the given cluster data using R
visualizations.
> library(cluster)

> set.seed(20)

> irisCluster <- kmeans(iris[, 3:4], 3, nstart=20)

> irisCluster

K-means clustering with 3 clusters of sizes 52, 48, 50

Cluster means:

Petal.Length Petal.Width

1 4.269231 1.342308

2 5.595833 2.037500

3 1.462000 0.246000

Clustering vector:

[1] 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3
31111111111111111111111111112111112111

[88] 1 1 1 1 1 1 1 1 1 1 1 1 1 2 2 2 2 2 2 1 2 2 2 2 2 2 2 2 2 2 2 2 1 2 2 2 2 2 2 1 2 2 2 2 2 2 2 2
222122222222222

Within cluster sum of squares by cluster:

[1] 13.05769 16.29167 2.02200

(between_SS / total_SS = 94.3 %)

Available components:

[1] "cluster" "centers" "totss" "withinss" "tot.withinss" "betweenss" "size"

"iter" "ifault"
Screenshots:

R LAB Exproling Data
100% (2)
R LAB Exproling Data
6 pages
FANUC Series 0-MB, FANUC Series 00-MB OPERATOR'S MANUAL
100% (6)
FANUC Series 0-MB, FANUC Series 00-MB OPERATOR'S MANUAL
540 pages
PN325 PDS
No ratings yet
PN325 PDS
4 pages
Woodward LSM 9907-838
No ratings yet
Woodward LSM 9907-838
36 pages
Hunting Cyber Criminals: A Hacker's Guide to Online Intelligence Gathering Tools and Techniques
From Everand
Hunting Cyber Criminals: A Hacker's Guide to Online Intelligence Gathering Tools and Techniques
Vinny Troia
5/5 (1)
Minerals Engineering: T.J. Napier-Munn
No ratings yet
Minerals Engineering: T.J. Napier-Munn
8 pages
Sweet Home 3D 2020 Manual
No ratings yet
Sweet Home 3D 2020 Manual
8 pages
Aayushi Bda File
No ratings yet
Aayushi Bda File
41 pages
Data Science Lab
No ratings yet
Data Science Lab
28 pages
R Program
No ratings yet
R Program
2 pages
R Studio
No ratings yet
R Studio
4 pages
Practical 5
No ratings yet
Practical 5
5 pages
R Studio
No ratings yet
R Studio
5 pages
Statistics Introduction
No ratings yet
Statistics Introduction
8 pages
R Lab Ex 1 to 5
No ratings yet
R Lab Ex 1 to 5
26 pages
Assignment
No ratings yet
Assignment
49 pages
Lab4
No ratings yet
Lab4
4 pages
R
No ratings yet
R
3 pages
Mtcars - Ipynb - Colab
No ratings yet
Mtcars - Ipynb - Colab
2 pages
Lab1: Introduction To R: Islr2
No ratings yet
Lab1: Introduction To R: Islr2
10 pages
Mtcars Dataset Analysis in R
No ratings yet
Mtcars Dataset Analysis in R
4 pages
Regression Models Assignment 1 (1) (1)
No ratings yet
Regression Models Assignment 1 (1) (1)
5 pages
Assignment 2 output 229010
No ratings yet
Assignment 2 output 229010
17 pages
Week2 Submission Assignment Solution AshaA-3
No ratings yet
Week2 Submission Assignment Solution AshaA-3
2 pages
Regression Models Assignment 1
No ratings yet
Regression Models Assignment 1
6 pages
Regression Models Assignment 1
No ratings yet
Regression Models Assignment 1
5 pages
Assignment CSE-520
No ratings yet
Assignment CSE-520
29 pages
Exercise 1 Filtering and Summarizing Fuel Efficiency
No ratings yet
Exercise 1 Filtering and Summarizing Fuel Efficiency
1 page
Statistics
No ratings yet
Statistics
10 pages
Chapter 4 Exercise 11
No ratings yet
Chapter 4 Exercise 11
5 pages
activity 2
No ratings yet
activity 2
16 pages
Assignment Auto
No ratings yet
Assignment Auto
6 pages
SMDM-Business Report
No ratings yet
SMDM-Business Report
11 pages
BI Bootcamp Test - Week1
No ratings yet
BI Bootcamp Test - Week1
3 pages
R Notebook: "Mtcars - CSV"
No ratings yet
R Notebook: "Mtcars - CSV"
4 pages
DS_on_MTCARS_Solutions
No ratings yet
DS_on_MTCARS_Solutions
3 pages
SMDM Business+Report
No ratings yet
SMDM Business+Report
11 pages
Project - Analyzing The Impact of Car Features On Price and Profitability
No ratings yet
Project - Analyzing The Impact of Car Features On Price and Profitability
8 pages
Project - Analyzing The Impact of Car Features On Price and Profitability
No ratings yet
Project - Analyzing The Impact of Car Features On Price and Profitability
8 pages
Topic
No ratings yet
Topic
9 pages
SMDM Business+Report
No ratings yet
SMDM Business+Report
11 pages
Module 5 - Data Visualization - File 1
No ratings yet
Module 5 - Data Visualization - File 1
3 pages
p1
No ratings yet
p1
4 pages
Motor Trend Car Road Tests
No ratings yet
Motor Trend Car Road Tests
5 pages
Regression Models Project Sid Jas
No ratings yet
Regression Models Project Sid Jas
7 pages
SMDM-Business Report
No ratings yet
SMDM-Business Report
11 pages
Economics 400 Computer Exercise
No ratings yet
Economics 400 Computer Exercise
7 pages
Unit 4 DVTTT
No ratings yet
Unit 4 DVTTT
24 pages
Mtcars: Choosing The Most Related Variable (S) To The Response
No ratings yet
Mtcars: Choosing The Most Related Variable (S) To The Response
13 pages
SMDM-Business Report
No ratings yet
SMDM-Business Report
11 pages
Team AN
No ratings yet
Team AN
23 pages
Fall 2023-2024 IE 451 Homework 2 Solutions
No ratings yet
Fall 2023-2024 IE 451 Homework 2 Solutions
20 pages
Data Vizualization - Jupyter Notebook
No ratings yet
Data Vizualization - Jupyter Notebook
20 pages
Car Trend Analysis
No ratings yet
Car Trend Analysis
12 pages
Impact of Car Features
No ratings yet
Impact of Car Features
9 pages
car-price-prediction-1 (1)
No ratings yet
car-price-prediction-1 (1)
24 pages
DMPM-LAB-03-Assignment: Rcode
No ratings yet
DMPM-LAB-03-Assignment: Rcode
9 pages
Practical NO.3
No ratings yet
Practical NO.3
7 pages
Report Analysis Super Cars
100% (1)
Report Analysis Super Cars
15 pages
Automobil E Data Analysis: Name Pgp-Dsba Online January' 21 Date: Dd/mm/yyyy
No ratings yet
Automobil E Data Analysis: Name Pgp-Dsba Online January' 21 Date: Dd/mm/yyyy
11 pages
Mtcars Data
No ratings yet
Mtcars Data
2 pages
practice_questions_on_central_tendency_on_mtcars
No ratings yet
practice_questions_on_central_tendency_on_mtcars
3 pages
Finalised FBA CIA 3
No ratings yet
Finalised FBA CIA 3
16 pages
se python_merged (1) (1) (1)
No ratings yet
se python_merged (1) (1) (1)
77 pages
Applied Predictive Analytics: Principles and Techniques for the Professional Data Analyst
From Everand
Applied Predictive Analytics: Principles and Techniques for the Professional Data Analyst
Dean Abbott
No ratings yet
VRS-Lab Acquisition Module 565 en US Edb
No ratings yet
VRS-Lab Acquisition Module 565 en US Edb
7 pages
IELTs SPEAKING SAMPLES WEEK 2
No ratings yet
IELTs SPEAKING SAMPLES WEEK 2
9 pages
MSTE SOL007-Differential Calculus Part I
No ratings yet
MSTE SOL007-Differential Calculus Part I
30 pages
Proforma Invoice: M/S. Orion Traders Vola Trank Road Jessore, Bangladesh
No ratings yet
Proforma Invoice: M/S. Orion Traders Vola Trank Road Jessore, Bangladesh
2 pages
Systems Analysis & Design: Tenth Edition
No ratings yet
Systems Analysis & Design: Tenth Edition
89 pages
Yingli YGE 60 Cell1
No ratings yet
Yingli YGE 60 Cell1
2 pages
Broker - Accreditation - Form - V2022 - FILLABLE - 1647244618
No ratings yet
Broker - Accreditation - Form - V2022 - FILLABLE - 1647244618
1 page
Policy On Network and Systems Administration: Purpose
No ratings yet
Policy On Network and Systems Administration: Purpose
3 pages
Solving Rational Inequalities
No ratings yet
Solving Rational Inequalities
24 pages
QZT +gps Locator
No ratings yet
QZT +gps Locator
7 pages
MS Office MCQ
100% (2)
MS Office MCQ
33 pages
Calls For Proposals of Particular Interest For The Outermost Regions
No ratings yet
Calls For Proposals of Particular Interest For The Outermost Regions
15 pages
IU1. Introduction To Liferay: Module: Liferay Portal 6.2 Qualification
No ratings yet
IU1. Introduction To Liferay: Module: Liferay Portal 6.2 Qualification
37 pages
CSIO4070 IoT For Industries Syllabus
No ratings yet
CSIO4070 IoT For Industries Syllabus
3 pages
IS1441 - CHAPTER-3. Lesson 1
No ratings yet
IS1441 - CHAPTER-3. Lesson 1
47 pages
945G
No ratings yet
945G
36 pages
Colt Scamp
No ratings yet
Colt Scamp
3 pages
2 21cst602-Cgip
No ratings yet
2 21cst602-Cgip
4 pages
95.004.5_MRCS_VA_for_Data_Assistant_1_TGI_
No ratings yet
95.004.5_MRCS_VA_for_Data_Assistant_1_TGI_
4 pages
Deadlock Detection in Distributed Systems: Ajay Kshemkalyani and Mukesh Singhal
No ratings yet
Deadlock Detection in Distributed Systems: Ajay Kshemkalyani and Mukesh Singhal
21 pages
Bahti Engineering Marketing Strategy Plan
No ratings yet
Bahti Engineering Marketing Strategy Plan
6 pages
Airtel Broadband Plan: Get 60GB Data at Rs 999 With Unlimited Free Calls. Book Now
No ratings yet
Airtel Broadband Plan: Get 60GB Data at Rs 999 With Unlimited Free Calls. Book Now
12 pages
CS6303 Computer Architecture Two Marks
80% (5)
CS6303 Computer Architecture Two Marks
19 pages
租赁转让协议
100% (1)
租赁转让协议
7 pages
AIDS_VSB SYLLABUS 2023_16.8.24 (1)
No ratings yet
AIDS_VSB SYLLABUS 2023_16.8.24 (1)
88 pages