0% found this document useful (0 votes)

11 views31 pages

RStudio

,ne,n,me

Uploaded by

221it040

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

11 views31 pages

RStudio

,ne,n,me

Uploaded by

221it040

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

You are on page 1/ 31

Experiment 1: Working with Objects in Memory

Aim:

To understand the creation, manipulation, and management of objects in R's memory.

Algorithm:

1. Start the R Environment: Open RStudio or the R Console.

2. Create Basic Objects:
○ Use the assignment operator = or = to create variables.
○ Example: x = 10 or y = "Hello"
3. Manipulate Objects:
○ Perform operations on numeric objects.
○ Example: z = x + 20
4. Check Object Class and Type:
○ Use functions like class() and typeof() to verify the type of objects.
○ Example: class(x) or typeof(y)
5. Inspect Objects in Memory:
○ Use ls() to list all objects in the current environment.
○ Example: ls()
6. Remove Objects:
○ Use rm() to delete objects from memory.
○ Example: rm(x)
7. Perform Simple Operations:
○ Work with sequences, vectors, and logical conditions.
○ Example: vec = c(1, 2, 3, 4)
8. End: Display the final state of objects in the memory.

R Code:

# Create objects
x = 10
y = "Hello"
z = x + 20

# Print objects
print(x)
print(y)
print(z)

# Check object types

cat("Class of x:", class(x), "\n")
cat("Type of y:", typeof(y), "\n")
# List objects in memory
cat("Objects in memory:", ls(), "\n")

# Remove an object
rm(x)

# Confirm removal
cat("Objects after removing 'x':", ls(), "\n")

# Work with a vector

vec = c(1, 2, 3, 4, 5)
print(vec)

# Perform an operation on the vector

vec_squared = vec^2
print(vec_squared)

Output Example:

[1] 10
[1] "Hello"
[1] 30
Class of x: numeric
Type of y: character
Objects in memory: y z
Objects after removing 'x': y z vec
[1] 1 2 3 4 5
[1] 1 4 9 16 25

Experiment 2: Demonstrate Data Frame

Aim:
To create and manipulate a Data Frame in R, showcasing its structure, operations, and
applications.

Algorithm:

1. Start the R Environment: Open RStudio or the R Console.

2. Create a Data Frame:
○ Use the data.frame() function.
○ Example: df = data.frame(Column1, Column2, Column3)
3. Inspect the Data Frame:
○ View the structure using str().
○ Check dimensions using dim().
4. Access Data Frame Elements:
○ Use indexing: df[row, column].
○ Access columns using the $ operator: df$ColumnName.
5. Perform Operations:
○ Add, modify, or delete rows and columns.
○ Example: df$NewColumn = some_operation.
6. Summary and Viewing:
○ Display the first few rows with head().
○ Summarize data using summary().
7. End: Save or display the modified Data Frame.

R Code:

# Create a data frame

students = data.frame(
Roll_No = c(101, 102, 103, 104),
Name = c("Alice", "Bob", "Charlie", "Diana"),
Marks = c(85, 90, 78, 92),
Grade = c("A", "A+", "B", "A+")
)

# Display the data frame

print("Original Data Frame:")
print(students)

# View structure and dimensions

cat("\nStructure of the Data Frame:\n")
str(students)

cat("\nDimensions of the Data Frame: ")

print(dim(students))
# Access specific elements
cat("\nMarks of the second student:")
print(students[2, "Marks"])

cat("\nNames of all students:")

print(students$Name)

# Add a new column

students$Attendance = c(90, 95, 85, 88)
cat("\nData Frame after adding Attendance column:\n")
print(students)

# Modify a column
students$Marks = students$Marks + 5
cat("\nData Frame after increasing marks by 5:\n")
print(students)

Output Example:

Original Data Frame:

Roll_No Name Marks Grade
1 101 Alice 85 A
2 102 Bob 90 A+
3 103 Charlie 78 B
4 104 Diana 92 A+

Structure of the Data Frame:

'data.frame': 4 obs. of 4 variables:
$ Roll_No: num 101 102 103 104
$ Name : chr "Alice" "Bob" "Charlie" "Diana"
$ Marks : num 85 90 78 92
$ Grade : chr "A" "A+" "B" "A+"

Dimensions of the Data Frame:

[1] 4 4

Marks of the second student:

[1] 90
Names of all students:
[1] "Alice" "Bob" "Charlie" "Diana"

Data Frame after adding Attendance column:

Roll_No Name Marks Grade Attendance
1 101 Alice 85 A 90
2 102 Bob 90 A+ 95
3 103 Charlie 78 B 85
4 104 Diana 92 A+ 88

Data Frame after increasing marks by 5:

Roll_No Name Marks Grade Attendance
1 101 Alice 90 A 90
2 102 Bob 95 A+ 95
3 103 Charlie 83 B 85
4 104 Diana 97 A+ 88

This program demonstrates the creation and manipulation of a data frame in R.

Experiment 3: Perform Matrix Operations

Aim:

To create and perform various operations on matrices in R, such as addition, multiplication,

transposition, and inversion.
Algorithm:

1. Start the R Environment: Open RStudio or the R Console.

2. Create Matrices:
○ Use the matrix() function.
○ Example: matrix(data, nrow, ncol)
3. Perform Basic Matrix Operations:
○ Addition, subtraction, and multiplication.
○ Use operators like +, -, *.
4. Transpose the Matrix:
○ Use the t() function.
5. Find the Determinant:
○ Use the det() function.
6. Find the Inverse of a Matrix:
○ Use the solve() function (for square matrices).
7. End: Display the final results of the operations.

R Code:

# Create two matrices

matrix1 <- matrix(c(1, 2, 3, 4, 5, 6), nrow = 2, ncol = 3)
matrix2 <- matrix(c(6, 5, 4, 3, 2, 1), nrow = 2, ncol = 3)

# Display the matrices

cat("Matrix 1:\n")
print(matrix1)

cat("\nMatrix 2:\n")
print(matrix2)

# Matrix addition
matrix_sum <- matrix1 + matrix2
cat("\nMatrix Addition (Matrix 1 + Matrix 2):\n")
print(matrix_sum)

# Transpose of a matrix
transpose_matrix <- t(matrix1)
cat("\nTranspose of Matrix 1:\n")
print(transpose_matrix)

# Multiplication of matrices (requires compatible dimensions)

matrix3 <- matrix(c(1, 2, 3, 4), nrow = 2, ncol = 2)
matrix4 <- matrix(c(5, 6, 7, 8), nrow = 2, ncol = 2)

matrix_product <- matrix3 %*% matrix4

cat("\nMatrix Multiplication (Matrix 3 x Matrix 4):\n")
print(matrix_product)

# Determinant of a square matrix

det_matrix <- det(matrix3)
cat("\nDeterminant of Matrix 3:\n")
print(det_matrix)

# Inverse of a square matrix (if determinant is not zero)

if (det_matrix != 0) {
inverse_matrix <- solve(matrix3)
cat("\nInverse of Matrix 3:\n")
print(inverse_matrix)
} else {
cat("\nMatrix 3 is not invertible.\n")
}

Output Example:

Matrix 1:
[,1] [,2] [,3]
[1,] 1 3 5
[2,] 2 4 6

Matrix 2:
[,1] [,2] [,3]
[1,] 6 4 2
[2,] 5 3 1

Matrix Addition (Matrix 1 + Matrix 2):

[,1] [,2] [,3]
[1,] 7 7 7
[2,] 7 7 7

Transpose of Matrix 1:
[,1] [,2]
[1,] 1 2
[2,] 3 4
[3,] 5 6

Matrix Multiplication (Matrix 3 x Matrix 4):

[,1] [,2]
[1,] 19 22
[2,] 43 50

Determinant of Matrix 3:
[1] -2

Inverse of Matrix 3:
[,1] [,2]
[1,] -2 1.5
[2,] 1 -0.5

This program demonstrates how to create matrices, perform basic arithmetic operations,
transpose, find determinants, and calculate inverses in R.

Experiment 4: Working with Various Built-in Functions in R

Aim:

To explore and demonstrate the use of various built-in functions in R for mathematical,
statistical, and data manipulation tasks.

Algorithm:

1. Start the R Environment: Open RStudio or the R Console.

2. Use Mathematical Functions:
○ Demonstrate functions like sqrt(), log(), exp(), and abs().
3. Use Statistical Functions:
○ Demonstrate functions like mean(), median(), sd(), var(), and
summary().
4. Use Character Functions:
○ Demonstrate functions like toupper(), tolower(), substr(), and
paste().
5. Use Sequence and Repetition Functions:
○ Use seq() and rep() to generate sequences and repeated values.
6. Perform Aggregation:
○ Use aggregate() to group and summarize data.
7. End: Display the results of all operations.

R Code:

# 1. Mathematical Functions
x <- 16
y <- -4
cat("Square root of", x, ":", sqrt(x), "\n")
cat("Absolute value of", y, ":", abs(y), "\n")
cat("Natural logarithm of", x, ":", log(x), "\n")
cat("Exponential of", y, ":", exp(y), "\n\n")

# 2. Statistical Functions
data <- c(10, 20, 30, 40, 50)
cat("Mean of data:", mean(data), "\n")
cat("Median of data:", median(data), "\n")
cat("Standard deviation of data:", sd(data), "\n")
cat("Variance of data:", var(data), "\n")
cat("Summary of data:\n")
print(summary(data))
cat("\n")

# 3. Character Functions
text <- "Hello R"
cat("Uppercase:", toupper(text), "\n")
cat("Lowercase:", tolower(text), "\n")
cat("Substring (1 to 5):", substr(text, 1, 5), "\n")
cat("Concatenate strings:", paste("Learning", "R", sep = " "), "\n\
n")

# 4. Sequence and Repetition

sequence <- seq(1, 10, by = 2)
cat("Generated sequence:", sequence, "\n")
repeated <- rep(5, times = 4)
cat("Repeated values:", repeated, "\n\n")

# 5. Aggregation
df <- data.frame(
Category = c("A", "A", "B", "B", "C"),
Value = c(10, 15, 10, 20, 30)
)
aggregated <- aggregate(Value ~ Category, data = df, sum)
cat("Aggregated values by category:\n")
print(aggregated)

Output Example:

Square root of 16 : 4
Absolute value of -4 : 4
Natural logarithm of 16 : 2.772589
Exponential of -4 : 0.018316

Mean of data: 30
Median of data: 30
Standard deviation of data: 15.81139
Variance of data: 250
Summary of data:
Min. 1st Qu. Median Mean 3rd Qu. Max.
10.0 20.0 30.0 30.0 40.0 50.0

Uppercase: HELLO R
Lowercase: hello r
Substring (1 to 5): Hello
Concatenate strings: Learning R

Generated sequence: 1 3 5 7 9
Repeated values: 5 5 5 5

Aggregated values by category:

Category Value
1 A 25
2 B 30
3 C 30

This program demonstrates the use of various built-in functions in R for handling
mathematical operations, statistical analysis, character string manipulations, sequence
generation, and data aggregation.

Experiment 5: Import and Export Files in R

Aim:

To demonstrate the import and export of data files in R, such as CSV, Excel, and text files,
and perform basic operations on the imported data.

Algorithm:

1. Start the R Environment: Open RStudio or the R Console.

2. Import a CSV File:
○ Use the read.csv() function to load data.
○ Example: data <- read.csv("file.csv").
3. View and Manipulate Data:
○ Use functions like head(), str(), and summary() to inspect the data.
4. Export a CSV File:
○ Use the write.csv() function to save the modified data to a new file.
5. Import an Excel File (Optional):
○ Use the readxl package and the read_excel() function.
6. Export Data to an Excel File:
○ Use the writexl package and the write_xlsx() function.
7. Import and Export Text Files:
○ Use read.table() and write.table() functions.
8. End: Verify the imported and exported files.

R Code:

# 1. Importing a CSV File

cat("Importing CSV file...\n")
data <- read.csv("sample.csv", header = TRUE)
cat("First few rows of the data:\n")
print(head(data))
# 2. Display Summary and Structure
cat("\nSummary of the imported data:\n")
print(summary(data))

cat("\nStructure of the imported data:\n")

str(data)

# 3. Modify Data
cat("\nModifying data: Adding a new column...\n")
data$NewColumn <- data$ExistingColumn * 2 # Replace ExistingColumn
with an actual column name

# 4. Exporting Data to a New CSV File

cat("\nExporting modified data to a new CSV file...\n")
write.csv(data, "modified_data.csv", row.names = FALSE)
cat("Data exported to 'modified_data.csv'\n")

# 5. Importing and Exporting Excel Files (requires `readxl` and

`writexl` packages)
if (!requireNamespace("readxl", quietly = TRUE))
install.packages("readxl")
if (!requireNamespace("writexl", quietly = TRUE))
install.packages("writexl")

library(readxl)
library(writexl)

cat("\nImporting Excel file...\n")

excel_data <- read_excel("sample.xlsx")
cat("First few rows of the Excel data:\n")
print(head(excel_data))

cat("\nExporting data to an Excel file...\n")

write_xlsx(data, "exported_data.xlsx")
cat("Data exported to 'exported_data.xlsx'\n")

# 6. Importing and Exporting Text Files

cat("\nImporting Text file...\n")
text_data <- read.table("sample.txt", header = TRUE, sep = "\t")
cat("First few rows of the text data:\n")
print(head(text_data))
cat("\nExporting data to a text file...\n")
write.table(data, "exported_data.txt", row.names = FALSE, sep = "\
t")
cat("Data exported to 'exported_data.txt'\n")

Output Example:

Importing CSV file...

First few rows of the data:
Column1 Column2 Column3
1 1 A 10
2 2 B 20
3 3 C 30

Summary of the imported data:

Column1 Column2 Column3
Min. :1 A:1 Min. :10
1st Qu.:2 B:1 1st Qu.:20
Median :3 C:1 Median :30
Mean :3 Mean :30
3rd Qu.:3 3rd Qu.:30
Max. :3 Max. :30

Structure of the imported data:

'data.frame': 3 obs. of 3 variables:
$ Column1: int 1 2 3
$ Column2: Factor w/ 3 levels "A","B","C": 1 2 3
$ Column3: num 10 20 30

Modifying data: Adding a new column...

Exporting modified data to a new CSV file...

Data exported to 'modified_data.csv'

Importing Excel file...

First few rows of the Excel data:
Column1 Column2 Column3
1 1 A 10
2 2 B 20
Exporting data to an Excel file...
Data exported to 'exported_data.xlsx'

Importing Text file...

First few rows of the text data:
Column1 Column2 Column3
1 1 X 5
2 2 Y 10

Exporting data to a text file...

Data exported to 'exported_data.txt'

This program demonstrates importing and exporting CSV, Excel, and text files in R, with
basic operations performed on the imported data.

Experiment 6: Implement Statistical Methods

Aim:

To implement statistical methods such as mean, median, variance, standard deviation,

correlation, and regression analysis using R.

Algorithm:

1. Start the R Environment: Open RStudio or the R Console.

2. Create or Import Data:
○ Define a dataset manually or import it using read.csv().
3. Calculate Basic Statistics:
○ Use functions like mean(), median(), var(), and sd().
4. Perform Correlation Analysis:
○ Use the cor() function to calculate the correlation between variables.
5. Perform Linear Regression:
○ Use the lm() function to fit a linear regression model.
6. Visualize the Results:
○ Use plot() to create a scatter plot and abline() to add a regression line.
7. End: Print the results and display the visualization.

R Code:

# Step 1: Create a dataset

data <- data.frame(
x = c(5, 10, 15, 20, 25),
y = c(12, 20, 28, 36, 44)
)

cat("Dataset:\n")
print(data)

# Step 2: Calculate Basic Statistics

mean_x <- mean(data$x)
median_x <- median(data$x)
variance_x <- var(data$x)
sd_x <- sd(data$x)

cat("\nBasic Statistics for x:\n")

cat("Mean:", mean_x, "\n")
cat("Median:", median_x, "\n")
cat("Variance:", variance_x, "\n")
cat("Standard Deviation:", sd_x, "\n")

# Step 3: Correlation Analysis

correlation <- cor(data$x, data$y)
cat("\nCorrelation between x and y:", correlation, "\n")

# Step 4: Perform Linear Regression

cat("\nPerforming Linear Regression:\n")
model <- lm(y ~ x, data = data)
cat("Regression Summary:\n")
print(summary(model))

# Step 5: Visualize Data and Regression Line

plot(data$x, data$y, main = "Scatter Plot with Regression Line",
xlab = "X", ylab = "Y", col = "blue", pch = 19)
abline(model, col = "red", lwd = 2)
Expected Output:

Dataset:
x y
1 5 12
2 10 20
3 15 28
4 20 36
5 25 44

Basic Statistics for x:

Mean: 15
Median: 15
Variance: 62.5
Standard Deviation: 7.905694

Correlation between x and y: 1

Performing Linear Regression:

Regression Summary:
Call:
lm(formula = y ~ x, data = data)

Residuals:
1 2 3 4 5
-1.421e-14 -7.105e-15 0.000e+00 7.105e-15 1.421e-14

Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 4.000 5.568e-15 7.18e+14 <2e-16 ***
x 1.600 3.712e-16 4.31e+15 <2e-16 ***
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Residual standard error: 6.693e-15 on 3 degrees of freedom

Multiple R-squared: 1, Adjusted R-squared: 1
F-statistic: 1.86e+31 on 1 and 3 DF, p-value: < 2.2e-16
Visualization:

● A scatter plot is displayed with data points (x vs. y) in blue and the regression line in
red.

This program demonstrates how to calculate statistical measures, evaluate correlations, and
perform regression analysis, along with visualizing results in R.

Experiment 7: Working with Machine Learning Algorithms

Aim:

To implement a basic machine learning algorithm, such as linear regression or k-nearest

neighbors (KNN), using R.

Algorithm for K-Nearest Neighbors (KNN):

1. Start the R Environment: Open RStudio or the R Console.

2. Load Required Libraries: Install and load the class package for KNN.
3. Prepare the Dataset:
○ Use a built-in dataset such as iris, or load your own dataset.
○ Split the dataset into training and testing sets.
4. Normalize the Data:
○ Scale the features to ensure they are on a comparable scale.
5. Implement KNN:
○ Use the knn() function to classify test data based on training data.
6. Evaluate the Model:
○ Compare predictions with actual labels to calculate accuracy.
7. End: Display the results and accuracy.
R Code:

# Step 1: Load Required Libraries

if (!requireNamespace("class", quietly = TRUE))
install.packages("class")
library(class)

# Step 2: Load and Prepare Dataset

data(iris) # Load the iris dataset
cat("First few rows of the iris dataset:\n")
print(head(iris))

# Step 3: Split the Data into Training and Testing Sets

set.seed(123) # For reproducibility
indices <- sample(1:nrow(iris), size = 0.7 * nrow(iris)) # 70%
training
train_data <- iris[indices, ]
test_data <- iris[-indices, ]

train_features <- train_data[, 1:4] # Sepal and Petal dimensions

train_labels <- train_data[, 5] # Species column
test_features <- test_data[, 1:4]
test_labels <- test_data[, 5]

# Step 4: Normalize the Features (Optional)

normalize <- function(x) {
return((x - min(x)) / (max(x) - min(x)))
}

train_features <- as.data.frame(lapply(train_features, normalize))

test_features <- as.data.frame(lapply(test_features, normalize))

# Step 5: Implement KNN

k <- 3 # Number of neighbors
predicted_labels <- knn(train_features, test_features, train_labels,
k)

# Step 6: Evaluate the Model

accuracy <- sum(predicted_labels == test_labels) /
length(test_labels) * 100
cat("\nAccuracy of the KNN model:", accuracy, "%\n")
# Confusion Matrix
cat("\nConfusion Matrix:\n")
print(table(Predicted = predicted_labels, Actual = test_labels))

Expected Output:

First few rows of the iris dataset:

Sepal.Length Sepal.Width Petal.Length Petal.Width Species
1 5.1 3.5 1.4 0.2 setosa
2 4.9 3.0 1.4 0.2 setosa
3 4.7 3.2 1.3 0.2 setosa
4 4.6 3.1 1.5 0.2 setosa
5 5.0 3.6 1.4 0.2 setosa
6 5.4 3.9 1.7 0.4 setosa

Accuracy of the KNN model: 97.7777777777778 %

Confusion Matrix:
Actual
Predicted setosa versicolor virginica
setosa 15 0 0
versicolor 0 14 1
virginica 0 0 15

Conclusion:

The KNN algorithm was successfully implemented, and the accuracy of the model was
calculated. This demonstrates the effectiveness of KNN for classification tasks.

Experiment 8: Implement Time Series Analysis

Aim:

To analyze and forecast a time series dataset using R.

Algorithm:

1. Start the R Environment: Open RStudio or the R Console.

2. Load Required Libraries: Install and load necessary libraries such as forecast
and ggplot2.
3. Load the Time Series Data:
○ Use built-in datasets like AirPassengers or import your own dataset.
○ Convert the dataset into a time series object using the ts() function if not
already in time series format.
4. Visualize the Data:
○ Use plot() to visualize the time series data.
5. Decompose the Time Series:
○ Apply decomposition using the decompose() function to separate the trend,
seasonality, and residuals.
6. Apply Forecasting Method:
○ Use methods like ARIMA or exponential smoothing for forecasting.
7. Evaluate the Forecast:
○ Compare the predicted values with the actual data.
8. End: Display the plots and results.

R Code:

# Step 1: Load Required Libraries

if (!requireNamespace("forecast", quietly = TRUE))
install.packages("forecast")
if (!requireNamespace("ggplot2", quietly = TRUE))
install.packages("ggplot2")

library(forecast)
library(ggplot2)

# Step 2: Load the Time Series Data

data("AirPassengers") # Built-in dataset
ts_data <- AirPassengers

# Step 3: Visualize the Time Series Data

cat("Time Series Data Summary:\n")
print(summary(ts_data))
plot(ts_data, main = "AirPassengers Data", xlab = "Year", ylab =
"Passengers", col = "blue")

# Step 4: Decompose the Time Series

decomposed <- decompose(ts_data)
plot(decomposed)

# Step 5: Apply ARIMA Model for Forecasting

model <- auto.arima(ts_data)
cat("\nARIMA Model Summary:\n")
print(summary(model))

# Forecast the next 12 months

forecasted <- forecast(model, h = 12)
cat("\nForecasted Values:\n")
print(forecasted)

# Step 6: Plot the Forecast

plot(forecasted, main = "AirPassengers Forecast", xlab = "Year",
ylab = "Passengers", col = "blue")

# Step 7: Evaluate the Model (Optional)

accuracy_metrics <- accuracy(forecasted)
cat("\nAccuracy Metrics:\n")
print(accuracy_metrics)

Expected Output:

Time Series Data Summary:

Min. 1st Qu. Median Mean 3rd Qu. Max.
104 180 265 280 360 622

ARIMA Model Summary:

Series: ts_data
ARIMA(0,1,1)(0,1,1)[12]

Coefficients:
ma1 sma1
-0.401 -0.627
s.e. 0.088 0.076

sigma^2 estimated as 1378: log likelihood=-508.33

AIC=1022.67 AICc=1022.91 BIC=1031.47

Forecasted Values:
Point Forecast Lo 80 Hi 80 Lo 95 Hi 95
Jan 1961 443.961 421.3156 466.6064 409.3448 478.5772
Feb 1961 444.450 421.9786 466.9214 409.9040 478.9960
...
Accuracy Metrics:
ME RMSE MAE MPE MAPE
Training set 0.1234 35.67891 28.2345 -0.0234 3.67890

Visualizations:

1. Original Time Series Plot:

○ Shows trends and seasonality in the dataset.
2. Decomposition Plot:
○ Displays trend, seasonality, and residual components.
3. Forecast Plot:
○ Presents the original data with forecasted values and confidence intervals.

Conclusion:

The time series analysis was successfully performed, including decomposition and
forecasting using ARIMA. The forecasted values provide insights into future trends.

Experiment 9: Demonstrate Data Mining Algorithms

Aim:

To demonstrate a basic data mining algorithm, such as association rule mining using the
Apriori algorithm in R.

Algorithm:

1. Start the R Environment: Open RStudio or the R Console.

2. Load Required Libraries: Install and load the arules library for association rule
mining.
3. Prepare the Dataset:
○ Use a built-in dataset like Groceries or create your own transactional data.
4. Perform Data Preprocessing:
○ Convert the dataset into a transaction format if necessary.
5. Apply the Apriori Algorithm:
○ Use the apriori() function to discover frequent itemsets and association
rules.
6. Analyze the Rules:
○ Sort and inspect the rules based on confidence, support, or lift.
7. End: Display the mined rules and relevant metrics.
R Code:

# Step 1: Load Required Libraries

if (!requireNamespace("arules", quietly = TRUE))
install.packages("arules")

library(arules)

# Step 2: Load the Dataset

data("Groceries") # Built-in transactional dataset
cat("Summary of Groceries Dataset:\n")
print(summary(Groceries))

# Step 3: Apply the Apriori Algorithm

rules <- apriori(
Groceries,
parameter = list(support = 0.01, confidence = 0.5)
)

# Step 4: Inspect the Rules

cat("\nSummary of Association Rules:\n")
print(summary(rules))

# Inspect the top 5 rules sorted by lift

cat("\nTop 5 Association Rules:\n")
inspect(head(sort(rules, by = "lift"), 5))

# Step 5: Visualize Rules (Optional)

if (!requireNamespace("arulesViz", quietly = TRUE))
install.packages("arulesViz")
library(arulesViz)
plot(rules, method = "graph", control = list(type = "items"))

Expected Output:

Summary of Groceries Dataset:

transactions as itemMatrix in sparse format with
9835 rows (elements/itemsets/transactions) and
169 columns (items) and a density of 0.02609146

Summary of Association Rules:

set of 420 rules

Top 5 Association Rules:

lhs rhs support confidence
lift
[1] {whole milk} => {other vegetables} 0.0745 0.5587045
3.122
[2] {root vegetables} => {whole milk} 0.0486 0.4937238
2.250
[3] {yogurt} => {whole milk} 0.0560 0.4023948
1.834
...

Visualizations:

1. Graph Plot:
○ Displays items and association rules in a network format.
2. Scatter Plot (Optional):
○ Shows the relationship between support, confidence, and lift.

Conclusion:

The Apriori algorithm was successfully implemented to discover frequent itemsets and
generate association rules. This demonstrates the basic principles of data mining and
association rule learning in R.

Experiment 10: Implement Text Mining Algorithms

Aim:

To implement text mining using R by preprocessing textual data and extracting insights such
as frequent terms or word clouds.

Algorithm:

1. Start the R Environment: Open RStudio or the R Console.

2. Load Required Libraries: Install and load necessary libraries such as tm,
wordcloud, and SnowballC.
3. Load the Text Data:
○ Use a built-in dataset or read a text file containing the data.
4. Preprocess the Data:
○ Convert the text to lowercase.
○ Remove stopwords, punctuation, and numbers.
○ Perform stemming to normalize words.
5. Create a Document-Term Matrix:
○ Use the DocumentTermMatrix() function to create a term-document
matrix.
6. Analyze the Data:
○ Find the most frequent terms.
○ Visualize the terms using a word cloud.
7. End: Display the insights and visualizations.

R Code:

# Step 1: Load Required Libraries

if (!requireNamespace("tm", quietly = TRUE)) install.packages("tm")
if (!requireNamespace("wordcloud", quietly = TRUE))
install.packages("wordcloud")
if (!requireNamespace("SnowballC", quietly = TRUE))
install.packages("SnowballC")

library(tm)
library(wordcloud)
library(SnowballC)

# Step 2: Load Text Data

text_data <- c(
"Text mining is the process of deriving meaningful information
from text.",
"It involves cleaning, preprocessing, and analyzing textual
data.",
"Applications of text mining include sentiment analysis, topic
modeling, and more."
)

# Step 3: Create a Corpus

corpus <- Corpus(VectorSource(text_data))

# Step 4: Preprocess the Data

corpus <- tm_map(corpus, content_transformer(tolower)) # Convert to
lowercase
corpus <- tm_map(corpus, removePunctuation) # Remove
punctuation
corpus <- tm_map(corpus, removeNumbers) # Remove
numbers
corpus <- tm_map(corpus, removeWords, stopwords("en")) # Remove
stopwords
corpus <- tm_map(corpus, stemDocument) # Perform
stemming

# Step 5: Create a Document-Term Matrix

dtm <- DocumentTermMatrix(corpus)
cat("\nDocument-Term Matrix Summary:\n")
print(dtm)

# Step 6: Analyze and Visualize Data

# Find the most frequent terms
freq_terms <- findFreqTerms(dtm, lowfreq = 2)
cat("\nFrequent Terms (Appearing >= 2 times):\n")
print(freq_terms)

# Visualize with Word Cloud

word_freq <- as.data.frame(as.matrix(dtm))
word_freq <- colSums(word_freq)
wordcloud(names(word_freq), word_freq, max.words = 50, colors =
brewer.pal(8, "Dark2"))

Expected Output:

Document-Term Matrix Summary:

A document-term matrix (3 documents, 20 terms)

Frequent Terms (Appearing >= 2 times):

[1] "data" "text" "mine"

Word Cloud Visualization:

A colorful word cloud showing frequent terms like "text," "data," and "mine."

Conclusion:

Text mining was successfully performed using R. Preprocessing techniques and analysis,
including generating a word cloud, helped extract meaningful insights from textual data.
Experiment 11: Data Visualization Techniques
Aim:

To demonstrate various data visualization techniques in R using basic and advanced plots.

Algorithm:

1. Start the R Environment: Open RStudio or the R Console.

2. Load Required Libraries: Install and load necessary libraries like ggplot2.
3. Load the Dataset: Use a built-in dataset such as mtcars or import your own.
4. Generate Visualizations:
○ Create basic plots (line plot, bar plot, etc.).
○ Create advanced plots (scatter plot, histogram, etc.).
○ Add labels, titles, and themes to the plots.
5. Customize the Plots:
○ Use colors, point shapes, and additional features for better insights.
6. Display the Results: Render the plots and analyze the insights.
7. End: Save the plots if required.

R Code:

# Step 1: Load Required Libraries

if (!requireNamespace("ggplot2", quietly = TRUE))
install.packages("ggplot2")

library(ggplot2)

# Step 2: Load Dataset

data("mtcars")
cat("Dataset Summary:\n")
print(summary(mtcars))

# Step 3: Generate Basic Visualizations

# Bar Plot - Number of cylinders
barplot(table(mtcars$cyl), main = "Number of Cylinders", col =
"blue",
xlab = "Cylinders", ylab = "Frequency")

# Scatter Plot - MPG vs Horsepower

plot(mtcars$mpg, mtcars$hp, main = "MPG vs Horsepower",
xlab = "Miles Per Gallon (MPG)", ylab = "Horsepower (HP)",
col = "red", pch = 19)

# Step 4: Generate Advanced Visualizations with ggplot2

# Histogram - MPG Distribution

ggplot(mtcars, aes(x = mpg)) +
geom_histogram(binwidth = 2, fill = "skyblue", color = "black") +
labs(title = "MPG Distribution", x = "Miles Per Gallon", y =
"Frequency")

# Box Plot - MPG by Cylinders

ggplot(mtcars, aes(x = factor(cyl), y = mpg)) +
geom_boxplot(fill = "orange") +
labs(title = "MPG by Cylinder Count", x = "Number of Cylinders", y
= "MPG")

# Step 5: Customize a Scatter Plot with ggplot2

ggplot(mtcars, aes(x = wt, y = mpg, color = factor(gear))) +
geom_point(size = 3) +
labs(title = "MPG vs Weight by Gear", x = "Weight", y = "Miles Per
Gallon") +
theme_minimal()

# Step 6: Save a Plot (Optional)

ggsave("scatter_plot.png", width = 8, height = 6)

Expected Output:

1. Bar Plot:
○ Displays the frequency of cars based on the number of cylinders.
2. Scatter Plot:
○ Shows the relationship between miles per gallon (MPG) and horsepower
(HP).
3. Histogram:
○ Represents the distribution of MPG across the dataset.
4. Box Plot:
○ Compares MPG values for different cylinder categories.
5. Advanced Scatter Plot:
○ Highlights the relationship between weight and MPG, grouped by the number
of gears.

Conclusion:

Various data visualization techniques were successfully implemented using R. Both basic
and advanced plots provide insights into the dataset, demonstrating the power of visual
analysis.

Experiment 12: Experiment with Hypothesis Testing Methods

Aim:

To perform hypothesis testing in R to determine if there is a significant difference between

two sample groups.

Algorithm:

1. Start the R Environment: Open RStudio or the R Console.

2. Set Up Hypotheses:
○ Define the null hypothesis (H₀) and the alternative hypothesis (H₁).
○ Example: H₀ - There is no significant difference between the means of two
groups.
3. Load or Generate Data:
○ Use a built-in dataset or simulate data for testing.
4. Perform Hypothesis Testing:
○ Use appropriate statistical tests (e.g., t-test, ANOVA, chi-square test).
○ Choose the test based on the data type and hypothesis.
5. Interpret the Results:
○ Compare the p-value with the significance level (α = 0.05).
○ Accept or reject the null hypothesis based on the p-value.
6. End: Report the conclusion of the test.

R Code:

# Step 1: Generate Sample Data

set.seed(123)
group1 <- rnorm(30, mean = 50, sd = 5) # Group 1 data
group2 <- rnorm(30, mean = 55, sd = 5) # Group 2 data
# Step 2: Define Hypotheses
# H₀: The means of group1 and group2 are equal.
# H₁: The means of group1 and group2 are not equal.

# Step 3: Perform an Independent t-test

t_test_result <- t.test(group1, group2, alternative = "two.sided")

# Step 4: Display the Results

cat("T-Test Results:\n")
print(t_test_result)

# Step 5: Interpret the Results

if (t_test_result$p.value < 0.05) {
cat("\nConclusion: Reject the null hypothesis. There is a
significant difference between the groups.\n")
} else {
cat("\nConclusion: Fail to reject the null hypothesis. No
significant difference is found.\n")
}

# Step 6: Visualization (Optional)

boxplot(group1, group2, names = c("Group 1", "Group 2"),
main = "Boxplot of Two Groups",
col = c("lightblue", "pink"),
ylab = "Values")

Expected Output:

T-Test Results:
Welch Two Sample t-test

data: group1 and group2

t = -3.632, df = 57.76, p-value = 0.0006345
alternative hypothesis: true difference in means is not equal to 0
95 percent confidence interval:
-7.898827 -2.101173
sample estimates:
mean of x mean of y
50.28389 54.48388
Conclusion: Reject the null hypothesis. There is a significant
difference between the groups.

Visualization:

A boxplot comparing the two groups, showing the difference in their distributions.

Conclusion:

Hypothesis testing was successfully conducted using an independent t-test. The results
indicate whether there is a statistically significant difference between the means of the two
sample groups.

Experiment 1: Working With Objects in Memory
No ratings yet
Experiment 1: Working With Objects in Memory
6 pages
RemoveWatermark pdf24 Merged+
No ratings yet
RemoveWatermark pdf24 Merged+
76 pages
Technical Skills 1 1
No ratings yet
Technical Skills 1 1
22 pages
R-1ST Internal-Lab Notes
No ratings yet
R-1ST Internal-Lab Notes
14 pages
Model 1
No ratings yet
Model 1
14 pages
BA Lab - 53
No ratings yet
BA Lab - 53
11 pages
Practical 1 - Basics of R
No ratings yet
Practical 1 - Basics of R
8 pages
R Programming Lab
No ratings yet
R Programming Lab
14 pages
R Assignment
No ratings yet
R Assignment
9 pages
Statistical Lab Using R-Programming Lab Manual and Workbook: Department of Mathematics
No ratings yet
Statistical Lab Using R-Programming Lab Manual and Workbook: Department of Mathematics
58 pages
R Programming LAB
No ratings yet
R Programming LAB
32 pages
R Lab PGM
No ratings yet
R Lab PGM
77 pages
Teaching R
No ratings yet
Teaching R
15 pages
Network Analysis and Visualization With R and Igraph
No ratings yet
Network Analysis and Visualization With R and Igraph
62 pages
R Programming Materials
No ratings yet
R Programming Materials
51 pages
R Tools LAB
No ratings yet
R Tools LAB
31 pages
18 3 24 Upto Week 6 A B Latest 1
No ratings yet
18 3 24 Upto Week 6 A B Latest 1
25 pages
IDS Lab Manual
No ratings yet
IDS Lab Manual
11 pages
Introduction To R Chap 2
No ratings yet
Introduction To R Chap 2
30 pages
Practical
No ratings yet
Practical
47 pages
R Prgms
No ratings yet
R Prgms
12 pages
R WorkSamples
No ratings yet
R WorkSamples
44 pages
BDS306C - Imp Questions & Answers - Module 2-2
No ratings yet
BDS306C - Imp Questions & Answers - Module 2-2
14 pages
50 R Exercises
No ratings yet
50 R Exercises
44 pages
R Programming Basics for Beginners
No ratings yet
R Programming Basics for Beginners
14 pages
R File Code
No ratings yet
R File Code
16 pages
BCA 280 Lab
No ratings yet
BCA 280 Lab
29 pages
R Programming Practical File
100% (1)
R Programming Practical File
32 pages
R Programming-Chapiter 4
No ratings yet
R Programming-Chapiter 4
16 pages
R Lab Manual
No ratings yet
R Lab Manual
16 pages
R Programming
No ratings yet
R Programming
50 pages
Big Data Analytics Programs Only
No ratings yet
Big Data Analytics Programs Only
22 pages
R Basics for Beginners
No ratings yet
R Basics for Beginners
24 pages
Week 5
No ratings yet
Week 5
5 pages
Base R
No ratings yet
Base R
9 pages
19PDSC205 Lab Manual
No ratings yet
19PDSC205 Lab Manual
21 pages
173 - Prabhakar Pal-R Assignment
No ratings yet
173 - Prabhakar Pal-R Assignment
9 pages
Lab Experiment 20: R Basics - Initializing Matrices and Performing Matrix Operations
No ratings yet
Lab Experiment 20: R Basics - Initializing Matrices and Performing Matrix Operations
3 pages
23DSCP206-Data Mining Using R
No ratings yet
23DSCP206-Data Mining Using R
14 pages
R Lab Rec Final
No ratings yet
R Lab Rec Final
31 pages
R Data Analytics & Visualization Lab
No ratings yet
R Data Analytics & Visualization Lab
47 pages
RStudio
No ratings yet
RStudio
60 pages
Rprograms CSE
No ratings yet
Rprograms CSE
26 pages
Lab Manual Record: St. Josephs PG College
No ratings yet
Lab Manual Record: St. Josephs PG College
14 pages
R Studio
No ratings yet
R Studio
8 pages
Siddharth Arya 76 ML Practical File
No ratings yet
Siddharth Arya 76 ML Practical File
30 pages
21Ai51T - Programming Language For Ai: Innovative Assignment - III
No ratings yet
21Ai51T - Programming Language For Ai: Innovative Assignment - III
13 pages
R Lab File Deepak
No ratings yet
R Lab File Deepak
27 pages
Da Lab It
No ratings yet
Da Lab It
20 pages
Introduction To R PDF
No ratings yet
Introduction To R PDF
56 pages
Sta238 Wks - Week1+2
No ratings yet
Sta238 Wks - Week1+2
35 pages
An Introduction To R Language
No ratings yet
An Introduction To R Language
11 pages
R Data Types and Variables Guide
No ratings yet
R Data Types and Variables Guide
19 pages
R Programs
No ratings yet
R Programs
12 pages
R
No ratings yet
R
13 pages
R Practicals
No ratings yet
R Practicals
53 pages
My R Report
No ratings yet
My R Report
52 pages
About R Language: Installation
No ratings yet
About R Language: Installation
7 pages
DR - Pierpaolo-Delser - Introduction R
No ratings yet
DR - Pierpaolo-Delser - Introduction R
83 pages
Freelance Platforms for Job Seekers
No ratings yet
Freelance Platforms for Job Seekers
12 pages
Mistake Proofing
No ratings yet
Mistake Proofing
20 pages
DryLin Specialists
No ratings yet
DryLin Specialists
10 pages
Garrett 1557800 Rev A EC 10754 Paragon Email
No ratings yet
Garrett 1557800 Rev A EC 10754 Paragon Email
2 pages
Chapter 5 - Rev A
100% (1)
Chapter 5 - Rev A
30 pages
Digital Image Processing
No ratings yet
Digital Image Processing
4 pages
637931762595602517CSE 20CS42P W9 S1 Sy
No ratings yet
637931762595602517CSE 20CS42P W9 S1 Sy
8 pages
Metal Claddings: Producing With Quality
No ratings yet
Metal Claddings: Producing With Quality
48 pages
1 Over 500 Misc Java Interview Questions (PDFDrive)
No ratings yet
1 Over 500 Misc Java Interview Questions (PDFDrive)
106 pages
2.1. Marine Boilers PDF
80% (5)
2.1. Marine Boilers PDF
48 pages
Montek Tech Services PVT LTD
No ratings yet
Montek Tech Services PVT LTD
15 pages
IPv4 Subnetting Cheat Sheet
No ratings yet
IPv4 Subnetting Cheat Sheet
1 page
Doosan HP 4000II 5100II English
No ratings yet
Doosan HP 4000II 5100II English
20 pages
High Vibration of GA-101D
100% (2)
High Vibration of GA-101D
17 pages
Strategic Management Analysis of The Strategy of The: Prof. Dr. Christian Buer
No ratings yet
Strategic Management Analysis of The Strategy of The: Prof. Dr. Christian Buer
25 pages
Reimagining The Teacher and The Learner in The Time of COVID-19
No ratings yet
Reimagining The Teacher and The Learner in The Time of COVID-19
45 pages
Orthophoto
No ratings yet
Orthophoto
8 pages
Sample Report - Bug Bounty Program
No ratings yet
Sample Report - Bug Bounty Program
41 pages
Wealth Management Client Portal
No ratings yet
Wealth Management Client Portal
3 pages
Gigabyte Ga p43 Es3g Rev 1 4 Owner S Manual
No ratings yet
Gigabyte Ga p43 Es3g Rev 1 4 Owner S Manual
88 pages
HCK0 Samsung Datasheet DDR3SDRAM
No ratings yet
HCK0 Samsung Datasheet DDR3SDRAM
64 pages
Dokumen - Tips Routing Operations in Cisco Ios Routers Operations in Cisco Ios Routers Rick
No ratings yet
Dokumen - Tips Routing Operations in Cisco Ios Routers Operations in Cisco Ios Routers Rick
115 pages
Nelson 5930i Easy Set Hose Timer Owners Manual
No ratings yet
Nelson 5930i Easy Set Hose Timer Owners Manual
2 pages
Solution Architect - RAN
No ratings yet
Solution Architect - RAN
2 pages
Choices Pre-Int WB
No ratings yet
Choices Pre-Int WB
125 pages
Pushdown Automata: Introduction To Formal Languages and Automata
No ratings yet
Pushdown Automata: Introduction To Formal Languages and Automata
102 pages
Vapor Compression Cycle Guide
No ratings yet
Vapor Compression Cycle Guide
18 pages
Database Lab for IT Students
No ratings yet
Database Lab for IT Students
3 pages
GUI Lab Programs 1 To 4
No ratings yet
GUI Lab Programs 1 To 4
33 pages
Excel Skills - Search Queries - Google Ads Search Queries Sample
No ratings yet
Excel Skills - Search Queries - Google Ads Search Queries Sample
8 pages