0% found this document useful (0 votes)
13 views

VBA EXAM

The document outlines a process for loading and merging stock data for IBM from two sources: a CSV file and Yahoo Finance using the quantmod package in R. It includes steps for data cleaning, checking for missing values, and saving the final merged dataset to a new CSV file. The final output is a cleaned dataset with correctly formatted date and numerical columns.

Uploaded by

Mariam mushtaq
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as TXT, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
13 views

VBA EXAM

The document outlines a process for loading and merging stock data for IBM from two sources: a CSV file and Yahoo Finance using the quantmod package in R. It includes steps for data cleaning, checking for missing values, and saving the final merged dataset to a new CSV file. The final output is a cleaned dataset with correctly formatted date and numerical columns.

Uploaded by

Mariam mushtaq
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as TXT, PDF, TXT or read online on Scribd
You are on page 1/ 2

##---- PART 1 ----

##SET WORKING DIRECTORY


setwd("C:/Users/HP/Desktop")

##LOADING CSV INTO R


data <- read.csv("IBM Bloomberg Data.csv", header = TRUE, sep = ",")

##INSTALLING QUANTMOD
install.packages("quantmod")
library(quantmod)

# Define the stock symbol, start date, and end date


symbol <- "IBM"
start_date <- as.Date("2020-01-01")
end_date <- as.Date("2024-12-31")

# Download stock data using quantmod


getSymbols(symbol, src = "yahoo", from = start_date, to = end_date)

# Convert to data frame


ibm_data <- data.frame(Date = index(IBM), coredata(IBM))

# Save the data as a CSV file


write.csv(ibm_data, file = "IBM_Quantmod_Data.csv", row.names = FALSE)

##---- PART 2 ----

# Load necessary packages


install.packages("tidyverse")
install.packages("dplyr")
library(tidyverse)
library(dplyr)

# Read the CSV files


bloomberg_data <- read.csv("IBM Bloomberg Data.csv")
quantmod_data <- read.csv("IBM_Quantmod_Data.csv")

# Convert the 'Date' columns to Date type


bloomberg_data$Date <- as.Date(bloomberg_data$Date, format = "%d/%m/%Y")
quantmod_data$Date <- as.Date(quantmod_data$Date, format = "%Y-%m-%d")

# Extract the year from the 'Date' columns


bloomberg_data$Year <- format(bloomberg_data$Date, "%Y")
quantmod_data$Year <- format(quantmod_data$Date, "%Y")

# Merge the datasets by 'Year'


merged_data <- merge(bloomberg_data, quantmod_data, by = "Year", all = FALSE)

# Optionally, save the merged dataset to a new CSV file


write.csv(merged_data, "Merged_Data.csv", row.names = FALSE)

# Check for missing values in the entire dataset


missing_summary <- merged_data %>%
summarise(across(everything(), ~ sum(is.na(.))))

# Display columns with missing values, if any


missing_summary[, missing_summary > 0]
# Check if there are any missing values in the entire dataset
any(is.na(data)) # Returns TRUE if there are missing values, otherwise FALSE

# Count total missing values in the entire dataset


sum(is.na(data)) # Returns 0 if there are no missing values

## NO MISSING VALUES WERE FOUND

## ENSURING CORRECT FORMAT

# Read the CSV file


data <- read.csv("Merged_Data.csv", stringsAsFactors = FALSE)

# Check the first few rows and column names


head(data)
str(data)

# Convert date columns to Date format if they exist


if("Date.x" %in% colnames(data)) {
data$Date.x <- as.Date(data$Date.x, format = "%d/%m/%Y")
}
if("Date.y" %in% colnames(data)) {
data$Date.y <- as.Date(data$Date.y, format = "%d/%m/%Y")
}

# Convert numerical columns to numeric if they exist


num_cols <- c("PE_RATIO", "RETURN_ON_ASSET", "ESG_SCORE", "EQY_RAW_BETA",
"Closing_Price", "Volume")
num_cols <- intersect(num_cols, colnames(data)) # Ensure columns exist before
converting
data[num_cols] <- lapply(data[num_cols], as.numeric)

write.csv(merged_data, "Merged_Clean_Data.csv", row.names = FALSE)

You might also like