0% found this document useful (0 votes)
36 views

Week1 1 Rev

This document provides an overview of an introductory course on R programming for business analytics. The course will introduce students to R and its uses and benefits for data analysis. Students will learn key R skills like data manipulation, visualization, and reporting. Example applications for business like analyzing online dating data are described. The course will cover importing, cleaning, summarizing, and visualizing different types of data files in R. Assessments will include homework, midterm, final exam, and participation.

Uploaded by

Aaron Chan
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
36 views

Week1 1 Rev

This document provides an overview of an introductory course on R programming for business analytics. The course will introduce students to R and its uses and benefits for data analysis. Students will learn key R skills like data manipulation, visualization, and reporting. Example applications for business like analyzing online dating data are described. The course will cover importing, cleaning, summarizing, and visualizing different types of data files in R. Assessments will include homework, midterm, final exam, and participation.

Uploaded by

Aaron Chan
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 32

ISOM 3390 Business Programming in R

Week 1 Introduction to R programming


5, September 2022

Instructor Hyungsoo Lim

ISOM 3390 Business Programming in R


Plan for today

• What is R?
• Why do we need to learn R programming for business analytics?
• What will you learn from this course?
• Summary of key class details

ISOM 3390 Business Programming in R 2


What is R?

• It’s open-source
• No fees or licenses are needed
• It’s platform-independent. R runs on all operating systems
• It has more than 10,000 packages
• It’s useful for visualizing dataset
• It has latest cutting-edge technology (e.g., Machine learning, natural language processing)
• Practical tool that could get your future job

ISOM 3390 Business Programming in R 3


The Popularity of Data Science Software

 Number of Google scholar hits


• The Popularity of Data Science Software | r4stats.com

ISOM 3390 Business Programming in R 4


Who uses R?

 List of companies that use R


• careerkarma.com

ISOM 3390 Business Programming in R 5


Popularity of programming language index

• PYPL PopularitY of Programming Language index

ISOM 3390 Business Programming in R 6


Business analytics with R

ISOM 3390 Business Programming in R 7


What will you learn from this course?

ISOM 3390 Business Programming in R 8


Learning outcomes

• Develop proficiency in R programming


• Understand data structures and manipulation
• Import various data formats into R using RStudio
• Provide data summary by utilizing descriptive statistics and statistical plots
• Construct effective techniques for data visualization and communication
• Analyze text-based data
• Use R Markdown to write reports

ISOM 3390 Business Programming in R 9


Work experience with R

 List of projects with R


• Online dating platform (published in Information Systems Research)
• Multi-country and multi-generation diffusion of Sim cards (published in Technological For
ecasting and Social Change)
• Hotel booking behavior (working paper)
• Social media contagion (working paper)
✓ Text analytics
• Lexicographic preference (working paper)
✓ Survey data
• Emergent text messages (working paper)
• Multiple collaboration with industrial companies

R can do many things!

ISOM 3390 Business Programming in R 10


Course overview with an example

 Real example
• Collaboration with an online dating platform in South Korea

✓ Need to suggest a new strategy to maximize the profit

✓ Open to any suggestions

✓ Received multiple sources of the dataset


• User dataset: user_id, gender, attraction score, job, religion, and tenure, etc. (200MB)
• Notification dataset: time instant of notification, user_id_sent, and user_id_received , etc. (14GB)
• Matching information dataset: user_id, user_id_matched, action, time, etc. (13GB)
• In-app- currency dataset: user_id, purchase amount, etc. (2GB)

What would be your first step?

ISOM 3390 Business Programming in R 11


Course overview with an example

 Real example
• Check the types of variables in dataset
✓ There are plenty of types of variables in datasets
✓ Numeric variables (e.g., age, attraction score)
✓ String variables (e.g., occupation)
✓ List variables (e.g., religion, preferred age)
• Ex)
• religion – 1: Catholic, 2: Buddhism, etc.
• Preferred age – 1: 4 years older than the user, 2: 2 and 3 years older than the user, etc.

Will be covered in week 1 and week 2

ISOM 3390 Business Programming in R 12


Course overview with an example

 Real example
• Need to clean and organize dataset
✓ R provides several packages to organize dataset

• In week 3, you will learn


✓ Basic logic of R
✓ How to use packages
✓ How to import data

• In week 6, you will learn


✓ How to deal with messy datasets (e.g., wrangling)
✓ How to merge multiple datasets

ISOM 3390 Business Programming in R 13


Course overview with an example

 Real example
• Check statistics and distributions
✓ To understand the data
✓ To get an idea

• In week 4 and 5, you will learn


✓ How to summarize statistics
✓ How to generate statistical graphs

ISOM 3390 Business Programming in R 14


Examples of charts

ISOM 3390 Business Programming in R 15


Course overview with an example

 Real example
• Provide charts and graphs
✓ to persuade the online dating platform

• In week 9, you will learn


✓ How to visualize dataset

ISOM 3390 Business Programming in R 16


Importance of data visualization

• Source: Andrew Heiss, “Data Visualization with R”

ISOM 3390 Business Programming in R 17


Wrong example of visualization

ISOM 3390 Business Programming in R 18


Wrong example of visualization

ISOM 3390 Business Programming in R 19


Web scraping

ISOM 3390 Business Programming in R 20


Wordcloud

ISOM 3390 Business Programming in R 21


R Markdown

• R Markdown provides a productive notebook interface and a unified authoring framewor


k that weave together narrative text and code to produce elegantly formatted output
• R Markdown documents are fully reproducible and support dozens of output formats, like
PDFs, Word files, slideshows, and more
• An R Markdown file is a plain-text file that has the extension. Rmd.

ISOM 3390 Business Programming in R 22


R Markdown

 Example
• rmarkdown-cheatsheet (rstudio.com)
• We have data about 53940 diamonds
• Only 126 are larger than 2.5 carats
• The distribution is shown below

ISOM 3390 Business Programming in R 23


Class details

 Course schedule

ISOM 3390 Business Programming in R 24


Class details

 Contacting us
• Email
✓ Instructor: [email protected]
✓ TA: [email protected]
✓ Begin subject [ISOM3390]…

• Office Hours: by appointment only

 Contacting website
• http://canvas.ust.hk

ISOM 3390 Business Programming in R 25


Class evaluations

 Participation (10%)
• In-Class Participation (10%)
✓ Students are expected to attend the courses (5%)
✓ Students are also expected to attend the lab sessions (5%)

• Absences ONLY can be excused with


✓ a doctor’s note for an illness
✓ a note from a university authority documenting participation in a university-sponsored activity
✓ a quarantine order from DH.

ISOM 3390 Business Programming in R 26


Class evaluations

 Homework (20%)
• There will be hands-on homework assignments (using Canvas website)

• Assignments will be graded and returned promptly

• The due date of each homework assignment will be announced upon its release on Canvas

ISOM 3390 Business Programming in R 27


Class evaluations

 Mid-term exam (25%) and final exam (25%)


• The exams will be based on the topics and related concepts taught during class
✓ The midterm exam will test issues covered in the first half of the course
✓ The final exam will cover the classes in the second half of the course
• All examinations will be closed book, closed notes, and no devices
• Do not miss the exam: there will be NO make-up for both mid-term and final examinations
✓ If you have to miss the mid-term exam due to extraordinary circumstances such as unexpected
hospitalization or loss of a family member, please let me (cc TA) know as soon as you can and
see me with a doctor’s note and/or verifiable and valid evidence
✓ Only under such extraordinary circumstances, a make-up exam will be arranged for you but
with additional essay questions or/and oral examination
✓ There is NO make-up for the final examination
• In other cases, there will be no make-up exam if you miss the exam, and you will
automatically receive 0 points for that exam.
✓ Time conflicts with job interviews, other tests, travel plans, social obligations or any other, do
mestic, social, financial, religious or geopolitical situation, etc. will NOT be considered. There
will be NO exceptions to this rule
ISOM 3390 Business Programming in R 28
Class evaluations

 Group projects (20%)


• You will be assigned to small groups (3 at maximum) to work on a final group project
• You will select project topics provided by the instructor
• Each group will cooperate on writing code, documenting it, writing a report, and
presenting the project at the end of semester
• One component (10% out of 20%) of your final project grade will be based on your
teammates’ assessment of your contribution to the project
✓ Typically, all members of a group would receive the same grade for the group project
✓ However, I will moderate individual students’ group project grades based on peer evaluations
✓ Students who perform exceedingly well in their peer evaluations could receive higher group
project grades than their group mates
✓ Conversely, students who do badly in their peer evaluations would receive lower group project
grades

ISOM 3390 Business Programming in R 29


Class evaluations

 Late policy
• A 20% penalty will be deducted for each day or part of a day that an assignment is late
• For instance,
✓ if you are 1-day late in submission → 80% of your points for the submission
✓ if you 2-days late in submission → 60% (reduction of 2 × 20%) of your points for the submission
✓ if you are late by 5 days, then you are better off NOT submitting the deliverable

ISOM 3390 Business Programming in R 30


Installing R and the R Console

• We can download R freely from the https://cran.r-project.org/


• Interactive data analysis usually occurs on the R console

ISOM 3390 Business Programming in R 31


Installing RStudio

 An Integrated Development Environment for R


• RStudio includes an editor with many R specific features, a console to execute your code,
and other useful panes
• You can download RStudio at https://www.rstudio.com/products/rstudio/download/
• Once RStudio is installed, we can simply start RStudio rather than R since that program au
tomatically starts R

ISOM 3390 Business Programming in R 32

You might also like