0% found this document useful (0 votes)
71 views8 pages

Big Data and Analytics - B4E4540101 - 2024

Uploaded by

huonghehe613
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
71 views8 pages

Big Data and Analytics - B4E4540101 - 2024

Uploaded by

huonghehe613
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 8

Course Title Big Data and Analytics (SPRING) B4E4540101

Instructor Nora Sharkasi

This course prepares students to develop skills in visualizing and analyzing big data using datasets of real business
Course Description scenarios. The course encompasses two main modules;(1) data visualization with Tableau software, (2) prediction and
regression with Python. For each module, students will complete a project using one or multiple datasets in class.

Students are expected to explain their understanding of:


- visualization best practices
Objectives - data dashboards and presenting findings with Tableau data stories
- the use of different data visualization and analytical tools like Tableau, spreadsheets, and Python
- exploratory analysis and data wrangling for regression-based prediction techniques

Method of Instruction Lectures and lab work

Class Preparation Each week mandates an average of 2-3 hours of preparation.

Textbook

Reference Books

Course Outline (individual classes)

1 Course Introduction - lecture 16 Module II: Introduction to Python – lecture & lab

Module I: Intro to Tableau and data visualization –


2 17 Getting Started with Python – lecture and lab work
lecture & lab work

3 Level I: Foundation for building visualization 18 Data encoding and dummy variables – lecture & lab

Components of Tableau: Charts and graphs – lecture and


4 19 Missing data – lecture & lab
lab work

Quiz II
5 Graphical Visualizations – lecture and lab work 20
Introduction to outliers – lecture

6 Level II: Filtering: discrete, continuous, dates and others 21 Outliers’ detection and treatment options – lecture & lab

7 Dashboard interface & managing data source 22 Linear regression – lecture

8 Dashboard interactivity– lecture and lab work 23 Boston dataset preparation –lab work

9 Comparing values and advanced charts 24 Linear regression assumptions –lab work

Quiz III
10 Advanced visualizations – distributions 25
Stepwise regression and modeling –lab work
11 Level III: Venturing with Calculations 26 Split-test technique in machine learning – lab work

Model fitting and validation


12 Table Crosstabs and Highlight 27
Quiz IV: Friday Jul 14

13 Level IV: Storytelling with dashboards 28 Module III: Optimization with Spreadsheets

14 Adding actions and interactivity in stories 29 Setting up optimization problem

Quiz I
15 30 Making sense of results and assignment submission
Module II: Prediction analysis

Evaluation Ratio

Class Evaluation Exam 80% Report Others 20%

(Others Evaluation)  Four Quizzes 4* 20% = 80%


Details  Assignments = 20%

Assignment solutions will be shared on Moodle with no feedback on an individual basis.

Particular Note Bringing your personal PC is required for all sessions

The information in the course syllabus maybe subject to change with notice, as deemed appropriate by the instructor

Office Hour: Wednesdays from 1:30 - 2:30 on https://us06web.zoom.us/j/87934312910 , Meeting ID: 879 3431 2910
Professor Contact Hours Feel free to send an email to schedule a meeting at your convenient time at [email protected]
Detailed Course Outline
Note : only major assignments are listed
: PC required in all classes (no Tablets)

Teaching Assistant:

# Class Files Description Preparations before class


1 Tue: Apr 2 1 BD Course Course Introduction Reading: Syllabus and pptx
Course Introduction – Introduction.pptx Installing Tableau Tasks: explore:
lecture Importance of data visualization (i) Tableau E-learning
(ii) Tableau Students
Program
2 Fri: Apr 5 1 Intro to Tableau.pptx Level – I: Introduction to Tableau Reading: pptx (p. 1 – 18)
Module I : Intro to Superstore.csv The Cycle of Analytics Task – optional: Free Training
Tableau and data Chapter 1 Starter. twbx Connecting to Tableau Explore Data Literacy
visualization – lecture Managing data source Connecting to a Data File
& lab The interface Database Design Basics
3 Tue: Apr 9 1 Intro to Tableau.pptx Level – I: Foundations for building Reading: pptx (p. 19 – 38)
Level – I : Foundation Superstore.csv visualizations
for building Chapter 1 Starter. twbx Measures and dimensions
visualizations – Discrete and continuous
lecture & lab Discrete and continuous Dates
4 Fri: Apr 12 1 Intro to Tableau.pptx Level – I: Charts and graphs Reading: pptx (p. 39 – 55)
Components of Superstore.csv Level of details (LOD)
Tableau: Charts and Chapter 1 Starter. twbx Bar charts – two levels and stacked chart
graphs – lecture & Line charts – overlapping lines
lab *Introduction to Python and Anaconda
Environment
5 Tue: Apr 16 1 Intro to Tableau.pptx Level – I: Graphic visualizations Reading: pptx (p. 56 – 70)
Graphical Superstore.csv Filled maps
Visualizations – Chapter 1 Starter. twbx Symbol maps
lecture & lab Density maps

6 Fri: Apr 19 1 Groups Hierarchies and Level – II: Groups, hierarchies Reading: pptx (p. 1 – 43)
Level II: Filtering: Filtering.pptx Groups in Tableau
discrete, continuous, Superstore.csv Hierarchies
dates and others – Chapter 1 Starter. twbx
lecture & lab
7 Tue: Apr 23 3 Advanced Visualizations Level – II: Filtering Reading: pptx (p. 1 – 34)
Comparing values Ch 3.pptx Understanding filtering with Tableau I do not think we will have time for

and advanced charts 1 Hospital Visits.xlsx *Getting started with Jupyter iNotebook Gantt chart.

– lecture & lab 2 Hospital Goals.xlsx and coding with Python


Process Times.xlsx
8 Fri: Apr 26 3 Advanced Visualizations Level – II: Advanced visualizations – data Reading: pptx (p. 35 – 50)
Advanced Ch 3.pptx comparisons
visualizations: 1 Hospital Visits.xlsx Comparing values (bar-in-bar charts and bullet
distributions – lecture 2 Hospital Goals.xlsx charts)
& lab Process Times.xlsx Dates and time (Date parts, date values, and
exact dates)
Heat map
Level – II: Advanced visualizations –
distributions
Parts of whole
Stacked bars
Tree maps
Area charts
Pie charts
9 Tue: Apr 30 2 Dashboards - with Level – II: Introduction to dashboards Reading: pptx (p. 1 – 28)
Dashboard interface Tableau.pptx Purpose of a dashboard
& managing data Superstore.csv Planning a successful dashboard
source – lecture & lab
Fri May 3 (GW
Holiday)

1 Tue: May 7 2 Dashboards - with Level – II: Interactivity and dashboards Reading:
0 Dashboard Tableau.pptx Adding interactivity (highlights and filters) 2 Dashboards - with Tableau.pptx
interactivity – lecture 2 Filtering and Dashboard (p. 29 – 30)
& lab Ch 1 Ch 2.pptx 2 Filtering and Dashboard Ch 1
Superstore.csv Ch 2.pptx (all)
1 Fri: May 10 4 Venturing with Level – III: Introduction to Tableau Reading: pptx (p. 1 – 33)
1 Level III: Venturing Calculations Ch 4.pptx calculations
with Calculations – 2 Table Calculations Ch5 - Types of calculations
lecture & lab Part I.pptx Level of Detail (LoD) calculations
Vacation Rentals.xlsx
1 Tue: May 14 4 Venturing with Level – III: Table Crosstabs and Highlight Reading: pptx (p. 34 – 48)
2 Table Crosstabs and Calculations Ch 4.pptx Parameters
Highlight – lecture & 4 Using Calculations.pptx Ad-hoc calculations
lab Using Date-Specific Calculations
Vacation Rentals.xlsx
Working with Aggregations
Using Quick Table Calculations to Analyze Data
1 Fri: May 17 Telling Story w Dashboards Level – IV: Introduction to Tableau stories Reading: pptx (p. 1 – 39)
3 Level IV: Storytelling Ch7 - Part I.pptx Building the views
with dashboards – Superstore.csv Implementing actions
lecture & lab Context filtering
Filter and highlight actions
1 Tue: May 21 Telling Story w Dashboards Level – IV: Dashboard example – regional Reading: pptx (p. 1 – 1 4)
4 Adding actions and Ch7 - Part II.pptx scorecard
interactivity in stories Superstore.csv Interactivity with the KPI target parameter
– lecture & lab Using interactive snapshots of dashboards and
views

1 Fri: May 24 *** 1B Intro to data Quiz I


Quiz I – topics from 1 – 10
5 Quiz I analytics.pptx
Prediction analysis A EDA Data Types.pptx Introduction to prediction science and data
Superstore.csv types
1 Tue: May 28 Python Libraries Module II – Introduction to Python Reading: pptx (p. 1 – 19)
6 Module II: Introduction.pptx Panda libraries Task: create a Google Colab
Introduction to 2 A Getting Started with print( ) function account
Python – lecture & Python.pptx Advanced Variables Types If you haven’t taken a python
lab Conditional Statements course before, please, attempt
Code Academy
1 Fri: May 31 2 A Getting Started with Module II – Getting started with Python Reading: pptx (p. 20 – 25)
7 Getting started with Python.pptx For and while loops
Python – lecture & Functions
lab
1 Tue: Jun 4 B Dummy and Categorical Module II (A) – Explanatory data analysis Reading: pptx (p. 1 – 17)
8 Data encoding and Variables.pptx Dependent and Independent Variables Definition
dummy variables – B Example 1 - Data Modules in Python
lecture & lab Encoding.xlsx Importing libraries and reading CSV file with
python
Get-dummies method
Concat method
Drop method to finalize
Export the data frame to xlsx excel file
1 Fri: Jun 7 C D EDA Missing Data.pptx Module II (A) – Missing data Reading: pptx (p. 1 – 21)
9 Missing data – lecture C Example 2 - Missing Missing Value Definition
& lab Data.xlsx Dealing with Missing Data
D Example 3 - Advanced Advanced Approaches Dealing with Missing Data
Missing Data.xlsx
2 Tue: Jun 11 *** E Outliers and Quiz II
Quiz II – topics from 11 – 14
0 Quiz II Visualization.pptx
Introduction to 1_House_Sales_Data.xlsx Introduction to outliers Reading: pptx (p. 1 – 6)
outliers – lecture What is an outlier?

2 Fri: Jun 14 E Outliers and Module II (B) – Outliers Reading: pptx (p. 7 – 16)
1 Outliers’ detection Visualization.pptx The five number summary E Outliers and Visualization.pptx
and treatment 1_House_Sales_Data.xlsx Lower, Upper, and Inner Quartiles pptx (p. 7 – 16)
options – lecture &
lab
2 Tue: Jun 18 Predicting House Module II (C) – Introduction to prediction Reading: pptx (p. 1 – 35)
2 Linear regression – Prices.pptx Introduction to Predictions Task: perform outliers’ detection
lecture 1_House_Sales_Data.xlsx Outliers’ detection and variable transformation in and remove outliers before class
the dataset
Removing Outliers in the dataset
2 Fri: Jun 21 Datacleaning.ipynb Module II (C) – Dataset preparation Reading: Datacleaning.ipynb
3 Boston dataset 1_House_Sales_Data.xlsx Other Ways to Cleaning and Scaling Variables
preparation –lab work
2 Tue: Jun 25 RegressionAssumptions.ipy Reading:
4 Linear regression nb Module II (C) – Linear Regression RegressionAssumptions.ipynb
assumptions –lab 1_House_Sales_Data.xlsx Assumptions
work
2 Fri: Jun 28 *** StepwiseRegression.ipynb Reading:
Quiz III – topics from 15 – 21
5 Quiz III 1_House_Sales_Data.xlsx StepwiseRegression.ipynb
Stepwise regression Module II (D) – Stepwise variable
and modeling –lab selection
work
2 Tue: Jul 2 Split-testMethod.ipynb Module II (D) – Splitting the dataset for Reading:
6 Split-test technique 1_House_Sales_Data.xlsx Machine Learning Split-testMethod.ipynb
in machine learning – Split-test technique
lab work Other: cross-validation with k-folds
2 Fri: Jul 5 Modelfitting.ipynb Module II (D) – Fitting the model and Reading:
7 Model fitting and 1_House_Sales_Data.xlsx validations Modelfitting.ipynb
validation Ordinary Least Squares (OLS) method
Mean squared prediction error (MSE) and root MSE
(RMSE)

Quiz IV is scheduled during the open week

Best of Luck

You might also like