Big Data and Analytics - B4E4540101 - 2024
Big Data and Analytics - B4E4540101 - 2024
This course prepares students to develop skills in visualizing and analyzing big data using datasets of real business
Course Description scenarios. The course encompasses two main modules;(1) data visualization with Tableau software, (2) prediction and
regression with Python. For each module, students will complete a project using one or multiple datasets in class.
Textbook
Reference Books
1 Course Introduction - lecture 16 Module II: Introduction to Python – lecture & lab
3 Level I: Foundation for building visualization 18 Data encoding and dummy variables – lecture & lab
Quiz II
5 Graphical Visualizations – lecture and lab work 20
Introduction to outliers – lecture
6 Level II: Filtering: discrete, continuous, dates and others 21 Outliers’ detection and treatment options – lecture & lab
8 Dashboard interactivity– lecture and lab work 23 Boston dataset preparation –lab work
9 Comparing values and advanced charts 24 Linear regression assumptions –lab work
Quiz III
10 Advanced visualizations – distributions 25
Stepwise regression and modeling –lab work
11 Level III: Venturing with Calculations 26 Split-test technique in machine learning – lab work
13 Level IV: Storytelling with dashboards 28 Module III: Optimization with Spreadsheets
Quiz I
15 30 Making sense of results and assignment submission
Module II: Prediction analysis
Evaluation Ratio
The information in the course syllabus maybe subject to change with notice, as deemed appropriate by the instructor
Office Hour: Wednesdays from 1:30 - 2:30 on https://us06web.zoom.us/j/87934312910 , Meeting ID: 879 3431 2910
Professor Contact Hours Feel free to send an email to schedule a meeting at your convenient time at [email protected]
Detailed Course Outline
Note : only major assignments are listed
: PC required in all classes (no Tablets)
Teaching Assistant:
6 Fri: Apr 19 1 Groups Hierarchies and Level – II: Groups, hierarchies Reading: pptx (p. 1 – 43)
Level II: Filtering: Filtering.pptx Groups in Tableau
discrete, continuous, Superstore.csv Hierarchies
dates and others – Chapter 1 Starter. twbx
lecture & lab
7 Tue: Apr 23 3 Advanced Visualizations Level – II: Filtering Reading: pptx (p. 1 – 34)
Comparing values Ch 3.pptx Understanding filtering with Tableau I do not think we will have time for
and advanced charts 1 Hospital Visits.xlsx *Getting started with Jupyter iNotebook Gantt chart.
1 Tue: May 7 2 Dashboards - with Level – II: Interactivity and dashboards Reading:
0 Dashboard Tableau.pptx Adding interactivity (highlights and filters) 2 Dashboards - with Tableau.pptx
interactivity – lecture 2 Filtering and Dashboard (p. 29 – 30)
& lab Ch 1 Ch 2.pptx 2 Filtering and Dashboard Ch 1
Superstore.csv Ch 2.pptx (all)
1 Fri: May 10 4 Venturing with Level – III: Introduction to Tableau Reading: pptx (p. 1 – 33)
1 Level III: Venturing Calculations Ch 4.pptx calculations
with Calculations – 2 Table Calculations Ch5 - Types of calculations
lecture & lab Part I.pptx Level of Detail (LoD) calculations
Vacation Rentals.xlsx
1 Tue: May 14 4 Venturing with Level – III: Table Crosstabs and Highlight Reading: pptx (p. 34 – 48)
2 Table Crosstabs and Calculations Ch 4.pptx Parameters
Highlight – lecture & 4 Using Calculations.pptx Ad-hoc calculations
lab Using Date-Specific Calculations
Vacation Rentals.xlsx
Working with Aggregations
Using Quick Table Calculations to Analyze Data
1 Fri: May 17 Telling Story w Dashboards Level – IV: Introduction to Tableau stories Reading: pptx (p. 1 – 39)
3 Level IV: Storytelling Ch7 - Part I.pptx Building the views
with dashboards – Superstore.csv Implementing actions
lecture & lab Context filtering
Filter and highlight actions
1 Tue: May 21 Telling Story w Dashboards Level – IV: Dashboard example – regional Reading: pptx (p. 1 – 1 4)
4 Adding actions and Ch7 - Part II.pptx scorecard
interactivity in stories Superstore.csv Interactivity with the KPI target parameter
– lecture & lab Using interactive snapshots of dashboards and
views
2 Fri: Jun 14 E Outliers and Module II (B) – Outliers Reading: pptx (p. 7 – 16)
1 Outliers’ detection Visualization.pptx The five number summary E Outliers and Visualization.pptx
and treatment 1_House_Sales_Data.xlsx Lower, Upper, and Inner Quartiles pptx (p. 7 – 16)
options – lecture &
lab
2 Tue: Jun 18 Predicting House Module II (C) – Introduction to prediction Reading: pptx (p. 1 – 35)
2 Linear regression – Prices.pptx Introduction to Predictions Task: perform outliers’ detection
lecture 1_House_Sales_Data.xlsx Outliers’ detection and variable transformation in and remove outliers before class
the dataset
Removing Outliers in the dataset
2 Fri: Jun 21 Datacleaning.ipynb Module II (C) – Dataset preparation Reading: Datacleaning.ipynb
3 Boston dataset 1_House_Sales_Data.xlsx Other Ways to Cleaning and Scaling Variables
preparation –lab work
2 Tue: Jun 25 RegressionAssumptions.ipy Reading:
4 Linear regression nb Module II (C) – Linear Regression RegressionAssumptions.ipynb
assumptions –lab 1_House_Sales_Data.xlsx Assumptions
work
2 Fri: Jun 28 *** StepwiseRegression.ipynb Reading:
Quiz III – topics from 15 – 21
5 Quiz III 1_House_Sales_Data.xlsx StepwiseRegression.ipynb
Stepwise regression Module II (D) – Stepwise variable
and modeling –lab selection
work
2 Tue: Jul 2 Split-testMethod.ipynb Module II (D) – Splitting the dataset for Reading:
6 Split-test technique 1_House_Sales_Data.xlsx Machine Learning Split-testMethod.ipynb
in machine learning – Split-test technique
lab work Other: cross-validation with k-folds
2 Fri: Jul 5 Modelfitting.ipynb Module II (D) – Fitting the model and Reading:
7 Model fitting and 1_House_Sales_Data.xlsx validations Modelfitting.ipynb
validation Ordinary Least Squares (OLS) method
Mean squared prediction error (MSE) and root MSE
(RMSE)
Best of Luck