0% found this document useful (0 votes)
5 views

W07- Intro Basic ML

Uploaded by

KaNika TH11
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
5 views

W07- Intro Basic ML

Uploaded by

KaNika TH11
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 35

INTRODUCTION TO

MACHINE LEARNING (ML)

By: SEK SOCHEAT


Lecturer Artificial Intelligence
2023 – 2024
Mobile: 017 879 967 MSIT – AEU
Email: [email protected]
TABLE OF CONTENTS
Introduction to Machine Learning (ML)

1. Introduction to Machine Learning

2. Machine Learning Core Concepts

3. Practical Examples and Demonstrations

4. Solutions
2
1. INTRODUCTION TO MACHINE LEARNING
1. INTRODUCTION TO MACHINE LEARNING

What is Machine Learning?

Machine Learning (ML) is a subset of artificial intelligence that enables


computers to learn from and make predictions or decisions based on data. Instead
of being explicitly programmed to perform a task, ML algorithms use statistical
techniques to identify patterns in data and improve their performance over time.

4
1. INTRODUCTION TO MACHINE LEARNING

Difference between AI, ML, and Deep Learning

Artificial Intelligence: Machine Learning: Deep Learning:


AI is the broad field of ML is a subset of AI that focuses DL is a subset of ML that uses
creating machines on developing algorithms that neural networks with many
capable of performing allow computers to learn from and layers (hence "deep") to
tasks that typically make predictions or decisions analyze complex patterns in
require human based on data. Instead of being large amounts of data. It is
intelligence, such as explicitly programmed, ML particularly effective in tasks
reasoning, learning, and models improve their performance such as image and speech
problem-solving. as they are exposed to more data. recognition.
5
2. MACHINE LEARNING CORE CONCEPTS
2. MACHINE LEARNING CORE CONCEPTS

Types of Machine Learning

There are primarily four types of machine learning:


(Supervised, Unsupervised, Semi-Supervised
Learning and Reinforcement Learning).

1. Supervised Learning

2. Unsupervised Learning

3. Semi-supervised Learning

4. Reinforcement Learning

7
2. MACHINE LEARNING CORE CONCEPTS

Types of Machine Learning: Supervised Learning

1. Supervised Learning: Using labeled data to predict outcomes (e.g., house price prediction).

• Supervised Learning is a type of machine learning where the model is trained using labeled data.
This means that each training example is paired with an output label. The goal is for the model to
learn a mapping from inputs to outputs, so it can predict the label for new, unseen data.

Key Concepts:

• Labeled Data: Data that comes with input-output pairs.

• Training Phase: The model learns from the training data.

• Prediction Phase: The model predicts labels for new data.


8
2. MACHINE LEARNING CORE CONCEPTS

Types of Machine Learning: Unsupervised Learning


2. Unsupervised Learning: Finding hidden patterns in data (e.g., customer segmentation).
Unsupervised Learning is a type of machine learning where the model is trained on unlabeled data.
The goal is to identify patterns or structures within the data without predefined labels. Common
tasks include clustering and dimensionality reduction.
Key Concepts:
• No Labeled Data: Works with data without labels or predefined outcomes.
• Clustering: Groups similar data points (e.g., K-means, Hierarchical clustering).
• Dimensionality Reduction: Reduces the number of variables (e.g., PCA, t-SNE).
• Anomaly Detection: Identifies rare, suspicious data points.
• Association: Finds rules describing large portions of data (e.g., Market Basket Analysis).
• Self-Organizing Maps (SOMs): Uses neural networks to create a low-dimensional representation of input
data.
9
2. MACHINE LEARNING CORE CONCEPTS

Types of Machine Learning

3. Semi-supervised Learning: Uses both labeled and unlabeled data, dealing with data that is
not entirely structured nor unstructured.
Semi-structured learning uses both labeled and unlabeled data to handle data that is not fully
structured or unstructured, such as HTML, JSON, and XML files with tags or markers.
Key Concepts:
• Mixed Data: Combines labeled and unlabeled data.
• Flexibility: Handles data with some structure but not rigidly formatted.
• Iterative Refinement: Improves models iteratively with both data types.
• Graph-Based Models: Uses graphs to represent and learn from data relationships.
• Semi-Supervised Algorithms: Includes methods like self-training and co-training.
• Applications: Useful in NLP, information retrieval, and bioinformatics.
10
2. MACHINE LEARNING CORE CONCEPTS

Types of Machine Learning


4. Reinforcement Learning: Learning through trial and error (e.g., game playing AI).
Reinforcement Learning (RL) is a type of machine learning where an agent learns to make
decisions by performing actions in an environment to maximize cumulative reward. The agent
receives feedback in the form of rewards or punishments, which it uses to improve its decision-
making over time.
Key Concepts:
• Agent: The learner or decision-maker.
• Environment: The world with which the agent interacts.
• State: A representation of the current situation of the agent.
• Action: What the agent can do.
• Reward: Feedback from the environment based on the agent's action.
• Policy: A strategy used by the agent to determine its actions.
11
3. PRACTICAL EXAMPLES AND
DEMONSTRATIONS
3. PRACTICAL EXAMPLES AND DEMONSTRATIONS

Supervised Learning: House Price Prediction Using Machine Learning


Tasks:

1. Expand the Dataset: Add five more house sizes and their corresponding prices to the dataset. Ensure the new data
points make logical sense in the context of the existing data.

2. Calculate Model Accuracy: Add code to calculate and print the Mean Absolute Error (MAE) of the model's
predictions on the test set. Provide a brief explanation of what MAE indicates about the model's performance.

3. Modify Train-Test Split: Change the test size to 30% and rerun the model training and evaluation. Observe how
this change affects the model's performance and MAE.

4. Customize the Plot: Customize the plot by changing the color of the data points to green and the regression line to
black. Add grid lines to the plot to enhance readability.

5. Implement Polynomial Regression: Modify the code to fit a polynomial regression model of degree 2 instead of a
linear regression model. Visualize the new regression line and compare it with the linear regression line. Discuss the
13
differences observed in the predictions.
3. PRACTICAL EXAMPLES AND DEMONSTRATIONS

Unsupervised Learning: Exploring Unsupervised Learning with Social Network Data


Tasks:

1. Data Preparation: a. Generate a synthetic dataset simulating social network data with features such as the number
of friends, likes, and posts, b. Print the first few rows to understand its structure.

2. K-Means Clustering: a. Perform K-Means clustering on the dataset, b. Use the Elbow Method to determine the
optimal number of clusters, and c. Visualize clusters with a scatter plot, coloring data points by cluster labels.

3. Dimensionality Reduction with PCA: a. Apply Principal Component Analysis (PCA) to reduce the dataset to 2
components, and b. Visualize the dataset in 2D, coloring data points by K-Means cluster labels.

4. Hierarchical Clustering: a. Perform Hierarchical clustering on the dataset, b. Visualize the dendrogram and identify
an appropriate number of clusters, and c. Compare clusters from Hierarchical clustering with those from K-Means.

5. Anomaly Detection: a. Use the GaussianMixture model to detect anomalies in the dataset, and b. Identify and
visualize anomalies using a scatter plot.
14
3. PRACTICAL EXAMPLES AND DEMONSTRATIONS

Semi-structure Learning: Exploring Semi-Structured Learning with Email Dataset.


Tasks:
1. Data Preparation: a. Generate a synthetic email dataset with features like number of words,
attachments, keywords, and sender's domain. b. Print the first few rows to understand its structure.
2. Label Propagation: a. Apply Label Propagation using partially labeled data (e.g., spam or not spam).
b. Train the model, predict labels for unlabeled data, and evaluate accuracy.
3. Graph-Based Semi-Supervised Learning: a. Construct a graph where nodes represent emails and edges
represent similarities. b. Apply a graph-based semi-supervised algorithm to classify emails. c. Visualize
the graph with nodes colored by predicted labels.
4. Dimensionality Reduction with t-SNE: a. Apply t-SNE to reduce dataset dimensions to 2 components.
b. Visualize the dataset in 2D, coloring data points by predicted labels.
5. Anomaly Detection: a. Use Isolation Forest to detect anomalies (e.g., potential spam). b. Identify and
15 visualize anomalies using a scatter plot.
3. PRACTICAL EXAMPLES AND DEMONSTRATIONS

Reinforcement Learning: Exploring Reinforcement Learning with a Grid World Environment

Tasks:

1. Environment Setup: a. Create a simple Grid World environment where an agent can move up, down, left, or right.
b. Define the grid size, start position, goal position, and obstacles.

2. Q-Learning Implementation: a. Implement the Q-Learning algorithm for the agent to learn the optimal policy to
reach the goal. b. Define the reward structure, learning rate, and discount factor. c. Train the agent by allowing it to
interact with the environment for a specified number of episodes.

3. Policy Visualization: a. Visualize the learned policy by showing the optimal path from the start to the goal on the
grid. b. Display the Q-values for each state-action pair on the grid.

4. Performance Evaluation: a. Plot the total rewards per episode to observe the learning progress. b. Evaluate the
performance by calculating the average reward over multiple test episodes.

5. Exploration vs. Exploitation: a. Implement an epsilon-greedy strategy for the agent to balance exploration and
16
exploitation. b. Experiment with different values of epsilon and observe the impact on the learning process.
4. SOLUTIONS
3. PRACTICAL EXAMPLES AND DEMONSTRATIONS

Supervised Learning: House Price Prediction Using Machine Learning


Tasks:

1. Expand the Dataset: Add five more house sizes and their corresponding prices to the dataset. Ensure the new data
points make logical sense in the context of the existing data.

2. Calculate Model Accuracy: Add code to calculate and print the Mean Absolute Error (MAE) of the model's
predictions on the test set. Provide a brief explanation of what MAE indicates about the model's performance.

3. Modify Train-Test Split: Change the test size to 30% and rerun the model training and evaluation. Observe how
this change affects the model's performance and MAE.

4. Customize the Plot: Customize the plot by changing the color of the data points to green and the regression line to
black. Add grid lines to the plot to enhance readability.

5. Implement Polynomial Regression: Modify the code to fit a polynomial regression model of degree 2 instead of a
linear regression model. Visualize the new regression line and compare it with the linear regression line. Discuss the
18
differences observed in the predictions.
4. SOLUTIONS: SUPERVISED LEARNING
House Price Prediction
Using Machine Learning

19
4. SOLUTIONS: SUPERVISED LEARNING

House Price Prediction


Using Machine Learning

20
4. SOLUTIONS: SUPERVISED LEARNING
House Price Prediction
Output: Using Machine Learning

21
3. PRACTICAL EXAMPLES AND DEMONSTRATIONS

Unsupervised Learning: Exploring Unsupervised Learning with Social Network Data


Tasks:

1. Data Preparation: a. Generate a synthetic dataset simulating social network data with features such as the number
of friends, likes, and posts, b. Print the first few rows to understand its structure.

2. K-Means Clustering: a. Perform K-Means clustering on the dataset, b. Use the Elbow Method to determine the
optimal number of clusters, and c. Visualize clusters with a scatter plot, coloring data points by cluster labels.

3. Dimensionality Reduction with PCA: a. Apply Principal Component Analysis (PCA) to reduce the dataset to 2
components, and b. Visualize the dataset in 2D, coloring data points by K-Means cluster labels.

4. Hierarchical Clustering: a. Perform Hierarchical clustering on the dataset, b. Visualize the dendrogram and identify
an appropriate number of clusters, and c. Compare clusters from Hierarchical clustering with those from K-Means.

5. Anomaly Detection: a. Use the GaussianMixture model to detect anomalies in the dataset, and b. Identify and
visualize anomalies using a scatter plot.
22
4. SOLUTIONS: UNSUPERVISED LEARNING

Exploring Unsupervised
Learning with Social Network
Data

23
4. SOLUTIONS: UNSUPERVISED LEARNING

Exploring Unsupervised
Learning with Social Network
Data

24
4. SOLUTIONS: UNSUPERVISED LEARNING
Exploring Unsupervised
Learning with Social Network
Data
Output:

25
3. PRACTICAL EXAMPLES AND DEMONSTRATIONS

Semi-structure Learning: Exploring Semi-Structured Learning with Email Dataset.


Tasks:
1. Data Preparation: a. Generate a synthetic email dataset with features like number of words,
attachments, keywords, and sender's domain. b. Print the first few rows to understand its structure.
2. Label Propagation: a. Apply Label Propagation using partially labeled data (e.g., spam or not spam).
b. Train the model, predict labels for unlabeled data, and evaluate accuracy.
3. Graph-Based Semi-Supervised Learning: a. Construct a graph where nodes represent emails and edges
represent similarities. b. Apply a graph-based semi-supervised algorithm to classify emails. c. Visualize
the graph with nodes colored by predicted labels.
4. Dimensionality Reduction with t-SNE: a. Apply t-SNE to reduce dataset dimensions to 2 components.
b. Visualize the dataset in 2D, coloring data points by predicted labels.
5. Anomaly Detection: a. Use Isolation Forest to detect anomalies (e.g., potential spam). b. Identify and
26 visualize anomalies using a scatter plot.
4. SOLUTIONS: SEMI-STRUCTURE LEARNING
Exploring Semi-Structured
Learning with Email Dataset.

27
4. SOLUTIONS: SEMI-STRUCTURE LEARNING

Exploring Semi-Structured
Learning with Email Dataset.

28
4. SOLUTIONS: SEMI-STRUCTURE LEARNING
Output: Exploring Semi-Structured
Learning with Email Dataset.

29
3. PRACTICAL EXAMPLES AND DEMONSTRATIONS

Reinforcement Learning: Exploring Reinforcement Learning with a Grid World Environment

Tasks:

1. Environment Setup: a. Create a simple Grid World environment where an agent can move up, down, left, or right.
b. Define the grid size, start position, goal position, and obstacles.

2. Q-Learning Implementation: a. Implement the Q-Learning algorithm for the agent to learn the optimal policy to
reach the goal. b. Define the reward structure, learning rate, and discount factor. c. Train the agent by allowing it to
interact with the environment for a specified number of episodes.

3. Policy Visualization: a. Visualize the learned policy by showing the optimal path from the start to the goal on the
grid. b. Display the Q-values for each state-action pair on the grid.

4. Performance Evaluation: a. Plot the total rewards per episode to observe the learning progress. b. Evaluate the
performance by calculating the average reward over multiple test episodes.

5. Exploration vs. Exploitation: a. Implement an epsilon-greedy strategy for the agent to balance exploration and
30
exploitation. b. Experiment with different values of epsilon and observe the impact on the learning process.
4. SOLUTIONS

Reinforcement Learning:

Exploring Reinforcement
Learning with a Grid
World Environment

31
4. SOLUTIONS

Reinforcement Learning:

Exploring Reinforcement
Learning with a Grid
World Environment

32
4. SOLUTIONS

Reinforcement Learning:

Exploring Reinforcement
Learning with a Grid
World Environment

33
4. SOLUTIONS

Reinforcement Learning:

Output:
Exploring Reinforcement
Learning with a Grid
World Environment

34
Thank You!
If you have any questions, please reach me!

You might also like