0% found this document useful (0 votes)
10 views

sdl unit 1

Ml

Uploaded by

rohan.babbar
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
10 views

sdl unit 1

Ml

Uploaded by

rohan.babbar
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 7

Machine Learning Overview

Here’s an in-depth explanation of each topic you mentioned, with detailed insights and
examples to help you thoroughly prepare for your end-semester exams. Each section has
been expanded to cover key concepts, formulas, diagrams (descriptive), and applications.

1. Introduction to Machine Learning

What is Machine Learning (ML)?

ML is a branch of artificial intelligence (AI) that focuses on building systems that can
automatically learn and improve from experience.

Instead of being explicitly programmed for every task, ML models identify patterns in
data and make decisions or predictions.

Characteristics of ML:

1. Data-Driven: ML relies on large datasets to extract meaningful patterns.

2. Iterative Process: Models improve as they are exposed to more data.

3. Generalization: The ability to perform well on unseen data (not just training data).

Types of Problems Solved by ML:

1. Prediction: Forecasting future trends based on historical data (e.g., stock prices).

2. Classification: Categorizing data into predefined labels (e.g., email spam detection).

3. Clustering: Grouping similar data points (e.g., customer segmentation).

Steps in a Typical ML Workflow:

1. Data Collection: Collect relevant data (structured or unstructured).

2. Data Preprocessing: Clean and prepare the data (handle missing values, scaling, etc.).

3. Model Training: Use an algorithm to learn patterns from training data.

4. Model Evaluation: Test performance on validation/testing data.

5. Prediction/Deployment: Use the trained model to make predictions or deploy it in real-


world applications.

1/7
Applications of ML:

Healthcare: Disease diagnosis, drug discovery.

Finance: Fraud detection, stock price prediction.

Retail: Recommendation systems (e.g., Amazon).

Autonomous Vehicles: Self-driving cars use ML for object detection and decision-
making.

2. Types of Machine Learning


Machine Learning can be broadly classified into three types:

a) Supervised Learning:

Definition: The model is trained on a labeled dataset, where both inputs (features) and
corresponding outputs (labels) are provided.

Goal: Learn a mapping from input to output.

Key Algorithms:

Linear Regression, Logistic Regression.

Decision Trees, Random Forests, Support Vector Machines (SVMs).

Examples:

Predicting house prices based on square footage (Regression).

Classifying emails as spam or not spam (Classification).

b) Unsupervised Learning:

Definition: The model is trained on an unlabeled dataset and is tasked with finding
hidden patterns or structures in the data.

Goal: Group or reduce the dimensionality of data.

Key Algorithms:

K-Means Clustering, Hierarchical Clustering.

Principal Component Analysis (PCA), t-SNE.

Examples:

2/7
Segmenting customers based on purchase behavior.

Identifying topics in a collection of articles.

c) Reinforcement Learning:

Definition: The model learns by interacting with an environment and receiving rewards
or penalties based on its actions.

Goal: Maximize cumulative reward over time.

Key Concepts:

Agent: The learning entity (e.g., robot).

Environment: The external system the agent interacts with.

Reward Signal: Feedback for the agent's actions.

Examples:

Training AI to play games (e.g., AlphaGo).

Robotics for navigation and control.

3. Supervised Learning Basics

What is Supervised Learning?

A learning paradigm where the model learns a mapping function from input features (X
) to output labels (Y ).

Key Steps:

1. Dataset Preparation:

Features (X ): Independent variables.

Labels (Y ): Dependent variables (targets).

Example: In house price prediction:

Features (X ): Square footage, number of rooms.

Labels (Y ): House price.

2. Training:

3/7
The model uses labeled data to identify relationships between X and Y .

3. Testing:

Evaluate the model's performance on unseen data to ensure it generalizes well.

Examples:

Regression: Predicting continuous outcomes (e.g., temperature).

Classification: Assigning discrete labels (e.g., cat vs. dog).

4. Regression and Classification

Regression

Predicts a continuous value.

Example: Predicting the price of a stock.

Types of Regression:

1. Linear Regression:

Relationship between dependent (y ) and independent (x) variables is linear.

Equation: y = β0 + β1 x + ϵ .
​ ​

2. Polynomial Regression:

Captures non-linear relationships by adding polynomial terms (x2 , x3 , …).

Applications:

Forecasting sales.

Estimating growth trends.

Classification

Predicts discrete classes.

Example: Identifying spam emails.

Types of Classification:

4/7
1. Binary Classification:

Two possible outcomes (e.g., spam or not spam).

2. Multi-Class Classification:

More than two classes (e.g., classifying images into cats, dogs, and birds).

Applications:

Sentiment analysis.

Medical diagnosis (disease detection).

5. Linear Regression

Key Concepts:

1. Dependent Variable (y ): Target variable.

2. Independent Variable (x): Features affecting y .

3. Intercept (β0 ): Predicted y when x


​ = 0.
4. Slope (β1 ): Change in y for a one-unit change in x.

Model Assumptions:

1. Linear relationship exists between x and y .

2. Errors are normally distributed.

3. No multicollinearity among predictors.

4. Homoscedasticity: Constant variance of residuals.

Evaluation Metrics:

Mean Squared Error (MSE): Penalizes large errors.

R-Squared (R2 ): Proportion of variance explained.

6. Logistic Regression

5/7
Why Logistic Regression?

Used when the dependent variable is categorical (e.g., binary outcomes like "Yes/No").

Key Formula:

Sigmoid Function:
1
P (y = 1∣x) =
1 + e−(β0 +β1 x)

​ ​

Produces probabilities between 0 and 1.

Decision Boundary:

Threshold for classification (e.g., if P > 0.5, classify as 1).

Applications:

Predicting customer churn.

Credit risk assessment.

7. Model Evaluation Metrics

For Regression:

1. Mean Absolute Error (MAE):

Average magnitude of errors.


1
M AE = n
​ ∑ ∣yi − y^i ∣. ​ ​ ​

2. Mean Squared Error (MSE):

Penalizes larger errors.


1
MSE = n
​ ∑(yi − y^i )2 .
​ ​ ​

3. R-Squared (R2 ):

Measures how well the model explains the variance.


SSresidual
R2 = 1 − SStotal ​
. ​

For Classification:

1. Accuracy:

6/7
Ratio of correct predictions.
TP +TN
Accuracy = TP +TN +FP +FN
. ​

2. Precision:

Correctness of positive predictions.


TP
P recision = TP +FP
.​

3. Recall (Sensitivity):

Ability to identify all positives.


TP
Recall = TP +FN
.​

4. F1-Score:

Balance between precision and recall.


P recision⋅Recall
F1 = 2 ⋅ P recision+Recall
.

5. ROC-AUC:

Measures how well the model separates classes.

Preparation Tips:
Understand the Theory: Ensure you grasp the concepts and can explain them in your
own words.

Practice Questions: Solve problems on datasets to solidify your understanding.

Learn Assumptions: For regression and logistic regression, focus on the model's
assumptions and why they matter.

Summarize: Create a one-page cheat sheet for formulas and key points.

Let me know if you want further elaboration on any sub-topic or examples!

7/7

You might also like