0% found this document useful (0 votes)
13 views

lec 2

Uploaded by

voicemint311
Copyright
© © All Rights Reserved
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
13 views

lec 2

Uploaded by

voicemint311
Copyright
© © All Rights Reserved
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 23

Online vs.

Offline Machine
Learning

Understanding Key Differences and Use Cases


How does machine learning work?
Machine learning uses two types of techniques: supervised learning
and unsupervised learning.

Supervised learning builds a model that makes predictions based on


evidence in the presence of uncertainty. The model is trained on
known input and output data so that it can reasonably predict future
outputs.

In contrast, unsupervised learning looks for patterns and structures in


datasets. The machine learns to draw inferences from datasets
without labeled responses.
Simply put, a machine learning model learns through repetition and is
fed information repeatedly until it gets better at carrying out its
intended function, such as correctly identifying objects or making
accurate predictions about the weather.

This can be achieved through either online machine learning or


offline (batch) machine learning.
What is Offline Machine
Learning?
• Simply put, offline or batch learning refers to learning
over all the observations in a dataset at a go. We can
also say that models in offline learning learn over a
static dataset. We collect data and then train a machine
learning model to learn from this data.

• In example of learning weather patterns. For offline


learning, we collect the weather readings for six months
and then train a model over this data collection.

• Additionally, in offline learning, the parameters of the


machine learning model are updated when learning has
been completed over the entire dataset
Characteristics:
• Batch Processing: All data is processed in one go.
• Data Collection: The model learns from historical data that is
gathered and fixed prior to training.
• Training Phase: Once the model is trained, it is tested and
evaluated before deployment.
Advantages of Offline Machine
Learning:
• Accuracy: Models can be highly accurate as they
utilize the entire dataset, allowing for thorough
learning of patterns and relationships.
• Simplicity: Easier to develop and evaluate;
straightforward training process as it doesn’t require
continuous monitoring.
• Robustness :Can effectively capture complex
relationships when provided with comprehensive
data.
Disadvantages of Offline
Machine Learning

• Adaptability: Limited adaptability to new data; any


changes in data distribution require retraining the
entire model.
• Resource Intensive: Requires significant
computational resources and memory, making it
potentially slow and costly for large datasets.
• Latency: Typically involves a longer wait time from
data collection to deployment due to the batch
processing nature.
Use Cases for Offline Machine
Learning

Examples
Image Recognition: Training models on large image
datasets for tasks like facial recognition or object detection.
Natural Language Processing (NLP): Developing
models for sentiment analysis or language translation
based on static corpuses of text.
Medical Diagnosis: Creating models to predict diseases
using comprehensive patient data.
What is Online Machine
Learning?
• Online machine learning is a method of machine
learning where the model continuously updates and
evolves as it is exposed to new data. Unlike
traditional machine learning methods that require
retraining on the entire dataset, online machine
learning models adapt and learn from each new data
point they receive. This makes them particularly
useful in situations where data is continuously
generated and the model needs to adapt to changing
patterns in real-time.
What is Online Machine
Learning?
• When we talk of online learning, we refer to instances
where learning occurs as the data becomes available.
Alternatively, we also mean learning by considering
one observation at a time. In this case, the model
parameters get updated each time it receives a new
observation.

• In online learning, we train the model over


observation, update the parameters, and iterate over
these till we obtain a model that can be used for the
task at hand.
• This process of constantly learning through updating
the parameters makes online machine learning
adaptable to different types of data.
What is Online Machine Learning?
Online machine learning refers to a method where the
model is trained incrementally, processing data in
real-time or in small batches as it becomes available.

Characteristics: Continuous Learning: The model


updates its parameters with each new data point.

Immediate Predictions: Capable of making


predictions and updates instantly as new data
arrives.
Advantages of Online Machine
Learning

Adaptability: Quickly adapts to changes in data


distribution, making it suitable for dynamic
environments.
Resource Efficiency: Generally requires less
memory and computational resources since it
doesn't need to load the entire dataset at once.
Real-Time Processing: Enables real-time
predictions and model updates, essential for
applications that demand immediacy.
Disadvantages of Online Machine
Learning
Accuracy Concerns:
May lead to less accurate models if the
incoming data is noisy or unrepresentative.
Complexity: More complicated
implementation due to the need for
managing model states and memory over
time.
Potential for Over fitting:
Risk of adapting too quickly to recent data,
leading to over fitting.
Use Cases for Online Machine
Learning
Examples:
Stock Price Prediction: Continuously
updating models based on real-time market
data for better forecasting.
Recommendation Systems: Adjusting
recommendations based on user interactions
and preferences as they occur.
Fraud Detection: Real-time monitoring of
transactions to identify fraudulent activities
as they happen..
Key differences between online and
offline machine learning
Challenges of Machine
Learning
Machine learning, a subset of artificial intelligence,
enables computers to learn from data, uncover
patterns, and make predictions or decisions without
being explicitly programmed. It has found
applications in autonomous vehicles, healthcare
diagnostics, recommendation systems, and more.
But beneath its promising facade lies a maze of
challenges and limitations. Let’s begin by
dissecting the key challenges of machine learning.
1. Data Quality and Quantity
Challenge: Machine learning algorithms are voracious
consumers of data, but they demand high-quality
data. Garbage in, garbage out — the adage rings
true in the world of ML. The data used for training
machine learning models should be clean,
accurate, and representative of the problem. Data
preprocessing, cleaning, and augmentation are
often required to ensure data quality. Additionally,
having a sufficient quantity of data is crucial, as
models need diverse examples to learn effectively.

Real-life Example: Healthcare providers rely on patient


records to train diagnostic models. Incomplete or
inaccurate data can lead to erroneous predictions,
risking patient health.
2. Over fitting and Under
fitting Machine learning models can over fit
Challenge:
(become overly complex) or under fit (too
simplistic). Striking the right balance is critical for
model performance. Over fitting occurs when a
model fits the training data too closely, capturing
noise instead of useful patterns. Under fitting, on
the other hand, results from overly simplistic
models that can’t capture complex relationships in
the data. Addressing these issues often involves
hyper parameter tuning and cross-validation.
Real-life Example: In stock market prediction, an
over fit model may perform exceptionally well on
historical data but fail to generalize to new market
conditions, leading to poor investment decisions.
3. Generalization
Challenge: A successful machine learning model
should perform well on new, unseen data.
Achieving this generalization is often tricky.
Ensuring that a model generalizes effectively is a
core challenge. Overfit models may perform well on
training data but fail to make accurate predictions
on new, unseen data. Techniques such as cross-
validation and regularization are employed to
improve generalization.
Real-life Example: A spam email classifier may
excel in identifying common spam, but it could
falter when new, sophisticated spam techniques
emerge.
4. Bias and Fairness
Challenge: Machine learning models can
inadvertently inherit biases present in their training
data, leading to unfair or discriminatory outcomes.
Biases in training data can result from historical
prejudices, unequal representation, or data
collection methods. Addressing bias and ensuring
fairness in models is crucial, particularly in
applications like hiring and lending, where
discrimination can have serious consequences.
Real-life Example: Hiring algorithms can
unintentionally favor candidates from specific
demographic groups, contributing to discrimination
in the recruitment process.
5.Model Selection
Challenge: Selecting the appropriate machine
learning algorithm or model architecture for a
specific problem can be perplexing. Choosing the
wrong one may result in suboptimal performance.
The choice of model depends on the data type,
problem type (classification, regression, clustering),
and the desired output. It also involves deciding
between traditional machine learning algorithms
and deep learning methods.
Real-life Example: Image recognition tasks benefit
from convolutional neural networks (CNNs), while
natural language processing tasks require recurrent
neural networks (RNNs).
Conclusion

Choosing the Right Approach:


The choice between online and offline machine
learning depends on factors like data
characteristics, application needs, and resource
availability.
Understanding these differences helps
organizations leverage the appropriate model to
achieve their goals effectively.
Both methods have unique strengths, and in some
cases, a hybrid approach may be beneficial.

You might also like