0% found this document useful (0 votes)
2 views

Machine Learning

ml

Uploaded by

lovishh03.ssll
Copyright
© © All Rights Reserved
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
2 views

Machine Learning

ml

Uploaded by

lovishh03.ssll
Copyright
© © All Rights Reserved
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 57

Machine Learning

Machine Learning
• “Learning is any process by which a system
improves performance from experience.”
--------------- Herbert Simon
Definition by Tom Mitchell (1998):
Machine Learning is the study of algorithms that
• improve their performance P
• at some task T
• with experience E.
A well-defined learning task is given by <P,T<E> .
• Machine learning uses a variety of algorithms
that iteratively learn from data to improve,
describe data, and predict outcomes.
• As the algorithms ingest training data, it is
then possible to produce more precise models
based on that data.
• A machine learning model is the output
generated when you train your machine
learning algorithm with data.
• After training, when you provide a model with
an input, you will be given an output
• Some machine learning models are online and
continuously adapt as new data is ingested.
• On the other hand, other models, called
offline machine learning models, are derived
from machine learning algorithms but, once
deployed, do not change.
Traditional Programming:

Data
output
program

Machine Learning

Data
program

output
Related Disciplines

decision game
theory theory
AI control
theory
information
biological theory
evolution
Machine
probability Learning
& philosophy
statistics
optimization
Data Mining statistical psychology
mechanics

computational
complexity
theory neurophysiology
When Do We Use Machine Learning?
• ML is used when:
• Human expertise does not exist (navigating on Mars)
• Humans can’t explain their expertise (speech
recognition)
• Models must be customized (personalized medicine)
• Models are based on huge amounts of data (genomics)

Learning isn’t always useful: • There is no need to “learn”


to calculate payroll
Some more examples of tasks that are best solved by
using a learning algorithm

Recognizing patterns:
• Facial identities or facial expressions
• Handwritten or spoken words
• Medical images
Generating patterns:
• Generating images or motion sequences
Recognizing anomalies:
• Unusual credit card transactions
• Unusual patterns of sensor readings in a nuclear power plant
Prediction:
• Future stock prices or currency exchange rates
Sample Applications
• • Web search
• Computational biology
• Finance
• E-commerce
• Space exploration
• Robotics
• Information extraction
• Social networks
• Debugging software
Structured data sources
• Structured data is typically stored in traditional relational databases and refers to
data that has a defined length and format.

» Sensor data: Examples include radio frequency ID (RFID) tags, smart meters,
medical devices, and Global Positioning System (GPS) data.
» Weblog data: When servers, applications, networks, and so on operate, they
capture all kinds of data about their activity.
» Point-of-sale data: When the cashier swipes the bar code of any product that you
purchase, all that data associated with the product is generated.
» Financial data: Many financial systems are now programmatic; they operate based
on predefined rules that automate processes.
» Weather data: Sensors to collect weather data are being deployed across towns,
cities, and regions to collect data on things like temperature, wind, barometric
pressure, and precipitation. This data can help meteorologists create hyperlocal
forecasts.
» Click-stream data: Data is generated every time you click a link on a website. This
data can be analyzed to determine customer behavior and buying patterns
Unstructured data sources
• Although unstructured data has some implicit structure, it doesn’t follow
a specified format.
» Text internal to your company: Think of all of the text within documents,
logs, survey results, and emails. Enterprise information actually
represents a large percent of the text information in the world today.
» Social media data: This data is generated from the social media platforms,
such as YouTube, Facebook, Twitter, LinkedIn, and Flickr.
» Mobile data: This includes text messages, notes, calendar inputs, pictures,
videos, and data entered into third-party mobile applications.
» Satellite images: This includes weather data or the data that the
government captures in its satellite surveillance imagery.
» Photographs and video: This includes security, surveillance, and traffic data.
» Radar or sonar data: This includes vehicular, meteorological, and
oceanographic data.
What is Data?

• Collection of data objects and Attributes


their attributes
• An attribute is a property or Tid Refund Marital Taxable
Status Income Cheat
characteristic of an object
1 Yes Single 125K No
– Examples: eye color of a person,
temperature, etc. 2 No Married 100K No
– Attribute is also known as 3 No Single 70K No

Objects
variable, field, characteristic, 4 Yes Married 120K No
dimension, or feature 5 No Divorced 95K Yes
• A collection of attributes 6 No Married 60K No
describe an object 7 Yes Divorced 220K No
– Object is also known as record, 8 No Single 85K Yes
point, case, sample, entity, or
9 No Married 75K No
instance
10 No Single 90K Yes
10
Attribute Values
• Attribute values are numbers or symbols
assigned to an attribute for a particular object

• Distinction between attributes and attribute


values
– Same attribute can be mapped to different attribute
values
• Example: height can be measured in feet or meters

– Different attributes can be mapped to the same set of


values
• Example: Attribute values for ID and age are integers
• But properties of attribute values can be different
Types of Attributes
• There are different types of attributes
– Nominal
• Examples: ID numbers, eye color, zip codes
– Ordinal
• Examples: rankings (e.g., taste of potato chips on a scale
from 1-10), grades, height {tall, medium, short}
– Interval
• Examples: calendar dates, temperatures in Celsius or
Fahrenheit.
– Ratio
• Examples: temperature in Kelvin, length, time, counts
Discrete and Continuous Attributes
• Discrete Attribute
– Has only a finite or countably infinite set of values
– Examples: zip codes, counts, or the set of words in a collection of
documents
– Often represented as integer variables.
– Note: binary attributes are a special case of discrete attributes
• Continuous Attribute
– Has real numbers as attribute values
– Examples: temperature, height, or weight.
– Practically, real values can only be measured and represented
using a finite number of digits.
– Continuous attributes are typically represented as floating-point
variables.
Types of data sets
• Record
– Data Matrix
– Document Data
– Transaction Data
• Graph
– World Wide Web
– Molecular Structures
• Ordered
– Spatial Data
– Temporal Data
– Sequential Data
– Genetic Sequence Data
Important Characteristics of Data

– Dimensionality (number of attributes)


• High dimensional data brings a number of challenges

– Sparsity
• Only presence counts

– Resolution
• Patterns depend on the scale

– Size
• Type of analysis may depend on size of data
Record Data
• Data that consists of a collection of records, each
of which consists of a fixed set of attributes
Tid Refund Marital Taxable
Status Income Cheat

1 Yes Single 125K No


2 No Married 100K No
3 No Single 70K No
4 Yes Married 120K No
5 No Divorced 95K Yes
6 No Married 60K No
7 Yes Divorced 220K No
8 No Single 85K Yes
9 No Married 75K No
10 No Single 90K Yes
10
Data Matrix
• If data objects have the same fixed set of numeric attributes,
then the data objects can be thought of as points in a multi-
dimensional space, where each dimension represents a
distinct attribute

• Such data set can be represented by an m by n matrix, where


there are m rows, one for each object, and n columns, one for
each attribute

Projection Projection Distance Load Thickness


of x Load of y load

10.23 5.27 15.22 2.7 1.2


12.65 6.25 16.22 2.2 1.1
Document Data
• Each document becomes a ‘term’ vector
– Each term is a component (attribute) of the vector
– The value of each component is the number of
times the corresponding term occurs in the
document.

timeout

season
coach

game
score
play
team

win
ball

lost
Document 1 3 0 5 0 2 6 0 2 0 2

Document 2 0 7 0 2 1 0 0 3 0 0

Document 3 0 1 0 0 1 2 2 0 3 0
Transaction Data
• A special type of record data, where
– Each record (transaction) involves a set of items.
– For example, consider a grocery store. The set of
products purchased by a customer during one
shopping trip constitute a transaction, while the
individual products that were purchased are the
items. TID Items
1 Bread, Coke, Milk
2 Beer, Bread
3 Beer, Coke, Diaper, Milk
4 Beer, Bread, Diaper, Milk
5 Coke, Diaper, Milk
Graph Data
• Examples: Generic graph, a molecule, and webpages

2
5 1
2
5

Benzene Molecule: C6H6


Ordered Data
• Sequences of transactions
Items/Events

An element of
the sequence
Ordered Data
• Genomic sequence data
GGTTCCGCCTTCAGCCCCGCGCC
CGCAGGGCCCGCCCCGCGCCGTC
GAGAAGGGCCCGCCTGGCGGGCG
GGGGGAGGCGGGGCCGCCCGAGC
CCAACCGAGTCCGACCAGGTGCC
CCCTCTGCTCGGCCTAGACCTGA
GCTCATTAGGCGGCAGCGGACAG
GCCAAGTAGAACACGCGAAGCGC
TGGGCTGCCTGCTGCGACCAGGG
Ordered Data

• Spatio-Temporal Data

Average Monthly
Temperature of
land and ocean
The Machine Learning Cycle
Identify the data: Identifying the relevant data sources is the first step in the cycle. In
addition, as you develop your machine learning algorithm, think about expanding the
target data to improve the system.
» Prepare data: Make sure your data is clean, secured, and governed. If you create a
machine learning application based on inaccurate data, the application will fail.
» Select the machine learning algorithm: You may have several machine learning algorithms
applicable to your data and business challenge.
» Train: You need to train the algorithm to create the model. Depending on the type of data
and algorithm, the training process may be supervised, unsupervised, or reinforcement
learning.
» Evaluate: Evaluate your models to find the best performing algorithm. » Deploy: Machine
learning algorithms create models that can be deployed to both cloud and on-premises
applications.
» Predict: After deployment, start making predictions based on new, incoming data.
» Assess predictions: Assess the validity of your predictions. The information you gather
from analyzing the validity of predictions is then fed back into the machine learning cycle
to help improve accuracy.
Training machine learning systems
• When you’re training a machine learning
system, you know the inputs.
• you know your desired goal.
– Training a machine learning algorithm to create an
accurate model can be broken down into three
steps:
– Representation
– Evaluation
– Optimization
Representation.
• The algorithm creates a model to transform
the inputted data into the desired results.
• As the learning algorithm is exposed to more
data, it will begin to learn the relationship
between the raw data and which data points
are strong predictors for the desired outcome.
Representation
• Decision trees
• Sets of rules / Logic programs
• Instances
• Graphical models (Bayes/Markov nets)
• Neural networks
• Support vector machines
• Model ensembles
• Etc.
Evaluation
• As the algorithm creates multiple models,
either a human or the algorithm will need to
evaluate and score the models based on which
model produces the most accurate predictions.
• It is important to remember that after the
model is operationalized, it will be exposed to
unknown data.
• As a result, make sure the model is generalized
and not overfit to your training data.
Evaluation
• Accuracy
• Precision and recall
• Squared error
• Likelihood
• Posterior probability
• Cost / Utility
• Margin
• Entropy
• K-L divergence
• Etc.
Optimization
• After the algorithm creates and scores
multiple models, select the best performing
algorithm.
• As you expose the algorithm to more diverse
sets of input data, select the most generalized
model.
Optimization
• Combinatorial optimization
– E.g.: Greedy search
• Convex optimization
– E.g.: Gradient descent
• Constrained optimization
– E.g.: Linear programming
Types of Learning
• Supervised (inductive) learning –
Given: training data + desired outputs (labels)
• Unsupervised learning –
Given: training data (without desired outputs)
• Semi-supervised learning –
Given: training data + a few desired outputs
• Reinforcement learning –
Rewards from sequence of actions
• Unsupervised learning: Learning a model from
unlabeled data.
Methods: K-means, gaussian mixtures,
hierarchical clustering, spectral clustering, etc.
• Supervised learning: Learning a model from
labeled data.
Methods: Support Vector Machines, neural
networks, decision trees, K-nearest neighbors,
naive Bayes, etc.
Supervised Learning: Uses
Example: decision trees tools that create rules

• Prediction of future cases: Use the rule to


predict the output for future inputs
• Knowledge extraction: The rule is easy to
understand
• Compression: The rule is simpler than the data
it explains
• Outlier detection: Exceptions that are not
covered by the rule, e.g., fraud

41
Unsupervised Learning
• Learning “what normally happens”
• No output
• Clustering: Grouping similar instances
• Other applications: Summarization,
Association Analysis
• Example applications
– Customer segmentation in CRM
– Image compression: Color quantization
– Bioinformatics: Learning motifs
43
Reinforcement Learning
• The output of the system is a sequence of actions.
• No supervised output but delayed reward
• Applications:
– Game playing
– Robot in a maze
– Multiple agents, partial observability, ...

44
Unsupervised learning
• Unsupervised learning is best suited when the problem
requires a massive amount of data that is unlabeled.
• For example, social media applications, such as Twitter,
Instagram, Snapchat, and so on all have large amounts of
unlabeled data.
• Understanding the meaning behind this data requires
algorithms that can begin to understand the meaning
based on being able to classify the data based on the
patterns or clusters it finds.
• Unsupervised learning is used with email spam-
detecting technology.
• Unsupervised learning algorithms segment data into
groups of examples (clusters) or groups of features.
• The unlabeled data creates the parameter values and
classification of the data.
• This process adds labels to the data so that it
becomes supervised.
• Unsupervised learning can determine the outcome
when there is a massive amount of data.
• For example, in healthcare, collecting huge amounts
of data about a specific disease can help practitioners
gain insights into the patterns of symptoms and
relate those to outcomes from patients.
Reinforcement learning
• Reinforcement learning is a behavioral
learning model.
• The algorithm receives feedback from the
analysis of the data so the user is guided to
the best outcome.
• Reinforcement learning differs from other
types of supervised learning because the
system isn’t trained with the sample data set
• Rather, the system learns through trial and
error.
• Therefore, a sequence of successful decisions
will result in the process being “reinforced”
because it best solves the problem at hand.
• One of the most common applications of
reinforcement learning is in robotics or game
playing.
• Reinforcement learning is also the algorithm
that is being used for self-driving cars.
Deep learning
• Deep learning is a specific method of machine
learning that incorporates neural networks in
successive layers in order to learn from data in an
iterative manner.
• Deep learning is especially useful when you’re
trying to learn patterns from unstructured data.
• Deep learning is often used in image recognition,
speech, and computer vision applications.
• A neural network consists of three or more layers:
an input layer, one or many hidden layers, and an
output layer.
• Data is ingested through the input layer. Then the
data is modified in the hidden layer and the output
layers based on the weights applied to these
nodes.
• The term deep learning is used when there are
multiple hidden layers within a neural network.
• Deep learning is a machine learning technique that
uses hierarchical neural networks to learn from a
combination of unsupervised and supervised
algorithms.
• Deep learning learns from unlabeled and unstructured data.
• While deep learning is very similar to a traditional neural
network, it will have many more hidden layers.
• The more complex the problem, the more hidden layers
there will be in the model.
• In the Internet of Things (IoT) manufacturing applications,
deep learning can be used to predict when a machine will
malfunction.
• Deep learning algorithms can help law enforcement
personnel keep track of the movements of a known suspect.
Difference between supervised and unsupervised learning

• Supervised learning uses known and labeled data as input.


Unsupervised learning uses unlabeled data as input.

• Supervised learning has feedback mechanism .


Unsupervised learning has no feedback mechanism.

• Mostly used supervised learning algorithms are :decision


trees,SVM, Logistic regression.
Mostly used are Kmean clustering, hierarical clustering, apriori
algorithm
Types of machine learning algorithms
• Bayesian: Bayesian algorithms allow data
scientists to encode prior beliefs about what
models should look like, independent of what
the data states.
• focus on data defining the model.
• These algorithms are especially useful when
you don’t have massive amounts of data to
confidently train a model.
• Clustering: Clustering is a fairly
straightforward technique to understand —
objects with similar parameters are grouped
together (in a cluster).
• All objects in a cluster are more similar to
each other than objects in other clusters.
• Clustering is a type of unsupervised learning
because the data is not labeled.
• The algorithm interprets the parameters that
make up each item and then groups them
accordingly
• Decision tree: Decision tree algorithms use a
branching structure to illustrate the results of a
decision.
• Decision trees can be used to map the possible
outcomes of a decision.
• Each node of a decision tree represents a
possible outcome.
• Percentages are assigned to nodes based on
the likelihood of the outcome occurring.
• Decision trees are sometimes used for
marketing campaigns.

You might also like