0% found this document useful (0 votes)
67 views

Chapter 1 DA

The document discusses data analytics including its concept, types of analytics, and importance. It describes the differences between data analysis and data analytics, stating that data analysis is a subset of data analytics and looks backwards to understand past data, while analytics predicts the future. Several examples of using data analytics are provided.

Uploaded by

payalwani73
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
67 views

Chapter 1 DA

The document discusses data analytics including its concept, types of analytics, and importance. It describes the differences between data analysis and data analytics, stating that data analysis is a subset of data analytics and looks backwards to understand past data, while analytics predicts the future. Several examples of using data analytics are provided.

Uploaded by

payalwani73
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 73

Course Code: CS - 364

Course Title : Data Analytics

C:\Users\RRB\FDS_TY\T.Y.B.Sc.
(Computer Science)_07.07.2021.pdf
Chapter 1
Introduction to Data Analytics
“Information is the oil of the 21st century, and
analytics is the combustion engine.”
— Peter Sondergaard, Senior Vice President, Gartner
Research
Contents
• Concept of data analytics
• Data analysis vs Data analytics
• Types of analytics
– Diagnostic Analytics, Predictive Analytics , Prescriptive
Analytics, Exploratory Analysis, Mechanistic Analysis
• Mathematical models - Concept

• Model evaluation: metrics for evaluating classifiers -


Class imbalance - AUC, ROC (Receiver-Operator
Characteristic) curves, Evaluating value prediction
models
Concept of Data Analytics
• Data analytics is the science of analyzing raw
data to make conclusions about that
information.
• The techniques and processes of data
analytics have been automated into
mechanical processes and algorithms that
work over raw data for human consumption.
• Data analytics help a business optimize its
performance.
• Data analytics process Gives a clear picture
of where you are, where you have been and
where you should go.
• Data analytics techniques can reveal trends
and metrics that would otherwise be lost in
the mass of information.
• This information can then be used to optimize
processes to increase the overall efficiency of
a business or system.
• Examples
– Manufacturing companies often record the
runtime, downtime, and work queue for various
machines and then analyze the data to better plan
the workloads so the machines operate closer to
peak capacity.
– Gaming companies use data analytics to set
reward schedules for players that keep the
majority of players active in the game.
– Content companies use many of the same data
analytics to keep the user clicking, watching, or
re-organizing content to get another view or
another click.
– Use of data analytics in Travel and hospitality
industry, where turnarounds can be quick. This
industry can collect customer data and figure out
where the problems, if any, lie and how to fix
them.
– Healthcare combines the use of high volumes of
structured and unstructured data and uses data
analytics to make quick decisions.
– Retail industry uses copious amounts of data to
meet the ever-changing demands of shoppers.
The information retailers collect and analyze can
help them identify trends, recommend products,
and increase profits.
• Importance of Data analytics
– Helps businesses optimize their performances.
– Implementing it into the business model means
companies can help reduce costs by identifying
more efficient ways of doing business and by
storing large amounts of data.
– A company can also use data analytics to make
better business decisions and help analyze
customer trends and satisfaction, which can lead
to new—and better—products and services.
• Steps in Data analytics :
– The first step is to determine the data requirements or how
the data is grouped. Data may be separated by age,
demographic, income, or gender. Data values may be
numerical or be divided by category.
– The second step in data analytics is the process of
collecting it. This can be done through a variety of sources
such as computers, online sources, cameras,
environmental sources, or through personnel.
– Once the data is collected, it must be organized so it can be
analyzed. This may take place on a spreadsheet or other
form of software that can take statistical data.
– The data is then cleaned up before analysis. This means it is
scrubbed and checked to ensure there is no duplication or
error, and that it is not incomplete. This step helps correct
any errors before it goes on to a data analyst to be
analyzed.
Data analysis vs Data analytics
• Considerable differences between them like
the two chutneys: Onion Chutney and
Coconut Chutney.
• They both are used as a side dish for the well-known
South Indian dish Idli. Since both are chutneys in
general that doesn’t mean they are the same in-depth.
Without Idli, there is no worth for both.
– Similarly, there is not much importance for both
the terms Data Analytics and Data Analytics
without the Data.
• Data Analytics is a wide area involving
handling data with a lot of necessary tools to
produce helpful decisions with useful
predictions for a better output
• Data Analysis is actually a subset of Data
Analytics which helps us to understand the
data by questioning and to collect useful
insights from the information already
available.
• Data analysis refers to hands-on data
exploration and evaluation.
• Data analytics is a broader term and includes
data analysis as necessary subcomponent.
• Analytics defines the science behind the
analysis.
• The science means understanding the relative
processes an analyst uses to understand
problems and explore data in meaningful
ways.
• In simple terms, Data Analytics is the process
of exploring the data from the past to make
appropriate decisions in the future by using
valuable insights.
• Whereas Data Analysis helps in understanding
the data and provides required insights from
the past to understand what happened so far.
• Analysis looks backwards, providing marketers
with a historical view of what has happened.
• Analytics, on the other hand, models the
future or predicts a result.
• One way to understand the difference between analysis and
analytics is to think in terms of past and future.
• Data Analysis looks backwards, providing marketers with a
historical view of what has happened.
• Analytics, on the other hand, models the future or predicts
a result.
• Analytics makes extensive use of mathematics and statistics
and the use of descriptive techniques and predictive models
to gain valuable knowledge from data.
• These insights from data are used to recommend action or
to guide decision-making in a business context.
• Thus, analytics is not so much concerned with individual
analysis or analysis steps, but with the entire methodology
• Both the concepts, analytics and analysis, run
around the information called the Data.
• Data is a collection of information
• Understanding the insights hidden behind the
datasets, the analysis and analytics patterns
play a major role in fetching and showcasing a
lot more about the data, they attain various
transformations and cross numerous stages to
produce valuable output.
• Data Analytics consists of various stages
including identifying the problem, finding the
Data, Data Filtering, Data Validation, Data
Cleaning, Data Visualization, Data Analysis,
Inference, Prediction, etc.
• The most common tools employed in Data
Analytics are R, Python, SAS, SPARK, Google
Analytics, Excel, etc.
• Data Analysis comprises Data gathering, Data
validation, Interpretation, Analysis, Results,
etc., shortly it tries to find what the data is
trying to express.
• The most common tools employed in Data
Analysis are Tableau, Excel, SPARK, Google
Fusion tables, Node XL, etc.
• Analytics is commonly used in many distinct
ways to find some strange patterns like
finding the preferences, compute various
correlations, trend forecastings, etc.
• The most common real-life findings found
through analytics are market trend
forecastings, customer preferences, and
effective business decisions.
• With the help of Analysis, it is quite simple
and easy to explore more valuable insights
from the available data by performing the
various types of Data Analysis such as
Exploratory Data Analysis, Predictive Analysis,
and Inferential Analysis, etc.
• They play a major role by providing more
insights in understanding the data.
• An Example :
– Almost every one of us has at least some little
knowledge about the Share Market. Just think if you
are a beginner and you want to start your trade with
some profit there. Now say what you will do initially?
– Most probably before starting trading you just try to
examine the past trend records of the shares in the
share market to understand what happened so far in
order to frame your strategies to get more profit
right? This kind of process is an example of Data
Analysis.
– After understanding the trend of the shares, now you
may use different techniques to predict the future
price trend of the shares, and based on that you buy
some shares right? This is an example process of Data
Analytics.
Real-Time Analytics/Decision Requirement

Product
Recommendations Learning why Customers
that are Relevant Switch to competitors
& Compelling and their offers; in
time to Counter

Friend Invitations
Improving the to join a
Marketing Game or Activity
Effectiveness of a that expands
Promotion while it business
is still in Play
Preventing Fraud
as it is Occurring
& preventing more
proactively
Big data analytics
Analytics Models
How can we
make it happen?
Prescriptive
What will
Analytics
happen?
Predictive n
Why did it a tio
VALUE

Analytics m i z
happen? ti
Op t
Diagnostic s i gh
What re
Analytics Fo
happened?
Descriptive
i ght
Analytics Ins

on ht
ati ds i g
f o rm Hin
In

DIFFICULTY
24
Descriptive Analytics
• Descriptive analytics, such as reporting/OLAP,
dashboards, and data visualization, have been widely
used for some time. Monthly revenue reports, sales
leads, KPI( Key Perfomance Indicator) dashboards
• They are the core of traditional BI.

Descriptive analytics, is about “ What is happening now based on


incoming data”
It’s a method for quantitatively describing main features of a collection
of data.
• Key points about Descriptive analytics:
– Typically, it is the first kind of data analysis performed
on a dataset.
– • Usually it is applied to large volumes of data, such as
census data.
– • Description and interpretation processes are
different steps.
• Some of the examples where Descriptive analysis
can be useful is in the sales cycle, for example, to
categorize customers by their likely product
preferences and purchasing patterns.
• Another example is the Census Data Set, where
descriptive analysis is applied on a whole
population
• Often during data collection, researchers and
analysts are faced with raw data that needs to
be organized and summarized before it can be
analyzed.
• Data, when presented in an organized
manner, allows the observers to draw
conclusions, and reveal hidden patterns.
• Descriptive statistics facilitate in analyzing
and summarizing the data and are thus
instrumental to processes inherent in data
science.
• Data cannot be properly used if it is not
correctly interpreted.
• This requires appropriate statistics.
– For example, should we use the mean, median, or
mode, two of these, or all three? Each of these
measures is a summary that emphasizes certain
aspects of the data and overlooks others. They all
provide information we need to get a full picture
of the world we are trying to understand
• The important features of a data set needs to
be extracted, in order to describe it.
Diagnostic analysis
• Emphasizes on “Why it Happened”
• Descriptive analysis focus on ‘What has
happened’
• Tries to gain a deeper understanding of the
reasons behind the pattern of data found in
the past.
• Uses BI to dig down in depth to find the root
cause of the pattern or the nature of data
obtained.
• Eg : A data analyst can find out why the
performance of a student has risen (or degraded )
in the recent past 7 years.
• Diagnostic analysis deals with the critical aspect of
finding the reason behind a particular change or
cause in a phenomenon
• Machine learning techniques are used , to use BI
for a deeper understanding of a problem.
• Sometimes this type of analytics when done
hands-on with a small dataset is also known as
causal analysis, since it involves at least one cause
(usually more than one) and one effect.
• This allows a look at past performance to
determine what happened and why.
• The result of the analysis is often referred to
as an analytic dashboard.
• For example,
– for a social media marketing campaign, you can
use descriptive analytics to assess the number of
posts, mentions, followers, fans, page views,
reviews, or pins, etc.
– There can be thousands of online mentions that
can be distilled into a single view to see what
worked and what did not work in your past
campaigns
Predictive analytics
• Predictive analytics has its roots in our ability to
predict what might happen.
• These analytics are about understanding the
future using the data and the trends we have seen
in the past, as well as emerging new contexts and
processes.
• An example is
– trying to predict how people will spend their tax
refunds based on how consumers normally behave
around a given time of the year (past data and trends),
and how a new tax policy (new context) may affect
people’s refunds.
• Deals with the prediction of the future based
on the available current and past data.
– Eg: Predict the performance of each player for an
upcoming international cricket world cup.
• Predictive analytics applied in many domains
like risk management, sales forecasting,
weather forecasting etc.
• Prediction is generally made by
– Dividing the available data set into the training set
and testing set
– M/L is applied to check the accuracy level of
prediction.
• A predicted solution provides an approximate
forecasted result , may vary from the actual result,
since 100% accuracy is not guaranteed.
• Predictive analytics is done in stages, as follows:
– First, once the data collection is complete, it needs to go
through the process of cleaning
– Cleaned data can help us obtain hindsight in relationships
between different variables. Plotting the data (e.g., on a
scatterplot) is a good place to look for hindsight.
– Next, we need to confirm the existence of such
relationships in the data. This is where regression comes
into play. From the regression equation, we can confirm
the pattern of distribution inside the data. In other words,
we obtain insight from hindsight.
– Finally, based on the identified patterns, or insight, we can
predict the future, i.e., foresight.
• Examples:
– many people turn to predictive analytics to produce their
credit scores.
– Financial services use such numbers to determine the
probability that a customer will make their credit payments
on time. FICO, in particular, has extensively used predictive
analytics to develop the methodology to calculate
individual FICO scores.
– Customer relationship management (CRM) classifies
another common area for predictive analytics. Here, the
process contributes to objectives such as marketing
campaigns, sales, and customer service.
– Predictive analytics applications are also used in the
healthcare field. They can determine which patients are at
risk for developing certain conditions such as diabetes,
asthma, and other chronic or serious illnesses.
Prescriptive analytics
• Involves higher degree of complexity
• It is the area of business analytics dedicated to
finding the best course of action for a given
situation
• The insights gained from the previous three
types of analytics are combined to determine
the kind of action to be taken to solve a
certain situation.
• Describes what steps are needed to avoid a future
problem.
• Prescriptive analytics start by first analyzing the
situation (using descriptive analysis), but then
moves toward finding connections among various
parameters/variables, and their relation to each
other to address a specific problem, more likely
that of prediction.
• The prescriptive approach is a process-intensive
task, that analyzes potential decisions, the
interactions between decisions, the influences
that bear upon these decisions, and the bearing
all of this has on an outcome to ultimately
prescribe an optimal course of action in real time
• Prescriptive analytics can also suggest options
for taking advantage of a future opportunity
or mitigate a future risk and illustrate the
implications of each.
• In practice, prescriptive analytics can
continually and automatically process new
data to improve the accuracy of predictions
and provide advantageous decision options.
• For example, in healthcare, we can better
manage the patient population by using
prescriptive analytics to measure the number
of patients who are clinically obese, then add
filters for factors like diabetes and LDL
cholesterol levels to determine where to focus
treatment.
Exploratory analysis
• Often when working with data, there arises
situations where we may not have a clear
understanding of the problem or the situation,
And yet, we may be called on to provide some
insights.
• In other words, we are asked to provide an
answer without knowing the question! This is
where we go for an exploration.
• Exploratory analysis is an approach to analyzing
datasets to find previously unknown relationships.
• Often such analysis involves using various data
visualization approaches
• Plotting data in different forms, can provide us
with some clues regarding what we may find or
want to find in the data.
– This situation arises when we don’t have a clear
question or hypothesis.
• Such insights can then be useful for defining
future studies/questions, leading to other forms
of analysis.
• Exploratory analytics is a direct approach
whereby we allow the data itself to reveal its
underlying structure in the form of a model.
• Thus, exploratory analysis is not a mere
collection of techniques; rather, it offers a
philosophy as to
– how to dissect a dataset;
– what to look for;
– how to look;
– how to interpret the outcomes.
• Examples
– Finding patterns in data, eg finding Groups of
similar genes from a collection of samples.
– Consider the Census data of a country
• Which state has highest population Descriptive
Analytics
• Which city will have lowest influx of immigrant
population Predictive analytics
• To find some interesting unknown insights from tis
massive data set we need to Explore
Mechanistic Analytics
• Mechanistic analysis involves understanding the
exact changes in variables that lead to changes in
other variables for individual objects.
– For instance, we may want to know how the number
of free doughnuts per employee per day affects
employee productivity.
– Perhaps by giving them one extra doughnut we gain a
5% productivity boost, but two extra doughnuts could
end up making them lazy (and diabetic)!
• Studying the effects of carbon emissions on
bringing about the Earth’s climate change.
Here, we are interested in seeing how the
increased amount of CO2 in the atmosphere is
causing the overall temperature to change
• Mechanistic analytics is thus studying a
relationship between two variables. Such
relationships are often explored using
regression.
Mathematical Model : concepts
Model Evaluation
• Modeling The process of encapsulating
information into a tool which can forecast and
make predictions.
• Predictive models are structured around some
idea of what causes future events to happen.
• Effectively formulating models requires a detailed
understanding of the space of possible choices.
• Accurately evaluating the performance of a model
can be surprisingly hard, but it is essential for
knowing how to interpret the resulting
predictions.
• The best forecasting system is not necessarily
the most accurate one, but the model with
the best sense of its boundaries and
limitations.
• Simpler models provide for the best
explanation.
• Simpler models tend to be more robust and
understandable than complicated
alternatives.
• Principles for effective modeling:
– Think probablistically
• Forecasts which make concrete statements are less
meaningful than those that are inherently probabilistic.
– A forecast that Trump has only 28.3% chance of winning is
more meaningful than one that categorically states that he
will lose.
• Real world is full of uncertainty; successful models
recognize this uncertainty.
• There are always a range of possible outcomes that can
occur with slight perturbations of reality, and this
should be captured in your model.
• Forecasts of numerical quantities should not be single
numbers, but instead report probability distributions.
– Change your forecast in response to new
information
• Live models are much more valuable n interesting than
dead ones.
• Live models are those that continually updates its
predictions in response to new information
• Building a live model is more complex than that of a
one-off computation, but much more valuable
• Live models are more intellectual than dead ones, since
fresh information should change the result of any
forecast.
• The model should be open to changing opinions in
response to new data.
• Dynamically-changing forecasts provide excellent
opportunities to evaluate your model.
– Do they ultimately converge on the correct answer?
– Does uncertainty diminish as the event approaches?
• Any live model should track and display its predictions
over time, so the viewer can gauge whether changes
accurately reflected the impact of new information.
– Look for Census
• A good prediction model comes from multiple distinct
sources of evidence.
• Data should derive from as many different sources as
possible.
• Ideally, multiple models should be built, each trying to
predict the same thing in different ways.
• You should have an opinion as to which model is the
best, but be concerned when it substantially differs
from the herd.
• Often competitors generate competing forecasts,
which we can monitor and compare, thus providing a
reality check.
– Who has been doing better lately?
– What explains the differences in the forecast?
– Can your model be improved?
• Eg: Google's Flu Trends forecasting model
predicted disease outbreaks by monitoring key
words on search: a surge in people looking for
aspirin or fever might suggest that illness is
spreading.
• Google's forecasting model proved quite
consistent with the Center for Disease Control's
(CDC) statistics on actual flu cases for several
years, until they embarrassingly went astray.
• The world changes. Among the changes was that
Google's search interface began to suggest search
queries in response to a user's history.
• When offered the suggestion, many more
people started searching for aspirin after
searching for fever.
• And the old model suddenly wasn't accurate
anymore.
• Google's sins lay in not monitoring its
performance and adjusting over time.
– Employ Bayesian reasoning
• Bayesian reasoning starts from the prior distribution of
probability, then weighs further evidenceby how strongly it
should impact the probability of the event.
• Bayes theorem provides a way to calculate how probabilities
change in response to new events.
• P(A|B) = P(B|A)P(A) / P(B)
• It provides a way to calculate how the probability of event A
changes in response to a new evidence B
• Applying Bayes' theorem requires a prior probability
P(A), the likelihood of event A before knowing the
status of a particular event B.
• This might be the result of running a classier to predict
the status of A from other features, or background
knowledge about event frequencies in a population.
• Without a good estimate for this prior, it is very
difficult to know how seriously to take the classier.
• Baseline models
– Baseline models are built to assess the complexity
of a task
– They are the simplest reasonable models that
produce answers we can compare with.
– More sophisticated models should do better than
the baseline models, thereby evaluating the
performance of the complex models.
Baseline models for classification
• Two common tasks for data science models :
– Classification
– Value prediction
• In classification tasks, we are given a small set of
possible labels for any given item, like (spam or
not spam), (man or woman), or (bicycle, car,
• or truck).
• We seek a system that will generate a label
accurately describing a particular instance of an
email, person, or vehicle
• Following are the representative baseline models
for classification:
– Uniform & Random selection among labels :
• If there is no knowledge about the prior
distribution/labeling on the objects, then we might make
arbitrary selections. (Blind classifiers)
– Eg : comparing stock market prediction model to random coin flips.
– The most common label appearing in the training
data:
• A large training set usually provides some notion of a prior
distribution on the classes.
• Selecting the most frequent label is better than selecting
them uniformly or randomly.
• Eg: theory behind sun-will-rise-tomorrow baseline model
– The most accurate single-feature model :
• Powerful models strive to exploit all the useful features
present in a given data set.
• But it is valuable to know what the best single feature
can do.
• Building classifiers based on a single numerical feature ,
say X, is easy.
– Eg : we are declaring that the item is in class 1 if x >= t, and
class 2 if otherwise.
– To find the best threshold t, we can test all n
possible thresholds of the form t = xi + e , where xi
i

is the value of the feature in the ith of n training


instances. Then select the threshold which yields
the most accurate classier on your training data
– Somebody else’s model:
– Often , there may be a legacy model in place, which needs to
be updated or revised.
» Perhaps a close variant of the problem has been
discussed in an academic paper, and maybe they even
released their code on the web for you to experiment
with.
– One of two things can happen when you compare your model
against someone else's work:
» either you beat them or you don't.
– If you beat them, you now have something worth bragging
about.
– If you don't, it is a chance to learn and improve.
– Not winning gives you certainty that your model can be
improved, at least to the level of the other guy's model.
Evaluating Models /Classifiers
• Necessary to check how good is our predictive
model.
• The informal sniff test is perhaps an important
criteria for evaluating a model.
• The formal evaluations are generally based on
some summary statistics, aggregated over
many instances.
• A sniff test involves looking carefully at a few
example instances where the model got it
right, and a few where it got it wrong.
• The best way to assess models involves
out-of-sample predictions, results on data that
you never saw (or even better, did not exist)
when you built the model.
• Good performance on the data that you
trained models on is very suspect, because
models can easily be overfit.
– overfitting is "the production of an analysis that
corresponds too closely or exactly to a particular
set of data, and may therefore fail to fit to
additional data or predict future observations
reliably".
• Out of sample predictions are the key to being
honest, provided you have enough data and
time to test them
• Evaluating Classifiers:
– Evaluating a classier means measuring how
accurately our predicted labels match the gold
standard labels in the evaluation set.
• In case of binary classification , there will be
two distinct labels or classes.
– Typically the smaller and more interesting of the
two classes as positive and the larger/other class
as negative.
– In a spam classification problem, the spam would
typically be positive and the ham (non-spam)
would be negative.
• There are four possible results of what the
classification model could do on any given
instance, which defines the confusion matrix or
contingency table
– True Positives (TP): Here our classifier labels a positive
item as positive, resulting in a win for the classifier.
– True Negatives (TN): Here the classifier correctly
determines that a member of the negative class
deserves a negative label. Another win.
– False Positives (FP): The classifier mistakenly calls a
negative item as a positive, resulting in a “type I"
classification error.
– False Negatives (FN): The classifier mistakenly
declares a positive item as negative, resulting in a
“type II" classification error.
Predicted class
YES NO

Actual YES True Positive (TP) False Negatives(FN)


Class NO False Positives (FP) True Negatives (TN)
• Accuracy, Precision, Recall and F-Score :
– Several evaluation statistics can be computed
from the Confusion Matrix.
– All these evaluation statistics help us to defend
our Classifier against two baseline opponents
• An opponent classifier that declares all items positive,
or perhaps all negative, thus making the evaluation
statistics look bad, by achieving a high score with a
useless classifier. (Sharp)
• An opponent classifier that randomly guesses on each
instance. (monkey)
– Thus to defend our classifier , its important to
establish by how much it beats both our
opponents.
– Accuracy of a classifier the first measure of
evaluation.
• Defined as the ratio of the number of correct
predictions over the total predictions.
– accuracy = (TP +TN) / (TP+TN+FP+FN)
• By multiplying such fractions by 100, we can get the
percentage accuracy score.
• Accuracy ratio is meaningful, provided we know how
accurate it is.
• Precision Precision measures how often this
classier is correct when it dares to say
positive.
– Precision = TP / (TP +FP) , where TP + FP = Total
predicted Positive
Predicted
Negative Positive
Actual Negative TN FP
Positive FN TP
• Thus Precision = TP / Total Predicted Positive
• Precision talks about how precise/accurate your
model is out of those predicted positive, how many of
them are actual positive.
• Precision is a good measure to determine, when the
costs of False Positive is high.
• If the classifier issues too many positive labels, it is
doomed to low precision because so many bullets miss
their mark, resulting in many false positives.
• But if the classifier is stingy with positive labels, very
few of them are likely to connect with the rare positive
instances, so the classifier achieves low true positives
• Recall measures how often you prove right
on all positive instances
– Recall = TP / (TP +FN), where TP +FN = Actual
Positives.
– A high recall implies that the classier has few false
negatives.
– Recall actually calculates how many of the Actual
Positives our model capture through labeling it as
Positive (True Positive).
– Applying the same understanding, we know that
Recall shall be the model metric we use to select
our best model when there is a high cost
associated with False Negative.
• F-score (F1-score):
– A single measurement describing the performance
of the system, in terms of precision and recall
– F-score gives the harmonic mean of precision and
recall
• F-score = 2 *(( precision * recall) /(precision + recall))
– Achieving a high F-score requires both high recall
and high precision.
ROC (Receiver Operating Curve)
AUC (Area Under Curve)

You might also like