0% found this document useful (0 votes)
30 views37 pages

Introduction To Business Analytics: DR Sandipan Karmakar Department of Management Studies MNIT Jaipur

The document provides an overview of business analytics, highlighting its importance in decision-making across various sectors such as banking, e-commerce, airlines, and healthcare. It discusses the processes involved in evaluating loan applications, product recommendations, pricing strategies, and disease diagnosis, emphasizing the role of data analysis techniques like descriptive, predictive, and prescriptive analytics. Additionally, it covers the growth of business analytics driven by technological advancements, methodological developments, and the increasing volume of data generated in various industries.

Uploaded by

hrithiks435
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
30 views37 pages

Introduction To Business Analytics: DR Sandipan Karmakar Department of Management Studies MNIT Jaipur

The document provides an overview of business analytics, highlighting its importance in decision-making across various sectors such as banking, e-commerce, airlines, and healthcare. It discusses the processes involved in evaluating loan applications, product recommendations, pricing strategies, and disease diagnosis, emphasizing the role of data analysis techniques like descriptive, predictive, and prescriptive analytics. Additionally, it covers the growth of business analytics driven by technological advancements, methodological developments, and the increasing volume of data generated in various industries.

Uploaded by

hrithiks435
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 37

Introduction to Business

Analytics

Dr Sandipan Karmakar
Department of Management Studies
MNIT Jaipur
Some Real World Challenges
You apply for a loan for the first time. How does the bank assess the riskiness of the loan it might make
to you?
How does Amazon.com know which books and other products to recommend to you whenever you log
in to their web site?
How do airlines determine what price to quote to you when you are shopping for a plane ticket?
How can doctors better diagnose and treat you when you are ill or injured?

And Many More…………………


Evaluating Loan Applications
You might be applying for loans for the first time
But, millions of people around the world have applied for loans before
Many of these loan recipients have paid back their loans in full and on time, but some have not
The bank wants to know whether you are more like those who have paid back their loans or more like
those who defaulted
By comparing your credit history, financial situation, and other factors to the vast database of previous
loan recipients, the bank can effectively assess how likely you are to default on a loan
Product Recommendation by amazon.com
amazon.com has access to data on millions of purchases made by customers on its web site.
amazon.com examines your previous purchases, the products you have viewed, and any product
recommendations you have provided.
amazon.com then searches through its huge database for customers who are similar to you in terms of
product purchases, recommendations, and interests.
Once similar customers have been identified, their purchases form the basis of the recommendations
given to you
Quoting Air Ticket Prices
Prices for airline tickets are frequently updated.
The price quoted to you for a flight between New York and San Francisco today could be very different
from the price that will be quoted tomorrow.
These changes happen because airlines use a pricing strategy known as revenue management.
Revenue management works by examining vast amounts of data on past airline customer purchases and
using these data to forecast future purchases.
These forecasts are then fed into sophisticated optimization algorithms that determine the optimal price
to charge for a particular flight and when to change that price.
Revenue management has resulted in substantial increases in airline revenues
Diagnosing Diseases
Hundreds of medical papers may describe research studies done on patients facing similar diagnoses,
and thousands of data points exist on their outcomes.
However, it is extremely unlikely that your doctor has read every one of these research papers or is
aware of all previous patient outcomes.
Instead of relying only on her medical training and knowledge gained from her limited set of previous
patients, wouldn’t it be better for your doctor to have access to the expertise and patient histories of
thousands of doctors around the world?
Why Business Analytics
Analytics is a collection of techniques and tools for creating value from the data – techniques include
concepts such as AI, ML, DL etc
Organizations across the world use several performance measures such as ROI, market share, consumer
retention, sales growth customer satisfaction and so on to quantifying, monitoring, benchmarking and
improving.
Organizations strive to understand the relation between different KPIs and the factors that have
significant impact on the KPIs for effective management.
Knowledge of this identified relationship between KPIs and factors would provide the decision makers
with appropriate actionable knowledge
AI/ML algorithms can be used for identifying such factors influencing the KPIs which can be further
used for decision making for value creation
Growth of Business Analytics
Three developments spurred recent explosive growth in the use of analytical methods in business
applications
First, technological advances—such as improved point-of-sale scanner technology and the collection of
data through e-commerce and social networks, data obtained by sensors on all kinds of mechanical
devices such as aircraft engines, automobiles, and farm machinery through the so-called Internet of
Things and data generated from personal electronic devices—produce incredible amounts of data for
businesses
Second, ongoing research has resulted in numerous methodological developments, including advances in
computational approaches to effectively handle and explore massive amounts of data, faster algorithms
for optimization and simulation, and more effective approaches for visualizing data
Third, these methodological developments were paired with an explosion in computing power and
storage capability
Decision Making
Responsibility of managers to plan, coordinate, organize, and lead their organizations to better
performance
Ultimately, managers’ responsibilities require that they make strategic, tactical, or operational decisions
Strategic decisions involve higher-level issues concerned with the overall direction of the
organization; these decisions define the organization’s overall goals and aspirations for the future.
Strategic decisions are usually the domain of higher-level executives and have a time horizon of
three to five years
Tactical decisions concern how the organization should achieve the goals and objectives set by its
strategy, and they are usually the responsibility of midlevel management. Tactical decisions usually
span a year and thus are revisited annually or even every six months
Operational decisions affect how the firm is run from day to day; they are the domain of
operations managers, who are the closest to the customer.
Process of Decision Making
Decision making can have the following steps:
1. Identify and define the problem.
2. Determine the criteria that will be used to evaluate alternative solutions.
3. Determine the set of alternative solutions.
4. Evaluate the alternatives.
5. Choose an alternative.
Number of approaches to making decisions:
tradition (“We’ve always done it this way”),
intuition (“gut feeling”) and
rules of thumb (“As the restaurant owner, I schedule twice the number of waiters and cooks on
holidays”)
Managerial experience and intuition are valuable inputs to making decisions, but what if relevant data
were available to help us make more informed decisions?
How can managers convert these data into knowledge that they can use to be more efficient and
effective in managing their businesses?
Business Analytics Defined
What makes decision making difficult and challenging?
Uncertainty
Enormous number of alternatives to evaluate them all
Business analytics is the scientific process of transforming data into insight for making better decisions
(INFORMS)
Used for data-driven or fact-based decision making, which is often seen as more objective than other
alternatives for decision making
Creating insights from data, by improving our ability to more accurately forecast for planning, by
helping us quantify risk, and by yielding better alternatives through analysis and optimization
Steps in Business Analytics
Identify the problem or opportunity for value creation
Identify sources of data (primary, secondary) and create a data lake (integrated data from various
sources)
Pre-process the data for issues such as missingness and incorrect data-Generate derived variables
(Feature Engineering) and Transform the data if necessary – Prepare the data for AI/ML based model
building
Divide the Data into subsets of Training, Testing and Validation
Build AI/ML models and identify the best model(s)using model performance in Validation data
Implement the Solution/Decision/Develop product
Paradigms of Analytics
Business analytics can involve anything from simple reports to the most advanced optimization
techniques
Three broad categories of techniques: descriptive analytics, predictive analytics, and prescriptive
analytics
Descriptive analytics encompasses the set of techniques that describes what has happened in the past.
Examples are data queries, reports, descriptive statistics, data visualization including data dashboards,
some data-mining techniques, and basic what-if spreadsheet models
Predictive analytics consists of techniques that use models constructed from past data to predict the
future or ascertain the impact of one variable on another. For example, past data on product sales may be
used to construct a mathematical model to predict future sales
Prescriptive analytics differs from descriptive and predictive analytics in that it indicates a course of
action to take; that is, the output of a prescriptive model is a decision
Descriptive analytics - Examples
A data query is a request for information with certain characteristics from a database.
For example, a query to a manufacturing plant’s database might be for all records of shipments to a
particular distribution center during the month of March. This query provides descriptive information
about these shipments: the number of shipments, how much was included in each shipment, the date
each shipment was sent, and so on
Data dashboards are collections of tables, charts, maps, and summary statistics that are updated as new
data become available. Dashboards are used to help management monitor specific aspects of the
company’s performance related to their decision-making responsibilities
Data mining is the use of analytical techniques for better understanding patterns and relationships that
exist in large data sets. For example, by analyzing text on social network platforms like Twitter,
data-mining techniques (including cluster analysis and sentiment analysis) are used by companies to
better understand their customers
Predictive Analytics - Examples
Past data on product sales may be used to construct a mathematical model to predict future sales. This mode
can factor in the product’s growth trajectory and seasonality based on past patterns.
A packaged-food manufacturer may use point-of-sale scanner data from retail outlets to help in estimating the
lift in unit sales due to coupons or sales events. Survey data and past purchase behavior may be used to help
predict the market share of a new product
Linear regression, time series analysis, some data-mining techniques, and simulation, often referred to as risk
analysis, all fall under the banner of predictive analytics
Data mining, previously discussed as a descriptive analytics tool, is also often used in predictive analytics.
For example, a large grocery store chain might be interested in developing a targeted marketing campaign
that offers a discount coupon on potato chips. By studying historical point-of-sale data, the store may be able
to use data mining to predict which customers are the most likely to respond to an offer on discounted chips
by purchasing higher-margin items such as beer or soft drinks in addition to the chips, thus increasing the
store’s overall revenue
Prescriptive Analytics
A forecast or prediction, when combined with a rule, becomes a prescriptive model
We may develop a model to predict the probability that a person will default on a loan. If we create a rule
that says if the estimated probability of default is more than 0.6, we should not award a loan, now the
predictive model, coupled with the rule is prescriptive analytics. These types of prescriptive models that rely
on a rule or set of rules are often referred to as rule-based models
Portfolio models in finance, supply network design models in operations, and price-markdown models in
retailing – all these require some optimization models, that is, models that give the best decision subject to
the constraints of the situation
Another type of modeling in the prescriptive analytics category is simulation optimization which combines
the use of probability and statistics to model uncertainty with optimization techniques to find good decisions
in highly complex and highly uncertain settings
The techniques of decision analysis can be used to develop an optimal strategy when a decision maker is
faced with several decision alternatives and an uncertain set of future events.
Spectrum of Business Analytics
Big Data
Walmart handles over 1 million purchase transactions per hour. Facebook processes more than 250
million picture uploads per day
Six billion cell phone owners around the world generate vast amounts of data by calling, texting,
tweeting, and browsing the web daily
Google notified that; the amount of data currently created every 48 hours is equivalent to the entire
amount of data created from the dawn of civilization until the year 2003
The Internet, cell phones, retail checkout scanners, surveillance video, and sensors on everything from
aircraft to cars to bridges allow us to collect and store vast amounts of data in real time
Probably the most accepted and most general definition is that big data is any set of data that is too large
or too complex to be handled by standard data-processing techniques and typical desktop software
IBM describes the phenomenon of big data through the four Vs: volume, velocity, variety, and
veracity. The fifth V is Variability
Big Data
Cross Industry Practice of Data Mining

• The Cross-Industry Standard Process for Data


Mining (CRISP-DM8) was developed by
analysts representing Daimler-Chrysler,
SPSS, and NCR
• CRISP provides a nonproprietary and freely
available standard process for fitting data
mining into the general problem-solving
strategy of a business or research unit
Four Basic Tasks of
Analytics
Detailed Tasks of Analytics
Description & Estimation [Descriptive]
Classification [Predictive]
Association Rule Discovery [Descriptive]
Clustering [Descriptive]
Discriminant Analysis [Predictive]
Sequential Pattern Discovery [Descriptive]
Regression & Neural Networks[Predictive]
Deviation Detection [Predictive]
Time Series & Forecasting [Predictive]
Optimization Model [Prescriptive]
Decision Analysis [Prescriptive]
Classification - Definition
Given a collection of records (training set )
Each record contains a set of attributes, one of the attributes is the class.
Find a model for class attribute as a function of the values of other attributes
Goal: previously unseen records should be assigned a class as accurately as possible.
A test set is used to determine the accuracy of the model. Usually, the given data set is divided into
training and test sets, with training set used to build the model and test set used to validate it
Classification Example
al al us
oric ric u o
g go tin
ate a te o n as
s
C C C Cl

Test
Set

Learn
Training Model
Set Classifier
Classification – Application 1
Direct Marketing
Goal: Reduce cost of mailing by targeting a set of consumers likely to buy a new cell-phone product
Approach
• Use the data for a similar product introduced before.
• We know which customers decided to buy and which decided otherwise. This {buy, don’t buy}
decision forms the class attribute.
• Collect various demographic, lifestyle, and company-interaction related information about all
such customers.
• Type of business, where they stay, how much they earn, etc.
• Use this information as input attributes to learn a classifier model.
Classification – Application 2
Fraud Detection
Goal: Predict fraudulent cases in credit card transactions
Approach
• Use credit card transactions and the information on its account-holder as attributes.
• When does a customer buy, what does he buy, how often he pays on time, etc
• Label past transactions as fraud or fair transactions. This forms the class attribute.
• Learn a model for the class of the transactions.
• Use this model to detect fraud by observing credit card transactions on an account.
Classification – Application 3
Customer Attrition/Churn
Goal: To predict whether a customer is likely to be lost to a competitor
Approach
• Use detailed record of transactions with each of the past and present customers, to find
attributes.
• How often the customer calls, where he calls, what time-of-the day he calls most, his
financial status, marital status, etc.
• Label the customers as loyal or disloyal.
• Find a model for loyalty
Clustering - Definition
Given a set of data points, each having a set of attributes, and a similarity measure among them, find
clusters such that
Data points in one cluster are more similar to one another
Data points in separate clusters are less similar to one another
Similarity Measures:
Euclidean Distance if attributes are continuous
Other Problem-specific Measures.
Illustrative Clustering
Euclidean Distance Based Clustering in 3-D space

Intracluster distances Intercluster distances


are minimized are maximized
Clustering – Application 1
Market Segmentation
Goal: subdivide a market into distinct subsets of customers where any subset may conceivably be
selected as a market target to be reached with a distinct marketing mix
Approach:
Collect different attributes of customers based on their geographical and lifestyle related
information.
Find clusters of similar customers.
Measure the clustering quality by observing buying patterns of customers in same cluster vs.
those from different clusters.
Clustering – Application 2
Document Clustering
Goal: To find groups of documents that are similar to each other based on the important terms
appearing in them
Approach:
To identify frequently occurring terms in each document. Form a similarity measure based on
the frequencies of different terms. Use it to cluster.
Gain:
Information Retrieval can utilize the clusters to relate a new document or search term to
clustered documents clusters.
Association Rule Discovery - Definition
Given a set of records each of which contain some number of items from a given collection
Goal: Produce dependency rules which will predict occurrence of an item based on occurrences of
other items

Rules Discovered:
{Milk} --> {Coke}
{Diaper, Milk} --> {Beer}
Association Rule Discovery – Application 1
Marketing and Sales Promotion
Let the rule discovered be
{Bagels, … } --> {Potato Chips}
Potato Chips as consequent => Can be used to determine what should be done to boost its sales
Bagels in the antecedent => Can be used to see which products would be affected if the store
discontinues selling bagels
Bagels in antecedent and Potato chips in consequent => Can be used to see what products should be
sold with Bagels to promote sale of Potato chips!
Association Rule Discovery – Application 2
Supermarket shelf management
Goal: To identify items that are bought together by sufficiently many customers
Approach: Process the point-of-sale data collected with barcode scanners to find dependencies among
items.
A classic rule --
• If a customer buys diaper and milk, then he is very likely to buy beer.
• So, don’t be surprised if you find six-packs stacked next to diapers!
Association Rule Discovery – Application 3
Inventory Management
Goal: A consumer appliance repair company wants to anticipate the nature of repairs on its consumer
products and keep the service vehicles equipped with right parts to reduce on number of visits to
consumer households.
Approach: Process the data on tools and parts required in previous repairs at different consumer
locations and discover the co-occurrence patterns
Regression
Predict a value of a given continuous valued variable based on the values of other variables, assuming a
linear or nonlinear model of dependency.
Greatly studied in statistics, neural network fields.
Examples:
Predicting sales amounts of new product based on advertising expenditure
Predicting wind velocities as a function of temperature, humidity, air pressure, etc.
Time series prediction of stock market indices
Why Python
Python is an interpreted, high-level, general purpose programming language.
Some of the key design philosophy are the code readability, ease of use and high productivity and gained
significant popularity since 2012.
It has an amazing ecosystem and is excellent for developing prototypes very quickly
It has a comprehensive set of core libraries for data analysis and visualization
Python, unlike R is not only built for only data analysis purpose but is a general-purpose language
It can be used to build web applications, enterprise applications and easier to integrate with existing
systems in an enterprise for data collection and preparation
Moreover, Python’s strong community continuously evolves its data science libraries and keeps it
cutting edge
It has libraries for Linear Algebra, Statistics, Machine Learning, Visualization, Optimization, Stochastic
Models etc.

You might also like