What Is Predictive Modeling
What Is Predictive Modeling
Most predictive models work fast and often complete their calculations in real
time. That’s why banks and retailers can, for example, calculate the risk of an
online mortgage or credit card application and accept or decline the request
almost instantly based on that prediction.
Also, being able to use more data in predictive modeling is an advantage only
to a point. Too much data can skew the calculation and lead to a meaningless
or an erroneous outcome. For example, more coats are sold as the outside
temperature drops. But only to a point. People do not buy more coats when
it’s -20 degrees Fahrenheit outside than they do when it’s -5 degrees below
freezing. At a certain point, cold is cold enough to spur the purchase of coats
and more frigid temps no longer appreciably change that pattern.
However, these are not technologies that businesses can afford to adopt later,
after the tech reaches maturity and all the kinks are worked out. The near-
term advantages are simply too strong for a late adopter to overcome and
remain competitive.
Our advice: Understand and deploy the technology now and then grow the
business benefits alongside subsequent advances in the technologies.
Financial modeling and planning and budgeting are key areas to reap the
many benefits of using these advanced technologies without overwhelming
your team.
predictive modeling
By
George Lawton
Joseph M. Carew
Ed Burns
"Predictive modeling is a form of data mining that analyzes historical data with
the goal of identifying trends or patterns and then using those insights to
predict future outcomes," explained Donncha Carroll a partner in the revenue
growth practice of Axiom Consulting Partners. "Essentially, it asks the
question, 'have I seen this before' followed by, 'what typically comes after this
pattern.'"
THIS ARTICLE IS PART OF
Decision trees. Decision tree algorithms take data (mined, open source,
internal) and graph it out in branches to display the possible outcomes of
various decisions. Decision trees classify response variables and predict
response variables based on past decisions, can be used with incomplete
data sets and are easily explainable and accessible for novice data
scientists.
The most complex area of predictive modeling is the neural network. This type
of machine learning model independently reviews large volumes of labeled
data in search of correlations between variables in the data. It can detect even
subtle correlations that only emerge after reviewing millions of data points.
The algorithm can then make inferences about unlabeled data files that are
similar in type to the data set it trained on.
Predictiv
e modeling algorithms include logistic regression, time series analysis and decision trees.
Bayesian spam filters use predictive modeling to identify the probability that a
given message is spam.
In fraud detection, predictive modeling is used to identify outliers in a data set
that point toward fraudulent activity. In customer relationship management,
predictive modeling is used to target messaging to customers who are most
likely to make a purchase.
Other areas where predictive models are used include the following:
capacity planning
change management
disaster recovery
engineering
city planning
How to build a predictive model
Building a predictive model starts with identifying historical data that's
representative of the outcome you are trying to predict.
"The model can infer outcomes from historical data but cannot predict what it
has never seen before," Carroll said. Therefore, the volume and breadth of
information used to train the model is critical to securing an accurate
prediction for the future.
The next step is to identify ways to clean, transform and combine the raw data
that leads to better predictions.
Skill is required in not only finding the appropriate set of raw data but also
transforming it into data features that are most appropriate for a given model.
For example, calculations of time-boxed weekly averages may be more useful
and lead to better algorithms than real-time levels.
This is both an art and a science. The art lies in cultivating a gut feeling for the
meaning of things and intuiting the underlying causes. The science lies in
methodically applying algorithms to consistently achieve reliable results, and
then evaluating these algorithms over time. Just because a spam filter works
on day one does not mean marketers will not tune their messages, making the
filter less effective.
Reducing risk. Predictive analytics can detect activities that are out of the
ordinary such as fraudulent transactions, corporate spying or cyber attacks
to reduce reaction time and negative consequences.
Once the data has been sorted, organizations must be careful to avoid
overfitting. Over-testing on training data can result in a model that appears
very accurate but has memorized the key points in the data set rather than
learned how to generalize.
Choosing the right business case. Another potential obstacle for predictive
modeling initiatives is making sure projects address significant business
challenges. Sometimes, data scientists discover correlations that seem
interesting at the time and build algorithms to investigate the correlation
further. However, just because they find something that is statistically
significant does not mean it presents an insight the business can use.
Predictive modeling initiatives need to have a solid foundation of business
relevance.
Bias. "One of the more pressing problems everyone is talking about, but few
have addressed effectively, is the challenge of bias," Carroll said. Bias is
naturally introduced into the system through historical data since past
outcomes reflect existing bias.
Nate Nichols, distinguished principal at Narrative Science, a natural language
generation tools provider, is excited about the role that new explainable
machine learning methods such as LIME or SHAP could play in addressing
concerns about bias and promoting trust.
"People trust models more when they have some understanding of what the
models are doing, and trust is paramount for predictive analytic capabilities,"
Nichols said. Being able to provide explanations for the predictions, he said, is
a huge positive differentiator in the increasingly crowded field of predictive
analytic products.
"Once data has been gathered, transformed and cleansed, then predictive
modeling is performed on the data," said Terri Sage, chief technology officer
at 1010data, an analytics consultancy.
Collecting data, transforming and cleaning are processes used for other types
of analytic development.
This will differ across various industries and use cases, as there will be
diverse data used and different variables discovered during the modeling
iterations.
For example, in healthcare, predictive models may ingest a tremendous
amount of data pertaining to a patient and forecast a patient's response to
certain treatments and prognosis. Data may include the patient's specific
medical history, environment, social risk factors, genetics -- all which vary
from person to person. The use of predictive modeling in healthcare marks a
shift from treating patients based on averages to treating patients as
individuals.
Similarly, with marketing analytics, predictive models might use data sets
based on a consumer's salary, spending habits and demographics. Different
data and modeling will be used for banking and insurance to help determine
credit ratings and identify fraudulent activities.
1. First, data modeling capabilities are being baked into more business
applications and citizen data science tools. These capabilities can provide
the appropriate guardrails and templates for business users to work with
predictive modeling.
2. Second, the tools and frameworks for low-code predictive modeling are
making it easier for data science experts to quickly cleanse data, create
models and vet the results.
3. Third, better tools are coming to automate many of the data engineering
tasks required to push predictive models into production. Carroll predicts
this will allow more organizations to shift from simply building models to
deploying them in ways that deliver on their potential value.
Next Steps
14 most in-demand data science skills you need to succeed
Related Terms