UNIT-1_BI
UNIT-1_BI
and Analytics
[BCDS-051]
UNIT1- BUSINESS INTELLIGENCE – INTRODUCTION
Lectures Details topics
8 Changing the aggregation, both for a single viz, or changing default aggregation
What is Business Intelligence (BI)?
Business intelligence (BI) is a set of technologies that are used to solve specific business problems. BI tools are typically
designed to deliver a mix of operational embedded analytics, analytics platform capabilities, and rich data visualization
functionality.
What it includes??
statistical analysis tools
database management systems
data mining applications.
BI is usually implemented as a standalone technology in-house or by an outside consulting firm or vendor.
The business intelligence platform is mostly a cloud-based solution that is designed to help an organization to achieve and
sustain the goals of its digital transformation initiatives. A business intelligence platform can be used to store, organize, and
analyze data that allows organizations to access the insights they need in order to meet their business goals.
It can also be used as a tool for employees and stakeholders alike to gain insights into a company’s performance. A business
BI platform is more than just an analytics tool.
In other words, you can use a business platform to enable collaboration between all parties in the organization as well as
with external stakeholders such as customers, partners, suppliers and others.
What is the history and evolution of business
intelligence (BI)?
There are some major shifts that have been made in BI over the last 20 years.
Some development and evolution era of business intelligence.
❑ Traditional era of business intelligence: They started to introduce a technique of combining data from
multiple systems to a single database. Thus, they call the process as extract, transform, and load (ETL).
❑ Self-service era of business intelligence: They enable data analysts to easily sort through large amounts of
data to find patterns quickly. They substitute the rows and also columns that are part of traditional data
presentation tools with pictures and charts that visually represent the data.
❑ Augmented analytics era of business intelligence: We are moving away from self-service analytics and
toward automation. So, this is what we call augmented analytics.
History and Evolution of Business intelligence
and Analytics
Business Analytics
❑ Analytics is the use of:
❑ data,
❑ information technology,
❑ statistical analysis,
❑ quantitative methods, and
❑ mathematical or computer-based models
❑ to help managers gain improved insight about their business operations and
make better, fact- based decisions.
Example of Applications
Pricing
setting prices for consumer and industrial goods, government contracts, and
maintenance contracts
Customer segmentation
identifying and targeting key customer groups in retail, insurance, and credit card industries.
Merchandising
determining brands to buy, quantities, and allocations
Location
finding the best location for bank branches and ATMs, or where to service industrial equipment
Social Media
understand trends and customer perceptions; assist marketing managers and product
designers
A Visual Perspective of Business Analytics
Impact and Challenges
Challenges
Business Intelligence: Effective and Timely decisions
The main purpose of business intelligence systems is to provide knowledge workers with tools and methodologies that
allow them to make effective and timely decisions.
Effective decisions. The application of rigorous analytical methods allows decision makers to rely on information and knowledge
which are more dependable.
Timely decisions. The ability to rapidly react to the actions of competitors and to new market conditions is a critical factor in the
success or even the survival of a company.
Data, information and knowledge
Data:
Data is unprocessed facts and figures without any added interpretation or analysis. "The price of crude oil is $80 per barrel."
For a retailer data refer to primary entities such as customers, points of sale and items, while sales receipts represent the commercial
transactions.
Information:
Information is the outcome of extraction and processing activities carried out on data, and it appears meaningful for those who
receive it in a specific domain
Information is data that has been interpreted so that it has meaning for the user. "The price of crude oil has risen from $70 to $80
per barrel" gives meaning to the data and so is said to be information to someone who tracks oil prices.
Knowledge:
Information is transformed into knowledge when it is used to make decisions and develop the corresponding actions.
Knowledge is a combination of information, experience and insight that may benefit the individual or the organization. "When
crude oil prices go up by $10 per barrel, it's likely that petrol prices will rise by 2p per litre" is knowledge.
Architectural Representation: From data to information
to knowledge
The role of mathematical models
A business intelligence system provides decision makers with information and knowledge extracted from
data, through the application of mathematical models and algorithms. In some instances, this activity may
reduce to calculations of totals and percentages, graphically represented by simple histograms, whereas more
elaborate analyses require the development of advanced optimization and learning models.
A business intelligence system provides decision makers with information and knowledge extracted from
data, through the application of mathematical models and algorithms.
First, the objectives of the analysis are identified and the performance indicators that will be used to evaluate
alternative options are defined.
Mathematical models are then developed by exploiting the relationships among system control variables,
parameters and evaluation metrics.
Finally, what-if analyses are carried out to evaluate the effects on the performance determined by variations in
the control variables and changes in the parameters.
Role of Mathematical Model
Structure of Mathematical models
Business Intelligence Architectures
The architecture of a business intelligence system, includes three major components.
Data sources. In a first stage, it is necessary to gather and integrate the data stored in the various primary and
secondary sources, which are heterogeneous in origin and type. The sources consist for the most part of data belonging
to operational systems, but may also include unstructured documents, such as emails and data received from external
providers. Generally speaking, a major effort is required to unify and integrate the different data sources.
Data warehouses and data marts. Using extraction and transformation tools known as extract, transform, load
(ETL), the data originating from the different sources are stored in databases intended to support business intelligence
analyses. These databases are usually referred to as data warehouses and data marts.
Business intelligence methodologies. Data are finally extracted and used to feed mathematical models and analysis
methodologies intended to support decision makers. In a business intelligence system, several decision support
applications may be implemented.
Business Intelligence Architecture
1.Data sources
The sources consist for the most part of data belonging to operational systems, may also include unstructured
documents, such as emails and data received from external providers.
4.Data Mining
The fourth level includes active business intelligence methodologies, whose purpose is the extraction of information and knowledge from
data.
These include mathematical models for pattern recognition, machine learning and data mining techniques. Unlike the tools described at
the previous level of the pyramid, the models of an active kind do not require decision makers to formulate any prior hypothesis to be
later verified. Their purpose is instead to expand the decision makers’ knowledge.
5.Optimization.
By moving up one level in the pyramid we find optimization models that allow us to determine the best solution out of a set of alternative actions,
which is usually fairly extensive and sometimes even infinite.
6.Decision
Finally, the top of the pyramid corresponds to the choice and the actual adoption of a specific decision, and in some way represents the natural
conclusion of the decision-making process. Even when business intelligence methodologies are available and successfully adopted, the choice of a
decision pertains to the decision makers, who may also take advantage of informal and unstructured information available to adapt and modify the
recommendations and the conclusions achieved through the use of mathematical models.
Cycle of a business intelligence analysis
Analysis.
During the analysis phase, it is necessary to recognize and accurately spell out the problem at hand. Decision makers must then create a
mental representation of the phenomenon being analyzed, by identifying the critical factors that are perceived as the most relevant. The
availability of business intelligence methodologies may help already in this stage, by permitting decision makers to rapidly develop
various paths of investigation. For instance, the exploration of data cubes in a multidimensional analysis, according to different logical
views, allows decision makers to modify their hypotheses flexibly and rapidly, until they reach an interpretation scheme that they deem
satisfactory. Thus, the first phase in the business intelligence cycle leads decision makers to ask several questions and to obtain quick
responses in an interactive way.
Insight.
The second phase allows decision makers to better and more deeply understand the problem at hand, often at a causal level. For instance,
if the analysis carried out in the first phase shows that a large number of customers are discontinuing an insurance policy upon yearly
expiration, in the second phase it will be necessary to identify the profile and characteristics shared by such customers. The information
obtained through the analysis phase is then transformed into knowledge during the insight phase. On the one hand, the extraction of
knowledge may occur due to the intuition of the decision makers and therefore be based on their experience and possibly on unstructured
information available to them. On the other hand, inductive learning models may also prove very useful during this stage of analysis,
particularly when applied to structured data.
Decision.
During the third phase, knowledge obtained as a result of the insight phase is converted into decisions and subsequently into actions. The
availability of business intelligence methodologies allows the analysis and insight phases to be executed more rapidly so that more
effective and timely decisions can be made that better suit the strategic priorities of a given organization. This leads to an overall reduction
in the execution time of the analysis–decision–action– revision cycle, and thus to a decision-making process of better quality.
Evaluation.
Finally, the fourth phase of the business intelligence cycle involves performance measurement and evaluation. Extensive metrics should
then be devised that are not exclusively limited to the financial aspects but also take into account the major performance indicators defined
Enabling factors in business intelligence projects
Some factors are more critical than others to the success of a business intelligence project: technologies, analytics and human resources.
Technologies
Hardware and software technologies are significant enabling factors that have facilitated the development of business intelligence systems within
enterprises and complex organizations. On the one hand, the computing capabilities of microprocessors have increased on average by 100% every 18
months during the last two decades, and prices have fallen. This trend has enabled the use of advanced algorithms which are required to employ
inductive learning methods and optimization models, keeping the processing times within a reasonable range. Moreover, it permits the adoption of
state-of-the-art graphical visualization techniques, featuring real-time animations. A further relevant enabling factor derives from the exponential
increase in the capacity of mass storage devices, again at decreasing costs, enabling any organization to store terabytes of data for business intelligence
systems. And network connectivity, in the form of Extranets or Intranets, has played a primary role in the diffusion within organizations of information
and knowledge extracted from business intelligence systems. Finally, the easy integration of hardware and software purchased by different suppliers, or
developed internally by an organization, is a further relevant factor affecting the diffusion of data analysis tools.
Analytics
As stated above, mathematical models and analytical methodologies play a key role in information enhancement and knowledge extraction from the
data available inside most organizations. The mere visualization of the data according to timely and flexible logical views, as described , plays a
relevant role in facilitating the decision-making process, but still represents a passive form of support. Therefore, it is necessary to apply more
advanced models of inductive learning and optimization in order to achieve active forms of support for the decision-making process.
Human resources
The human assets of an organization are built up by the competencies of those who operate within its boundaries, whether as individuals or collectively.
The overall knowledge possessed and shared by these individuals constitutes the organizational culture. The ability of knowledge workers to acquire
information and then translate it into practical actions is one of the major assets of any organization, and has a major impact on the quality of the
decision-making process. If a given enterprise has implemented an advanced business intelligence system, there still remains much scope to emphasize
the personal skills of its knowledge workers, who are required to perform the analyses and to interpret the results, to work out creative solutions and to
devise effective action plans. All the available analytical tools being equal, a company employing human resources endowed with a greater mental
agility and willing to accept changes in the decision-making style will be at an advantage over its competitors.
Development of a business intelligence system
The development of a business intelligence system can be assimilated to a project, with a specific final objective,
expected development times and costs, and the usage and coordination of the resources needed to perform planned.
Analysis:
During the first phase, the needs of the organization relative to the development of a business intelligence system
should be carefully identified.
This preliminary phase is generally conducted through a series of interviews of knowledge workers performing
different roles and activities within the organization. It is necessary to clearly describe the general objectives and
priorities of the project, as well as to set out the costs and benefits deriving from the development of the business
intelligence system.
Design:
The second phase includes two sub-phases and is aimed at deriving a provisional plan of the overall architecture,
taking into account any development in the near future and the evolution of the system in the mid-term. First, it is
necessary to make an assessment of the existing information infrastructures. Moreover, the main decision-making
processes that are to be supported by the business intelligence system should be examined, in order to adequately
determine the information requirements. Later on, using classical project management methodologies, the project
plan will be laid down, identifying development phases, priorities, expected execution times and costs, together
with the required roles and resources.
Planning:
The planning stage includes a sub-phase where the functions of the business intelligence system are defined and described
in greater detail. Subsequently, existing data as well as other data that might be retrieved externally are assessed. This
allows the information structures of the business intelligence architecture, which consist of a central data warehouse and
possibly some satellite data marts, to be designed. Simultaneously with the recognition of the available data, the
mathematical models to be adopted should be defined, ensuring the availability of the data required to feed each model
and verifying that the efficiency of the algorithms to be utilized will be adequate for the magnitude of the resulting
problems. Finally, it is appropriate to create a system prototype, at low cost and with limited capabilities, in order to
uncover beforehand any discrepancy between actual needs and project specifications.
Data visualization is one of the steps of the data science process, which states that after data has been
collected, processed and modeled, it must be visualized for conclusions to be made.
Context of data visualization – Definition
Data visualization is the graphical representation of information and data. By using visual elements like
charts, graphs, and maps, data visualization tools provide an accessible way to see and understand trends,
outliers, and patterns in data. Additionally, it provides an excellent way for employees or business owners
to present data to non-technical audiences without confusion.
In the world of Big Data, data visualization tools and technologies are essential to analyze massive amounts
of information and make data-driven decisions.
Advantages and Disadvantages of data visualization
Advantages:
Disadvantages:
better understand data. ‘Whether simple or complex, the right visualization can bring everyone on the
same page, regardless of their level of expertise.
What is Tableau?
Tableau is a visual analytics platform transforming the way we use data to solve problems empowering
people and organizations to make the most of their data.
Tableau disrupted business intelligence with intuitive, visual analytics for everyone.
https://www.tableau.com/
Architecture of Tableau
Features of Tableau
Data Visualization Principles
Stephen Few's 8 Core Principles:
1. Simplify - Just like an artist can capture the essence of an emotion with just a few lines, good data visualization captures the essence of
data - without oversimplifying.
2. Compare - We need to be able to compare our data visualizations side by side. We can't hold the details of our data visualizations in our
memory - shift the burden of effort to our eyes.
3. Attend - The tool needs to make it easy for us to attend to the data that's really important. Our brains are easily encouraged to pay
attention to the relevant or irrelevant details.
4. Explore - Data visualization tools should let us just look. Not just to answer a specific question, but to explore data and discover things.
Directed and exploratory analysis are equally valid, but we need to be sure that out visualization tool makes both possible.
5. View Diversely - Different views of the same data provide different insights. It helps to be able to look at the same data
from different perspectives at the same time and see how they fit together.
6. Ask why - More than knowing "what's happening", we need to know "why it's happening". This is where actionable results
come from.
7. Be skeptical - We too rarely question the answers we get from our data because traditional tools have made data analysis
so hard. We accept the first answer we get simply because exploring any further is tool hard More powerful tools like
Tableau give you the luxury to ask more questions, as fast as we can think of them.
8. Respond - Simply answering questions for yourself has limited benefit. It's the ability to share our data that leads to
global enlightenment.
Why visualization came into the picture
Our brains are naturally inclined to process visual information more efficiently than textual or numerical
data. Visualization leverages this strength by representing data visually, using charts, graphs, diagrams, and
other visual elements. This allows us to perceive patterns, trends, and relationships that might not be
immediately apparent in raw data.
10 Good and Bad Examples of Data Visualization
https://www.polymersearch.com/blog/10-good-and-bad-examples-of-data-visualization
https://www.syntaxtechs.com/blog/data-visualization-examples#h4
Poor visualization vs Perfect visualization
https://docs.google.com/document/d/1W4C1Vy2sZnnok4DrzuYetuXt6tooAXVrGEVK9DetjfM/edit?usp=sharin
g
Books
https://www.tableau.com/learn/articles/books-about-data-visualization
Goal of Data visualization
The visual representation of data, is more scientific than artistic in our modern world.
The main goal of data visualization is effectively, efficiently, elegantly, accurately as well as
meaningfully communicating information.
It fulfills its objectives only if it encodes the given input in such a manner that our eyes can recognize and
our brain can comprehend.
One of the main goals of data visualization is to give support in making decision through appropriately
designed graphically represented information.
Visualizing the Past
Different Data Visuals for Different Needs:
There are two common types of visual representations of data. Both are very important and both have different
requirements when it comes to designing great visualizations.
1. Presentation - Uses data visuals to communicate. This type of visual representation has two roles: a presenter and
an audience.
2. Visualization - This is a fairly new term and the idea is to use visuals to think. Here, the experience is active and
involves people trying to answer questions.
Visualizing the Past
1700-1900: Visualization is Transformed:
William Playfair, a Scottish engineer who is widely regarded as the father of statistical presentation. Playfair
published a book in 1786 called the Commercial and Political Atlas which used graphical representations of data
to describe England’s balance of trade.
One famous example comes from Dr. John Snow, a British physician who used statistical graphics to deal with
London’s cholera epidemic of 1855.
Visualizing the Past
1700-1900: Visualization is Transformed:
Snow plotted individual cases of cholera as dots on a map of London. These dots showed that the majority of
cases could be traced to a water pump on Broad Street. An investigation of outlying cases showed they, too, had
connections to the Broad Street pump. Snow removed the handle from the contaminated pump and the cholera
epidemic subsided. This shows how the power of visualization can answer questions and, in this case, even work
for the public good. Snow’s map also works as an effective example of the Presentation style; Snow’s data was
strong enough to persuade city officials to remove the infected handle and quell the outbreak.
Tableau Products:
Tableau Desktop
Tableau Server
Tableau Online
Tableau Public
Tableau Reader
Tableau Mobile
https://docs.google.com/document/d/15pwFeswKVlUFmCqaFpixTDbwUOF7dEtjRy-
ga9qgqcY/edit?usp=sharing
Why use Tableau?
Tableau is the fastest and powerful growing visualization tool. It is very easy to use. There are no complex
formulas like excel and other visualization tools. It provides the features like cleaning, organizing, and
visualizing data, it is easier to create interactive visual analytics in the form of dashboards. These
dashboards make it easier for non-technical analysts and end-users to convert data into understandable
ones.
Values in Tableau
There are two types of values in the tableau:
Dimensions: Values that are discrete(which can not change with respect to time) in nature called
Dimension in tableau. Example: city name, product name, country name.
Measures: Values that are continuous(which can change with respect to time) in nature called Measure in
tableau. Example: profit, sales, discount, population.
Advantages of Tableau
● Quick calculation- All the calculations on the tableau done by the backend, so it is
relatively faster than any other tool.
● Interactive dashboards– Tableau dashboards are very interactive and easy to draw.
● No manual calculation- All the calculations done by the tableau only. There is no manual
calculation but in some specific cases, we used calculated fields for calculation.
● A large amount of data- Tableau can handle a large amount of data. Different types of
visualization can be created with a large amount of data without impacting the
performance of the dashboards.
Disadvantages of Tableau
● High Cost- tableau is a paid tool for visualization, and it is a reason why people are not using
tableau so much.
● Static and single value parameters- Tableau’s parameters are static and always single value can
be selected using a parameter. Whenever the data gets changed, these parameters need to be
updated manually every time.
● Limited Data Preprocessing- Tableau is strictly a visualization tool. Tableau Desktop allows you to
do very basic preprocessing.
Disadvantages of Tableau
● High Cost- tableau is a paid tool for visualization, and it is a reason why people are not using
tableau so much.
● Static and single value parameters- Tableau’s parameters are static and always single value can
be selected using a parameter. Whenever the data gets changed, these parameters need to be
updated manually every time.
● Limited Data Preprocessing- Tableau is strictly a visualization tool. Tableau Desktop allows you to
do very basic preprocessing.