Business Analytics
Business Analytics
Disadvantages:
1. Misinterpretation: Poorly designed visualizations or misinterpretation of
visual cues can lead to incorrect conclusions or decisions.
2. Bias and manipulation: Visualizations can be intentionally or unintentionally
biased, leading to uneven interpretations of data.
3. Data limitations: Visualizations may oversimplify complex data, which may
lead to potentially missing of important nuances or details.
4. Technical expertise: Creating effective visualizations may require specialized
skills and tools that not everyone possesses.
5. Overreliance on visuals: Relying solely on visualizations may overlook the
importance of other forms of analysis or data exploration.
6. Data Overload: Too many visualizations or excessive use of graphical
elements can overwhelm users, leading to mental overload and difficulty in
extracting meaningful insights.
Business analytics:
Business analytics refers to the process of collecting, processing,
analysing, and interpreting data to make informed decisions and drive
business success. It involves the use of various statistical and quantitative
techniques to find deep insights from data and guide strategic,
operational, and tactical decision-making within organizations.
In industries and companies professional collect, organise and interpret
the data.
As data is based on facts it is a powerful tool which is used to increase
competitive advantage by making predictions which helps a company to
grow. Business analytics is the process of interacting insights from the
data for the purposes of improving business decisions.
Thus, the process of collecting, analysing and interpreting data by using
quantitative methods to make best decisions for the organisation is
known as business analytics.
The term business analytics can be explained with the help of two
independent concepts “business” and “analytics” where business refers
to the economic or commercial activity related to a person's regular
occupation, profession or trade on the other hand analytics can be
explained as the body of knowledge or principal.
Elements:
1. Data Collection: The foundation of business analytics is data. This
includes gathering data from various sources, such as internal databases,
external sources, customer interactions, social media, sensors, and more.
Data collection involves ensuring data quality, integrity, and relevance to
the business objectives.
2. Data Management: Once collected, data needs to be stored, organized,
and managed effectively. This involves structuring data in databases or
data warehouses, cleaning and preprocessing data to remove errors or
inconsistencies, and ensuring data security and compliance with
regulations.
3. Data Analysis: Data analysis involves applying statistical techniques,
mathematical models, and algorithms to explore, interpret, and extract
insights from data. This includes descriptive analytics to summarize data,
diagnostic analytics to understand why certain events occurred,
predictive analytics to forecast future outcomes, and prescriptive
analytics to recommend actions based on insights.
4. Data Visualization: Communicating insights effectively is crucial in
business analytics. Data visualization involves representing data visually
through charts, graphs, dashboards, and other visualizations to make
complex information more accessible and understandable to
stakeholders. Visualization aids in identifying patterns, trends, and
relationships within the data.
5. Statistical Analysis: Statistical analysis plays a significant role in business
analytics by providing quantitative methods for analysing data. This
includes hypothesis testing, regression analysis, correlation analysis.
Statistical techniques help validate findings, evaluate uncertainty, and
make data-driven decisions.
6. Predictive Modelling: Business analytics also involves building predictive
models to forecast future outcomes based on historical data. These
models can help businesses make informed decisions and anticipate
future trends.
7. Reporting and Communication: Once the analysis is complete, the
findings are communicated to stakeholders through reports,
presentations, or interactive dashboards. This helps in sharing insights
and driving data-informed decision-making.
8. Continuous Improvement: Business analytics is an iterative process that
requires continuous monitoring, evaluation, and refinement.
Organizations need to continuously improve their analytics capabilities
by incorporating feedback, updating models, refining processes, and
adapting to changing business requirements and technological
advancements.
In this career, you'll work with various data sources, such as sales figures,
market research, customer feedback, and social media data, to extract valuable
information. You'll use statistical techniques, data modeling, and data
visualization tools to analyze the data and present your findings in a
meaningful way.
Data warehousing:
There are two types of data:
1 Structure data-it is highly specific and is stored in a predefined
format. It is quantitative in nature that is it is related to quantities
that means it contains measurable values like numbers, dates and
Times. It is easy to search and analyse structure data for example a
table consisting of rows and columns other examples include sales
transaction inventory control.
2 Unstructured data-this is the most common form of data which any
organisation possesses. This type of data is not present in the form of
tables with rows and columns. It is very difficult to use this data for
analysis. IT side then various different formats like text images audio
and video file extra as it is in qualitative nature.
Data warehousing is like having a big storage space for all the data that a
company collects. It's like a central hub where data from different sources, such
as databases, spreadsheets, and other systems, is gathered and organized in a
structured way. This organized data can then be used for analysis and
reporting.
Think of it as a giant library where all the books are neatly categorized and
stored. In a data warehouse, data is transformed and structured to make it
easier to access and analyze. It's designed to support the needs of business
intelligence and analytics.
So, in simple terms, data warehousing is like a big data storage space that helps
companies organize and analyze their data effectively. It's like having a library
for data where you can easily find and use the information you need.
Key characteristics of data warehousing:
1. Centralized Data: A data warehouse consolidates data from various sources
into a central repository, providing a unified view of the organization's data.
3. Historical Data: Data warehouses typically store historical data, allowing for
analysis and comparison of trends over time. This historical perspective is
valuable for making informed business decisions.
5. Query and Analysis: Data warehouses are designed for complex querying and
analysis. They provide tools and technologies to extract meaningful insights
and support business intelligence activities.
ETL
ETL stands for Extract, Transform, Load, which is a process used in data
warehousing to gather, transform, and load data into a target database or
data warehouse.
Let's break down each step of the ETL process:
1. Extract: In the extraction phase, data is collected from various sources
such as databases, spreadsheets, web services, or even external systems.
The goal is to retrieve the necessary data needed for analysis and
reporting. This can involve querying databases, using APIs, or accessing
files.
This section takes time as it involves gathering data or information from
multiple sources.
2. Transform: Once the data is extracted, it undergoes a series of
transformations. This involves cleaning, validating, and reformatting the
data to ensure consistency and quality. Data cleansing may involve
removing duplicates, correcting errors, or filling in missing values. The
transformed data is then standardized to a common format, making it easier
to combine and analyze.
3. Load: After the data has been transformed, it is loaded into the target
database or data warehouse. This step involves organizing the data in a
structured manner, such as tables or dimensions, to enable efficient
querying and analysis. The loaded data is typically stored in a way that
supports the organization's reporting and analytical needs.
2 Oracle
Oracle is the industry's most popular database. It provides a diverse range
of Data Warehouse solutions for both on-premises and in the cloud. It
contributes to better customer experiences by increasing operational
efficiency.
3 Amazon Redshift
Amazon Redshift is a database tool. It is a straightforward and low-cost tool
for analysing all types of data using standard SQL and existing Bl tools. It
also enables the execution of challenging queries against petabytes of
structured data.
Star Schema
Schema is simply a mental structure that helps us understand how things
work. It is related to how we organize knowledge.
Star schema is the most basic and simplest of a data mart schema This
schema is commonly used to design or construct a data warehouse and
dimensional data marts,
A star schema is a type of data modelling technique used in data
warehouses. It consists of a central table (the fact table) surrounded by
multiple related tables (dimension tables) that represent different aspects
of the data. The fact table contains the measurable data, while the
dimension tables provide context and descriptive attributes. It's called a star
schema because the structure resembles a star with the fact table at the
centre and the dimension tables branching out like the arms of a star. It
helps with efficient querying and analysis of data.
Dimension Tables, on the other hand, store descriptive information for all
related fields that are included in the fact table's records. As a result,
dimensions are the means by which you want to analyze your data. For
example, while learning about facts, we learn about the company's sales
revenue, and dimensions refer to the various ways in which sales revenue is
categorized. Sales can be classified by location, product, or time of year.
Dimension Tables are generally expected to be much smaller in size than
Fact Tables.
Advantages:
1. Simplicity: Star schema is easy to understand and navigate, making it
user-friendly for both business users and analysts.
2. Query Performance: The structure of star schema allows for faster query
performance as it involves fewer joins between tables.
3. Aggregation: Aggregating data is more efficient in star schema, which is
helpful for reporting and analysis purposes.
4. Flexibility: Star schema allows for easy addition of new dimensions and
measures without impacting existing data.
Disadvantages:
1. Redundancy (duplication of data): Star schema can lead to data
redundancy as dimension tables are denormalized, meaning that data is
duplicated across multiple tables resulting in larger storage requirements.
2. Limited Relationship Representation: Star schema may not be suitable for
complex relationships between dimensions, as it primarily focuses on one-
to-many relationships.
3. Data Integrity: Due to denormalization, maintaining data integrity can be
challenging in star schema compared to more normalized schemas like
snowflake schema.
When not to use star schema
You might want to consider not using a star schema in the following
situations:
1. Complex Relationships: If your data has complex relationships between
dimensions that cannot be easily represented in a star schema, it may not
be the best choice.
Data Mining:
Data mining is a process that involves discovering patterns, relationships,
and insights from large volumes of data. In data mining, various techniques
and algorithms are used to analyze the data and uncover hidden patterns or
trends that can be used for decision-making and predictive modeling.
The process of data mining typically involves several steps. First, data is
collected from various sources and stored in a data warehouse or database.
Then, the data is preprocessed and transformed into a suitable format for
analysis. Next, data mining algorithms are applied to the prepared data to
identify patterns or relationships. These algorithms can include clustering,
classification, regression, association rule mining, and more.
The insights obtained from data mining can be incredibly valuable for
businesses and organizations. It can help in identifying customer behavior,
market trends, fraud detection, risk assessment, and optimizing business
processes. For example, a retail company can use data mining to analyze
customer purchase patterns and preferences to improve marketing
strategies and personalize recommendations.
Origin of data mining:
The origin of data mining can be traced back to the 1960s and 1970s. During
that time, statisticians and researchers started exploring ways to extract
useful information from large datasets. However, it wasn't until the 1990s
that data mining gained significant attention and popularity. With
advancements in computer technology and the exponential growth of data
storage capabilities, researchers from various fields developed algorithms
and methods to analyze data and uncover patterns, associations, and
trends. Today, data mining is widely used across industries finance,
marketing, healthcare to gain insights and make informed decisions.
Once the model is trained, we can use it to predict the class of new,
unseen emails. The model analyzes the content, subject line, sender
information, and other relevant features of the email and assigns it a
label of either spam or non-spam.
3. Drug Discovery and Development: Data mining techniques can analyze vast
amounts of biomedical data, including genomic data, clinical trials, and
scientific literature. This aids in identifying potential drug targets, predicting
drug interactions, and accelerating the drug discovery and development
process.
4. Healthcare Management and Resource Optimization: Data mining enables
healthcare organizations to analyze operational data, such as patient flow,
resource allocation, and staff scheduling. This helps optimize healthcare
management, improve efficiency, and reduce costs.
5. Public Health Surveillance: Data mining techniques can analyze public health
data, such as disease incidence, demographic information, and environmental
factors. This aids in early detection of disease outbreaks, monitoring
population health trends, and implementing effective public health
interventions.
Insurance Sector:
1. Risk Assessment: Data mining helps insurance companies analyze historical
data on claims, policyholders, and external factors to assess risks accurately.
This enables them to determine appropriate coverage, set premiums, and
prevent fraudulent activities.
Telecommunications Sector:
1. Customer Churn Prediction: Data mining techniques help telecom companies
analyze customer data, usage patterns, and customer interactions to predict
the likelihood of churn. This enables proactive retention strategies and
personalized offers to reduce customer attrition.