Unit-1
Unit-1
Introduction
Univariate
• ‘uni’ means one, variate indicates a variable. Therefore,
univariate analysis is a form of analysis that only
involves a single variable.
• Examination of the distribution of cases on only one
variable at a time(e.g., weight of college students)
• The primary purpose of univariate analysis is to describe
data. Using different techniques, these descriptions are
found.
Univariate Contd.,,
Univariate Contd.,
• Graphical analysis
Various types of graphs can be used to understand data. The standard type of
graphs include
1. Histograms
2. Boxplot
3. Bar chart
4. Pie chart
And etc..
• Univariate Tables
Tables help in univariate analysis and are typically used with categorical data or
numerical data with limited cardinality.It includes,
Univariate Contd.,
1. Frequency Tables
2. Grouped Tables(finding the count of each unique value)
3. Proportion (or) Percentage Tables(showing the frequency of the unique values )
4. Cumulative Proportion Tables(It is similar to the proportion table, with the
difference being that the proportion is shown cumulatively)
• Univariate Statistics
Univariate analysis can be performed in a statistical setting.
Two types of statistics can be used here
• Descriptive
• Inferential.
Univariate Contd.,
• Descriptive Statistics
Descriptive statistics are used to describe data. For instance, if you have to
describe a cube, you have to ‘measure’ it. By measuring its length, breadth,
and height, you can describe it.
1.Measure of Central Tendency(Statistics such as mean, median, and mode are
considered here)
2.Measure of Variability(specific univariate statistics can be calculated, such as
range, interquartile range, variance, standard deviation, etc)
3.Measure of Shape( if the data is symmetrical, non-symmetrical, left or right-
skewed)
Univariate Contd.,
• Z Test: Used for numerical
(quantitative) data where the sample
size is greater than 30 and the
population’s standard deviation is
known.
• One-Sample t-Test: Used for
numerical (quantitative) data where the
sample size is less than 30 or the
population’s standard deviation is
unknown
• Chi-Square Test: Used with ordinal
categorical data
• Kolmogorov-Smirnov Test: Used
with nominal categorical data
Bivariate
• Bivariate analysis is performed when two variables are involved.
• In Bivariate Data, two values are recorded for each observation. for example
data on income and weight of individuals.
Income(Y) Weight(X)
1000 50
2000 55
3000 60
4000 62
6.ANOVA
• compare more than two groups
• Test the null hypothesis that two poulations among several numbers of populations has
the same average
Multivariate
• When more than two variables are to be analyzed, such an analysis is called
multivariate analysis.
• such an analysis where many variables are used to predict another variable.
However, non-predictive analysis is also performed by creating a correlation
matrix, cross-frequency tables, dodged or stacked bar charts, etc.
• Categories:
1. Dependence
Dependence methods are used when one or some of the variables are
dependent on others
Dependence looks at cause and effect; in other words, can the values of two or
more independent variables be used to explain, describe, or predict the value of
another, dependent variable? To give a simple example, the dependent variable
of “weight” might be predicted by independent variables such as “height” and
“age.”
Multivariate Contd.,
2. Interdependence
Interdependence methods are used to understand the structural makeup and
underlying patterns within a dataset
In this case, no variables are dependent on others
Techniques:
• Multiple linear regression
• Multiple logistic regression
• Multivariate analysis of variance (MANOVA)
• Factor analysis
• Cluster analysis.
for more info,
https://careerfoundry.com/en/blog/data-analytics/multivariate-analysis/
Classification of Multivariate
Techniques
Selection of Multivariate technique depends on,
a. If the variable is either dependent or interdependent.
Classification of Multivariate Techniques Contd.,
Classification of Multivariate Techniques Contd.,
• Principal Component and Common Factor Analysis:
Classification of Multivariate Techniques Contd.,
Logistic regression:
Classification of Multivariate Techniques Contd.,
Canonical Correlation Analysis:
Canonical correlation analysis is used to identify and measure the associations
among two sets of variables.
MANOVA
Classification of Multivariate Techniques Contd.,
Conjoint Analysis:
• Conjoint analysis is a popular research method for understanding which attributes
are most important to customers when they consider purchasing your product.
Components:
1.Question
2.Profile
3.Attribute
4.Levels
Cluster Analysis:
Classification of Multivariate Techniques Contd.,
Multidimensional Scaling: