0% found this document useful (0 votes)
37 views

Data Analysis Technique.docx

Module 4 covers data analysis techniques using descriptive statistics, correlation, regression analysis, and tools like Solver and Analysis ToolPak in Excel. It explains key statistical measures, their formulas, and how to perform what-if analysis using Goal Seek, Data Tables, and Scenario Manager. Additionally, it introduces DAX for advanced calculations in Power BI and Power Pivot, emphasizing its benefits for large datasets.

Uploaded by

bijuaksel
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
37 views

Data Analysis Technique.docx

Module 4 covers data analysis techniques using descriptive statistics, correlation, regression analysis, and tools like Solver and Analysis ToolPak in Excel. It explains key statistical measures, their formulas, and how to perform what-if analysis using Goal Seek, Data Tables, and Scenario Manager. Additionally, it introduces DAX for advanced calculations in Power BI and Power Pivot, emphasizing its benefits for large datasets.

Uploaded by

bijuaksel
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 10

MODULE-4

DATA ANALYSIS TECHNIQUE

Descriptive statistics in a spreadsheet involve summarizing and analyzing data


using key statistical measures such as mean, median, mode, standard deviation,
variance, and range. Most spreadsheet programs like Microsoft Excel, Google
Sheets, and LibreOffice Calc offer built-in functions to calculate these statistics.

Descriptive statistics
1. Mean (Average)
Definition:
Mean is the sum of all values divided by the total number of values. It represents
the central value of a dataset.
SYNTAX: =AVERAGE(range)
2. Median (Middle Value)
Definition:
The median is the middle number when a dataset is arranged in ascending order.
If there are even numbers, the median is the average of the two middle numbers.
SYNTAX: =MEDIAN(range)
3. Mode (Most Frequent Value)
Definition:
Mode is the number that appears most frequently in the dataset.
SYNTAX: =MODE.SNGL(range) → Returns a single mode
=MODE.MULT(range) → Returns multiple modes (if available)

Ms. Aleena Rose, Sacred Heart College (Autonomous), Chalakudy


Example:
Numbers: 20, 30, 30, 40, 50, 50, 50
Mode: 50 (as it appears 3 times)

Excel Formula: =MODE.SNGL(A1:A7)

Result: 50
If there were two most frequent values (e.g., 30 & 50 appear twice), then:
Formula: =MODE.MULT(A1:A7)

Result: 30, 50
4. Standard Deviation (Spread of Data)
Definition:
Standard Deviation tells us how much the values in a dataset deviate from the
mean.
• A high standard deviation means values are spread out.
• A low standard deviation means values are close to the mean.
SYNTAX:=STDEV.P(range)
SUMMARY:

Ms. Aleena Rose, Sacred Heart College (Autonomous), Chalakudy


Correlation and Regression Analysis
1. Correlation
• Measures the strength and direction of the relationship between two
variables.
• The value ranges from -1 to 1:
o 1 → Perfect positive correlation
o 0 → No correlation
o -1 → Perfect negative correlation
• SYNTAX : =CORREL(range1, range2)
2. Regression
• Shows how one variable (dependent) is influenced by another
(independent).
• The equation is: Y=a+bX
where:
o Y= dependent variable

Ms. Aleena Rose, Sacred Heart College (Autonomous), Chalakudy


o X = independent variable
o a= intercept
o b = slope
1. Using SLOPE and INTERCEPT Functions
To find the slope (b) and intercept (a) of the regression equation Y=a+bX:
• Slope (b): =SLOPE(Y_range, X_range)
• Intercept (a): =INTERCEPT(Y_range, X_range)
2. Using LINEST Function
To get the regression coefficients and additional statistics:
SYNTAX: =LINEST(Y_range, X_range, TRUE, TRUE)

• Y_range → Range of dependent variable (e.g., scores)


• X_range → Range of independent variable (e.g., study hours)
• TRUE → Calculates additional regression statistics
• TRUE → Returns the full regression output
EXAMPLE:

=CORREL(B2:B6, C2:C6)
b=SLOPE(C2:C6, B2:B6)
a=INTERCEPT(C2:C6, B2:B6)

Ms. Aleena Rose, Sacred Heart College (Autonomous), Chalakudy


Data analysis tools
1. Solver
Solver is an add-in in Excel that helps find the best solution for a problem by
changing certain variables while following given constraints.
Example: Maximize Profit
Imagine you run a bakery and sell cakes and cookies. You have limited
ingredients and need to maximize your profit.
• Objective: Maximize profit
• Variables: Number of cakes and cookies to make
• Constraints: Limited flour, sugar, and eggs

2. Analysis ToolPak
Analysis ToolPak is an Excel add-in that provides advanced data analysis
functions like regression, correlation, histograms, and more.
Example: Find Average Monthly Sales Growth
You have monthly sales data for a year and want to know if sales are increasing.
• Step 1: Enable "Analysis ToolPak" in Excel
• Step 2: Use the "Regression" tool to analyze the relationship between
months and sales
• Result: Excel will show how much sales are increasing each month and
whether the trend is significant.

1. Enabling Solver and Analysis ToolPak


Both Solver and Analysis ToolPak are Excel add-ins that need to be enabled
first.

Ms. Aleena Rose, Sacred Heart College (Autonomous), Chalakudy


Steps to Enable
1. Open Excel and go to File > Options.
2. Click on Add-ins (left panel).
3. At the bottom, next to Manage, select Excel Add-ins and click Go.
4. Check Solver Add-in and Analysis ToolPak, then click OK.

Now, you will see Solver under Data > Solver, and Analysis ToolPak functions
under Data > Data Analysis.

2. Using Analysis ToolPak


Let’s analyze monthly sales data and check the sales trend using Regression
Analysis.
Example:

Month Sales ($)

Jan 5000

Feb 5200

Mar 5400

Apr 6000

May 6500

Steps:
1. Click on Data > Data Analysis > Regression.
2. Input X Range: Select the months.
3. Input Y Range: Select the sales data.
4. Click OK, and Excel will generate a report showing if sales are
increasing significantly.

Ms. Aleena Rose, Sacred Heart College (Autonomous), Chalakudy


Scenario Analysis and What-If Analysis in Excel
What-If Analysis in Excel helps you experiment with different values to see
how they affect the final result. It includes Goal Seek, Data Tables, and
Scenario Manager.

1. Goal Seek
When you know the desired result and need to find the input value to achieve
it.

Example :

SUBJECT MARK
English 73
Maths 78
Science 74

Percentage of mark: 75%

Steps to Use Goal Seek:

1. Click on percentage of mark.


2. Go to Data → What-If Analysis → Goal Seek.
3. In the Set Cell, select cell
4. In To Value, enter (goal percentage).
5. In By Changing Cell, select any mark cell (e.g., B2).
6. Click OK

2. Data Tables
When you want to see multiple possible results by changing one or two inputs.

Example: Multiplication Table

Ms. Aleena Rose, Sacred Heart College (Autonomous), Chalakudy


Step to use Data Table

1. Enter numbers 1 to 5 in Row and Column


2. In cell enter the formula =A1*B1.
3. Select the entire table .
4. Go to Data → What-If Analysis → Data Table.
5. In Row Input Cell, select A1 (empty cell).
6. In Column Input Cell, select B1
7. Click OK.

3. Scenario Manager
When you want to compare multiple sets of input values at once. It is a tool that
lets you create, analyze, and compare different data scenarios. It helps you
understand how changing variables can affect the outcome.

Example : create a excel sheet of names and marks.

Steps to Use Scenario Manager

1. Go to Data → What-If Analysis → Scenario Manager.


2. Click Add and enter the scenario.
3. Click Show to compare different scenarios.
4. You can also generate a Scenario Summary Report to see all.

Introduction to DAX (Data Analysis Expressions)


DAX (Data Analysis Expressions) is a special formula language used in Power
BI, Power Pivot (Excel), and Analysis Services. It helps in performing
advanced calculations and data analysis on large datasets. Unlike normal Excel
formulas that work on single cells, DAX works on entire tables and columns
at once.

Why Use DAX?

DAX is useful when simple Excel formulas are not enough for data analysis. It
helps in:
Creating new calculated columns and measures.
Summarizing and filtering large amounts of data.
Ms. Aleena Rose, Sacred Heart College (Autonomous), Chalakudy
Performing time-based analysis, such as year-over-year growth.
Creating relationships between different tables.

Key Features of DAX

• Column-Based Calculations → Works on tables and columns, not


individual cells.
• Functions and Operators → Similar to Excel, but more powerful.
• Context Awareness → Understands filters and relationships between
data.
• Performance Optimized → Works efficiently on large datasets.

Steps to Use DAX in Excel (Power Pivot)


Step 1: Enable Power Pivot in Excel

1. Open Excel → Go to File → Options → Add-ins.


2. Select COM Add-ins → Click Go.
3. Check Microsoft Power Pivot for Excel → Click OK.

Step 2: Load Data into Power Pivot

1. Click Power Pivot → Manage.


2. Import your Excel table into Power Pivot.

Step 3: Write a DAX Formula

1. Click on an empty column in Power Pivot.


2. Type a DAX formula like:

Total Sales = 'Sales'[Price] * 'Sales'[Quantity]

3. Press Enter and the entire column will be calculated.

Step 4: Use DAX in PivotTable

1. Go back to Excel and insert a PivotTable.


2. Drag and drop your DAX column or measure into the Values field.

Ms. Aleena Rose, Sacred Heart College (Autonomous), Chalakudy


✔DAX is a powerful tool for data analysis in Power Pivot (Excel) and Power
BI.
✔ It helps in creating complex calculations easily.
✔ Works best for large datasets and advanced reports.
✔ Learning DAX can make data analysis faster and more efficient.

Ms. Aleena Rose, Sacred Heart College (Autonomous), Chalakudy

You might also like