AARYADAV
AARYADAV
Technology
Department of Computer Science and Engineering
LAB MANUAL
CERTIFICATE
This is to certify that this manual contains practical work performed by Miss.
Aarya Rai of B.E. – Computer Science and Engineering studying in THIRD
YEAR – 6th Semester having Enrollment No. 210450131039 has satisfactorily
completed his practical work in the subject of Data Analysis and Visualization
(3161613) for the term ending in May 2023-2024.
DATE: -
PRACTICAL LIST
(Academic Year: 2022-23)
PRACTICAL – 1
Aim: Download and study the dataset for various data analysis
statistics.
SOLUTION:
Source: https://www.kaggle.com/datasets
Donor:
TATA Motors
Dataset Information:
His dataset contains the historical stock prices of Tata Motors Limited in INR on a daily
basis. The stock prices was collected through yahoo finance. Ticker symbol -
TATAMOTORS.NS
Attribute Information :
ExcelSheet:
PRACTICAL – 2
Mean :
The MEAN formula in Excel is one of the most widely used statistical functions. It helps
to up all numbers within a data set and divide by the number of points added together.
Excel Sheet :
Median :
This function is used when there are even numbers in the data set. The MEDIAN
function helps calculate the average of the two middle numbers.
The syntax for Median Calculation In Excel : =MEDIAN (array of numbers)
Excel Sheet :
Mode :
The MODE function in Excel returns the most commonly occurring number in any
numeric set of data.
Excel Sheet :
Standard Deviation :
The STANDARD DEVIATION Function In Excel is mainly used to calculate how much
the observed value has varied or deviated from the average.
Excel Sheet :
Correlation Coefficient :
This statistical Excel function helps determine the link between the two variables. This
function is used mainly by data analysts to study data thoroughly. You must remember
that the CORRELATION coefficient range lies between -1 to +1.
Excel Sheet :
Relative :
In Excel, a relative cell reference is used by default. Excel uses a relative reference
whenever we insert a cell reference or a range within a formula. The relative references,
which commonly reflect the combination of column name and row number, are used
normally with the associated cell references. There is no dollar ($) sign in the relative
reference for the cell.
When you need to develop a formula for a set of cells and the formula needs to make a
reference to a relative cell reference, relative cell references come in handy.
Excel Sheet :
Absolute :
When copying or using AutoFill, there are times when the cell reference must stay the
same. A column and/or row reference is kept constant using dollar signs. So, to get an
absolute reference from a relative, we can use the dollar sign ($) characters.
Excel Sheet :
Mixed Reference :
An absolute column and relative row, or an absolute row and relative column, is a mixed
cell reference. You get an absolute column or absolute row when you individually put the
$ before the column letter or before the row number. Example: $B8 is relative to row 8
but absolute for column B, and B$8 is absolute for row 1 but relative for column A.
Here, the Dollar ($) before the row number fixes/locks the row & before the column
name fixes/locks the column.
Excel Sheet :
AVERAGE :
The AVERAGE function is used to calculate the average (arithmetic mean) of a range of
numbers. The basic syntax of the AVERAGE function is as follows:
=AVERAGE(number1, [number2], ...)
COUNTS :
The COUNT function is used to count the number of cells that contain numbers within a
specified range. The basic syntax of the COUNT function is as follows:
=COUNT(value1, [value2], ...)
Excel Sheet:
SUM IF :
The SUMIF function in Microsoft Excel is used to sum values based on a specified
condition. The basic syntax of the SUMIF function is:
=SUMIF(range, criteria, [sum_range])
AVERAGE IF :
The AVERAGEIF function in Microsoft Excel is used to calculate the average of arange
of cells that meet a specified condition. The basic syntax of the AVERAGEIF function is
as follows:
=AVERAGEIF(range, criteria, [average_range])
Excel Sheet:
COUNT IF :
The COUNTIF function in Microsoft Excel is used to count the number of cells within a
range that meet a specified condition. The basic syntax of the COUNTIF function is as
follows:
=COUNTIF(range, criteria)
SUM IFS :
The SUMIFS function in Microsoft Excel is used to sum values based on multiple
criteria. This function allows you to apply different conditions to different ranges and
sum the corresponding values that meet all specified criteria. The basic syntax of the
SUMIFS function is as follows:
=SUMIFS(sum_range, criteria_range1, criteria1, [criteria_range2, criteria2], ...)
AVERAGE IFS :
The AVERAGEIFS function is used to calculate the average of a range of cells that meet
multiple specified conditions. This function is an extension of the AVERAGEIF function,
allowing you to apply multiple criteria. The basic syntax of the AVERAGEIFS function
is as follows:
=AVERAGEIFS(average_range, criteria_range1, criteria1, [criteria_range2, criteria2], ...)
COUNT IFS :
The COUNTIFS function is used to count the number of cells that meet multiple
specified conditions. This function is an extension of the COUNTIF function and allows
you to apply multiple criteria. The basic syntax of the COUNTIFS function is as follows:
=COUNTIFS(criteria_range1, criteria1, [criteria_range2, criteria2], ...)
Excel Sheet:
VLookUp:
VLOOKUP, or Vertical Lookup, is a powerful function in Microsoft Excel that allows
you to search for a value in a specified range (table) and return a corresponding value
from the same row. Here's a basic guide on how to use VLOOKUP :The syntax for the
VLOOKUP function is as follows:
=VLOOKUP(lookup_value, table_array, col_index_num, [range_lookup])
Excel Sheet:
HLookUp:
HLOOKUP, or Horizontal Lookup, is another useful function in Microsoft Excel that
allows you to search for a value in the first row of a table and return a corresponding
value from the same column. Here's the basic syntax for the HLOOKUP function:
The syntax for the HLOOKUP function is as follows:
Excel Sheet :
XLookUp:
It is designed to replace older lookup functions like VLOOKUP, HLOOKUP, and
LOOKUP by offering improved features and flexibility.
Here is the basic syntax for the XLOOKUP function:
Excel Sheet:
INDEX FUNCTION:
The INDEX function returns the value of a cell in a specified row and column of a range.
Syntax:
MATCH FUNCTION:
The MATCH function searches for a specified value in a range and returns the relative
position of that item.
Syntax:
Excel Sheet:
PRACTICAL – 3
Aim: Collect the month wise COVID cases data for State wise data Plot
this time series Data. Analyze the trend as per time.
SOLUTION:
Coronavirus is a family of viruses that can cause illness, which can vary from common cold and
cough to sometimes more severe disease. SARS-CoV-2 (n-coronavirus) is the new virus of the
coronavirus family, which first discovered in 2019, which has not been identified in humans
before. It is a contiguous virus which started from Wuhan in December 2019. Which later
declared as Pandemic by WHO due to high rate spreads throughout the world. Currently (on date
27 March 2020), this leads to a total of 24K+ Deaths across the globe, including 16K+ deaths
alone in Europe.Pandemic is spreading all over the world; it becomes more important to
understand about this spread.
The number of new cases are increasing day by day around the world. This dataset has
information from the states and union territories of India at daily level.State Wise data fetched
from Ministry of Health & Family Welfare ICMR Testing Data comes from Indian Council of
Medical Research.
PRACTICAL – 4
SOLUTION:
Data Cleaning:
Data cleaning routines work to clean the data by filling in missing values, smoothing noisy data,
identifying or removing outliers, and resolving inconsistencies. If users believe the data are
dirty,they are unlikely to trust the results of any data mining that has been applied to it.
Furthermore, dirty data can cause confusion for the mining procedure, resulting in unreliable
output. Although most mining routines have some procedures for dealing with incomplete or
noisy data, they are not always robust. Instead, they may concentrate on avoiding over fitting the
data to the function being modeled. Therefore, a useful pre-processing step is to run your data
through some data cleaning routines. Methods for cleaning up missing data are as follows. Real-
world data tend to be incomplete, noisy, and inconsistent. Data cleaning (or data cleansing)
routines attempt to fill in missing values, smooth out noise while identifying outliers. Missing
Value - In the unemployed dataset there are some missing values denoted by ? Now, we have to
apply following missing value data treatment to fill those data.
Ignore the tuple : This is usually done when the class label is missing (assuming the mining
task involves classification). This method is not very effective, unless the tuple contains several
attributes with missing values. It is especially poor when the percentage of missing values per
attribute varies considerably.
Use a global constant to fill in the missing value : Replace all missing attribute values by the
same constant, such as a label like Unknown . If missing values are replaced by, say, Unknown,
then the mining program may mistakenly think that they form an interesting concept, since they
all have a value in common that of Unknown. Hence, although this method is simple, it is not
foolproof.
Use the attribute mean to fill in the missing value: Replace all missing attribute values in
single column by the average of that column data.
Use the maximum attribute data to fill in missing value: Replace all missing attribute values
in single column by the maximum data of that column data.
Use the minimum attribute data to fill in missing value: Replace all missing attribute values
in single column by the minimum data of that column data.
Used the most frequently used data to fill in missing value: Replace all missing
attributevalues in single column by the most frequently used data of that column data.
Processed Data :
Ignore the tuple: In dataset ignore all the row which contain at least one missing values.
Using global constant to fill the missing value : In dataset fill all the missing value by -1,
which assume as global constant. Note, that here we assume -1 as global constant because there
no value in dataset have -1 value.
Use the attribute mean to fill the missing value : In dataset to fill the missing value in
particular column first calculate average(=AVERAGE(A1,A40)) of the column and by that value
fill the missing value of that column. Repeat same process for the entire remaining column.
Use the maximum attribute value to fill the missing value : In dataset to fill the missing value
in particular column first find out maximum value(=MAX(A1,A40)) of the column and by that
value fill the missing value of that column. Repeat same process for the entire remaining column.
Use the minimum attribute value to fill the missing value : In dataset to fill the missing value
in particular column first find out maximum value(=MIN(A1,A40)) of the column and by that
value fill the missing value of that column. Repeat same process for the entire remaining column.
Used the most frequently used data to fill in missing value : Replace all missing attribute
values in single column by the most frequently used data of that column data.
Used the Constant to fill the missing value : using a constant to fill missing values involves
entering a specific number or text (the constant) into cells where data is missing to maintain data
consistency.
PRACTICAL – 5
Data Normalization is an important aspect of data management and analysis that plays a crucial
role in both data storage and data analysis. It is a systematic approach to decompose data tables
to eliminate redundant data and undesirable characteristics.
The primary goal of data normalization is to add, delete, and modify data without causing data
inconsistencies. It ensures that each data item is stored in only one place which reduces the
overall disk space requirement and improves the consistency and reliability of the system.
In databases, it organizes fields and tables and in data analysis and machine learning,
normalization is used to preprocess data before being used in any analysis.
INPUT:
OUTPUT:
PRACTICAL – 6
INPUT:
OUTPUT:
PRATICAL : 7
Linear Regression :
It is a technique used for numerical prediction. Regression is a statistical measure that attempts to
determine the strength of the relationship between one dependent variable ( i.e. the label
attribute) and a series of other changing variables known as independent variables (regular
attributes). Just like Classification is used for predicting categorical labels, Regression is used for
predicting a continuous value. For example, we may wish to predict the salary of university
graduates with 5 years of work experience, or the potential sales of a new product given its price.
Regression is often used to determine how much specific factors such as the price of a
commodity, interest rates, particular industries or sectors influence the price movement of an
asset.
Steps :
Output :
PRATICAL: 8
Aim: Implement and Analysis KNN Algorithm on given data set (discussed in
theory) using rapid miner.
KNN Algorithm:
K-Nearest Neighbour is one of the simplest Machine Learning algorithms based on Supervised
Learning technique. It assumes the similarity between the new case/data and available cases and
put the new case into the category that is most similar to the available categories.
It stores all the available data and classifies a new data point based on the similarity. This means
when new data appears then it can be easily classified into a well suite category by using K- NN
algorithm.
It can be used for Regression as well as for Classification but mostly it is used for the
Classification problems. K-NN is a non-parametric algorithm, which means it does not make any
assumption on underlying data. It is also called a lazy learner algorithm because it does not learn
from the training set immediately instead it stores the dataset and at the time of classification, it
performs an action on the dataset.
KNN algorithm at the training phase just stores the dataset and when it gets new data, then it
classifies that data into a category that is much similar to the new data.
Example : Suppose, we have an image of a creature that looks similar to cat and dog, but we
want to know either it is a cat or dog. So for this identification, we can use the KNN algorithm,
as it works on a similarity measure. Our KNN model will find the similar features of the new
data set to the cats and dogs images and based on the most similar features it will put it in either
cat or dog category.
Steps:
Output:
PRACTICAL –9
AIM :Execute and analysis Random Forest Algorithm on demo example using
Rapid Miner .
Random Forest Algorithm is a powerful tree learning technique in Machine Learning .It works
by creating a number of Decision Trees during the training phase.Each tree is constructed using
random subset of the data set to measure a random subset of features in each partition.
INPUT:
OUTPUT:
PRATICAL: 10
Dataset:
Steps :
1. Take the data into process space
2. Now put K – Means Clustering in process space.
3. Connect all.
4. Run the process.
Output:
Code:
<html>
<head>
<title>Selection</title>
<script src="https://d3js.org/d3.v7.min.js"></script>
</head>
<body>
<p>Hello</p> <br>
<p id="p1">Welcome to DAV Lab</p> <br>
<p class="c1">DAV is useful for data analysis and visualization.</p> <br>
<p class="c1">DAV is very interesting subject.</p>
<script>
d3.select("p").style("font-size", "20px");
d3.select("#p1").style("color", "red");
d3.selectAll(".c1 ").style('color','pink');
</script>
</body>
</html>
Output :
PRATICAL:2
Code:
<html>
<head>
<title>Manipulation</title>
<script src="https://d3js.org/d3.v7.min.js"></script>
</head>
<body>
<p id="p1"></p> <br>
<div>
</div>
<p id="p3">DAV is for Data analysis and visualization</p>
<p id="p4">Hello</p>
<p>D3</label><input type="checkbox" />
<p>jQuery</label><input type="checkbox" />
<script>
d3.select("#p1").text("Inserted Text");
d3.select("div").insert("p").text("Text is inserted in appended class");
d3.select("#p4").remove();
d3.select("#p3").html("<span>Inner html span class</span>");
d3.select("input").property("checked",true);
</script>
</body>
</html>
Output :
PRATICAL:3
Code:
<html>
<head>
<title>Function</title>
<script src="https://d3js.org/d3.v7.min.js"></script>
</head>
<body>
<p class="p1"></p>
<p></p>
<p class="p1"></p>
<p class="p1"></p>
<script>
var data = ["Hello","good","morning"];
var paragraph = d3.select("body")
.selectAll(".p1")
.data(data)
.text(function (d, i)
{ console.log("d: " +
d);
console.log("i: " + i);
console.log("this: " + this);
return d;
});
</script>
</body>
</html>
Output :
PRATICAL:4
Code:
<html>
<head>
<title>Data Binding</title>
<script src="https://d3js.org/d3.v7.min.js"></script>
</head>
<body>
<p>Hello! How are you?</p>
<p id="p1">We will meet soon..!</p>
<script>
var myData = ["Hi! I am fine."];
var myData1 = ["Sure!"]
var p = d3.select("body")
.selectAll("p")
.data(myData)
.text(function (d)
{ return d;
});
var p = d3.select("body")
.selectAll("#p1")
.data(myData1)
.text(function (d)
{ return d;
});
</script>
</body>
</html>
Output :
PRATICAL : 5
Aim : Write and execute program to create SVG Chart using D3.JS.
Code:
<html>
<head>
<title>Chart</title>
<script src="https://d3js.org/d3.v7.min.js"></script>
<style>
svg rect {
fill: skyblue;
}
svg text {
fill:purple;
font: 30px sans-serif;
text-anchor: start;
}
</style>
</head>
<body>
<script>
var data = [70, 25, 10, 66, 27, 46];
var width = 1500,
scaleFactor = 20,
barHeight = 100;
var graph = d3.select("body")
.append("svg")
.attr("width", width)
.attr("height", barHeight * data.length);
var bar = graph.selectAll("g")
.data(data)
.enter()
.append("g")
.attr("transform", function(d, i) {
return "translate(0," + i * barHeight + ")";
});
bar.append("rect")
.attr("width", function(d)
{ return d * scaleFactor;
})
Output :
PRATICAL : 6
Aim : Write and execute program to create Axes graph using D3.JS.
Code:
<html>
<head>
<title>Axes</title>
<script src="https://d3js.org/d3.v7.min.js"></script>
<style>
svg text {
fill:purple;
font: 25px times;
text-anchor: end;
}
</style>
</head>
<body>
<script>
var width = 800, height = 1000;
var data = [5, 10, 15];
var svg = d3.select("body")
.append("svg")
.attr("width", width)
.attr("height", height);
var xscale = d3.scaleLinear()
.domain([0, d3.max(data)])
.range([0, width - 100]);
var yscale = d3.scaleLinear()
.domain([0, d3.max(data)])
.range([height/2, 0]);
var x_axis = d3.axisBottom()
.scale(xscale);
var y_axis = d3.axisLeft()
.scale(yscale);
svg.append("g")
.attr("transform", "translate(50, 10)")
.call(y_axis);
var xAxisTranslate = height/2 + 10;
svg.append("g")
.attr("transform", "translate(50, " + xAxisTranslate +")")
.call(x_axis)
</script>
</body>
</html>
Output :
PRATICAL: 7
Aim : Write and execute program to create Bar Chart graph using D3.JS.
Code:
<html>
<head>
<title>Bar Chart</title>
<style>
.bar {
fill: steelblue;
}
</style>
<script src="https://d3js.org/d3.v4.min.js"></script>
<body>
<svg width="600" height="500"></svg>
<script>
var svg = d3.select("svg"),
margin = 200,
width = svg.attr("width") - margin,
height = svg.attr("height") - margin
svg.append("text")
.attr("transform", "translate(100,0)")
.attr("x", 140)
.attr("y", 50)
.attr("font-size", "24px")
.text("Stock Price")
var xScale = d3.scaleBand().range([0, width]).padding(0.4),
yScale = d3.scaleLinear().range([height, 0]);
var g = svg.append("g")
.attr("transform", "translate(" + 100 + "," + 100 + ")");
d3.csv("XYZ.csv", function(error, data) {
if (error) {
throw error;
}
xScale.domain(data.map(function(d) { return d.year; }));
yScale.domain([0, d3.max(data, function(d) { return d.value; })]);
g.append("g")
.attr("transform", "translate(0," + height + ")")
.call(d3.axisBottom(xScale))
.append("text")
XYZ.csv :
Output :
PRATICAL : 8
Aim : Write and execute program to create Pie Chart using D3.JS.
Code:
<html>
<head>
<style>
.arc text {
font: 10px sans-serif;
text-anchor: middle;
}
.arc path {
stroke: #fff;
}
.title {
fill: teal;
font-weight: bold;
}
</style>
<script src="https://d3js.org/d3.v4.min.js"></script>
</head>
<body>
<svg width="1000" height="500"></svg>
<script>
var svg = d3.select("svg"),
width = svg.attr("width"),
height = svg.attr("height"),
radius = Math.min(width, height) /3;
var g = svg.append("g")
.attr("transform", "translate(" + width/2 + "," + height /2 + ")");
var color =
d3.scaleOrdinal(['#4daf4a','#377eb8','#ff7f00','#984ea3','#e41a1c']); var pie =
d3.pie().value(function(d) {
return d.Amount;
});
var path = d3.arc()
.outerRadius(radius - 50)
.innerRadius(0);
var label = d3.arc()
.outerRadius(radius)
.innerRadius(radius - 50);
d3.csv("8.csv", function(error, data) {
if (error) {
throw error;
}
var arc = g.selectAll(".arc")
.data(pie(data))
.enter().append("g")
.attr("class", "arc");
arc.append("path")
.attr("d", path)
.attr("fill", function(d) { return color(d.data.Expenses); });
console.log(arc)
arc.append("text")
.attr("transform", function(d) {
return "translate(" + label.centroid(d) + ")";
})
.text(function(d) { return d.data.Expenses; });
});
svg.append("g")
.attr("transform", "translate(" + (width=425) + "," + 50 + ")")
.append("text")
.text("Expense of year 2022")
.attr("class", "title")
</script>
</body>
<html>
8. csv :
Output :
PRATICAL : 9
Aim : Write and execute program to create Scale Chart using D3.JS.
Code:
<html>
<head>
<title>Scale Bar Chart</title>
<script src="https://d3js.org/d3.v4.min.js"></script>
<style>
rect {
fill:darkcyan;
}
text {
font-size: 30px;
text-anchor: start;
}
</style>
</head>
<body>
<script>
var data = [70, 25, 10, 66, 27, 46]
var width = 800,
barHeight =
50,
margin = 10;
var scale = d3.scaleLinear()
.domain([d3.min(data), d3.max(data)])
.range([50, 500]);
var svg = d3.select("body")
.append("svg")
.attr("width", width)
.attr("height", barHeight * data.length);
var g = svg.selectAll("g")
.data(data)
.enter()
.append("g")
.attr("transform", function (d, i) {
return "translate(0," + i * barHeight + ")";
});
g.append("rect")
.attr("width", function (d) {
return scale(d);
})
.attr("height", barHeight - margin)
g.append("text")
.attr("x", function (d) { return (scale(d)); })
.attr("y", barHeight / 2)
.attr("dy", ".35em")
.text(function (d) { return d; });
</script>
</body>
</html>
Output :
PRATICAL : 10
Aim : Write & execute program to create Animated Bar Chart using D3.JS
Code:
<html>
<head>
<title>Animated Bar Chart</title>
<style>
.bar {
fill: black;
}
.highlight {
fill: skyblue;
}
</style>
<script src="https://d3js.org/d3.v4.min.js"></script>
</head>
<body>
<svg width="600" height="500"></svg>
<script>
var svg = d3.select("svg"),
margin = 200,
width = svg.attr("width") - margin,
height = svg.attr("height") - margin;
svg.append("text")
.attr("transform", "translate(100,0)")
.attr("x", 140)
.attr("y", 50)
.attr("font-size", "24px")
.text("Stock Price")
var x = d3.scaleBand().range([0, width]).padding(0.4),
y = d3.scaleLinear().range([height, 0]);
var g = svg.append("g")
.attr("transform", "translate(" + 100 + "," + 100 + ")");
d3.csv("XYZ.csv", function(error, data) {
if (error) {
throw error;
}
x.domain(data.map(function(d) { return d.year; }));
y.domain([0, d3.max(data, function(d) { return d.value; })]);
g.append("g")
.attr("transform", "translate(0," + height + ")")
.call(d3.axisBottom(x))
.append("text")
.attr("y", height - 230)
.attr("x", width - 170)
.attr("text-anchor", "end")
.attr("stroke", "black")
.attr("font-size", "24px")
.text("Year");
g.append("g")
.call(d3.axisLeft(y).tickFormat(function(d){
return d;
}).ticks(10))
.append("text")
.attr("transform", "rotate(-90)")
.attr("x",-60)
.attr("y", 6)
.attr("dy", "-2.1em")
.attr("text-anchor", "end")
.attr("stroke", "black")
.attr("font-size", "24px")
.text("Stock Price");
g.selectAll(".bar")
.data(data)
.enter().append("rect")
.attr("class", "bar")
.on("mouseover", onMouseOver) //Add listener for the mouseover event
.on("mouseout", onMouseOut) //Add listener for the mouseout event
.attr("x", function(d) { return x(d.year); })
.attr("y", function(d) { return y(d.value); })
.attr("width", x.bandwidth())
.transition()
.ease(d3.easeLinear)
.duration(400)
.delay(function (d, i)
{ return i * 50;
})
.attr("height", function(d) { return height - y(d.value); });
});
//mouseover event handler function
function onMouseOver(d, i) {
d3.select(this).attr('class', 'highlight');
d3.select(this)
.transition() // adds animation
.duration(400)
.attr('width', x.bandwidth() + 5)
.attr("y", function(d) { return y(d.value) - 10; })
.attr("height", function(d) { return height - y(d.value) + 10; });
g.append("text")
.attr('class', 'val')
.attr('x', function()
{ return x(d.year);
})
.attr('y', function()
{ return y(d.value) -
15;
})
.text(function() {
return [d.value]; // Value of the text
});
}
//mouseout event handler function
function onMouseOut(d, i) {
// use the text label class to remove label on mouseout
d3.select(this).attr('class', 'bar');
d3.select(this)
.transition() // adds animation
.duration(400)
.attr('width', x.bandwidth())
.attr("y", function(d) { return y(d.value); })
.attr("height", function(d) { return height - y(d.value); });
d3.selectAll('.val')
.remove()
}
</script>
</body>
</html>
Output :
PRATICAL : 11
Aim : Write and execute program to create maps and scatter plot using D3.JS
Code:
<!DOCTYPE html>
<html lang="en">
<head>
<meta charset="UTF-8">
<meta name="viewport" content="width=device-width, initial-scale=1.0">
<title>D3.js Map and Scatter Plot Example</title>
<script src="https://d3js.org/d3.v7.min.js"></script>
<style>
/* Add some basic styling */
.scatter {
fill: steelblue;
}
.scatter:hover {
fill: orange;
}
</style>
</head>
<body>
<!-- Create a container for the map and scatter plot -->
<div id="container" style="width: 800px; height: 600px; margin: auto;"></div>
<script>
// Define some sample data for scatter
plot const scatterData = [
{ x: 100, y: 200 },
{ x: 300, y: 100 },
{ x: 500, y: 300 },
{ x: 700, y: 400 },
{ x: 600, y: 200 }
];
// Add map
svg.append("rect")
.attr("width", 800)
.attr("height", 600)
.attr("fill", "lightgray");
Output :
PRATICAL : 12
D3 allows us to manipulate DOM elements in the HTML document and for that we first need to
select a particular element, or a group of elements and then manipulate those elements using
various D3 methods.
Code:
<!doctype html>
<html>
<head>
<script src="https://d3js.org/d3.v4.min.js"></script>
</head>
<body>
<p>First paragraph</p>
<p>Second paragraph</p>
<script>
d3.select("p").style("color", "green");
</script>
</body>
</html>
Output:
2) Method Chaining in D3
D3 uses a convention called method chaining. Basically what it means is that when you call a
method on an object, it performs the method and returns an object. Since the method on an object
returns an object, another method can be called without having to explicitly reference an object
again.
Code:
d3.select("body")
.append("p")
.text("Third paragraph");
Output:
Third paragraph
3) Function of Data
In the DOM Manipulation chapter, we learned about different DOM manipulation methods in D3
such as append(), style(), text() etc. Each of these functions can take in a constant value or a
function as a parameter. This function is a function of data. So each of these methods will be
called for each of our data values bound to the DOM. Consider the following text() function.
Code:
<!doctype html>
<html>
<head>
<script src="https://d3js.org/d3.v4.min.js"></script>
</head>
<body>
<script>
d3.selectAll("p").style("color", function(d, i)
{ var text = this.innerText;
if (text.indexOf("Error") >= 0)
{ return "red";
} else if (text.indexOf("Warning") >= 0) {
return "yellow";
}
});
</script>
</body>
</html>
Output:
Dashboard
The Excel Dashboard is used to display overviews of large data tracks. Excel Dashboards use dashboard
elements like tables, charts, and gauges to show the overviews. The dashboards ease the decision-
making process by showing the vital parts of the data in the same window.
Your manager needs the following information: total sales for each year, delivery cost for each
manufacturer, labor cost for each state, total amount of sales and sales matrix. Besides, it would be
interesting if the CEO could visualize the total amount of sales by state and if the sales are above or
below average.
(car Sale)
Dashboard
Pivot table
It made no sense to me until I found that the Year (Ano) column was being summed. That made it clear
that the Year values had been recognized as integers on purpose so I had to Transform data, turning
those integers into text and ending up with the right chart that, after a bit of formatting, ended up like
this:
Job done! Next up is delivery cost by manufacturer. My first thought was to make another column chart
or at least a bar chart, but the course’s teacher said and I quote “You should not use the same chart
twice on a Dashboard to deliver different information”. Their advice was to maybe use a pie/donut
chart, but I, personally, didn’t think that would make much sense. My last take on this was to go back to
my first idea and make it into a column char, but turn my Yealy Revenue chart from column to line. In
the end, I had these:
Although out of order, the next question I wanted to solve was total amount of sales by state and if the
sales are above or below average. To do that, I first thought of using a pie chart, wich would in theory
be the best choice. But, as one can see here
two of the 6 states (mine included) were not showing due to a huge gap in revenue. I had been keeping
the map chart I wanted so badly to use for the labor cost by stat task, but, in the end, had to use it here
and ended up with
Where the specific value of revenue by state can be seen by hovering the mouse over each state on the
map (not here, on the dashboard). There isn’t an explicit sign of the average revenue, but it is very
clear where the line between the most profitable and least profitable states is drawn through the
BE-III (Semester VI) CSE 210450131039 66
Data Analysis and Visualization (3161613)
coloring.
Following up we have labor costs for each state, for which I’ve had no best idea (like, at all) than to use
a bar chart. It is very similar to the column chart in the sense that I didn’t want to use both in the same
dashboard following what I’d learned in the course. But, in the end, there was really no other way. This
is what I ended up with:
The "car Sales Project.xlsx" Excel file likely serves the purpose of analyzing and presenting data related
to bike sales. Here's a breakdown of its components and potential purpose:
Data Source: The Excel file likely contains multiple sheets or tabs, with one tab dedicated to raw data.
This raw data would include information such as sales figures, dates, types of bikes sold, customer
demographics, etc.
Pivot Tables: Pivot tables are likely used to summarize and analyze the raw data. They can provide
insights into various aspects of bike sales, such as total revenue, sales trends over time, best-selling bike
models, regional sales performance, etc. Pivot tables can be dynamically adjusted to explore different
dimensions of the data.
Dashboard: The dashboard is a visually appealing summary of key metrics and insights derived from
the pivot tables. It may include charts, graphs, and summary tables to present the most relevant
information at a glance. The purpose of the dashboard is to provide a quick overview of bike sales
performance and trends, allowing stakeholders to make informed decisions.
Purpose: The primary purpose of this project is likely to analyze bike sales data to gain insights into
sales performance, identify trends, and make data-driven decisions. This could include optimizing
inventory management, targeting marketing efforts, identifying opportunities for expansion, and
improving overall business efficiency.
71
BE-III (Semester VI) CSE/IT 210450131039