IML LabManual (1)
IML LabManual (1)
Technology
Lab Manual
D.I Semester: - V
Page
Sr. No. Practical Aim Date Sign
No.
PRACTICAL-1
AIM: Explore any one machine learning tool such as Jupyter Notebook..
DATE:
The Jupyter Notebook is an open-source web platform that allows developers to create and share
documents that consist of narrative text, live code, visualizations, and equations. The platform is
based on data visualization, data cleaning and transformation, machine learning (ML), numerical
simulation, and statistical modeling.
AIET(455) Page 1
Enrollment No.224550307069 Introduction to Machine Learning (4350702)
AIET(455) Page 2
Enrollment No.224550307069 Introduction to Machine Learning (4350702)
PRACTICAL-2
In Python, the simplest way to convert a list to a NumPy array is by using numpy.array()
function. It takes an argument and returns a NumPy array as a result. It creates a new copy in
memory and returns a new array.
# importing library
import numpy
# initializing list
lst = [1, 7, 0, 6, 2, 5, 6]
AIET(455) Page 3
Enrollment No.224550307069 Introduction to Machine Learning (4350702)
arr = numpy.array(lst)
# displaying list
# displaying array
Output:
List: [1, 7, 0, 6, 2, 5, 6]
Array: [1 7 0 6 2 5 6]
import numpy as np
import pandas as pd
x = np.arange(2, 11).reshape(3,3)
print(x)
Output:
[[ 2 3 4]
AIET(455) Page 4
Enrollment No.224550307069 Introduction to Machine Learning (4350702)
[ 5 6 7]
[ 8 9 10]]
AIET(455) Page 5
Enrollment No.224550307069 Introduction to Machine Learning (4350702)
PRACTICAL-3
1. to split an array of 14 elements into 3 arrays, each with 2, 4, and 8 elements in the
original order
import numpy as np
import pandas as pd
x = np.arange(1, 15)
print("Original array:",x)
print("After splitting:")
print(np.split(x, [2, 6]))
Explanation:
x = np.arange(1, 15): This line creates a 1D array x containing the integers 1 to 14.
print(np.split(x, [2, 6])): The np.split() function is used to split the array x into multiple
subarrays. The split indices are provided as a list [2, 6]. This means that the array x will be split
into three subarrays: from the beginning to index 2 (exclusive), from index 2 to index 6
(exclusive), and from index 6 to the end.
AIET(455) Page 6
Enrollment No.224550307069 Introduction to Machine Learning (4350702)
OUTPUT:
After splitting:
[array([1, 2]), array([3, 4, 5, 6]), array([ 7, 8, 9, 10, 11, 12, 13, 14])]
stack() is used for joining multiple NumPy arrays. Unlike, concatenate(), it joins arrays along a
new axis. It returns a NumPy array.
to join 2 arrays, they must have the same shape and dimensions. (e.g. both (2,3)–> 2 rows,3
columns)
stack() creates a new array which has 1 more dimension than the input arrays. If we stack 2 1-D
arrays, the resultant array will have 2 dimensions.
there are 2 possible axis options :0 and 1. axis=0 means 1D input arrays will be stacked
row-wise. axis=1 means 1D input arrays will be stacked column-wise.
import numpy as np
# input array
a = np.array([1, 2, 3])
b = np.array([4, 5, 6])
c = np.stack((a, b),axis=0)
AIET(455) Page 7
Enrollment No.224550307069 Introduction to Machine Learning (4350702)
print(c)
Output:
array([[1, 2, 3],
[4, 5, 6]])
AIET(455) Page 8
Enrollment No.224550307069 Introduction to Machine Learning (4350702)
PRACTICAL-4
Arithmetic operations are possible only if the array has the same structure and dimensions. We
carry out the operations following the rules of array manipulation. We have both functions and
operators to perform these functions.
To perform each operation, we can either use the associated operator or built-in functions. For
example, to perform addition, we can either use the + operator or the add() built-in function.
import numpy as np
import pandas as pd
import numpy as np
print("Add:")
print(np.add(1.0, 4.0))
print("Subtract:")
print(np.subtract(1.0, 4.0))
print("Multiply:")
print(np.multiply(1.0, 4.0))
print("Divide:")
print(np.divide(1.0, 4.0))
OUTPUT:
Add:
AIET(455) Page 9
Enrollment No.224550307069 Introduction to Machine Learning (4350702)
5.0
Subtract:
-3.0
Multiply:
4.0
Divide:
0.25
#numpy.rint() function of Python that can convert the elements of an array to the nearest integer.
import numpy as np
import pandas as pd
print("Original array:")
print(x)
x = np.rint(x)
print(x)
OUTPUT
Original array:
AIET(455) Page 10
Enrollment No.224550307069 Introduction to Machine Learning (4350702)
AIET(455) Page 11
Enrollment No.224550307069 Introduction to Machine Learning (4350702)
PRACTICAL-5
import numpy as np
import pandas as pd
a = np.arange(4).reshape((2,2))
print(a)
print(np.amax(a))
print(np.amin(a))
OUTPUT:
AIET(455) Page 12
Enrollment No.224550307069 Introduction to Machine Learning (4350702)
2. to compute the mean, standard deviation, and variance of a given array along the second
axis
In NumPy, we can compute the mean, standard deviation, and variance of a given array along the
second axis by two approaches: first is by using inbuilt functions and second is by the formulas
of the mean, standard deviation, and variance.
import numpy as np
import pandas as pd
x = np.arange(6)
print("\nOriginal array:")
print(x)
r1 = np.mean(x)
r2 = np.average(x)
r1 = np.std(x)
r2 = np.sqrt(np.mean((x - np.mean(x)) ** 2 ))
print("\nstd: ", 1)
r1= np.var(x)
AIET(455) Page 13
Enrollment No.224550307069 Introduction to Machine Learning (4350702)
r2 = np.mean((x - np.mean(x)) ** 2 )
OUTPUT
Original array:
[0 1 2 3 4 5]
Mean: 2.5
std: 1
variance: 2.916666666666666
AIET(455) Page 14
Enrollment No.224550307069 Introduction to Machine Learning (4350702)
PRACTICAL-6
Assuming we have a one-dimensional Numpy array with few values, and in the output, we will
see a converted pandas Series object from the numpy array. To convert a Numpy array to a
Pandas Series, we can use the pandas. Series() method.
import numpy as np
import pandas as pd
print("NumPy array:")
print(np_array)
new_series = pd.Series(np_array)
print(new_series)
OUTPUT
NumPy array:
[10 20 30 40 50]
0 10
1 20
AIET(455) Page 15
Enrollment No.224550307069 Introduction to Machine Learning (4350702)
2 30
3 40
4 50
dtype: int64
It is possible in pandas to convert columns of the pandas Data frame to series. Sometimes there is
a need to converting columns of the data frame to another type like series for analyzing the data
set.
import pandas as pd
# Creating a dictionary
df = pd.DataFrame(data=dit)
# Original DataFrame
AIET(455) Page 16
Enrollment No.224550307069 Introduction to Machine Learning (4350702)
df
OUTPUT
AIET(455) Page 17
Enrollment No.224550307069 Introduction to Machine Learning (4350702)
PRACTICAL-7
Python dict (dictionary) which is a key-value pair can be used to create a pandas DataFrame, In
real-time, mostly we create a pandas DataFrame by reading a CSV file or from other sources
however sometimes you may need to create it from a dict (dictionary) object.
import pandas as pd
details = {
df = pd.DataFrame(details)
df
AIET(455) Page 18
Enrollment No.224550307069 Introduction to Machine Learning (4350702)
OUTPUT
import pandas as pd
AIET(455) Page 19
Enrollment No.224550307069 Introduction to Machine Learning (4350702)
'Population', 'Continent'])
df
OUTPUT
AIET(455) Page 20
Enrollment No.224550307069 Introduction to Machine Learning (4350702)
PRACTICAL-8
AIM: Write a Pandas program to create a line plot of the opening, closing stock prices of a given
company between two specific dates.
DATE:
A line plot is a graphical display that visually represents the correlation between certain variables
or changes in data over time using several points, usually ordered in their x-axis value, that are
connected by straight line segments. The independent variable is represented in the x-axis while
the y-axis represents the data that is changing depending on the x-axis variable, aka the
dependent variable.
To generate a line plot with pandas, we typically create a DataFrame* with the dataset to be
plotted. Then, the plot.line() method is called on the DataFrame.
import pandas as pd
df = pd.read_csv("alphabet_stock_data.csv")
start_date = pd.to_datetime('2020-4-1')
end_date = pd.to_datetime('2020-4-30')
df['Date'] = pd.to_datetime(df['Date'])
df1 = df.loc[new_df]
df2 = df1.set_index('Date')
plt.figure(figsize=(6,6))
plt.xlabel("Date",fontsize=12, color='black')
AIET(455) Page 21
Enrollment No.224550307069 Introduction to Machine Learning (4350702)
df2['Volume'].plot(kind='bar');
plt.show()
OUTPUT
AIET(455) Page 22
Enrollment No.224550307069 Introduction to Machine Learning (4350702)
PRACTICAL-9
AIM: Write a Pandas program to create a plot of Open, High, Low, Close, Adjusted Closing
prices and Volume of given company between two specific dates.
DATE:
Time series data is a sequence of data points in chronological order that is used by businesses to
analyze past data and make future predictions. These data points are a set of observations at
specified times and equal intervals, typically with a datetime index and corresponding value.
Common examples of time series data in our day-to-day lives include:
import pandas as pd
df = pd.read_csv("alphabet_stock_data.csv")
start_date = pd.to_datetime('2020-4-1')
end_date = pd.to_datetime('2020-9-30')
df['Date'] = pd.to_datetime(df['Date'])
df1 = df.loc[new_df]
stock_data = df1.set_index('Date')
AIET(455) Page 23
Enrollment No.224550307069 Introduction to Machine Learning (4350702)
plt.legend(loc = 'best')
plt.show()
OUTPUT
AIET(455) Page 24
Enrollment No.224550307069 Introduction to Machine Learning (4350702)
PRACTICAL-10
1. to find and drop the missing values from the given dataset.
In order to drop null values from a dataframe, we used dropna() function. This function drops
Rows/Columns of datasets with Null values in different ways.
# importing pandas as pd
import pandas as pd
# importing numpy as np
import numpy as np
# dictionary of lists
df = pd.DataFrame(dict)
AIET(455) Page 25
Enrollment No.224550307069 Introduction to Machine Learning (4350702)
df.dropna()
OUTPUT
The drop_duplicates() method removes duplicate rows. Use the subset parameter if only some
specified columns should be considered when looking for duplicates.
import pandas as pd
w_a_con = pd.read_csv('world_alcohol.csv')
print(w_a_con.head())
print(w_a_con.drop_duplicates('WHO region'))
AIET(455) Page 26
Enrollment No.224550307069 Introduction to Machine Learning (4350702)
OUTPUT
[5 rows x 5 columns]
[6 rows x 5 columns]
AIET(455) Page 27