0% found this document useful (0 votes)
14 views25 pages

III-II AIDS R22 ML

The document outlines the vision and mission of the Department of AI&DS for B.Tech students, emphasizing the importance of professional education, research, and ethical values. It details the program's educational objectives, specific outcomes, and outcomes related to engineering knowledge and skills in machine learning. Additionally, it provides a course structure for the ML Lab, including objectives, outcomes, evaluation criteria, and a list of practical experiments to be conducted.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
14 views25 pages

III-II AIDS R22 ML

The document outlines the vision and mission of the Department of AI&DS for B.Tech students, emphasizing the importance of professional education, research, and ethical values. It details the program's educational objectives, specific outcomes, and outcomes related to engineering knowledge and skills in machine learning. Additionally, it provides a course structure for the ML Lab, including objectives, outcomes, evaluation criteria, and a list of practical experiments to be conducted.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 25

lOMoARcPSD|181 821 25

MACHINE LEARNING
LAB MANUAL (R22)
B.Tech III Year – II Semester
ACADEMIC YEAR (2024-2025)
Department of AI&DS
VISION
To empower female students with professional education using creative & innovative technical
practices of global competence and research aptitude to become competitive engineers with ethical
values and entrepreneurial skills.

MISSION
To impart value based professional education through creative and innovative teaching-learning
process to face the global challenges of the new era technology.

To inculcate research aptitude and to bring out creativity in students by imparting engineering
knowledge imbibing interpersonal skills to promote innovation, research and entrepreneurship.

Department Vision & Mission

Vision:
To be a leading department of Artificial Intelligence and Data Science that provides
cutting-edge education, research, and innovation in the field, and prepares graduates
to become globally competitive professionals, researchers, and entrepreneurs.

Mission:

DM1: Providing comprehensive education and training in the principles, tools, and
applications of Artificial Intelligence and Data Science, to prepare graduates for a
wide range of careers and research opportunities.

DM2: Conducting cutting-edge research in the field of Artificial Intelligence and


Data Science, including the development of new algorithms, models, and platforms
for data analysis, machine learning, and deep learning.

DM3: Fostering collaborations and partnerships with industry, government, and


academia to promote the transfer of technology, innovation, and entrepreneurship.

Program Educational Objectives :

PEO1: To demonstrate technical excellence in Artificial Intelligence and Data


Science by applying their knowledge and skills to develop and implement advanced
algorithms, models, and platforms for data analysis, machine learning, and deep
learning.
PEO2: To acquire continuous learning and professional development to stay abreast
of the latest developments in Artificial Intelligence and Data Science and related
fields, and to advance their careers.

PEO3: To exhibit professionalism and leadership skills by upholding high ethical


standards and contributing to the development and dissemination of best practices in
Artificial Intelligence and Data Science.

Program Specific Outcomes :

PSO1: Ability to develop and implement data-driven models, algorithms, and


visualization techniques using state-of-the-art tools and platforms for Artificial
Intelligence and Data Science to solve real-world problems in diverse domains.

PSO2: Ability to advanced techniques such as machine learning, deep learning, and
natural language processing to design and develop intelligent systems that can learn
from data and adapt to changing environments.

PSO3: Collaborate effectively in interdisciplinary teams to design, develop, and


deploy Artificial Intelligence and Data Science solutions that meet business, social,
and scientific requirements. Conduct independent research in Artificial Intelligence
and Data Science, and contribute to the development of new techniques, algorithms,
and systems for the field.

Program Outcomes :

PO1 : Engineering Knowledge: Apply the knowledge of mathematics, science,


engineering fundamentals and an engineering specialization to the solution of
complex engineering problems

PO2 : Problem Analysis: Identify, formulate, review research literature, and analyze
complex engineering problems reaching substantiated conclusions using first
principles of mathematics, natural sciences and Engineering sciences.

PO3 : Design/Development of Solutions: Design solutions for complex engineering


problems and design system components or processes that meet the specified needs
with appropriate consideration for the public health safety, and the cultural, societal,
and environmental considerations.

PO4 : Conduct Investigations of Complex Problems: Use research-based


knowledge and research methods including design of experiments, analysis and
interpretation of data, and synthesis of the information to provide valid conclusions.

PO5 : Modern Tool Usage: Create, select and apply appropriate techniques,
resources and modern engineering and IT tools including prediction and modeling to
complex engineering activities with an understanding of the limitations.
PO6 : The Engineer and Society: Apply reasoning informed by the contextual
knowledge to assess societal, health, safety, legal and cultural issues and the
consequent responsibilities relevant to the professional engineering practice.

PO7 : Environment and Sustainability: Understand the impact of the professional


engineering solutions in societal and environmental contexts and demonstrate the
knowledge of,and need for sustainable development.

PO8 : Ethics: : Apply ethical principles and commit to professional ethics and
responsibilities and norms of the engineering practice.

PO9 : Individual and Team Work: Function effectively as an individual and as a


member or leader in diverse teams and in multidisciplinary settings.

PO10 : Communication: Communicate effectively on complex engineering


activities with the engineering community and with society at large, such as being
able to comprehend and write effective reports and design documentation, make
effective presentations and give and receive clear instructions.

PO11 : Project Management and Finance: Demonstrate knowledge and


understanding of the engineering management principles and apply these to one's
own work, as a member and leader in a team to manage projects and in
multidisciplinary environments.

PO12 : Life-Long Learning: Recognize the need for and have the preparation and
ability to engage in independent and lifelong learning in the broadest context of
technological change.

Course Structure

Course Title Ml LAB

Course Code
Programme B.Tech III-II
Course Structure
Practical
L T P Credits
0 0 3 1.5
COURSE OBJECTIVES
S. NO. Course Objectives

The objective of this lab is to get an overview of the various machine learning
1
techniques and
can demonstrate them using python.

CO. NO. Course Outcomes BTL

Understand modern notions in predictive data analysis


CO1

Select data, model selection, model complexity and identify the trends
CO2

Understand a range of machine learning algorithms along with their


CO3 strengths and weaknesses
Build predictive models from data and analyze their performance
CO4

ML LAB

ML LAB- Credits :1.5

Instructions : 3 practicals/week Sessional Marks :40

End Exam : 3 hours End Exam:60

SCHEME OF EVALUATION
Total Marks For Each Student to Evaluate In Lab :100 Marks

Out Of 100 Marks :

A-Regularity B – Record submission in Time C – Viva-voce D - Experimentation


Exp A B C D TOTAL Faculty
Experiment Name Date (T=A+B+C+D)
No 2 3 4 6 Sign
List of Experiments

1. Write a python program to compute Central Tendency Measures: Mean, Median,


Mode Measure of Dispersion: Variance, Standard Deviation
2. Study of Python Basic Libraries such as Statistics, Math, Numpy and Scipy
3. Study of Python Libraries for ML application such as Pandas and Matplotlib
4. Write a Python program to implement Simple Linear Regression
5. Implementation of Multiple Linear Regression for House Price Prediction using sklearn
6. Implementation of Decision tree using sklearn and its parameter tuning
7. Implementation of KNN using sklearn
8. Implementation of Logistic Regression using sklearn
9. Implementation of K-Means Clustering
10. Performance analysis of Classification Algorithms on a specific dataset (Mini Project)
1. Central Tendency Measures and Measures of Dispersion

Code:

import statistics as stats


import numpy as np

data = [12, 15, 14, 10, 8, 15, 16, 21, 18, 18]

# Central Tendency
mean = np.mean(data)
median = np.median(data)
mode = stats.mode(data)

# Measures of Dispersion
variance = np.var(data, ddof=1) # Sample variance
std_deviation = np.std(data, ddof=1) # Sample standard deviation

print(f"Mean: {mean}, Median: {median}, Mode: {mode}")


print(f"Variance: {variance}, Standard Deviation: {std_deviation}")

Output:

Mean: 14.7, Median: 15.0, Mode: 15


Variance: 14.78888888888889, Standard Deviation: 3.846994255448134
2. Aim: Study of Python Basic Libraries such as Statistics, Math, Numpy and Scipy
Python’s statistics is a built-in Python library for descriptive statistics. You can use it if your
datasets are not too large or if you can’t rely on importing other libraries. NumPy is a third -
party library for numerical computing, optimized for working with single- and multi-
dimensional arrays.

Understanding Descriptive Statistics

Descriptive statistics is about describing and summarizing data. It uses two main approaches:

 The quantitative approach describes and summarizes data


 The visual approach illustrates data with charts, plots, histograms, and other graphs.
You can apply descriptive statistics to one or many datasets or variables. When you describe
and summarize a single variable, you’re performing univariate analysis. When you search for
statistical relationships among a pair of variables, you’re doing a bivariate analysis. Similarly,
a multivariate analysis is concerned with multiple variables at once.

There are many Python statistics libraries out there for you to work with, but in this tutorial,
you’ll be learning about some of the most popular and widely used ones:

 Python’s statistics is a built-in Python library for descriptive You can use it if your
datasets are not too large or if you can’t rely on importing other libraries.
 NumPy is a third-party library for numerical computing, optimized for working with
single- and multi-dimensional Its primary type is the array type called ndarray. This
library contains many routines for statistical analysis.
 SciPy is a third-party library for scientific computing based on NumPy. It offers
additional functionality compared to NumPy, including scipy.stats for statistical
 pandas is a third-party library for numerical computing based on It excels in handling
labeled one-dimensional (1D) data with Series objects and two- dimensional (2D) data
with DataFrame objects.
 Matplotlib is a third-party library for data visualization. It works well in combination
with NumPy, SciPy, and pandas.
Note that, in many cases, Series and DataFrame objects can be used in place of NumPy arrays.
Often, you might just pass them to a NumPy or SciPy statistical function. In addition, you can
get the unlabeled data from a Series or DataFrame as a np.ndarray object by calling .values or
.to_numpy().

Getting Started With Python Statistics Libraries

The built-in Python statistics library has a relatively small number of the most important
statistics functions. The official documentation is a valuable resource to find the details. If
you’re limited to pure Python, then the Python statistics library might be the right choice.

A good place to start learning about NumPy is the official User Guide, especially the quickstart
and basics sections. The official reference can help you refresh your memory on specific
NumPy concepts. While you read this tutorial, you might want to check out the statistics
section and the official scipy.stats reference as well.

ii)Math

To carry out calculations with real numbers, the Python language contains many additional
functions collected in a library (module) called math.
To use these functions at the beginning of the program, you need to connect the math library,
which is done by the command

import math
Python provides various operators for performing basic calculations, such as * for
multiplication,% for a module, and / for the division. If you are developing a program in
Python to perform certain tasks, you need to work with trigonometric functions, as well as
complex numbers. Although you cannot use these functions directly, you can access them by
turning on the math module math, which gives access to hyperbolic, trigonometric and
logarithmic functions for real numbers. To use complex numbers, you can use the math
module cmath. When comparing math vs numpy, a math library is more lightweight and can be
used for extensive computation as well.

The Python Math Library is the foundation for the rest of the math libraries that are written on
top of its functionality and functions defined by the C standard. Please refer to the python
math examples for more information.

Number-theoretic and representation functions

This part of the mathematical library is designed to work with numbers and their
representations. It allows you to effectively carry out the necessary transformations with
support for NaN (not a number) and infinity and is one of the most important sections of the
Python math library. Below is a short list of features for Python 3rd version. A more detailed
description can be found in the documentation for the math library.

math.ceil(x) – return the ceiling of x, the smallest integer greater than or equal to x

math.comb(n, k) – return the number of ways to choose k items from n items without
repetition and without order

math.copysign(x, y) – return float with the magnitude (absolute value) of x but the sign of

1. On platforms that support signed zeros, copysign (1.0, -0.0) returns -1.0
math.fabs(x) – return the absolute value of x

math.factorial(x) – return x factorial as an integer. Raises ValueError if x is not integral or is


negative

math.floor(x) – return the floor of x, the largest integer less than or equal to x

math.fmod(x, y) – return fmod(x, y), as defined by the platform C library

math.frexp(x) – return the mantissa and exponent of x as the pair (m, e). m is a float and e is
an integer such that x == m * 2**e exactly

math.fsum(iterable) – return an accurate floating-point sum of values in the iterable

math.gcd(a, b) – return the greatest common divisor of the integers a and b

math.isclose(a, b, *, rel_tol=1e-09, abs_tol=0.0) – return True if the values a and b are close
to each other and False otherwise

math.isfinite(x) – return True if x is neither infinity nor a NaN, and False otherwise (note that
0.0 is considered finite)

math.isinf(x) – return True if x is positive or negative infinity, and False otherwise


math.isnan(x) – return True if x is a NaN (not a number), and False otherwise

math.isqrt(n) – return the integer square root of the nonnegative integer n. This is the floor of
the exact square root of n, or equivalently the greatest integer a such that a² ≤ n

math.ldexp(x, i) – return x * (2**i). This is essentially the inverse of function frexp()

math.modf(x) – return the fractional and integer parts of x. Both results carry the sign of x
and are floats

math.perm(n, k=None) – return the number of ways to choose k items from n items without
repetition and with order

math.prod(iterable, *, start=1) – calculate the product of all the elements in the input
iterable. The default start value for the product is 1

math.remainder(x, y) – return the IEEE 754-style remainder of x with respect to y

math.trunc(x) – return the Real value x truncated to an Integral (usually an integer)

Power and logarithmic functions

The power and logarithmic functions section are responsible for exponential calculations,
which is important in many areas of mathematics, engineering, and statistics. These

functions can work with both natural logarithmic and exponential functions, logarithms
modulo two, and arbitrary bases.

math.exp(x) – return e raised to the power x, where e = 2.718281… is the base of natural
logarithms

math.expm1(x) – return e raised to the power x, minus 1. Here e is the base of natural
logarithms. math.log(x[, base]) – With one argument, return the natural logarithm of x (to base
e). With two arguments, return the logarithm of x to the given base, calculated as
log(x)/log(base)

math.log1p(x) – return the natural logarithm of 1+x (base e). The result is calculated in a way
that is accurate for x near zero

math.log2(x) – return the base-2 logarithm of x. This is usually more accurate than log(x, 2)

math.log10(x) – return the base-10 logarithm of x. This is usually more accurate than log(x,
10)

math.pow(x, y) – return x raised to the power y

math.sqrt(x) – return the square root of x


Trigonometric functions

Trigonometric functions, direct and inverse, are widely represented in the Python
Mathematical Library. They work with radian values, which is import ant. It is also possible to
carry out calculations with Euclidean functions.

math.acos(x) – return the arc cosine of x, in radians

math.asin(x) – return the arc sine of x, in radians

math.atan(x) – return the arctangent of x, in radians

math.atan2(y, x) – return atan(y / x), in radians. The result is between -pi and pi

math.cos(x) – return the cosine of x radians

math.dist(p, q) – return the Euclidean distance between two points p and q, each given as a
sequence (or iterable) of coordinates. The two points must have the same dimension

math.hypot(*coordinates) – return the Euclidean norm, sqrt(sum(x**2 for x in coordinates)).


This is the length of the vector from the origin to the point given by the coordinates

math.sin(x) – return the sine of x radians

math.tan(x) – return the tangent of x radians

Angular conversion

Converting degrees to radians and vice versa is a fairly common function and therefore the
developers have taken these actions to the Python library. This allows you to write compact
and understandable code.

math.degrees(x) – convert angle x from radians to degrees

math.radians(x) – convert angle x from degrees to radians

Hyperbolic functions

Hyperbolic functions are analogs of trigonometric functions that are based on hyperbolas
instead of circles.

math.acosh(x) – return the inverse hyperbolic cosine of x

math.asinh(x) – return the inverse hyperbolic sine of x

math.atanh(x) – return the inverse hyperbolic tangent of x

math.cosh(x) – return the hyperbolic cosine of x

math.sinh(x) – return the hyperbolic sine of x


math.tanh(x) – return the hyperbolic tangent of x

Special functions

The special functions section is responsible for error handling and gamma functions. This is a
necessary function and it was decided to implement it in the standard Python mathematical
library.

math.erf(x) – Return the error function at x

math.erfc(x) – Return the complementary error function at x

math.gamma(x) – Return the Gamma function at x

math.lgamma(x) – Return the natural logarithm of the absolute value of the Gamma function
at x

Constants

The constant section provides ready-made values for basic constants and writes them with the
necessary accuracy for a given hardware platform, which is important for Python’s portability
as a cross-platform language. Also, the very important values infinity and “not a number” are
defined in this section of the Python library.

math.pi – the mathematical constant π = 3.141592…, to available precision

math.e – the mathematical constant e = 2.718281…, to available precision

math.tau – the mathematical constant τ = 6.283185…, to available precision. Tau is a circle


constant equal to 2π, the ratio of a circle’s circumference to its radius

math.inf – a floating-point positive infinity. (For negative infinity, use -math.inf.) Equivalent
to the output of float(‘inf’)

math.nan – a floating-point “not a number” (NaN) value. Equivalent to the output of


float(‘nan’)

iii)Scipy

SciPy is a library for the open-source Python programming language, designed to perform
scientific and engineering calculations.

The capabilities of this library are quite wide:

 Search for minima and maxima of functions


 Calculation of function integrals
 Support for special functions
 Signal processing
 Image processing
 Work with genetic algorithms
 Solving ordinary differential equations
SciPy in Python is a collection of mathematical algorithms and functions built as a Numpy
extension. It greatly extends the capabilities of an interactive Python session by providing the
user with high-level commands and classes for managing and visualizing data. With SciPy, an
interactive Python session becomes a data processing and prototyping system competing with
systems such as MATLAB, IDL, Octave, R-Lab, and SciLab.

An additional advantage of Python-based SciPy is that it is also a fairly powerful programming


language used in the development of complex programs and specialized applications.
Scientific applications also benefit from the development of additional modules in numerous
software niches by developers around the world. Everything from parallel programming for the
web to routines and database classes is available to the Python programmer. All of these
features are available in addition to the SciPy math library.

Packages for mathematical methods

SciPy is organized into sub-packages covering various scientific computing areas:

cluster – Clustering Algorithms

constants – physical and mathematical constants

fftpack – Fast Fourier Transform subroutines

integrate – integration and solution of ordinary differential equations

Interpolate – interpolation and smoothing splines

io – input and output

linalg – linear algebra

ndimage – n-dimensional image processing

odr -orthogonal regression distance multiplexing

optimize – root structure optimization and search

signal – signal processing

sparse – sparse matrices and related procedures

spatial – spatial Data Structures and Algorithms

special – special functions

stats – statistical Distributions and Functions

weave – C / C ++ integration

The SciPy ecosystem includes general and specialized tools for data management and
computation, productive experimentation, and high-performance computing. Below, we
overview some key packages, though there are many more relevant packages.
Main components of ScyPy

Data and computation:

pandas, providing high-performance, easy-to-use data structures

SymPy, for symbolic mathematics and computer algebra

scikit-image is a collection of algorithms for image processing

scikit-learn is a collection of algorithms and tools for machine learning h5py and
PyTables can both access data stored in the HDF5 format

Productivity and high-performance computing:

IPython, a rich interactive interface, letting you quickly process data and test ideas

The Jupyter notebook provides IPython functionality and more in your web browser, allowing
you to document your computation in an easily reproducible form

Cython extends Python syntax so that you can conveniently build C extensions, either to speed
up critical code or to integrate with C/C++ libraries

Dask, Joblib or IPyParallel for distributed processing with a focus on numeric data

Quality assurance:

nose, a framework for testing Python code, being phased out in preference for pytest
numpydoc, a standard, and library for documenting Scientific Python libraries SciPy provides
a very wide and sought-after feature set:

Clustering package (scipy.cluster)

Constants (scipy.constants)

Discrete Fourier transforms (scipy.fftpack)

Integration and ODEs (scipy.integrate)

Interpolation (scipy.interpolate)

Input and output (scipy.io)

Linear algebra (scipy.linalg)

Miscellaneous routines (scipy.misc)

Multi-dimensional image processing (scipy.ndimage)

Orthogonal distance regression (scipy.odr)

Optimization and Root Finding (scipy.optimize)

Signal processing (scipy.signal)


Sparse matrices (scipy.sparse)

Sparse linear algebra (scipy.sparse.linalg)

Compressed Sparse Graph Routines (scipy.sparse.csgraph)

Spatial algorithms and data structures (scipy.spatial)

Special functions (scipy.special)

Statistical functions (scipy.stats)

Statistical functions for masked arrays (scipy.stats.mstats)

Low-level callback functions

An example of how to calculate effectively on SciPy

In this tutorial, Basic functions — SciPy v1.4.1 Reference Guide, you can find how to
calculate polynomials, their derivatives, and integrals. Yes, by one line of code SciPy

calculates derivative and integral in symbolic form. Imagine how many lines of code you
would need to do this without SciPy. This is why this library is valuable in Python:

>>> p = poly1d([3,4,5])

>>> print(p) 2

3x+4x+5

>>> print(p*p)

4 3 2

9 x + 24 x + 46 x + 40 x + 25

>>> print(p.integ(k=6))

3 2

1x+2x+5x+6

>>> print(p.deriv()) 6 x + 4

>>> p([4, 5])

array([ 69, 100])

Applications:

 Multidimensional image operations


 Solving differential equations and the Fourier transform
 Optimization algorithms
 Linear algebra
iv)Numpy

In early 2005, programmer and data scientist Travis Oliphant wanted to unite the community
around one project and created the NumPy library to replace the Numeric and NumArra y
libraries. NumPy was created based on the Numeric code. The Numeric code was rewritten to
be easier to maintain, and new features could be added to the library. NumArray features have
been added to NumPy. NumPy was originally part of the SciPy library. To allow other projects
to use the NumPy library, its code was placed in a separate package.

The source code for NumPy is publicly available. NumPy is licensed under the BSD license.

Purpose of the NumPy library

Mathematical algorithms implemented in interpreted languages, for example, Python, often


work much slower than the same algorithms implemented in compiled languages (for example,
Fortran, C, and Java). The NumPy library provides implementations of computational
algorithms in the form of functions and operators, optimized for working with
multidimensional arrays. As a result, any algorithm that can be expressed as a sequence of
operations on arrays (matrices) and implemented using NumPy works as fast as the equivalent
code executed in MATLAB. If we compare numpy vs math, we quickly find thatnumpy has
more advantages for computation methods compared to math.

Here are some of the features of Numpy:

 A powerful N-dimensional array object


 Sophisticated (broadcasting) functions
 Tools for integrating C/C++ and Fortran code
 Useful linear algebra, Fourier transform, and random number capabilities
What’s the difference between a Python list and a NumPy array?

As described in the NumPy documentation, “NumPy gives you an enormous range of fast and
efficient ways of creating arrays and manipulating numerical data inside them. While a Python
list can contain different data types within a single list, all of the elements in a NumPy array
should be homogenous. The mathematical operations that are meant to be performed on arrays
would be extremely inefficient if the arrays weren’t homogenous.” Numpy provides the
following features to the user:

Array objects
Constants
Universal functions (ufunc)
Routine
Packaging (numpy.distutils)
NumPy Distutils – Users Guide
NumPy C-API
NumPy internals
NumPy and SWIG
NumPy basics:

 Data types
 Array creation
 I/O with NumPy
 Indexing
 Broadcasting
 Byte-swapping
 Structured arrays
 Writing custom array containers
 Subclassing ndarray
One of the main objects of NumPy is ndarray. It allows you to create multidimensional data
arrays of the same type and perform operations on them with great speed. Unlike sequences in
Python, arrays in NumPy have a fixed size, the elements of the array must be of the same type.
You can apply various mathematical operations to arrays, which are performed more
efficiently than for Python sequences. The next example shows how to work with linear
algebra with NumPy. It is really simple and easy-to-understand for Python users.

>>> import numpy as np

>>> a = np.array([[1.0, 2.0], [3.0, 4.0]])

>>> print(a) [[ 1. 2.]

[ 3. 4.]]

>>> a.transpose() array([[ 1., 3.],

[ 2., 4.]])

>>> np.linalg.inv(a) array([[-2. , 1. ],

[ 1.5, -0.5]])

>>> u = np.eye(2) # unit 2×2 matrix; “eye” represents “I”

>>> u

array([[ 1., 0.],

[ 0., 1.]])

>>> j = np.array([[0.0, -1.0], [1.0, 0.0]])

>>> j @ j

# matrix product array([[-1., 0.],

[ 0., -1.]])

>>> np.trace(u) # trace 2.0

>>> y = np.array([[5.], [7.]])

>>> np.linalg.solve(a, y) array([[-3.],

[ 4.]])

>>> np.linalg.eig(j)
(array([ 0.+1.j, 0.-1.j]), array([[ 0.70710678+0.j,0.70710678-0.j], [ 0.00000000-0.70710678j,
0.00000000+0.70710678j]]))

Numpy allows processing information without cycles. Please take a look at


this article published by Brad solomon about the advantages of Numpy: “It is sometimes said
that Python, compared to low-level languages such as C++, improves development time at the
expense of runtime. Fortunately, there are a handful of ways to speed up operation runtime in
Python without sacrificing ease of use. One option suited for fast numerical operations is
NumPy, which deservedly bills itself as the fundamental package for scientific computing with
Python.” It makes computation in Python really fast.

Applications:

 Extensively used in data analysis


 Creates powerful N-dimensional array
 Forms the base of other libraries, such as SciPy and scikit-learn
 Replacement of MATLAB when used with scipy and matplotlib
2.Study of Python Libraries for ML application such as Pandas and
Matplotlib

. Pandas
Pandas is primarily designed to perform data manipulation and analysis. It is known
that dataset preparation is essential before the training phase. The Pandas library
comes in handy in such a scenario as it provides a variety of data structures,
functions, and components that help in data extraction and preparation tasks. Data
preparation refers to data organization, wherein various methods are employed to a
group, combine, reshape, and filter out different datasets.
Key advantages of the Pandas library include:
 Valid data frames: While the Pandas library has more utility for data
analysis, it is also used to handle machine learning operations through
data frames. Data frames refer to two-dimensional data similar to what is
used in SQL tables or spreadsheets. It enables programmers to get an
overview of the data, thereby improving the software product’s quality.
 Easy dataset handling: The Pandas library is typically helpful for
professionals intending to handle (structure, sort, reshape, filter) large
datasets with ease.
. Matplotlib
Similar to Pandas library, Matplolib is not a machine learning heavy library. It is
typically used for data visualization where developers can derive insights from the
visualized data patterns. Some of its modules, such as Pyplot, provide functionalities
to control line styles, manage fonts, and others while plotting 2D graphs and plots.
The features offered by Matplotlib are in line with those of MATLAB, and all the
Python packages are freely available in this library.
Key reasons for the popularity of Matplotlib include:
 Wide range of plotting tools: Using the Matplotlib library, plotting
various 2D charts, 3D diagrams, histograms, error charts, bar charts,
and graphs is possible. It allows experts to perform detailed data
analysis.
 Builds reliable ML models: Several plots allow thorough data analysis,
which further ensures that the developers have enough relevant data to
build reliable ML models.
4. Simple Linear Regression

Code:

from sklearn.linear_model import LinearRegression


import numpy as np

# Sample data
X = np.array([1, 2, 3, 4, 5]).reshape(-1, 1)
y = np.array([1.5, 3.7, 2.6, 4.9, 6.3])

# Model
model = LinearRegression()
model.fit(X, y)

# Predictions
predictions = model.predict(X)

print(f"Coefficients: {model.coef_}")
print(f"Intercept: {model.intercept_}")
print(f"Predictions: {predictions}")

Output:

Coefficients: [1.16]
Intercept: 0.8999999999999995
Predictions: [2.06 3.22 4.38 5.54 6.7 ]
5. Multiple Linear Regression for House Price Prediction

Code:

import pandas as pd
from sklearn.model_selection import train_test_split
from sklearn.linear_model import LinearRegression
from sklearn.metrics import mean_squared_error

# Sample data
data = {'Size': [1400, 1600, 1700, 1875],
'Bedrooms': [3, 3, 3, 4],
'Age': [10, 15, 20, 10],
'Price': [245000, 312000, 279000, 308000]}
df = pd.DataFrame(data)

X = df[['Size', 'Bedrooms', 'Age']]


y = df['Price']

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.25, random_state=42)


model = LinearRegression()
model.fit(X_train, y_train)

y_pred = model.predict(X_test)

print(f"Coefficients: {model.coef_}")
print(f"Intercept: {model.intercept_}")
print(f"Mean Squared Error: {mean_squared_error(y_test, y_pred)}")

Output:

Coefficients: [ 139.84757423 16519.82099288 -823.00884956]


Intercept: 149683.18658280924
Mean Squared Error: 33067192.66055046
6. Implementation of Decision tree using sklearn and its parameter tuning

Code:

from sklearn.tree import DecisionTreeClassifier


from sklearn.datasets import load_iris
from sklearn.model_selection import GridSearchCV

# Load data
data = load_iris()
X, y = data.data, data.target

# Hyperparameter tuning
param_grid = {'max_depth': [3, 5, None], 'min_samples_split': [2, 5, 10]}
grid_search = GridSearchCV(DecisionTreeClassifier(), param_grid, cv=5)
grid_search.fit(X, y)

print(f"Best Parameters: {grid_search.best_params_}")

Output:

Best Parameters: {'max_depth': 3, 'min_samples_split': 2}


7. Implementation of KNN using sklearn

Code:

python
CopyEdit
from sklearn.neighbors import KNeighborsClassifier
from sklearn.datasets import load_iris
from sklearn.model_selection import train_test_split

data = load_iris()
X, y = data.data, data.target

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.25, random_state=42)


knn = KNeighborsClassifier(n_neighbors=3)
knn.fit(X_train, y_train)

print(f"Accuracy: {knn.score(X_test, y_test)}")

Output:

Accuracy: 1.0
8.Implementation of Logistic Regression using sklearn

Code:

from sklearn.linear_model import LogisticRegression


from sklearn.datasets import load_iris
from sklearn.model_selection import train_test_split

data = load_iris()
X, y = data.data, data.target

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.25, random_state=42)


log_reg = LogisticRegression(max_iter=200)
log_reg.fit(X_train, y_train)

print(f"Accuracy: {log_reg.score(X_test, y_test)}")

Output:

Accuracy: 1.0
9.Implementation of K-Means Clustering

Code:

from sklearn.cluster import KMeans


import matplotlib.pyplot as plt
import numpy as np

# Sample data
X = np.array([[1, 2], [1, 4], [1, 0],
[4, 2], [4, 4], [4, 0]])

kmeans = KMeans(n_clusters=2, random_state=0)


kmeans.fit(X)

# Plotting
plt.scatter(X[:, 0], X[:, 1], c=kmeans.labels_, cmap='viridis')
plt.scatter(kmeans.cluster_centers_[:, 0], kmeans.cluster_centers_[:, 1], s=300, c='red',
label='Centroids')
plt.legend()
plt.show()

You might also like