0% found this document useful (0 votes)

16 views7 pages

EDA_UNIT_1

The document outlines a lab exercise focused on Exploratory Data Analysis (EDA) using Python, specifically with the Cars4U dataset from Kaggle. It includes instructions for downloading the dataset, installing necessary Python libraries (numpy, pandas, matplotlib, seaborn), and performing basic operations with numpy arrays and pandas dataframes. Additionally, it covers loading datasets, selecting rows and columns in dataframes, and provides code examples for each task.

Uploaded by

arafaths062

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

16 views7 pages

EDA_UNIT_1

Uploaded by

arafaths062

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

You are on page 1/ 7

EDA LAB

UNIT-I

1.a) Download Dataset from Kaggle using the following link :

https://www.kaggle.com/datasets/sukhmanibedi/cars4u
b) Install python libraries required for Exploratory Data Analysis (numpy,
pandas, matplotlib,seaborn)

Theory:

1.a) Open any browser and paste the above Kaggle link. A zip file will be
downloaded. Unzip it and study the Cars4U dataset(
used_cars_data.csv(785.45 kB))
in detail.
Explain the columns of the Dataset and count the no.of rows .

b) Install python libraries

PIP is a package management system used to install and manage software

packages/libraries written in Python. PIP stands for Preferred Installer
Program
Prerequisites:
Python should be installed on your Windows machine.
How to Check if Python is Installed?
Run the following command to test if Python is installed or not.
>python - -version
If it is installed, You will see something like this:
Python 3.12.4

Python PIP can be downloaded and installed with following method:

Follow these instructions to pip windows install:
Step 1: Open the cmd terminal
Step 2: In python, a curl is a tool for transferring data requests to and from a
server. Use the following command to request:
>https://bootstrap.pypa.io/get-pip.py

>python get-pip.py

pandas can be installed via pip from PyPI

>pip install pandas
If you use pip, you can install NumPy with:
>pip install numpy

>pip install Matplotlib

>pip install seaborn

After installation verify , by importing them in Jupyter note book.

Conclusions:

2 . Perform Numpy Array basic operations and Explore Numpy Built-in

functions.

Theory:

For importing numpy, we will use the following code:

import numpy as np

For creating different types of numpy arrays, we will use the following code:

# importing numpy
import numpy as np

# Defining 1D array
my1DArray = np.array([1, 8, 27, 64])
print(my1DArray)

# Defining and printing 2D array

my2DArray = np.array([[1, 2, 3, 4], [2, 4, 9, 16], [4, 8, 18, 32]])
print(my2DArray)

#Defining and printing 3D array

my3Darray = np.array([[[ 1, 2 , 3 , 4],[ 5 , 6 , 7 ,8]], [[ 1, 2,
3, 4],[ 9, 10, 11, 12]]])
print(my3Darray)

For displaying basic information, such as the data type, shape, size, and strides of
a NumPy array, we will use the following code:

# Print out memory address

print(my2DArray.data)

# Print the shape of array

print(my2DArray.shape)

# Print out the data type of the array

print(my2DArray.dtype)

Strides in numpy array:

How many bytes we have to skip in memory to move to the next position along a
certain axis.
# Print the stride of the array.
print(my2DArray.strides)

For creating an array using built-in NumPy functions, we will use the following
code:

# Array of ones
ones = np.ones((3,4))
print(ones)

# Array of zeros
zeros = np.zeros((2,3,4),dtype=np.int16)
print(zeros)

# Array with random values

np.random.random((2,2))

# Empty array
emptyArray = np.empty((3,2))
print(emptyArray)

# Full array
fullArray = np.full((2,2),7)
print(fullArray)
# Array of evenly-spaced values
evenSpacedArray = np.arange(10,25,5)
print(evenSpacedArray)

# Array of evenly-spaced values

evenSpacedArray2 = np.linspace(0,2,9)
print(evenSpacedArray2)

CONCLUSIONS:
3.. Loading Dataset into pandas dataframe

A. # Python Pandas read CSV

import pandas as pd
# Reading the CSV file
df = pd.read_csv("Iris.csv")
# Printing top 5 rows
df.head()

B. Using sk learn to import dataset

from sklearn.datasets import load_iris

# Load the Iris dataset

iris = load_iris()

# Access the features and target variable

X = iris.data # Features (sepal length, sepal width, petal length, petal width)
y = iris.target # Target variable (species: 0 for setosa, 1 for versicolor, 2 for virginica)

# Print the feature names and target names

print("Feature names:", iris.feature_names)
print("Target names:", iris.target_names)

CONCLUSION:

4. Selecting rows and columns in the dataframe

The following code displays the rows, columns, data types, and memory used by the
dataframe:

Refer code in-A

df.info()

Let's now see how we can select rows and columns in any dataframe:
# Selects a row
df.iloc[10]

# Selects 10 rows
df.iloc[0:10]
# Selects a range of rows
df.iloc[10:15]

# Selects the last 2 rows

df.iloc[-2:]

# Selects every other row in columns 3-5

df.iloc[::2, 3:5].head()

Refer code in B -Selecting columns

# Access the features and target variable

X = iris.data # Features (sepal length, sepal width, petal length, petal width)
y = iris.target # Target variable (species: 0 for setosa, 1 for versicolor, 2 for virginica)
print(X)
print(y)

CONCLUSIONS:

Python: Learn Python in 24 Hours
From Everand
Python: Learn Python in 24 Hours
Alex Nordeen
4/5 (12)
DSL Rough Draft
No ratings yet
DSL Rough Draft
34 pages
Exp-1
No ratings yet
Exp-1
22 pages
Ass-1 Prac
No ratings yet
Ass-1 Prac
23 pages
CS3361 Data Science Lab Manual
No ratings yet
CS3361 Data Science Lab Manual
43 pages
exp1
No ratings yet
exp1
5 pages
Data Analysis Lab - Final - 23-24
No ratings yet
Data Analysis Lab - Final - 23-24
11 pages
Data Science Lab Manual
No ratings yet
Data Science Lab Manual
45 pages
EXP1-siddhant gupta (23_SE_148)
No ratings yet
EXP1-siddhant gupta (23_SE_148)
17 pages
FDS_LAB_MANUAL-1
No ratings yet
FDS_LAB_MANUAL-1
51 pages
CS3361-DATA SCIENCE LAB MANUAL
No ratings yet
CS3361-DATA SCIENCE LAB MANUAL
44 pages
DSBDA Lab Manual
No ratings yet
DSBDA Lab Manual
155 pages
NumPy and Pandas
No ratings yet
NumPy and Pandas
72 pages
Machine Learning Lab Dlihebca6sem
100% (1)
Machine Learning Lab Dlihebca6sem
25 pages
Python_for_AIML1
No ratings yet
Python_for_AIML1
15 pages
Numpy_Data_Analysis_and_visualisation_with_Python
No ratings yet
Numpy_Data_Analysis_and_visualisation_with_Python
75 pages
3-numpy_pandas
No ratings yet
3-numpy_pandas
37 pages
Datascience Lab Manual
No ratings yet
Datascience Lab Manual
46 pages
De&v Lab Manual
No ratings yet
De&v Lab Manual
91 pages
Q-Step WS 06112019 Data Analysis and Visualisation With Python
No ratings yet
Q-Step WS 06112019 Data Analysis and Visualisation With Python
76 pages
ML MANUAL
No ratings yet
ML MANUAL
21 pages
dv_lab_manual_modified
No ratings yet
dv_lab_manual_modified
31 pages
Exercise and Experiment 3
No ratings yet
Exercise and Experiment 3
14 pages
Data Analysis and Visualisation With Python
No ratings yet
Data Analysis and Visualisation With Python
75 pages
PR final file
No ratings yet
PR final file
49 pages
Numpy&pandas
No ratings yet
Numpy&pandas
17 pages
Ass1 DSBDA Writeup
No ratings yet
Ass1 DSBDA Writeup
8 pages
DSBDA Lab Manual
No ratings yet
DSBDA Lab Manual
155 pages
fds lab manual[1]
No ratings yet
fds lab manual[1]
24 pages
Cs3361 Data Science Laboratory
No ratings yet
Cs3361 Data Science Laboratory
139 pages
ML Lab File
No ratings yet
ML Lab File
43 pages
AD3301 DEV Lab Manual
No ratings yet
AD3301 DEV Lab Manual
26 pages
Advance Data Analysis and Visualisation - With - Python For Executives and Business Management
No ratings yet
Advance Data Analysis and Visualisation - With - Python For Executives and Business Management
76 pages
To Study About Numpy, Pandas and Matplotlib Libraries in Python
No ratings yet
To Study About Numpy, Pandas and Matplotlib Libraries in Python
21 pages
Final Fds Manual
No ratings yet
Final Fds Manual
77 pages
FDS_LAB_MANUAL (1)
No ratings yet
FDS_LAB_MANUAL (1)
62 pages
final dev record
No ratings yet
final dev record
49 pages
Final Fds Manual Print
No ratings yet
Final Fds Manual Print
55 pages
FINAL FDS MANUAL print
No ratings yet
FINAL FDS MANUAL print
55 pages
PR Final File
No ratings yet
PR Final File
70 pages
CS3361 - Data Science
No ratings yet
CS3361 - Data Science
56 pages
LAB 2 DWM
No ratings yet
LAB 2 DWM
13 pages
Introduction To Python (Part III)
No ratings yet
Introduction To Python (Part III)
29 pages
Fds Lab Manual
No ratings yet
Fds Lab Manual
61 pages
NumPy & Pandas
No ratings yet
NumPy & Pandas
27 pages
Machine Learning Lab File: Submitted To: Submitted by
No ratings yet
Machine Learning Lab File: Submitted To: Submitted by
9 pages
FDS Lab Manual (Print)
No ratings yet
FDS Lab Manual (Print)
43 pages
Unit 2 ML
No ratings yet
Unit 2 ML
93 pages
Pandas Numpy
No ratings yet
Pandas Numpy
4 pages
l9 Scientific Python Proc
No ratings yet
l9 Scientific Python Proc
30 pages
NumPy Python Library by ChatGPT
No ratings yet
NumPy Python Library by ChatGPT
30 pages
Data Science Lab Manual
No ratings yet
Data Science Lab Manual
42 pages
New Chat
No ratings yet
New Chat
30 pages
Data Analysis in Python_ML
No ratings yet
Data Analysis in Python_ML
21 pages
Ty B Tech - Bda - Ai315 - Lab Manual
No ratings yet
Ty B Tech - Bda - Ai315 - Lab Manual
52 pages
01 Introduction to Python
No ratings yet
01 Introduction to Python
36 pages
ML lab_abbs
No ratings yet
ML lab_abbs
23 pages
Roadmap
No ratings yet
Roadmap
27 pages
FDS Lab Manual
No ratings yet
FDS Lab Manual
48 pages
Numpy Simply In Depth
From Everand
Numpy Simply In Depth
Ajit Singh
5/5 (1)
MPESA NEW
No ratings yet
MPESA NEW
1 page
Halszkaraptor
No ratings yet
Halszkaraptor
17 pages
Module 5 Electolyte Non Pages
No ratings yet
Module 5 Electolyte Non Pages
12 pages
Genmath e Portfolio
No ratings yet
Genmath e Portfolio
17 pages
Palm Oil Mill Bio Refiner
100% (2)
Palm Oil Mill Bio Refiner
21 pages
HS Food Safety and Sanitation Test Answers
No ratings yet
HS Food Safety and Sanitation Test Answers
3 pages
(A Trivia Nerds Guide To The History of The United States - 3) Bill O'Neill - The Great Book of California - The Crazy History of California With Amazing Random Facts & Trivia (2018)
100% (1)
(A Trivia Nerds Guide To The History of The United States - 3) Bill O'Neill - The Great Book of California - The Crazy History of California With Amazing Random Facts & Trivia (2018)
135 pages
Pump Head Calculations
100% (2)
Pump Head Calculations
4 pages
1.-Germans at Meat 1910
No ratings yet
1.-Germans at Meat 1910
4 pages
D Internet Myiemorgmy Iemms Assets Doc Alldoc Document 9194 - METD 270116 T PDF
No ratings yet
D Internet Myiemorgmy Iemms Assets Doc Alldoc Document 9194 - METD 270116 T PDF
1 page
Algebra Assignment 2
No ratings yet
Algebra Assignment 2
7 pages
PCI UG Framework
No ratings yet
PCI UG Framework
191 pages
TT KPI Rate 1 2 3: I Ingame Statistic
No ratings yet
TT KPI Rate 1 2 3: I Ingame Statistic
5 pages
Class Ix Mathematics Circles Worksheet 11xxxxx
No ratings yet
Class Ix Mathematics Circles Worksheet 11xxxxx
2 pages
Nana Jedy Darpawanto (30000120410015) Dan Putri Alifa Kholil (30000120410016)
No ratings yet
Nana Jedy Darpawanto (30000120410015) Dan Putri Alifa Kholil (30000120410016)
7 pages
Food Biils
No ratings yet
Food Biils
23 pages
Exploring Maldevta
No ratings yet
Exploring Maldevta
10 pages
Linux Device Driver_ODT
No ratings yet
Linux Device Driver_ODT
2 pages
Rajeev Nair: Bion Analytx Private Limited
No ratings yet
Rajeev Nair: Bion Analytx Private Limited
1 page
Switching Basics and Intermediate Routing: Case Study
No ratings yet
Switching Basics and Intermediate Routing: Case Study
20 pages
Republic v. Espinosa, GR No. 186603
No ratings yet
Republic v. Espinosa, GR No. 186603
1 page
Bounty Hunters 2015
100% (1)
Bounty Hunters 2015
4 pages
r48 3000e3 Datasheet
No ratings yet
r48 3000e3 Datasheet
2 pages
MGT490 Final Report Bashundhara
No ratings yet
MGT490 Final Report Bashundhara
23 pages
Techcheck Daily: Emkay Global Financial Services LTD
No ratings yet
Techcheck Daily: Emkay Global Financial Services LTD
9 pages
Vijay Kumar M: Phone-+91 7842006240
No ratings yet
Vijay Kumar M: Phone-+91 7842006240
2 pages
02
No ratings yet
02
23 pages
48TMSS18R0
No ratings yet
48TMSS18R0
45 pages
Final Paper A - Zaidan Manaf Mardani - 180410220072
No ratings yet
Final Paper A - Zaidan Manaf Mardani - 180410220072
5 pages
(Group 3) Value Proposition
No ratings yet
(Group 3) Value Proposition
6 pages

EDA_UNIT_1

Uploaded by

EDA_UNIT_1

Uploaded by

EDA LAB

1.a) Download Dataset from Kaggle using the following link :

b) Install python libraries

PIP is a package management system used to install and manage software

Python PIP can be downloaded and installed with following method:

pandas can be installed via pip from PyPI

>pip install Matplotlib

>pip install seaborn

After installation verify , by importing them in Jupyter note book.

2 . Perform Numpy Array basic operations and Explore Numpy Built-in

For importing numpy, we will use the following code:

# Defining and printing 2D array

#Defining and printing 3D array

# Print out memory address

# Print the shape of array

# Print out the data type of the array

Strides in numpy array:

# Array with random values

# Array of evenly-spaced values

A. # Python Pandas read CSV

B. Using sk learn to import dataset

from sklearn.datasets import load_iris

# Load the Iris dataset

# Access the features and target variable

# Print the feature names and target names

4. Selecting rows and columns in the dataframe

Refer code in-A

# Selects the last 2 rows

# Selects every other row in columns 3-5

Refer code in B -Selecting columns

# Access the features and target variable

You might also like