0% found this document useful (0 votes)
22 views

22mbada303 Module 4

The document discusses NumPy, a Python library used for working with arrays and numerical data. It describes key NumPy concepts like arrays, data types, operations on arrays, and summarizes common NumPy functions for working with arrays such as sum, mean, median, transpose etc. Examples are provided to demonstrate creating and manipulating NumPy arrays.

Uploaded by

Kiran Vinnu
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
22 views

22mbada303 Module 4

The document discusses NumPy, a Python library used for working with arrays and numerical data. It describes key NumPy concepts like arrays, data types, operations on arrays, and summarizes common NumPy functions for working with arrays such as sum, mean, median, transpose etc. Examples are provided to demonstrate creating and manipulating NumPy arrays.

Uploaded by

Kiran Vinnu
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 32

Dr Girish Kumar

Professor & HOD,


MCA Dept, BITM
 Stands for Numerical Python
 It is a Python library used for working with an array
 In Python, we use the list for purpose of the array but it’s slow to
process
 NumPy array is a powerful N-dimensional array object
 It provides an array object much faster than traditional Python lists
Types of Array:
1.One Dimensional Array
2.Multi-Dimensional Array
A one-dimensional array is a type of linear array.

One Dimensional Array

# importing numpy module


import numpy as np
Check data type for list and array:
# creating list
ls = [1, 2, 3, 4] print(type(ls))

# creating numpy array print(type(sample_array))


sample_array = np.array(ls)

print("List in python : ", ls)


print("Numpy Array in python :",sample_array)
Data in multidimensional arrays are stored in tabular form

# importing numpy module Two Dimensional Array


import numpy as np

# creating list
list_1 = [1, 2, 3, 4]
list_2 = [5, 6, 7, 8]
list_3 = [9, 10, 11, 12]

# creating numpy array


sample_array = np.array([list_1,list_2,list_3])

print("Numpy multi dimensional array in python\n",sample_array)


1. Axis: The Axis of an array describes the order of the indexing into the array

Axis 0 = one dimensional


Axis 1 = Two dimensional
Axis 2 = Three dimensional

2. Shape: The number of elements along with each axis. It is from a tuple
Numpy array :

[[ 1 2 3 4]
[ 5 6 7 8]
[ 9 10 11 12]]

Shape of the array : (3, 4)


3. Rank: The rank of an array is simply the number of axes (or dimensions) it has

Rank 1

The one-dimensional array has rank 1

Rank 2

The two-dimensional array has rank 2


4. Data type objects (dtype): Data type objects (dtype) is an instance
of numpy.dtype class. It describes how the bytes in the fixed-size block of
memory corresponding to an array item should be interpreted.

# Import module
import numpy as np

# Creating the array


sa1 = np.array([[0, 4, 2]])

sa2 = np.array([0.2, 0.4, 2.4])

# display data type


print("Data type of the array 1 :",sa1.dtype)

print("Data type of array 2 :", sa2.dtype)



Operations on single array

# Python program to demonstrate # multiply each element by 10


# basic operations on single array print ("Multiplying each element by 10:", a*10)
import numpy as np
# square each element
a = np.array([1, 2, 5, 3]) print ("Squaring each element:", a**2)

# add 1 to every element # modify existing array


print ("Adding 1 to every element:", a+1) a *= 2
print ("Doubled each element of original array:", a)
# subtract 3 from each element
print ("Subtracting 3 from each element:", a-3) # transpose of array
a = np.array([[1, 2, 3], [3, 4, 5], [9, 6, 0]])

print ("\nOriginal array:\n", a)


print ("Transpose of array:\n", a.T)

Unary operators: Many unary operations are provided as a method of ndarray class.

This includes sum, min, max, etc.


import numpy as np
arr = np.array([[1, 5, 6],[4, 7, 2],[3, 1, 9]])

# maximum element of array


print ("Largest element is:", arr.max())
print ("Row-wise maximum elements:",arr.max(axis = 1))

# minimum element of array


print ("Column-wise minimum elements:",arr.min(axis = 0))

# sum of array elements


print ("Sum of all array elements:",arr.sum())

# cumulative sum along each row


print ("Cumulative sum along each row:\n", arr.cumsum(axis = 1))

Binary operators: These operations apply on array elementwise and a new array is
created. You can use all basic arithmetic operators like +, -, /, , etc. In case of +=, -=, =
operators, the existing array is modified.
import numpy as np
a = np.array([[1, 2], [3, 4]])
b = np.array([[4, 3],[2, 1]])

# add arrays
print ("Array sum:\n", a + b)

# multiply arrays (elementwise multiplication)


print ("Array multiplication:\n", a*b)

# matrix multiplication
print ("Matrix multiplication:\n", a.dot(b))
 Pandas is a Python library used for working with data sets.
 It has functions for analyzing, cleaning, exploring, and
manipulating data.
 The name "Pandas" has a reference to both "Pane Data", and
"Python Data Analysis”
 Pandas particularly well-suited for working with tabular
data, such as spreadsheets or SQL tables.
 Its versatility and ease of use make it an essential tool for
data analysts, scientists with structured data in Python.
Pandas generally provide two data structures
for manipulating data, They are:
•Series
•DataFrame
A Pandas Series is a one-dimensional labeled array capable of holding
data of any type (integer, string, float, python objects, etc.).
•The axis labels are collectively called indexes.
Pandas Series is nothing but a column in an Excel sheet.
 In the real world, a Pandas Series will be created by loading the
datasets from existing storage, storage can be SQL Database, CSV
file, or an Excel file.
 Pandas Series can be created from lists, dictionaries, etc.
import pandas as pd
import numpy as np
# Creating empty series
ser = pd.Series()
print("Pandas Series: ", ser)
# simple array
data = np.array([‘p', ‘a', ‘n', ‘d', ‘a’, 's'])

ser = pd.Series(data)
print("Pandas Series:\n", ser)
 Pandas DataFrame is a two-dimensional data structure with
labeled axes (rows and columns).
 In the real world, a Pandas DataFrame will be created by
loading the datasets from existing storage, storage can be
SQL Database, CSV file, or an Excel file.
 Pandas DataFrame can be created from lists, dictionaries,
and etc.
import pandas as pd

# Calling DataFrame constructor


df = pd.DataFrame()
print(df)

# list of strings
lst = [‘Python', ‘NumPy', ‘Pandas’, ‘Data',' Analytics']

# Calling DataFrame constructor on list


df = pd.DataFrame(lst)
print(df)
# Import pandas
import pandas as pd

# reading csv file


df = pd.read_csv("people.csv")

print(df.head())
df = pd.read_csv('people.csv',header=0,
usecols=["First Name", "Sex", "Email"])
# printing dataframe
print(df.head())
In Descriptive statistics, we are describing our data with the help of
various representative methods using charts, graphs, tables, excel files,
etc. In descriptive statistics, we describe our data in some manner and
present it in a meaningful way so that it can be easily understood.

Mean
It is the sum of observations divided by the total number of observations.
import numpy as np
# Sample Data
arr = [5, 6, 11]
# Mean
mean = np.mean(arr)
print("Mean = ", mean)
Median
It is the middle value of the data set. It splits the data into two halves.

import numpy as np
# sample Data
arr = [1, 2, 3, 4]
# Median
median = np.median(arr)
print("Median = ", median)
 Missing Data can occur when no information is provided for one or more items
or for a whole unit.
 Missing Data is a very big problem in a real-life scenarios.
 Missing Data can also refer to as NA(Not Available) values in pandas.

In Pandas missing data is represented by two value:


•None: None is a Python singleton object that is often used for missing
data in Python code.
•NaN : NaN (an acronym for Not a Number), is a special floating-point
value recognized by all systems that use the standard IEEE floating-point
representation
• isnull()
• notnull()
• dropna()
• fillna()
• replace()
import pandas as pd
import numpy as np

# dictionary of lists
dict = {'First Score':[100, 90, np.nan, 95],
'Second Score': [30, 45, 56, np.nan],
'Third Score':[np.nan, 40, 80, 98]}

# creating a dataframe from list


df = pd.DataFrame(dict)

# using isnull() function


df.isnull()
import pandas as pd
import numpy as np

# dictionary of lists
dict = {'First Score':[100, 90, np.nan, 95],
'Second Score': [30, 45, 56, np.nan],
'Third Score':[np.nan, 40, 80, 98]}

# creating a dataframe from list


df = pd.DataFrame(dict)

# using isnull() function


df.notnull()
import pandas as pd
import numpy as np
# dictionary of lists
dict = {'First Score':[100, 90, np.nan, 95],
'Second Score': [30, 45, 56, np.nan],
'Third Score':[np.nan, 40, 80, 98]}

# creating a dataframe from dictionary


df = pd.DataFrame(dict)

# filling missing value


df.fillna(0)
import pandas as pd
import numpy as np
dict = {'First Score':[100, 90, np.nan, 95],
'Second Score': [30, np.nan, 45, 56],
'Third Score':[52, 40, 80, 98],
'Fourth Score':[np.nan, np.nan, np.nan, 65]}

# creating a dataframe from dictionary


df = pd.DataFrame(dict)

# using dropna() function


df.dropna()
import pandas as pd
import numpy as np

# dictionary of lists
dict = {'First Score':[100, 90, np.nan, 95],
'Second Score': [30, 45, 56, np.nan],
'Third Score':[np.nan, 40, 80, 98]}

# creating a dataframe from dictionary


df = pd.DataFrame(dict)

# Replace filling missing value


df. replace(to_replace = np.nan, value = -99)
Module – 4 : Assignment Questions
1. Explain all the data types supported in python
2. What is NumPy? Explain with an example
3. Define the terms a) Axis b)Shape c)Rank d)dtype
4. Explain Numpy basic operations on 1-D Array
5. Explain Numpy binary operations on 1-D Array
6. Explain all the methods of descriptive statistics (ndarray class)
with example program
7. What are Pandas? Explain the data structures in pandas
8. Write a python program to create a series using pandas
9. Write a python program to create a dataFrames using pandas
10. List and explain the functions used in handling missing data

You might also like