Pandas Ip PDF
Pandas Ip PDF
This tool is essentially your data’s home. Through pandas, you get acquainted with your
data by cleaning, transforming, and analyzing it.
For example, say you want to explore a dataset stored in a CSV on your computer. Pandas
will extract the data from that CSV into a DataFrame — a table, basically — then let you
do things like:
print(ser)
import pandas as pd
0 10
1 10
2 10
3 10
# giving a scalar value with index 4 10
ser = pd.Series(10, index =[0, 1, 2, 3, 4, 5]) 5 10
dtype: int64
print(ser)
Create an empty Series
import pandas as pd
S = pd.Series(dtype =int)
print(S)
import pandas as pd
S = pd.Series(range(2,24,3))
print(S)
0 2
1 5
2 8
3 11
4 14
5 17
6 20
7 23
Create a Series using range() and for loop
import pandas as pd
S = pd.Series(range(1,12,2),index=[x for x in ‘abcdef’])
print(S)
a 1
b 3
c 5
d 7
e 9
f 11
dtype: int64
Create a Series using a list with floating point
import pandas as pd
S = pd.Series([1,8,9.5])
print(S)
0 1.0
1 8.0
2 9.5
dtype: float64
Create a Series using two different list
import pandas as pd
Fruits=['Apple','Orange','Banana']
NoofFruitssold=[12,45,67]
S = pd.Series(NoofFruitssold,index=Fruits)
print(S)
Apple 12
Orange 45
Banana 67
dtype: int64
Create a Series using missing values
import pandas as pd
import numpy as np
Fruits=['Apple','Orange','Banana']
NoofFruitssold=[12,45,np.NaN]
S = pd.Series(NoofFruitssold,index=Fruits)
print(S)
Apple 12.0
Orange 45.0
Banana NaN
dtype: float64
Create a Series using mathematical expression
import pandas as pd
import numpy as np
L1 =np.arange(2,24,2)
ind =range(1,12)
S = pd.Series(index=ind,data=L1*2)
print(S)
1 4
2 8
3 12
4 16
5 20
6 24
7 28
8 32
9 36
10 40
11 44
dtype: int32
Write the output ????
import pandas as pd
import numpy as np
L1 =np.array([4,9,16,25])
ind =range(4)
S = pd.Series(index=ind,data=L1**0.5)
print(S)
0 2.0
1 3.0
2 4.0
3 5.0
dtype: float64
A Series in Pandas can be created using library Series() method
Array
Dict
Scalar value or constant
Mathematical expression
Data Structure Dimensions Description
Series 1 1D labeled homogeneous array,
size immutable.
Values of Data Mutable
a Series is Size immutable, which means once a Series object is created operations such as appending/deleting which
would change the size of the object are not allowed.
Mathematical Operations on Series
Mean, median, and mode are three kinds of "averages"
The "mean" :you add up all the numbers and then divide
by the number of numbers.
The "median" is the "middle" value in the list of numbers.
To find the median, your numbers have to be listed in
numerical order from smallest to largest,
The "mode" is the value that occurs most often. If no
number in the list is repeated, then there is no mode for
the list.
Find the mean, median, mode, and range for the following list of values:
13, 18, 13, 14, 13, 16, 14, 21, 13
Mean
(13 + 18 + 13 + 14 + 13 + 16 + 14 + 21 + 13) ÷ 9 = 15
The median is the middle value, so first I'll have to rewrite the list in numerical order:
There are nine numbers in the list, so the middle one will be the (9 + 1) ÷ 2 = 10 ÷ 2 = 5th number:
The mode is the number that is repeated more often than any other, so 13 is the mode.
The largest value in the list is 21, and the smallest is 13, so the range is 21 – 13 = 8.
mean: 15
median: 14
mode: 13
range: 8
import pandas as pd indexmin=s.idxmin()
s = pd.Series([3,45,1,2,3]) indexmax=s.idxmax()
ss = s.sum() print(indexmin)
print(ss)
cc=s.count()
print(indexmax)
print(cc) md =s.median()
mm=s.mean() mo=s.mode()
print(mm) print(md)
stds=s.std() print(mo)
print(stds) va=s.value_counts()
mn=s.min() print(va)
mx=s.max()
print(mn)
al=s.describe()
print(mx) print(al)
FUNCTION USE
s.sum() Returns sum of all values in the series
s.mean() Returns mean of all values in series. Equals to
s.sum()/s.count()
s.std() Returns standard deviation of all values
s.min() or s.max() Return min and max values from series
s.idxmin() or
Returns index of min or max value in series
s.idxmax()
s.median() Returns median of all value
s.mode() Returns mode of the series
s.value_counts() Returns series with frequency of each value
Returns a series with information like mean, mode
s.describe()
etc depending on dtype of data passed
Vector Operations on Series
import pandas as pd
s = pd.Series([1,2,3])
t = pd.Series([13,24,54])
0 14
u=s+t 1 26
print (u) 2 57
dtype: int64
w=s+2 0 3
1 4
print(w) 2 5
dtype: int64
fruits = ['apples', 'oranges', 'cherries', 'pears']
S = pd.Series([20, 33, 52, 10], index=fruits)
S2 = pd.Series([17, 13, 31, 32], index=fruits)
print(S + S2)
print("sum of S: ", sum(S))
apples 37
oranges 46
cherries 83
pears 42
dtype: int64
sum of S: 115
import pandas as pd
S = pd.Series([11, 28, 72, 3, 5, 8])
print(S.index)
print(S.values)
import pandas as pd
s = pd.Series([56,45,90,45,32,78],index =['a','b','c','d','e','f'])
s =s.reindex(['f','a','c','d','e','b'])
print(s)
runfile('C:/Users/PC/untitled0.py', wdir='C:/Users/PC')
f 78
a 56
c 90
d 45
e 32
b 45
dtype: int64