0% found this document useful (0 votes)

39 views24 pages

Introduction to Pandas for Data Analysis

Pandas is an open-source Python library built on NumPy, designed for data manipulation and analysis, providing powerful data structures like Series and DataFrame. It offers features such as efficient data handling, time-series functionality, and easy integration with NumPy. The document details the creation, manipulation, and analysis of data using Series and DataFrames, including methods for accessing, modifying, and performing statistical operations on data.

Uploaded by

Venkata Lokendra

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

39 views24 pages

Introduction to Pandas for Data Analysis

Uploaded by

Venkata Lokendra

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

PANDAS

 What is Pandas?
->It is an open source Python library that is build on top of numpy library.
->It is designed for Data Manipulation, Data Analysis, Data Cleaning
It can handle missing as well.
->It provides Flexible & Powerful Data Structures such as Series, DataFrame .
->It is fast and has high Performance & Productivity.
 Features of Pandas
->Fast and Efficient data manipulation and analysis.
->Provides Time-series functionality
->Easily we can handle missing data
->Faster data merging and joining
->Flexible reshaping and pivoting of data
->Data from different file objects can be loaded
->Integrates with numpy
 Data Structures in Pandas
-> Data Structures are used to Organize & Retrieve & Manipulate the Data
-> In pandas we have D.S are Series and Data Frame
 What is Series
->Series is the one dimensional Labeled array
->It can hold any Data Type(int,string or python objects)
->It axis labels are also known as Index
->Series Contains homogeneous data
->Series are mutable means we can modify the elements And Size is Inmutable means
we can not change once its Declared
->Syntax:[Link]( data, index, dtype, copy)
->Parameters : Data(required) = it can be a list and dictionary
Index(optional)
Dtype(optional)
Copy(optional)= This makes a copy of the input data
 Different ways to create a series in pandas
[Link] a empty series
import pandas as pd
print([Link]()) o/p Series([], dtype: object)
2. Creating a series from an Array
series_array=[Link](['m','Mukesh','bf','gf'])
[Link](series_array)
o/p 0 m
1 Mukesh
2 bf
3 gf
dtype: object
3. Create a series from an array with custom index
series_array=[Link](['m','Mukesh','bf','gf'])
[Link](series_array,index=[100,'Love',103,'No'])
O/p 100 m
Love Mukesh
103 bf
No gf
dtype: object

4. Creating a Series from List

list =['hi', 100,'Mukesh', 1000]
[Link](list)
O/p 0 hi
1 100
2 Mukesh
3 1000
dtype: object

5. Creating a series from dictionary

dict ={ 'k1': 1000,
'k2' : 2000,
'k3' : 3000,
'k4' : 4000 }
[Link](dict)
O/p k1 1000
k2 2000
k3 3000
k4 4000
dtype: int64

6. Creating a series using numpy functions

->[Link](start,stop,)
nu_fn=[Link]([Link](3,33,3))
nu_fn
O/p 0 3.0
1 18.0
2 33.0
dtype: float64

-> [Link](x)
nu_fn=[Link]([Link](3))
nu_fn
O/p 0 0.487446
1 0.375540
2 0.011341
dtype: float64

7. Creating a series using range function

range=[Link](range(5))
range
O/p 0 0
1 1
2 2 3 3 4 4 dtype : int 6
 Accessing Data using Series Position (iloc)
->To access position of Series we use iloc(Integer based indexing)
->iLoc is allow you to access/select rows by there integer/index positions
->Ex: data=[10,20,30,40,50]
pos=[Link](data,index=['A','B','C','D','E'])
[Link][4] o/p 50
[Link][-1] o/p 50
[Link][:] o/p A 10
B 20
C 30
D 40
E 50
dtype: int64

[Link][Link] o/p B 20
D 40
 Retrieve the data using Label(index) name (loc)
->Here we use loc(Label based indexing)
-> Ex: [Link][‘A’] o/p 10
[Link][‘A’ : ‘E’] (Slicing) o/p all the elements

 Changing the type of data

data=[1,2,3,4,5,0]
s=[Link](data,dtype=object)
O/p 0 1
1 2
2 3
3 4
4 5
5 0
dtype: object

data=[1,2,3,4,5,0]
s=[Link](data,dtype=bool)
O/p 0 True
1 True
2 True
3 True
4 True
5 False
dtype: bool
 What is DataFrame ?
->It is Data Structure in pandas library in python.
->It is a Two Dimensional labeled Data
->it has a labeled axis which means Both rows and columns have labels
Which makes easier to access or manipulate the specific data
->It is a heterogeneous type of data. A Dataframe can contains different
datatypes(int,float,string,object)
->Here size is mutable we can add or remove the rows and columns in DF
 Different ways to access a Dataframe
[Link] a empty dataframe:
print([Link]())
O/p : Empty DataFrame
Columns: []
Index: []

[Link] a Dataframe using List:

list=['hii',1,2,3,'hwllo']
[Link](list)
(or)
Print([Link](list))

[Link] a dataframe using list of lists:

list_list=[[1,'Mukesh'],[2,'data_Science'],[3,'job']]
[Link](list_list,columns=['hii','Bye'])
[Link] a DataFrame using Dictionary:
dic={'team':['India','SouthAfrica','Austrilla','England','Newsland'],
'Ranking':[1,2,4,3,5]}
[Link](dic)
[Link] a Dataframe using list of Dictionaries:
list_dic=[{1:'Mukesh',2:'Bleson',3:'Srinivasan'},
{1:'Safa',2:'Sreya',3:'Fareedha'}]
[Link](list_dic)
[Link] DataFrame from Pandas Series:
sd=[Link](['hhi',1,3,4])
[Link](sd)
[Link] Dataframe using Dict of ndarrays:
se={1:[Link]([1,2,3]),
'hi':[Link](['ji','ki','li']),
3:[Link]([4,5,6])}
[Link](se)
[Link] Datframe using Dict of lists:
data = { 'Name': ['Alice', 'Bob', 'Charlie'],
'Age': [25, 30, 35],
'City': ['New York', 'Los Angeles', 'Chicago'] }
[Link](data)
 Column Selection
->It is a fundamental operation in data manipulation and analysis
-> Methods to select the column
[Link] a single column: (using Brackets)
dict={'programming':['SQL','Python','Java','Html'],
'level oo proficiency':[4,3,2,1],
'Trainers':['self-learn','Madha_kiran','Akila','self_learn']}
df=[Link](dict)
df

df['programming']

df[['programming']]

1. (Using Dot notation)

[Link]
2. Selecting Multipule columns ( using list of column names)
df[['programming','Trainers']]

3. Selecting the column by Label(loc) or by conduction

[Link][ :, 'programming' : 'Trainers'] =[ rows : columns]

4. Selecting column by index (iloc)

[Link][ :,0:3]

[Link] column by datatype

df.select_dtypes(include=['int'])

 Column Addition
1. Addingg the new column by scaler value:
data={'A':[1,2,3],'B':[4,5,6]}
df=[Link](data)
df['C']=10

df
2. Adding a new column using list
df['D']=[9,8,7]
df

3. Addition with the help of ndarray

df['E']=[Link](['kii','kalkii','Prabhs'])

4. addition using arithmetic operations

df['F']=df['A']+df['B']

5. Joining the dataframs

dl=[[10,20,30],[40,50,60],[70,80,90]]
ds=[Link](dl)
ds
df=[Link](ds)
 Column deletion:
[Link] drop function
1.1 droping single column:
data={'A':[1,2,3,4],
'B':[5,6,7,8],
'C':[9,10,11,12],
'D':[10,20,30,40],
'E':[40,50,60,70]}
df=[Link](data)
df=[Link](columns=['E'])
Df

1.2 Droping multiple columns:

[Link](columns=['D','C'],inplace=True)
Df

Inplace =It modifies original Data Framewithout creating a copy.

2. Using del keyword
del df['E']
Df

3. Using pop Keyword :

Pop method removes column and return it as series.
a=[Link]('B')
a
 Descriptive Statistics
data = { 'A': [1, 2, 3, 4, 5, 6, 7, 8, 9, 10],
'B': [5, 6, 7, 8, 9, 10, 11, 12, 13, 14],
'C': [9, 10, 11, 12, 13, 14, 15, 16, 17, 18],
'D': [13, 14, 15, 16, 17, 18, 19, 20, 21, 22] }
df=[Link](data)
Df

1. Describe()
[Link]()

2. Mean()
mean_values=[Link]()
mean_values
3. Medium()
median_values=[Link]()
median_values

4. Standard deviation()
std_=[Link]()
std_

5. Variance()
var_=[Link]()
var_

6. Skewness()
skew_=[Link]()
skew_

7. Kurtosis()
kurt_=[Link]()
kurt_
8. Min ()
min_=[Link]()
min_

9. Max()
max_=[Link]()
max_

10. Quantile ()
quantile_=[Link]([0.25,0.5,0.75])
quantile_

q1_A=df['A'].quantile(0.25)
q1_A

q3_D=df['D'].quantile(0.75)
q3_D
11. Co-Varience()
cov_=[Link]()
cov_

12. Co-Relation()
corr_=[Link]()
corr_

13. sum()
sum_=[Link]()
sum_

14. count()
count_=[Link]()
count_
15. cumsum()
-> it is used to calculate the cumulative sum of the elements along a given axis
cumsum_=[Link]()
cumsum_

0 : 1 , 1 : 1+2=3 , 2 : 3+3 =6 , 3 : 6 +4=10 , 4 : 10 +5=15 , ………

cumsum_=[Link](axis=1)
cumsum_

horizontal
16. cummin()
17. Cummax()
18. Cumprod()
Iteration:
Iteration a DataFrame :
iterrows()
Ex:
dic={'stu_id' : ['C1','C2','C3','C4'],
'Tool_Proficcency' : ['Powr bi','Tableau','Excel','Sql'],
'Ratings' : [4,5,4,3]}
df=[Link](dic)

for index, row in [Link]():

print(f'Index: {index}')
print(f" stu_id :{row['stu_id']} , Tool_proficcency: {row['Tool_Proficcency']}, Ratings{row['Ratings']}")
print(f"Row as Series:\n{row}\n")

-It returns index and series pairs from each row (each row is converted into series object)
-It allows you to access the rows data using column name Ex: {row['stu_id']}
-Index : The index of the row
- Series Pairs : each row of the dataframe returns a series object
Row as Series:
stu_id C1
Tool_Proficcency Powr bi
Ratings 4
Name: 0, dtype: object (This the series object)
-iterrows() is slower compered to itertuples()
- Because iterrows() convert the each row in to series object .

->Itertuples()
for row in [Link]():
print(f" stu_id :{row.stu_id} , Tool_proficcency: {row.Tool_Proficcency}, Ratings{[Link]}")

->It returns an each row as named tuple

-> It excludes the Index , but we can include by passing the parameters (index=True)
->Accessing the row data : using dot notation {row.stu_id}

Items()
->we are iterating over the datadrame column by column
->for each column I get column name and Series (column data)
for col_name,col_data in [Link]():
print(f" Column :{col_name}")
print(col_data)
Sorting
->Sorting is nothing but arranging data in the specific order, Like ascending descending order
-> we can apply sorting for datatypes such as numbers, strings , complex objects
->Sorting algorithums : Bubble sort, Merge sort, Quick sort, Insertion sort

Sorting by Values :
->short the dataframe by one or more columns
df = [Link]({
'Name': ['John', 'Anna', 'Peter', 'Linda'],
'Age': [28, 24, 35, 32],
'City': ['New York', 'Paris', 'Berlin', 'London']
})
sort=df.sort_values(by='Name')
sort = df.sort_values(by='Age',ascending=False) # by default ascending = true
sort=df.sort_values(by=['City','Age'],ascending=False) # sorting multiple columns

Sorting by Indexes :
sort=df.sort_index(ascending=False)
# sorting the index by row
sort =df.sort_index(axis=1)
# sorting the index by column

Sorting vales In place:

->we use Inplace = True to modify the original dataframe
df.sort_values(by='City',inplace=True)

Sorting in the Series :

s = [Link]([3, 1, 4, 2], index=['d', 'b', 'a', 'c'])
sorts=s.sort_values(ascending=False)
# Soring by the values
sorts=s.sort_index(ascending=False)
# Sorting by the indexes
Groupby
->It is used to split the data in to groups based on the some criteria
And apply function to each group independently.
->We use groupby for aggregating data such as (sum, mean, count, max, min)

->Syntax : [Link](‘col_name’).function()

Grouping by the single Column

df = [Link]({
'Product': ['A', 'B', 'A', 'B', 'A', 'B'],
'Region': ['North', 'North', 'South', 'South', 'North', 'South'],
'Sales': [100, 200, 150, 250, 120, 300]
})
group =[Link]('Region').sum()

Grouping the multiple columns:

group =[Link](['Region','Product']).sum()

Applying the multiple aggregations functions

group= [Link]('Region').agg({'Sales':['sum','mean','max','count']})

Resetting the Index :

->After grouping the Labels are converted into Indexes. We can reset the index to get the dataframe
group= [Link]('Region').agg({'Sales':['sum','mean','max','count']}).reset_index()
Merging/Joining the groups
->These operation allows multiple dataframes in to single dataframe based on common Keys or columns

Concatenating the dataframes:

df1 = [Link]({
'ID': [1, 2, 3],
'Name': ['Alice', 'Bob', 'Charlie']
})

df2 = [Link]({
'ID': [4, 5, 6],
'Name': ['David', 'Edward', 'Frank']
})
concat_df=[Link]([df1,df2])
concat_df

Merge Function :
->combine multiple dataframes based on the one or more keys
->Syntax : [Link](left, right, how='inner', on=None, left_on=None, right_on=None)
df1 = [Link]({
'ID': [1, 2, 3, 4],
'Name': ['Alice', 'Bob', 'Charlie', 'David']
})
df2 = [Link]({
'ID': [3, 4, 5, 6],
'Score': [85, 90, 75, 60]
})
merge=[Link](df1,df2, on='ID',how='right')
Merge

Join function:
->Join function is used to join dataframs based on there indexes or a key column
-> syntax : left_df.join(right_df, on=None, how='left', lsuffix='', rsuffix='', sort=False)

-> #Using set_index while creating dataframe

df1 = [Link]({
'Name': ['Alice', 'Bob', 'Charlie', 'David'],
'ID': [1, 2, 3, 4]
}).set_index('ID')

# DataFrame 2
df2 = [Link]({
'Score': [85, 90, 75, 60],
'ID': [3, 4, 5, 6]
}).set_index('ID')
join=[Link](df2,how='left')
join
->#using the set_index while creating the join.
df1 = [Link]({
'Name': ['Alice', 'Bob', 'Charlie', 'David'],
'ID': [1, 2, 3, 4]
})

df2 = [Link]({
'Score': [85, 90, 75, 60],
'ID': [3, 4, 5, 6]
})

# Join DataFrames on 'ID' column

join = df1.set_index('ID').join(df2.set_index('ID'),how='outer')
Join

Set_Index() = is the function is used to set one or more columns in datadrame as Indexs (row
lablesSyntax : df.set_index(keys, drop=True, append=False, inplace=False, verify_integrity=False)
Setting Single coumn as index :
data = {
'ID': [1, 2, 3, 4],
'Name': ['Alice', 'Bob', 'Charlie', 'David'],
'Score': [85, 90, 75, 60]
}

df = [Link](data)

set_1 = df.set_index('ID')
set_1

Setting multiple columnas an index:

set_2 =df.set_index(['ID','Score'])
set_2

Keeping The orginal column

set_keep=df.set_index('ID',drop=False)
set_keep

Resetting the index :

set_reset=df.reset_index()
set_reset
Concatenation
->It used to Combain the multiple Sources into dataframe
->Syntax : result = [Link](objs, axis=0, join='outer', ignore_index=False, keys=None, levels=None,
names=None, verify_integrity=False, sort=False)

Concatenation along the rows(Vertical Concate)

df1 = [Link]({
'A': ['A0', 'A1', 'A2'],
'B': ['B0', 'B1', 'B2']
})

df2 = [Link]({
'A': ['A3', 'A4', 'A5'],
'B': ['B3', 'B4', 'B5']
})
concat=[Link]([df1,df2],axis=0,ignore_index=True)
concat

ignore_index : It is used to control the indexs will concatenating

When we concate the dataframes(Vertical concate) it will keep the orginal indexes(default=Flase).
to change the indexes into Sequential order (ignore_index=True)

Concatenation along Columns(Horizontal Concatenation)

concat=[Link]([df1,df2],axis=1,ignore_index=True)
Concat

Concatenate with there indexes

df3= [Link]({
'A': ['A0','A1','A2'],
'B':['B0','B1','B2']}, index=[0,1,2])
df4 = [Link]({
'A': ['A3', 'A4', 'A5'],
'B': ['B3', 'B4', 'B5']
}, index=[3, 4, 5])
result=[Link]([df3,df4])
Result

Concatening with keys : We an create hierarchical index in the dataframe

result=[Link]([df1,df2],keys=['df1','muk'],axis=1) #axis=0
Result
Concatenate with different columns
df5 = [Link]({
'A': ['A0', 'A1', 'A2'],
'B': ['B0', 'B1', 'B2']
})

df6 = [Link]({
'A': ['A3', 'A4', 'A5'],
'C': ['C3', 'C4', 'C5']
})
result = [Link]([df5, df6], axis=0)
result

Class 12 Python Practical Index
No ratings yet
Class 12 Python Practical Index
29 pages
Data Manipulation with Python Pandas
No ratings yet
Data Manipulation with Python Pandas
14 pages
Introduction to Pandas for Data Analysis
No ratings yet
Introduction to Pandas for Data Analysis
9 pages
Pandas Data Handling Guide for Class XII
No ratings yet
Pandas Data Handling Guide for Class XII
18 pages
Pandas Series and DataFrame Basics
No ratings yet
Pandas Series and DataFrame Basics
47 pages
Pandas Series and DataFrame Exercises
100% (2)
Pandas Series and DataFrame Exercises
42 pages
Pandas Data Structures and Visualization
No ratings yet
Pandas Data Structures and Visualization
11 pages
Ip Notes
No ratings yet
Ip Notes
72 pages
Mastering Data Analysis with Pandas
No ratings yet
Mastering Data Analysis with Pandas
57 pages
Pandas Practical Programs Guide
No ratings yet
Pandas Practical Programs Guide
37 pages
Pandas Practical Programs Guide
No ratings yet
Pandas Practical Programs Guide
40 pages
Introduction to Pandas Library
No ratings yet
Introduction to Pandas Library
46 pages
Introduction to Pandas for Data Analytics
No ratings yet
Introduction to Pandas for Data Analytics
33 pages
Creating Pandas Series and DataFrames
No ratings yet
Creating Pandas Series and DataFrames
33 pages
Informatics Practices Class XII Syllabus
No ratings yet
Informatics Practices Class XII Syllabus
28 pages
Pandas Data Analysis Guide
No ratings yet
Pandas Data Analysis Guide
21 pages
Last Moment Summary - Removed
No ratings yet
Last Moment Summary - Removed
14 pages
Data Handlinng Using Pandas-I
No ratings yet
Data Handlinng Using Pandas-I
42 pages
Pandas Series and DataFrame Basics
No ratings yet
Pandas Series and DataFrame Basics
49 pages
Python Pandas Series and DataFrame Guide
No ratings yet
Python Pandas Series and DataFrame Guide
20 pages
Data Handling with Pandas: Series Guide
No ratings yet
Data Handling with Pandas: Series Guide
35 pages
Pandas Data Handling Guide 2023-2024
No ratings yet
Pandas Data Handling Guide 2023-2024
21 pages
Creating and Manipulating Pandas Series
No ratings yet
Creating and Manipulating Pandas Series
11 pages
Pandas Series Creation and Operations
No ratings yet
Pandas Series Creation and Operations
45 pages
Pandas Data Structures and Features Guide
No ratings yet
Pandas Data Structures and Features Guide
32 pages
Pandas Programming Examples and Visualizations
No ratings yet
Pandas Programming Examples and Visualizations
22 pages
IP XII PRACTICAL FILE 2023.docx - 20260130 - 153714 - 0000
No ratings yet
IP XII PRACTICAL FILE 2023.docx - 20260130 - 153714 - 0000
53 pages
Introduction to Python Libraries
No ratings yet
Introduction to Python Libraries
12 pages
Pandas Data Structures Overview
No ratings yet
Pandas Data Structures Overview
48 pages
Data Handling with Pandas Overview
No ratings yet
Data Handling with Pandas Overview
9 pages
Data Manipulation with Pandas Basics
No ratings yet
Data Manipulation with Pandas Basics
36 pages
Writing References in Practical Files
No ratings yet
Writing References in Practical Files
98 pages
Understanding Pandas for Data Analysis
No ratings yet
Understanding Pandas for Data Analysis
46 pages
Introduction to Pandas Data Structures
No ratings yet
Introduction to Pandas Data Structures
58 pages
Pandas Series: Data Handling Guide
No ratings yet
Pandas Series: Data Handling Guide
64 pages
Data Structures in Pandas Explained
No ratings yet
Data Structures in Pandas Explained
46 pages
Understanding Pandas Series Basics
No ratings yet
Understanding Pandas Series Basics
60 pages
Introduction to Pandas Data Structures
No ratings yet
Introduction to Pandas Data Structures
25 pages
Main Practical
No ratings yet
Main Practical
54 pages
Create A Panda
No ratings yet
Create A Panda
46 pages
Data Handling with Pandas for Class 12
100% (1)
Data Handling with Pandas for Class 12
25 pages
Full Form of MLL in Computer Science
No ratings yet
Full Form of MLL in Computer Science
22 pages
Data Handling with Pandas Overview
No ratings yet
Data Handling with Pandas Overview
44 pages
Pandas DataFrame Indexing Techniques
No ratings yet
Pandas DataFrame Indexing Techniques
92 pages
Create and Manipulate Pandas Series
No ratings yet
Create and Manipulate Pandas Series
8 pages
Data Handling with Pandas Guide
No ratings yet
Data Handling with Pandas Guide
12 pages
Introduction to Pandas Data Structures
No ratings yet
Introduction to Pandas Data Structures
33 pages
Pandas Basics: Series and DataFrame Operations
No ratings yet
Pandas Basics: Series and DataFrame Operations
15 pages
Introduction to Pandas for Data Analysis
No ratings yet
Introduction to Pandas for Data Analysis
29 pages
Pandas Data Structures and Operations
No ratings yet
Pandas Data Structures and Operations
46 pages
Introduction to Pandas Data Structures
No ratings yet
Introduction to Pandas Data Structures
11 pages
Pandas Series and DataFrame Experiments
No ratings yet
Pandas Series and DataFrame Experiments
40 pages
Create and Manipulate Pandas Series
No ratings yet
Create and Manipulate Pandas Series
95 pages
Pandas Series and DataFrame Examples
No ratings yet
Pandas Series and DataFrame Examples
12 pages
Introduction to Pandas Library Basics
No ratings yet
Introduction to Pandas Library Basics
67 pages
Mastering Data Manipulation with Pandas
No ratings yet
Mastering Data Manipulation with Pandas
71 pages
Early Hospital Readmission Prediction
No ratings yet
Early Hospital Readmission Prediction
57 pages
Possessive Adjectives and Determiners Guide
No ratings yet
Possessive Adjectives and Determiners Guide
8 pages
Oracle DBA Interview Questions Overview
No ratings yet
Oracle DBA Interview Questions Overview
30 pages
Oracle DBA Interview Questions L2/L3
100% (2)
Oracle DBA Interview Questions L2/L3
46 pages
Business Statistics: Hypothesis Testing Guide
No ratings yet
Business Statistics: Hypothesis Testing Guide
37 pages
Z+F Imager 5010X: How We Build Reality
No ratings yet
Z+F Imager 5010X: How We Build Reality
2 pages
Open Optical Monitoring Specification 1.0
No ratings yet
Open Optical Monitoring Specification 1.0
12 pages
Business Justification for BI Implementation
No ratings yet
Business Justification for BI Implementation
15 pages
Document by Prof. Dr. Mustermann
No ratings yet
Document by Prof. Dr. Mustermann
16 pages
Understanding Graphs and Tables
No ratings yet
Understanding Graphs and Tables
8 pages
Network Topology Design Internship Report
No ratings yet
Network Topology Design Internship Report
47 pages
ANU Master of Computing Overview
No ratings yet
ANU Master of Computing Overview
2 pages
Dell Repair Order and Service Details
No ratings yet
Dell Repair Order and Service Details
1 page
Student Database Management System
No ratings yet
Student Database Management System
13 pages
IT Training and Internship Overview
No ratings yet
IT Training and Internship Overview
16 pages
Evaluating Pentaho Data Integration
No ratings yet
Evaluating Pentaho Data Integration
4 pages
Software Requirements Engineering Overview
100% (1)
Software Requirements Engineering Overview
29 pages
Officer and Student Logging System
No ratings yet
Officer and Student Logging System
5 pages
Dell EMC PowerEdge Server Matrix V2.5 - Mailversion
No ratings yet
Dell EMC PowerEdge Server Matrix V2.5 - Mailversion
1 page
CSC408 MS-Excel Functions Manual
No ratings yet
CSC408 MS-Excel Functions Manual
25 pages
Proces Um100 - en P
No ratings yet
Proces Um100 - en P
322 pages
XAML Square Drawing Application
No ratings yet
XAML Square Drawing Application
4 pages
Advanced Manufacturing Systems Overview
No ratings yet
Advanced Manufacturing Systems Overview
5 pages
OFBiz Framework Overview and Insights
100% (1)
OFBiz Framework Overview and Insights
61 pages
Choosing the Right IIoT Platform
No ratings yet
Choosing the Right IIoT Platform
10 pages
Syllabus in Management Information System
No ratings yet
Syllabus in Management Information System
7 pages
Vodafone R201 Firmware Update Guide
No ratings yet
Vodafone R201 Firmware Update Guide
4 pages
A2xx User Guide for Windows CE 5.0
No ratings yet
A2xx User Guide for Windows CE 5.0
43 pages
Sysmex XN-1000 SOP and Guidelines
No ratings yet
Sysmex XN-1000 SOP and Guidelines
10 pages
DSP RZNC D5416 Controller Manual
No ratings yet
DSP RZNC D5416 Controller Manual
32 pages
Adapting Magneti Marelli ETM Sensors
No ratings yet
Adapting Magneti Marelli ETM Sensors
10 pages
AmDg Series Soc Product Brief
No ratings yet
AmDg Series Soc Product Brief
3 pages
Nandintyo Arwanto's CV
No ratings yet
Nandintyo Arwanto's CV
3 pages
SQL Constraints in Database Management
No ratings yet
SQL Constraints in Database Management
12 pages
Cyber Clash: Gamifying Cybersecurity Education
No ratings yet
Cyber Clash: Gamifying Cybersecurity Education
118 pages

Introduction to Pandas for Data Analysis

Uploaded by

Introduction to Pandas for Data Analysis

Uploaded by

PANDAS

4. Creating a Series from List

5. Creating a series from dictionary

6. Creating a series using numpy functions

7. Creating a series using range function

 Changing the type of data

[Link] a Dataframe using List:

[Link] a dataframe using list of lists:

1. (Using Dot notation)

3. Selecting the column by Label(loc) or by conduction

4. Selecting column by index (iloc)

[Link] column by datatype

3. Addition with the help of ndarray

4. addition using arithmetic operations

5. Joining the dataframs

1.2 Droping multiple columns:

Inplace =It modifies original Data Framewithout creating a copy.

3. Using pop Keyword :

0 : 1 , 1 : 1+2=3 , 2 : 3+3 =6 , 3 : 6 +4=10 , 4 : 10 +5=15 , ………

for index, row in [Link]():

->It returns an each row as named tuple

Sorting vales In place:

Sorting in the Series :

Grouping by the single Column

Grouping the multiple columns:

Applying the multiple aggregations functions

Resetting the Index :

Concatenating the dataframes:

-> #Using set_index while creating dataframe

# Join DataFrames on 'ID' column

Setting multiple columnas an index:

Keeping The orginal column

Resetting the index :

Concatenation along the rows(Vertical Concate)

ignore_index : It is used to control the indexs will concatenating

Concatenation along Columns(Horizontal Concatenation)

Concatenate with there indexes

Concatening with keys : We an create hierarchical index in the dataframe

You might also like