18_Pandas
18_Pandas
Python Library
Learning objective
• What is pandas?
• Key features of pandas
• Working with Pandas
• Pandas – data structure
• Series and DataFrame
• Data analysis
• Data manipulation
What is Pandas?
• In 2008, developer Wes McKinney started developing pandas
when in need of high performance, flexible tool for analysis
of data.
• Tools for loading data into in-memory data objects from different
file formats.
• Key Points:
• Homogeneous data
• Size Immutable
• Values of Data Mutable
Series:
• A pandas Series can be created as follows:
import pandas as pd
df5 = pd.read_csv('heart.csv')
print(df5)
import pandas as pd
# Create DataFrame
df = pd.DataFrame({'Name': names, 'Age': ages, 'City': cities})
print(df)
Creating a DataFrame with Index Labels
data = {
'Name': ['arif','vijay'],
'Age': [44,42],
'City': ['muscat','salalah']
}
# Create DataFrame
df = pd.DataFrame(data, index=['A','B'])
print(df)
Creating a dataframe and then adding
columns and data
import pandas as pd
df = pd.DataFrame()
df['Empno'] = [101,102,103,104]
df['Ename'] = ['Ali','Mohammed','Nasser','Abdullah']
df['Salary']= [2000,3000,5000,7000]
print(df)
Data Manipulation using pandas
Filtering, Sorting, Specific columns, Grouping etc.
Filtering Data
import pandas as pd
df = pd.read_csv('Admission.csv')
admit_filter = df[df['admitted'] == 1]
print(admit_filter)
Sorting Data
import pandas as pd
df = pd.read_csv('Admission.csv')
sort_on_gpa = df.sort_values(by='gpa')
print(sort_on_gpa)
Display single column
import pandas as pd
df = pd.read_csv('Admission.csv')
select_gmat_gpa = df['gmat’]
print(select_gmat_gpa)
Display Specific columns only
import pandas as pd
df = pd.read_csv('Admission.csv')
selected_columns = df[['gmat','gpa','admitted']]
print(selected_columns)
Group data by work experience and get the
average gpa score
import pandas as pd
df = pd.read_csv('Admission.csv')
group_by_exp = df.groupby('work_experience')['gpa'].mean()
print(group_by_exp)
Get the statistics
import pandas as pd
df = pd.read_csv('Admission.csv')
print(df.describe())
Display information about DataFrame
import pandas as pd
df = pd.read_csv('Admission.csv')
print(df.info())
Drop rows with any missing values
import pandas as pd
df = pd.read_csv('Admission.csv')
df_cleaned = df.dropna()
print(df_cleaned)
You must have learnt:
• What is pandas?
• Key features of pandas
• Working with Pandas
• Pandas – data structure
• Series and DataFrame
• Data analysis
• Data manipulation