Pandas Data Structures: Sections
Pandas Data Structures: Sections
It is built on
NumPy and provides easy-to-use data structures and data analysis tools for the Python
programming language.
Check out the sections below to learn the various functions and tools Pandas offers.
Sections:
5. DataFrame Summary
6. Selection
7. Applying Functions
8. Data Alignment
9. In/Out
b -5
c 7
d 4
Dropping
In this section, you’ll learn how to remove specific values from a Series, and how to
remove columns or rows from a Data Frame.
s and df in the code below are used as examples of a Series and Data Frame
throughout this section.
>>> s
a 6
b -5
c 7
d 4
>>> df
>>> s.drop(['a','c'])
b -5
d 4
Capital Population
0 Brussels
111907
1 New Delhi 1303021
2 Brasilia 208476
df in the code below is used as an example Data Frame throughout this section.
>>> df
>>> df.sort_index()
>>> df.rank()
df in the code below is used as an example Data Frame throughout this section.
>>> df
(rows, columns)
>>> df.shape
(3, 3)
Describe index
>>> df.index
>>> df.columns
Info on DataFrame
>>> df.info()
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 3 entries, 0 to 2
dtypes: object(3)
>>> df.count()
Country 3
Capital 3
Population 3
DataFrame Summary
In this section, you’ll learn how to retrieve summary statistics of a Data Frame which
include the sum of each column, min/max values of each column, mean values of
each column, and others.
df in the code below is used as an example of a Data Frame throughout this section.
>>> df
Even Odd
0 2 1
1 4 3
2 6 5
Sum of values
>>> df.sum()
Even 12
Odd 9
>>> df.cumsum()
Even Odd
0 2 1
1 6 4
2 12 9
Minimum value
>>> df.min()
Even 2
Odd 1
Maximum value
>>> df.max()
Even 6
Odd 5
Summary statistics
>>> df.describe()
Even Odd
count 3.0 3.0
Mean of values
>>> df.mean()
Even 4.0
Odd 3.0
Median of values
>>> df.median()
Even 4.0
Odd 3.0
Selection
In this section, you’ll learn how to retrieve specific values from a Series and Data
Frame.
s and df in the code below are used as examples of a Series and Data Frame
throughout this section.
>>> s
a 6
b -5
c 7
d 4
>>> df
>>> s['b']
-5
>>> df[1:]
>>> df.iloc[0,0]
'Belgium'
>>> df.ix[2]
Country Brazil
Capital Brasilia
Population 208476
>>> df.ix[:,'Capital']
0 Brussels
1 New Delhi
2 Brasilia
>>> df.ix[1,'Capital']
'New Delhi'
a 6
b -5
c 7
d 4
Applying Functions
In this section, you’ll learn how to apply a function to all values of a Data Frame or a
specific column.
df in the code below is used as an example of a Data Frame throughout this section.
>>> df
Even Odd
0 2 1
1 4 3
2 6 5
Apply function
Even Odd
0 4 2
1 8 6
2 12 10
Data Alignment
In this section, you’ll learn how to add, subtract, and divide two series that have
different indexes from one another.
s and s3in the code below are used as examples of Series throughout this section.
>>> s
a 6
b -5
c 7
d 4
>>> s3
a 7
c -2
d 3
>>> s + s3
a 13.0
b NaN
c 5.0
d 7.0
a 13.0
b -5.0
c 5.0
d 7.0
a 0.857143
b -1.250000
c -3.500000
d 1.333333
In/Out
In this section, you’ll learn how to read a CSV file, Excel file, and SQL Query into
Python using Pandas. You will also learn how to export a Data Frame from Pandas into
a CSV file, Excel file, and SQL Query.
>>> pd.read_csv('file.csv')
>>> df.to_csv('myDataFrame.csv')
>>> pd.read_excel('file.xlsx')
engine = create_engine('sqlite:///:memory:')
>>>
Python is the top dog when it comes to data science for now and in the foreseeable
future. Knowledge of Pandas, one of its most powerful libraries is often a requirement
for Data Scientists today.
Use this cheat sheet as a guide in the beginning and come back to it when needed, and
you’ll be well on your way to mastering the Pandas library.