0% found this document useful (0 votes)
30 views4 pages

Class Notes: Class: XII Date: 7-Apr-2020 Subject: Informatics Practices Topic: 2. Python Pandas

Pandas is a Python library used for data analysis and manipulation. It allows users to import data from various sources into DataFrames, which are similar to Excel spreadsheets. DataFrames contain columns of different data types and rows indexed numerically. Users can create DataFrames from Python dictionaries or by loading data from files. Key Pandas objects include Series for one-dimensional data and DataFrames for multi-dimensional data.

Uploaded by

Ravi Ramrakhani
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
30 views4 pages

Class Notes: Class: XII Date: 7-Apr-2020 Subject: Informatics Practices Topic: 2. Python Pandas

Pandas is a Python library used for data analysis and manipulation. It allows users to import data from various sources into DataFrames, which are similar to Excel spreadsheets. DataFrames contain columns of different data types and rows indexed numerically. Users can create DataFrames from Python dictionaries or by loading data from files. Key Pandas objects include Series for one-dimensional data and DataFrames for multi-dimensional data.

Uploaded by

Ravi Ramrakhani
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 4

Class Notes

Class: XII Date: 7-Apr-2020

Subject: Informatics Practices Topic: 2. Python Pandas

Introduction
Pandas or Python Pandas is Python's library for data analysis. Pandas has derived its name from
"panel data system", which is an econometrics term for multi-dimensional, structured data sets.
Today, Pandas has become a popular choice for data analysis. As you must be aware of that data
analysis refers to process of evaluating big data sets using analytical and statistical tools so as to
discover useful information and conclusion to support business decision-making.

Pandas makes available various tools for data analysis and makes it a simple and easy process as
compared to other available tools. The main author of Pandas is Wes McKinney.

Installing pandas
Go to your search bar on your desktop and search for cmd. An application called Command prompt
should show up. Click to start it.

Type in the command “pip install pandas".


Wait for the downloads to be over and once it is done you will be able to run Pandas inside your
Python programs on Windows.

NB.: This sheet is prepared from home.


Note: Each time we need to use pandas in our python program we need to write a line of code at the
top of the program:

import pandas as <identifier_name>

Above statement will import the pandas library to our program.

We will use two different pandas libraries in in our programs


1. Series
2. DataFrames

1. Series
A Series is a Pandas data structure that represents a one dimensional array-like object containing an
array of data (of any Numpy data type) and an associated array of data labels, called its index.
A Series type object has two main components :
> an array of actual data
> an associated array of indexes or data labels.
Both components are one-dimensional arrays with the same length. The index is used to access
individual data values, e.g.,

2. DataFrames

A DataFrame is a two-dimensional labelled array like, pandas data structure that stores an ordered
collection columns that can store data of different types.

Pandas DataFrame is similar to excel sheet and looks like this

NB.: This sheet is prepared from home.


How to create a Pandas DataFrame?
In the real world, a Panda DataFrame will be created by loading the datasets from the permanent
storage, including but not limited to excel, csv and MySQL database.
First we will use Python Data Structures (Dictionary and list) to create DataFrame.
Using Python Dictionary to create a DataFrame object
name_dict = { 'name' : ["Anita", "Sajal", "Ayaan", "Abhey"], 'age' : [14,32, 3, 6] }

If we print this dictionary using print(name_dict) command, it will show us the output like this:
{'name': ['Anita', 'Sajal', 'Ayaan', 'Abhey'], 'age': [14, 32, 3, 6]}

We can create a Pandas DataFrame out of this dictionary

import pandas as pd
name_dict = { 'Name' : ["Anita", "Sajal", "Ayaan", "Abhey"], 'Age' : [14,32, 4, 6] }
df = pd.DataFrame(name_dict)
print(df)

Output
Name Age
0 Anita 14
1 Sajal 15
2 Ayaan 4
3 Abhey 6

As you can see the output generated for the DataFrame object looks similar to what we have seen in
the excel sheet as. Only difference is that the default index value for the first row is 0 in DataFrame
whereas in excel sheet this value is 1. We can also customize this index value as per our need.

NB.: This sheet is prepared from home.


Note: A side effect of dictionary is that when accessing the same dictionary at two separate times,
the order in which the information is returned by the does not remained constant.

One more example of DataFrame with customize index value


import pandas as pd
name_dict = { 'Name' : ["Anita", "Sajal", "Ayaan", "Abhey"], 'Age' : [14,32, 4, 6] }
df = pd.DataFrame(name_dict , index=[1,2,3,4])
print(df)

Output
Name Age
1 Anita 14
2 Sajal 15
3 Ayaan 4
4 Abhey 6
In the preceding output the index values start from 1 instead of 0

NB.: This sheet is prepared from home.

You might also like