0% found this document useful (0 votes)
78 views

Lecture-6 Introduction Pandas

This lecture introduces Pandas, a data analysis tool. It discusses Pandas' data structures like Series and DataFrames. It covers input/output with Pandas, getting information about data, and selecting/extracting data from DataFrames. The lecture aims to help students explore, clean, transform and analyze data with Pandas.

Uploaded by

Abdul Basit
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
78 views

Lecture-6 Introduction Pandas

This lecture introduces Pandas, a data analysis tool. It discusses Pandas' data structures like Series and DataFrames. It covers input/output with Pandas, getting information about data, and selecting/extracting data from DataFrames. The lecture aims to help students explore, clean, transform and analyze data with Pandas.

Uploaded by

Abdul Basit
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 10

Data Science

Lecture No.6

Introduction to Pandas

Shahzad Ali
Lecturer Dept. of Computer Science
City University Peshawar
Lecture Content
 Introduction
 Data Structures
 Input Output with Pandas
 Getting Info about Data
 DataFrame slicing, selecting, extracting
 Visualization with Pandas
Introduction
 Pandas has so many uses that it might make sense to list the things it
can't do instead of what it can do.
 This tool is essentially your data’s home. Through pandas, you get
acquainted with your data by cleaning, transforming, and analyzing
it.
Introduction
 For example, say you want to explore a dataset stored in a CSV on your
computer. Pandas will extract the data from that CSV into a Data Frame — a
table, basically — then let you do things like:
 Calculate statistics and answer questions about the data, like
 What's the average, median, max, or min of each column?
 Does column A correlate with column B?
 What does the distribution of data in column C look like?
 Clean the data by doing things like removing missing values and filtering rows or columns by
some criteria
 Visualize the data with help from Matplotlib. Plot bars, lines, histograms, bubbles, and more.
 Store the cleaned, transformed data back into a CSV, other file or database
Pandas data structures
 Pandas deals with the following three data structures:
 Series
 DataFrame
 Panel
Series
 A pandas Series can be created using the following constructor −
 pandas.Series( data, index, dtype, copy)
DataFrame
 A pandas DataFrame can be created using various inputs like −
 Lists
 dict
 Series
 Numpy ndarrays
 Another DataFrame
Input Output with Pandas
Getting Info about Data
 Getting info about your data
 .info()
 .shape()
 Handling duplicates
 .drop_duplicates()
 Column cleanup
 .columns
 .rename()
DataFrame slicing, selecting, extracting
 Selecting Data by column
 .loc[“column_name”] locates by name
 .iloc[“numerical index”] locates by numerical index

You might also like