MBAS901 1 LectureB
MBAS901 1 LectureB
2
Algorithm: example
3
Algorithm in computer terms
4
Why Python Programming Language?
The ability to connect to a wide range of data sources,
integrate with many applications including machine learning,
artificial intelligence, motion graphics, etc.
Package for scientific computing in Python.
• https://colab.research.google.com/
• Login to you Google account
• Open a Python Notebook (.ipynb)
• Write the first Python Program.
• print("I love Business Analytics")
6
Data Types
7
Data Types
True
bool
False
8
Variables
•Python uses the input command to get input from the user
•All input is stored as text (string data type)
•If required , the input may be converted to number (int or float data type)
10
Arithmetic Operators
11
Exercise: Operators and Expressions.
1 2
4
3
12
Selection or Decision
Selections: if statement
• Another example, if you would like to buy an item and you found two similar
products. You will then put some conditions such as a price limit, rating of the
product by experts, etc.
13
Selection or Decision
Selections: if statement (Cont)
• If..else
• nested if
• nested else if
14
Selections: if statement
Structure of if statement in Python:
Rules:
• A lower case if keyword must be used
• The if statement must end with a colon :
• The keyword else must be lower case and start at
the same level as the keyword if
• The keyword else must end with a colon :
15
15
Condition with non numeric
Sample output
16
Python Comparison Operators
Operator Description Example
Equal == If the values of two operands are equal, then the 2 == 2 [True]
condition becomes True, otherwise False. 3 == 2 [False]
"sum" == "sum” [True]
"ABC" == "Abc” [False]
Not Equal != If values of two operands are not equal, then condition 2 != 2 [False]
Not Equal <> becomes True, otherwise False. 3 != 2 [True]
"sum" != "sum” [False]
"ABC" != "Abc” [True]
Greater than > If the value of left operand is greater than the value of 3 > 2 [True]
right operand, then condition becomes True. 3 > 3 [False]
Greater than equal >= If the value of left operand is greater than or equal to the 3 >= 2 [True]
value of right operand, then condition becomes True. 3 >= 3 [True]
3 >= 4 [False]
Less than < If the value of left operand is less than the value of right 2 < 3 [True]
operand, then condition becomes True. 3 < 3 [False]
Less than equal <= If the value of left operand is less than or equal to the 3 <= 4 [True]
value of right operand, then condition becomes True. 3 <= 3 [True]
4 <= 3 [False]
17
Logical Operators
When you have more than one condition in the same if statement [compound
condition], then you need to use a logical operator. These logical operators simply
allow you to request that both conditions must be met or only one of them.
• If both are conditions must be True then use and.
• If Any one of the conditions is True then use or.
for i in range(4):
for loop
number of
timers to
repeat
20
Version 1: For loop with only end value
for i in range(endValue):
Statements
21
Version 2: For loop with start and end value
Output
Example
22
Version 3: For loop with increment value
for i in range(startValue, endValue, stepValue):
Statements
The starting value of loop can be changed to any given number. Step
value can be change from 1 to any value.
Note: Step value must be negative if start value is greater than end
value.
Example Output
23
What is Pandas?
• A Python library is a collection of program code that can be used
repeatedly in different programs. It makes Python Programming
simpler and convenient for the programmer.
From now and on, you can use the object pd to perform pandas
operations.
Data Files
• CSV datafiles are very common and in a safe format to work with data.
• These files have the extension .csv
• They and can be opened, edited and saved in Microsoft Excel or Notepad
• In Python, we will work with data from CSV files
Accessing CSV file and getting familiar with the data set
You need to download the data from a Comma Separated Value (CSV) file into a Pandas
Dataframe
The file name with
the full extension.
This line
imports
pandas and
create a pd
object
Viewing Sample Data
You can view sample data from the top or bottom of the dataset
df.head(5)
df[“Question1”]
29
Describing the data of a column
30
Working with loc in Pandas function
The name of Please note Here you specify The colon Here you specify
your data the use of the first column separates the last column
frame object. square name. Please the start name. Please
In our example bracket. note that you and end of note that you
this is data2. Normal should use column the should use column
bracket will name and not columns. It name and not
not work. numbers. is ‘must numbers.
have’.
Working with loc in Pandas function
Example:
df.loc[5:10,"Question1":"Question2"]
The column
“Question1”:”Question2”
Working with loc in Pandas function 2
You can display columns that are not in sequence. For example, you can
display Question1 and Question2.
To display selected columns or rows, you need to add them inside a
square bracket [ ].
Example:
Display rows 3, 8, and 20 and Columns “Question1”
and “Question4”.
df.loc[[3,8,20],["Question1","Question4"]]
Sorting data
Syntax
df.sort_values(‘‘Question1”)
35
Writing data to external file
Example:
Write the data you cleaned in the previous example to an external file.
The above lines store the DataFrame data in the an Excel file
‘NewData.xlsx’ in a sheet with the name ’Sheet1’.
36
Summary of Pandas Commands
Statistics
Reading or Importing Data df.describe() | Summary statistics for numerical columns
df.mean() | Returns the mean of all columns
pd.read_csv(filename) | From a CSV file df.corr() | Returns the correlation between columns in a DataFrame
pd.read_table(filename) | From a delimited text file (like TSV) df.count() | Returns the number of non-null values in each
pd.read_excel(filename) | From an Excel file DataFrame column
pd.read_html(URL) | From HTML page df.max() | Returns the highest value in each column
df.min() | Returns the lowest value in each column
Selection df.median() | Returns the median of each column
df.std() | Returns the standard deviation of each column
df[col] | Returns column with label col as Series
df[[col1, col2]] | Returns columns as a new DataFrame Viewing/Inspecting Data
df.iloc[0,:] | First row df.head(n) | First n rows of the DataFrame
df.iloc[0,0] | First element of first column df.tail(n) | Last n rows of the DataFrame
Data Cleaning df.shape() | Number of rows and columns
df.info() | Index, Datatype and Memory information
df.columns = ['a','b','c'] | Rename columns
df.describe() | Summary statistics for numerical columns
df.dropna() | Drop all rows that contain null values
df.fillna(x) | Replace all null values with x
df.rename(columns={'old_name': 'new_ name'}) | Selective renaming
Exporting/Writing Data
df.set_index('column_one') | Change the index
df.to_csv(filename) | Write to a CSV file
Filter, Sort, and Groupby df.to_excel(filename) | Write to an Excel file
df[df[col] > 0.5] | Rows where the column col is greater than 0.5
df[(df[col] > 0.5) & (df[col] < 0.7)] | Rows where 0.7 > col > 0.5
df.sort_values(col2,ascending=False) | Sort values by col2 in descending order
df.groupby(col) | Returns a groupby object for values from one column
df.groupby([col1,col2]) | Returns groupby object for values from multiple columns 37
Questions
•Python Tutorial
•https://www.w3schools.com/python/default.asp
38