NTU AB0403 Quiz Notes
NTU AB0403 Quiz Notes
5%2=1
Basic python knowledge:
- len() → count the number of items in a data type
- Indexes start from 0 first
- range():
E.g. range(6) → produce numbers from 0 to 5
- ALWAYS TYPE print() WHEN YOU WANT TO DISPLAY AN ITEM
Outputs:
Errors:
- TypeError: occurs when the data type of objects in an operation is inappropriate
(e.g. dividing an integer with a string)
- NameError: occurs when you try to use a variable, function, or module that
doesn't exist or wasn't used in a valid way
- ValueError: user gives an invalid value to a function but is of a valid argument
E.g.
- AttributeError: an exception that occurs when an attribute reference or
assignment fails. This can occur when an attempt is made to reference an
attribute on a value that does not support the attribute.
Functions are attributes
E.g.
While loop:
General structure
n=0
while condition:
code
- Put print(n) in front of n += 1 in the while loop if you want to increase the current value of
n by 1 unit → by doing so, you are printing the current value of n first and then adding 1
to it
- Qn: Print out the time table for the number user entered. Eg. If user entered 5, your
program will output 1 x 5 = 5 … until 10 x 5 = 50.
Try Except:
- try → test a block of code for errors
- except → handle the error
- pass → nothing happens with this statement
Dictionary:
- New dictionary → name_of_dictionary = {}
- Adding a value to a key in a dictionary:
name_of_dictionary[“key”] = value
- Returning all keys of a dictionary: name_of_dictionary.keys()
- Returning all values of a dictionary: name_of_dictionary.values()
String formatting:
- {:.2f} → number produce will have 2 decimal places
f = float
- f-string:
f”${salary:,.2f}”
- Adding commas into the output → {:,}
Datetime module:
- datetime.now() → returns current date and time
- datetime.today() → returns current date, but time is at time zero (i.e. 12am)
- datetime() → gives the date and time of a single point of time (e.g. independence day)
(you have to key in the date, time etc. of that single point of time into the brackets of
datetime())
SQL basics:
● NOTE: SQL is case sensitive for PA 2!!
- * = everything
- NUMBER(7,2) or NUMERIC(7,2) → You want to round off a number that is 7 digits long
to 2 decimal places
- SMALLINT → integer numbers up to 6 digits
- CHAR(L) → can store up to 255 characters. If there are fewer characters, there will be
unused space
L = number of characters
- VARCHAR(L) → can store as many characters, but there will NOT be unused space
L = number of characters
- ; = denote that you’ve reached the end of your statement
- , = means that you’re moving on to the next command of the SQL code
- AND = add additional commands
- FROM = specify which table you are selecting
- SELECT DISTINCT = return only distinct (unique) values
- COUNT() (it is a function) = gives number of records
NULL records will not be counted using COUNT()
- DROP = Delete things in SQL such as database, table etc.
Conditions
- WHERE (condition) = extract data that fulfil a specific condition
- WHERE NOT (condition) = extracts records when the condition is not fulfilled
ORDER BY usage:
● Ordering your data based on a specific column
● Sorting your data in descending order:
Insert:
Seeing if a value is null or not null:
● Use the IS or IS NOT operator
Checking for null:
Example 1:
SELECT * FROM MANAGER M INNER JOIN EXECUTIVE E
on M.MANAGER_CODE = E.MANAGER_CODE
M = Manager table
E = Executive table
MANAGER_CODE = Column that will contain values that are in both table M and table E
Text = table names
Example 2:
Jupyter Notebook:
Basic knowledge
- Each column in a dataframe is a series
Select a single column → df[“column_name”]
Selecting multiple columns → df[[“1st_column”, “2nd_column”]]
- df.shape → gives you the number of rows and columns in a dataframe in tuple format
(no. of rows, no. of columns)
To get no. of rows = df.shape[0]
Get no. of columns = df.shape[1]
- df.duplicated() → check if any of the records are duplicated
- df.duplicated().sum() → gives the sum of duplicated records
- Checking if there are any duplicated records within a COLUMN →
df.column_name.duplicated()
No. of duplicated records within a COLUMN → df.column_name.duplicated().sum()
- df.isna() → finding any null records within the entire dataframe
- astype() → converting value into another datatype
astype(str) = convert value into string data type
- inplace = True → Tell Jupyter Notebook that it can alter the original dataset
- drop_duplicates() → remove duplicated rows from a dataframe
- Replacing values:
df.replace(oldValue, newValue)
- Finding the mean:
● Mean of dataframe → df.mean()
● Mean of column → df.column_name.mean()
Practice questions:
Words in bold are the answers to the questions
1. Complete the code to write a python function called zooTicket() which will take in two input
arguments called adult and child, both in integer. The program will then calculate the total price
for zoo ticket. Adult ticket costs $25, child ticket costs $13. If an adult buys 3 or more child
tickets, he will get $20 discount. Error handling is not required, provide 2 spacing for each level
of indentation.
Example of calling the function with 2 adult tickets and 5 children ticket, the function will then
return the total cost:
amount = zooTicket(2, 5)
# define function with appropriate input argument
def zooTicket(adult, child):
total = adult * 25 + child * 13
# condition to get discount
if child >= 3 and adult >= 1:
# return discounted price
return total - 20
return total
2. The table, with table name "movie" contains movies related data. Complete the following SQL
command to view the average movie runtime for movies with runtime less than 120 mins. One
box for 1 word/syntax
Answers: SELECT avg(Movie_Runtime) AS Average FROM Movie WHERE Movie_Runtime < 120;
df.sales_latte.max()
Complete the following Python code to compute the mean sales of muffin from the first 3 months
of sales, up to 2 decimal points.
Answer: round(df.loc[:2, "sales_muffin”].mean(),2])