Questions Data
Questions Data
Q2. Out of these data structures, which are mutable and which are immutable?
Mutable data structures are the one whose contents can be changed after it's declared
whereas immutable data structures are those whose contents cannot be altered once they
are declared. Among the data structures mentioned, the mutable ones are lists, sets,
and dictionaries. Strings and tuples, on the other hand, are immutable.
Q4. What are dictionaries in Python? How do access the keys in a dictionary?
Dictionaries are a data structure present in Python which store key-value pairs. The idea is
that essentially any value in the dictionary can be accessed using its corresponding key.
Let's look at the following example:
d = {'Name': 'Kaustubh', 'Age': 25, 'Blood Group': 'B+'}
Now, using d['Name'] will return 'Kaustubh', d['Age'] will return 25, and d['Blood
Group'] will return 'B+'.
The keys, on the other hand, can be accessed using the '.keys()' command. In the dictionary
above, using d.keys() will return ['Name', 'Age', 'Blood Group'].
Q9. What are lambda functions in Python? How are they used?
Lambda functions are single expression anonymous functions which can be created on the
go. Lambda functions can have any number of arguments like a normal function and
provide a very concise way to write otherwise long functions. The syntax of a lambda
function is very simple. For example, if you want to write a lambda function that finds the
maximum of two integers, it can be simply written as:
f = lambda x, y: x if x > y else y f(9, 5)
The following function will simply return 9. If you would have written a normal function, on
the other hand, you would have 4-5 additonal lines of code. Another advantage of lambda
function is it doesn't require a 'return' statement.
Q1. What are the different packages that are necessary for data analysis in Python?
The following are the most important packages that are used for data analysis in Python:
Numerical/Mathematical Packages:
1. NumPy (or, Numerical Python)
2. SciPy
Dataframe Manipulation:
1. Pandas
Data Visualisation:
1. Matplotlib
2. Seaborn
Model Building:
1. SciKit-Learn
2. Statsmodels
Siddhartha 23 AB+
Sahil 22 A-
Long Format
Name Attribute Value
Siddhartha Age 23
Sahil Age 22
Q5. What is the difference between a Pandas series and a Pandas dataframe?
According to the official Pandas documentation, dataframes are two-dimensional, size-
mutable, potentially heterogeneous tabular data structure with labelled axes (rows and
columns). Arithmetic operations align on both row and column labels. It Can be thought of
as a dict-like container for Series object.
Series primarily contains just one homogenous column with corresponding indices. So,
series can be thought of as the data structure for a single column of the pandas dataframe.
And in actuality as well, the data is stored in a dataframe as a collection of series. If you use
the command type() on any of the columns in the dataframe, you will see that the output is
pd.series.
An analogy to understand this is arrays and matrices, or 1D array and 2D arrays. 1D arrays
are like the building blocks of 2D arrays, and 2D arrays cannot be made without the
existence of 1D arrays.
SQL QUESTION
Q2. What do you mean by Data Definition Language? What do you mean by Data
Manipulation Language?
Data Definition Language, or DDL, consists of the commands using which a user can create
or modify a database schema. These can be used to create or edit the structure of the
objects of a database, such as tables, indexes, etc.
Data Manipulation Language, or DML, consists of the commands using which a user can
access or manipulate data in the database. It allows you to perform the following functions:
Insert data or rows in a database
Delete data from a database
Retrieve or fetch data
Update data in a database.
Q7. What is a clustered index? What is a non-clustered index? What is the difference
between clustered and non-clustered indexes?
Clustered indexes sort and store the data rows in the table or view based on their key
values. These are the columns included in the index definition. There can be only one
clustered index per table because the data rows can be stored in only one order.
In a non-clustered index, there is a second list that has pointers to the physical rows. You
can have many non-clustered indexes, although each new index will increase the time it
takes to write new records.
The different between these two indexes are as follows.
One table can have only one clustered index but multiple non-clustered indexes.
Clustered indexes can be read more rapidly than non-clustered indexes.
Clustered indexes store data physically in the table or view, but non-clustered
indexes don’t store data in the table as they have a separate structure from data
rows.
Q8. What is the difference between DELETE and TRUNCATE? What is the difference
between DROP and TRUNCATE?
The basic difference between DELETE and TRUNCATE is that DELETE is a DML
command, while TRUNCATE is a DDL command. (For more details, take a look at this)
DELETE is used to delete specific rows from the table, whereas TRUNCATE removes
all the rows from the table.
DELETE can be used with the WHERE clause, which cannot be used with TRUNCATE.
TRUNCATE removes all rows from the table, while DROP removes the entire table from the
database.
Q9. What do you mean by subquery? How many row comparison operators are used
while working with a subquery?
A query within another query is called a subquery. A subquery is also called an inner query,
which returns the output to be used by another query.
There are three row comparison operators used in subqueries, namely IN, ANY, and ALL.
Q10. What is the difference between a nested subquery and a correlated subquery?
A subquery within another subquery is called a nested subquery. If the output of a
subquery is dependent on the column values of the parent query table, the query is called a
correlated subquery.
Q14. What is the difference between CHAR and VARCHAR2 data type in SQL?
Both these data types are used for characters, but VARCHAR2 is used for character strings
of variable lengths, whereas CHAR is used for character strings of a fixed length. For
example, if we specify the type as CHAR(5), we will not be allowed to store a string of any
other length in this variable. However, if we specify the type of this variable as
VARCHAR2(5), we will be allowed to store strings of variable lengths.
Q15. What is UNIQUE constraint in MySQL? What is the difference between a PRIMARY
KEY and UNIQUE constraints?
A UNIQUE constraint is a single field or a combination of fields that uniquely defines a
record. Some of the fields can contain null values as long as the combination of values is
unique.
The difference between PRIMARY KEY and UNIQUE constraints are -
1. A PRIMARY KEY cannot have NULL value, while UNIQUE constraints can have NULL
values.
2. There is only one PRIMARY KEY in a table, but there can be multiple UNIQUE
constraints. By this, we mean that in a table, there can be only one PRIMARY KEY, but
the UNIQUE constraint can be put up on as many columns as we want.
3. The PRIMARY KEY creates the cluster index automatically, but the UNIQUE constraint
does not.