Cloudera Data Analyst
Cloudera Data Analyst
Ensemble methods is a machine learning technique that combines several base models in
order to produce one optimal predictive model. Random Forest is a type of ensemble method.
The number of component classifier in an ensemble has a great impact on the accuracy of the
prediction, although there is a law of diminishing results in ensemble construction.
Constraints are the rules that we can apply on the type of data in a table. That is, we can specify
the limit on the type of data that can be stored in a particular column in a table using constraints.
NOT NULL, UNIQUE, DEFAULT, PRIMARY KEY, FOREIGN KEY, CHECK are the different
constraints in SQL.
4. How do you apply a single format to all the sheets present in a workbook?
To apply the same format to all the sheets of a workbook, follow the given steps:
The RANK() function in the result set defines the rank of each row within your ordered partition.
If both rows have the same rank, the next number in the ranking will be the previous rank plus a
number of duplicates. If we have three records at rank 4, for example, the next level indicated is
7. The DENSE_RANK() function assigns a distinct rank to each row within a partition based on
the provided column value, with no gaps. If we have three records at rank 4, for example, the
next level indicated is 5.
2. Explain One-hot encoding and Label Encoding. How do they affect the dimensionality of
the given dataset?
● Tableau uses a workbook and sheet file structure, much like Microsoft Excel.
● A workbook contains sheets, which can be a worksheet, dashboard, or a story.
● A worksheet contains a single view along with shelves, legends, and the Data
pane.
● A dashboard is a collection of views from multiple worksheets.
● A story contains a sequence of worksheets or dashboards that work together
to convey information.
You can split a column into 2 or more columns by following the below steps:
1. Select the cell that you want to split. Then, navigate to the Data tab, after that, select
Text to Columns. 2. Select the delimiter. 3. Choose the column data format and select
the destination you want to display the split. 4. The final output will look like below
where the text is split into multiple columns.