ML Algorithms Cheat Sheet
ML Algorithms Cheat Sheet
1. Linear Regression
**Key Hyperparameters**:
- `fit_intercept`: Whether to calculate the intercept for the model. Default is `True`.
- `normalize`: If `True`, the regressors X will be normalized before regression. Default is `False`.
**Example Code**:
```python
# Example data
X = ...
y = ...
# Split data
# Model initialization
lr = LinearRegression(fit_intercept=True, normalize=False)
# Model fitting
lr.fit(X_train, y_train)
# Predictions
y_pred = lr.predict(X_test)
# Evaluation
```
2. Logistic Regression
**Overview**: Logistic Regression is used for binary classification problems. It models the probability
**Key Hyperparameters**:
- `penalty`: Used to specify the norm used in the penalization (`'l1'`, `'l2'`, `'elasticnet'`, `'none'`).
- `solver`: Algorithm to use in the optimization problem (`'newton-cg'`, `'lbfgs'`, `'liblinear'`, `'sag'`,
`'saga'`).
**Example Code**:
```python
# Example data
X = ...
y = ...
# Split data
# Model initialization
# Model fitting
log_reg.fit(X_train, y_train)
# Predictions
y_pred = log_reg.predict(X_test)
# Evaluation
print(f'Accuracy: {accuracy}')
```
3. Decision Tree
**Overview**: Decision Tree is a non-parametric supervised learning method used for classification
and regression.
**Key Hyperparameters**:
- `criterion`: The function to measure the quality of a split (`'gini'` for Gini impurity, `'entropy'` for
information gain).
**Example Code**:
```python
# Example data
X = ...
y = ...
# Split data
# Model initialization
min_samples_leaf=1)
# Model fitting
dt.fit(X_train, y_train)
# Predictions
y_pred = dt.predict(X_test)
# Evaluation
print(f'Accuracy: {accuracy}')
```
4. Random Forest
**Overview**: Random Forest is an ensemble method that combines multiple decision trees to
**Key Hyperparameters**:
**Example Code**:
```python
# Example data
X = ...
y = ...
# Split data
# Model initialization
min_samples_split=2, min_samples_leaf=1)
# Model fitting
rf.fit(X_train, y_train)
# Predictions
y_pred = rf.predict(X_test)
# Evaluation
print(f'Accuracy: {accuracy}')
```
5. AdaBoost
**Overview**: AdaBoost is an ensemble method that combines multiple weak classifiers to create a
strong classifier.
**Key Hyperparameters**:
- `base_estimator`: The base estimator from which the boosted ensemble is built (e.g.,
`DecisionTreeClassifier`).
**Example Code**:
```python
# Example data
X = ...
y = ...
# Split data
# Model initialization
# Model fitting
ada.fit(X_train, y_train)
# Predictions
y_pred = ada.predict(X_test)
# Evaluation
print(f'Accuracy: {accuracy}')
```
**Overview**: KNN is a non-parametric method used for classification and regression by finding the
**Key Hyperparameters**:
- `algorithm`: Algorithm used to compute the nearest neighbors (`'auto'`, `'ball_tree'`, `'kd_tree'`,
`'brute'`).
**Example Code**:
```python
from sklearn.neighbors import KNeighborsClassifier
# Example data
X = ...
y = ...
# Split data
# Model initialization
# Model fitting
knn.fit(X_train, y_train)
# Predictions
y_pred = knn.predict(X_test)
# Evaluation
print(f'Accuracy: {accuracy}')
```