ml exp 3-7 manuval
ml exp 3-7 manuval
Implement Python program to prepare plots such as bar plot, histogram, distribution plot, box
plot,scatter plot
import numpy as np
np.random.seed(0)
# 1. Bar Plot
plt.figure(figsize=(10, 6))
plt.title('Bar Plot')
plt.xlabel('Categories')
plt.ylabel('Values')
plt.show()
# 2. Histogram
plt.figure(figsize=(10, 6))
plt.title('Histogram')
plt.xlabel('Value')
plt.ylabel('Frequency')
plt.show()
plt.figure(figsize=(10, 6))
plt.title('Distribution Plot')
plt.xlabel('Value')
plt.ylabel('Density')
plt.show()
# 4. Box Plot
plt.figure(figsize=(10, 6))
sns.boxplot(x=data, color='lightgreen')
plt.title('Box Plot')
plt.xlabel('Value')
plt.show()
# 5. Scatter Plot
x = np.random.rand(100)
y = np.random.rand(100) * 100
plt.figure(figsize=(10, 6))
plt.xlabel('X Axis')
plt.ylabel('Y Axis')
plt.show()
Explanation:
import numpy as np
np.random.seed(42)
plt.figure(figsize=(12, 6))
plt.subplot(1, 2, 1)
plt.xlabel('X')
plt.ylabel('y')
m = len(y)
predictions = X @ theta
return cost
# Implement gradient descent
m = len(y)
cost_history = np.zeros(iterations)
for i in range(iterations):
learning_rate = 0.1
iterations = 1000
plt.subplot(1, 2, 2)
plt.plot(cost_history)
plt.ylabel('Cost')
# Show plots
plt.tight_layout()
plt.show()
plt.figure(figsize=(12, 6))
plt.xlabel('X')
plt.ylabel('y')
plt.show()
1. Data Generation:
o Creates synthetic data with a linear relationship plus noise.
o Plots the synthetic data in the first subpl
subplot.
2. Cost Function:
o Computes the cost (mean squared error) for given parameters.
3. Gradient Descent:
o Updates the parameters iteratively to minimize the cost function.
o Tracks the cost history for visualization.
4. Visualization:
o Plots the cost function history to show convergence.
o Plots the synthetic data and the fitted regression line.
5. Output:
o Prints the optimal parameters learned by the model.
Experiment 5:
Implement Multiple linear regression algorithm using Python.
import numpy as np
np.random.seed(42)
X1 = 2 * np.random.rand(100, 1)
X2 = 3 * np.random.rand(100, 1)
X = np.hstack([X1, X2])
# Standardize features
scaler = StandardScaler()
X_scaled = scaler.fit_transform(X)
ax = fig.add_subplot(111, projection='3d')
X_grid_scaled = scaler.transform(X_grid)
ax.set_xlabel('X1')
ax.set_ylabel('X2')
ax.set_zlabel('y')
plt.show()
Experiment 6:
Implement Python Program to build logistic regression and decision tree models using the Python
package statsmodel and sklearn APIs
import numpy as np
import pandas as pd
import statsmodels.api as sm
data['Target'] = y
X_train_sm = sm.add_constant(X_train)
logit_result = logit_model.fit()
print(logit_result.summary())
X_test_sm = sm.add_constant(X_test)
y_pred_proba = logit_result.predict(X_test_sm)
print(confusion_matrix(y_test, y_pred))
print(classification_report(y_test, y_pred))
# Feature scaling
scaler = StandardScaler()
X_train_scaled = scaler.fit_transform(X_train)
X_test_scaled = scaler.transform(X_test)
# Initialize and train the Decision Tree model
dt_model = DecisionTreeClassifier(random_state=42)
dt_model.fit(X_train_scaled, y_train)
y_pred_dt = dt_model.predict(X_test_scaled)
print(confusion_matrix(y_test, y_pred_dt))
print(classification_report(y_test, y_pred_dt))
plt.figure(figsize=(20,10))
plt.show()
Output:
Experiment 7:
Implement Python Program to perform the activities such as - splitting the data set into training and
validation datasets - building model using Python package on training dataset and test on the
validation dataset
import numpy as np
import pandas as pd
scaler = StandardScaler()
X_train_scaled = scaler.fit_transform(X_train)
X_val_scaled = scaler.transform(X_val)
# 4. Build and Train the Model
# Logistic Regression
logistic_model = LogisticRegression(random_state=42)
logistic_model.fit(X_train_scaled, y_train)
y_pred_logistic = logistic_model.predict(X_val_scaled)
print(confusion_matrix(y_val, y_pred_logistic))
print(classification_report(y_val, y_pred_logistic))
# Decision Tree
decision_tree_model = DecisionTreeClassifier(random_state=42)
decision_tree_model.fit(X_train, y_train)
y_pred_tree = decision_tree_model.predict(X_val)
print(confusion_matrix(y_val, y_pred_tree))
print(classification_report(y_val, y_pred_tree))
Explanation:
1. Data Generation:
o A synthetic dataset is created with make_classification.
2. Data Splitting:
o train_test_split divides the data into training and validation sets. Here, 70% of
the data is used for training and 30% for validation.
3. Feature Scaling:
o StandardScaler is used to standardize features for the logistic regression model.
Scaling is not required for the decision tree but is important for models
sensitive to feature scales.
4. Model Building and Training:
o Logistic Regression: Model is trained on the scaled training data.
o Decision Tree: Model is trained on the original training data.
5. Model Evaluation:
o Logistic Regression: Predictions are made on the scaled validation set, and
performance metrics are printed.
o Decision Tree: Predictions are made on the validation set, and performance
metrics are printed.
output:
Logistic Regression Performance
Logistic Regression Performance:
[[30 5]
[ 7 18]]
precision recall f1-score support
accuracy 0.79 60
macro avg 0.79 0.79 0.79 60
weighted avg 0.79 0.79 0.79 60
accuracy 0.82 60
macro avg 0.83 0.83 0.83 60
weighted avg 0.83 0.82 0.82 60