0% found this document useful (0 votes)
2 views

Code

The document outlines a machine learning workflow using Python, specifically focusing on loading a dataset, preprocessing it, and applying three classification algorithms: Random Forest, K-Nearest Neighbors, and Support Vector Machine. It includes code snippets for model training, evaluation, and optional hyperparameter tuning using GridSearchCV for the SVM model. There are also error messages indicating issues with dataset loading and variable definitions.

Uploaded by

tkfx2jf9zs
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
2 views

Code

The document outlines a machine learning workflow using Python, specifically focusing on loading a dataset, preprocessing it, and applying three classification algorithms: Random Forest, K-Nearest Neighbors, and Support Vector Machine. It includes code snippets for model training, evaluation, and optional hyperparameter tuning using GridSearchCV for the SVM model. There are also error messages indicating issues with dataset loading and variable definitions.

Uploaded by

tkfx2jf9zs
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 4

import pandas as pd

import numpy as np
from sklearn.model_selection import train_test_split, GridSearchCV
from sklearn.ensemble import RandomForestClassifier
from sklearn.neighbors import KNeighborsClassifier
from sklearn.svm import SVC
from sklearn.metrics import accuracy_score, classification_report, confusion_mat

 Load your dataset

data = pd.read_csv() # Replace with your dataset path

File "<ipython-input-2-fb2fbd479d97>", line 1


data = pd.read_csv(BCL-2.RAW.RData) # Replace with your dataset
path
^
SyntaxError: invalid decimal literal

from google.colab import drive


drive.mount('/content/drive')

Double-click (or enter) to edit

Assume the dataset has features in columns



'feature1', 'feature2', ..., 'featureN'

and target labels in a column named 'target'

https://colab.research.google.com/drive/1YSaqI1S13jeRxk_Y9nE3-bjzLW_OETL3?usp=sharing 17/01/25, 12 19
Page 1 of 4
:
features = data.drop(columns=['target'])
labels = data['target']

--------------------------------------------------------------------
-------
NameError Traceback (most recent
call last)
<ipython-input-5-5943e81b0eae> in <cell line: 1>()
----> 1 features = data.drop(columns=['target'])
2 labels = data['target']

NameError: name 'data' is not defined

 Split the data into training and testing sets

X_train, X_test, y_train, y_test = train_test_split(features, labels, test_size=

 Random Forest ClassiFer

rf_model = RandomForestClassifier(random_state=42)
rf_model.fit(X_train, y_train)
rf_predictions = rf_model.predict(X_test)
print("Random Forest Results:")
print("Accuracy:", accuracy_score(y_test, rf_predictions))
print("Classification Report:\n", classification_report(y_test, rf_predictions))

 K-Nearest Neighbors (KNN) ClassiFer

knn_model = KNeighborsClassifier(n_neighbors=5) # Adjust 'n_neighbors' based on


knn_model.fit(X_train, y_train)
knn_predictions = knn_model.predict(X_test)
print("KNN Results:")
print("Accuracy:", accuracy_score(y_test, knn_predictions))
print("Classification Report:\n", classification_report(y_test, knn_predictions)

https://colab.research.google.com/drive/1YSaqI1S13jeRxk_Y9nE3-bjzLW_OETL3?usp=sharing 17/01/25, 12 19
Page 2 of 4
:
 Support Vector Machine (SVM) ClassiFer

svm_model = SVC(kernel='rbf', C=1, gamma='scale', random_state=42)


svm_model.fit(X_train, y_train)
svm_predictions = svm_model.predict(X_test)
print("SVM Results:")
print("Accuracy:", accuracy_score(y_test, svm_predictions))
print("Classification Report:\n", classification_report(y_test, svm_predictions)

Optional: Hyperparameter tuning using



GridSearchCV for SVM

svm_param_grid = {
'C': [0.1, 1, 10],
'gamma': ['scale', 'auto'],
'kernel': ['linear', 'rbf']
}

svm_grid = GridSearchCV(SVC(random_state=42), svm_param_grid, cv=5, scoring='acc


svm_grid.fit(X_train, y_train)
print("Best SVM Parameters:", svm_grid.best_params_)
print("Best SVM Accuracy:", svm_grid.best_score_)

https://colab.research.google.com/drive/1YSaqI1S13jeRxk_Y9nE3-bjzLW_OETL3?usp=sharing 17/01/25, 12 19
Page 3 of 4
:
https://colab.research.google.com/drive/1YSaqI1S13jeRxk_Y9nE3-bjzLW_OETL3?usp=sharing 17/01/25, 12 19
Page 4 of 4
:

You might also like