Assignment-2 IDS
Assignment-2 IDS
Problem Statement
Formulate a classification problem that closely aligns with a real-life system within our workplace. The
problem should be significant and can potentially benefit from machine learning classification techniques.
It's essential to choose a problem that is genuinely relevant to our work environment.
Dataset Requirements
1. The data set for the chosen problem should be relevant ensuring that it reflects the real-life
aspects of the problem.
2. The data set should consist of a large number of records, approximately 5000 or more, to ensure a
robust analysis.
3. It should have a sufficient number of attributes (10 or more) with various types, including
numerical, nominal, and categorical.
4. You can choose same dataset as chosen in assignment-1 provided it is a classification dataset.
Write python scripts for
Decision Tree Implementation
Split the dataset into training and testing sets
Implement a decision tree classifier using scikit-learn's DecisionTreeClassifier class.
Train the decision tree classifier on the training dataset.
Model Evaluation
Evaluate the performance of the trained decision tree classifier using appropriate evaluation
metrics such as accuracy, precision, recall, F1-score, and confusion matrix. Interpret the findings
from each of the evaluation metrics.
Visualize the decision tree using graphviz or any other suitable visualization tool/library.
Hyperparameter Tuning
Explore different hyperparameters of the decision tree classifier (e.g., max_depth,
min_samples_split, min_samples_leaf, etc.).
Evaluate the performance of the tuned model and compare it with the untuned model.
1
Instructions
This is a group assignment with 3 members in each group
Choose a unique problem statement and data set for our analysis.
Utilize Jupyter notebook for scripting and documentation.
Include visuals where applicable.
Ensure the code is well-documented, providing clear explanations for each step.
Submit the assignment as a single document in PDF or Jupyter notebook format.
Deliverables
How to Submit
Evaluation Criteria