Data4800 Report Ai
Data4800 Report Ai
DATA4800 Report.pdf
Assignment
Class
Organization
Document Details
Submission ID
trn:oid:::1:2945271804 6 Pages
Download Date
File Name
uploads_4200_2024_06_11_DATA4800_Report_c3b5e68c439d4844.pdf
File Size
1.4 MB
0%
Caution: Percentage may not indicate academic misconduct. Review required.
Our testing has found that there is a higher incidence of false positives when the percentage is less than 20. In order to reduce the
likelihood of misinterpretation, the AI indicator will display an asterisk for percentages less than 20 to call attention to the fact that
the score is less reliable.
However, the final decision on whether any misconduct has occurred rests with the reviewer/instructor. They should use the
percentage as a means to start a formative conversation with their student and/or use it to examine the submitted assignment in
greater detail according to their school's policies.
Non-qualifying text, such as bullet points, annotated bibliographies, etc., will not be processed and can create disparity between the submission highlights and the
percentage shown.
In a longer document with a mix of authentic writing and AI generated text, it can be difficult to exactly determine where the AI writing begins and original writing
ends, but our model should give you a reliable guide to start conversations with the submitting student.
Disclaimer
Our AI writing assessment is designed to help educators identify text that might be prepared by a generative AI tool. Our AI writing assessment may not always be accurate (it may misidentify
both human and AI-generated text) so it should not be used as the sole basis for adverse actions against a student. It takes further scrutiny and human judgment in conjunction with an
organization's application of its specific academic policies to determine whether any academic misconduct has occurred.
Fig 3: Department
Fig 1 : Orange WorkFlow
The department with the most employees is the Research &
Development, though it has a moderate level of attrition.
3.3. Exploratory Data Analysis (EDA) Attrition rate is higher in the Sales department implying
that there may be problems in retaining employees within
EDA involves summarizing and visualizing the main this department. The Human Resources department has the
characteristics of the dataset. However, before performing highest retention rate, and this means that employees in this
EDA, you need to perform Data Preprocessing. These steps department are likely to remain in the company.
are essential to ensure the quality and usability of the
dataset. It involves handling missing values, encoding
categorical variables, and scaling numeric features.
Fig 4: Education
The above graph reveals that employees with less than two
years with the current manager show a higher attrition rate
implying that the duration with the current manager
influences their decision to leave the organization. The
attrition rate decreased for employees who work with the
current manager for a middle to long time: 3-9+ years.
However, its performance declines on test data with AUC Fig 7: Gradient Boosting Configuration
at 0.759, CA at 0.861, and MCC at 0.345, which indicates
that the model have overfit. Low MCC value on test data
indicates that there is a poor balance between true and false
positive rates. (Pratt, Boudhane and Cakula, 2021) 3.6. Model 3 : Neural Network
Training Data
Test Data
Metric Random Gradient Neural Logistic
Forest Boosting Network Regression
Fig 8 : Neural Network Configuration
AUC 0.817 0.781 0.817 0.847
3.7. Model 4 : Logistic Regression
CA 0.864 0.878 0.871 0.864
Logistic Regression model has a stable performance on
training and test data. The AUC, CA, and MCC values for F1 Score 0.830 0.865 0.858 0.839
training data are 0.838, 0.874, and 0.438 respectively. For
testing data, these are 0.847, 0.864, and 0.376. This Precision 0.852 0.865 0.856 0.846
indicates that the model has generalized the patterns
effectively without overfitting. (Chakraborty et al., 2021) Recall 0.864 0.878 0.871 0.864