Student Success Prediction Using Neural Networks
Student Success Prediction Using Neural Networks
Student Success
Prediction Using
Neural Networks
COMP-258 Neural Networks
Group 5
Our Group
❖ Antony Tibursias - Designed the CNN architecture
❖ Jaspreet Kaur - Handled data preprocessing and
feature engineering.
❖ Sumanth Koppinadi- Developed the Flask-based
REST API and Dockerized the back-end.
❖ Ajaypal Singh- Built the React front-end and
integrated it with the API.
❖ Mohana Subramanian- Managed deployment on
Render and coordinated the testing process
Our Focus : Predicting academic outcomes with AI
Integration Hurdles
Establishing smooth communication between the Flask API and React front-end.
Synchronizing environments to ensure seamless functionality.
Deployment on Render
Configuring back-end and front-end deployment pipelines with Docker.
Addressing real-world accessibility and performance issues.
Team Coordination Challenges
Balancing workload and aligning deliverables among diverse team roles.
Overcoming communication barriers for smoother collaboration.
Real-Time Application
Ensuring the predictive model's reliability under real-time data inputs.
Adapting the system for unforeseen user behaviors and dynamic datasets.
Our workflow
Data Collection & Preprocessing:
•Raw academic and demographic data collected.
•Data cleaning, feature engineering, and scaling.
Model Training:
•Neural network trained on processed data using TensorFlow.
•Fine-tuning with hyperparameters like learning rate, epochs, and layers.
Backend Development:
•Flask API to serve predictions from the trained model.
•API endpoint
Frontend Development:
•React.js application to interact with users.
•Form to input features and display predictions.
Integration:
•Frontend communicates with backend via API.
•JSON payload used to send feature data and receive predictions.
Deployment:
•Application hosted on Render.com.
•Backend and frontend deployed seamlessly with environment configurations.
Thursday, December 12, 2024
7
Dataset Overview
Dataset Description:
•Includes academic, demographic, and program-related features
for students.
•Target Variable:: 1 = Success, 0 = Failure).
Features (14 Independent Variables):
•Numerical:
•First Term GPA, Second Term GPA, High School Average
Marks.
•Math Score, English Grade.
•Categorical:
•Gender, Age Group, Residency.
•Previous Education, First Language.
•School, Coop, Fast Track, Funding.
Target Variable Distribution:
•Success: 80%.
•Failure: 20% (class imbalance).
Dataset Features
Academic Performance:
• First Term GPA, Second Term GPA (Scale: 0.0–4.5).
• High School Average Mark (Scale: 0–100).
• Math Score (Scale: 0–50).
Demographics:
• Gender (Male/Female).
• Age Group (Under 20, 20–25, Above 25).
• Residency Status (Domestic/International).
Programs:
• Fast Track, Co-op Participation, Funding Type (Scholarship/No Scholarship).
• School Categories: Business, Engineering, Community, and Health.
Target Variable:
• Binary Classification:
• 1 = Success (Persisted).
• 0 = Failure (Did Not Persist).
Data Preprocessing
1. Missing Value Handling:
•Step: Removed rows with missing values to ensure clean and complete data for model training.
•Tools Used:
•Pandas
•Impact: This ensures no null values disrupt the model's learning process but may reduce the dataset
size.
2. Data Filtering and Normalization:
•Filtering:
•Outlier detection and removal (if applicable) to avoid skewing the model.
•Filtering rows or columns based on domain knowledge or redundancy.
•Normalization:
•Used normalization techniques to bring numerical data to a standard scale, such as between 0 and 1.
•Purpose: Improves the convergence speed of neural networks and ensures features contribute
equally to the learning process.
•Tools Used:
•Scikit-learn's StandardScaler or MinMaxScale.
Loss Function
Purpose: Measures the error in predictions during training.
Type: Binary Cross-Entropy Loss.
Reason: Suitable for binary classification problems as it calculates the difference between predicted
probabilities and actual labels.
Optimizer
Purpose: Updates weights to minimize the loss.
Type: Adam Optimizer.
Reason: Combines the advantages of momentum and adaptive learning rate for faster convergence.
Model Evaluation
• Training and validation accuracy/loss plots.
• Observations:
• Model convergence.
• Minimal overfitting.
Deployment
Workflow
Back-End:
•Flask API with Dockerization.
Front-End:
•React app for user interaction.
Platform:
•Render deployment.
End-to-end workflow:
Challenges and
Lessons Learned
Challenges:
•Data preprocessing complexities.
•Integration bugs between API and front-end.
Lessons:
•Team coordination is crucial.
•Importance of deployment testing.
Summary:
Conclusion and •Achieved a reliable prediction system.
•Deployed successfully for real-time usage.