0% found this document useful (0 votes)
91 views

Educational Data Mining For Student Placement Prediction Using Machine Learning Algorithms - Sreenivasa Rao - International Journal of Engineering & Technology

kfnklnggvlkrngrl

Uploaded by

Dheeraj Pranav
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF or read online on Scribd
0% found this document useful (0 votes)
91 views

Educational Data Mining For Student Placement Prediction Using Machine Learning Algorithms - Sreenivasa Rao - International Journal of Engineering & Technology

kfnklnggvlkrngrl

Uploaded by

Dheeraj Pranav
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF or read online on Scribd
You are on page 1/ 4
International Tournal of Engineering & Technology, 7 (1) (2018) 33-4 International Journal of Engineering & Technology etc: wae scecepben cnn PRIET SPC Research paper Educational data mining for student placement prediction using machine learning algorithms K. Sreenivasa Rao '*, N, Swapna 3, P. Praveon Kumar ® 1 Departmen of Computer Science and Engineering, Vignana Bharathi Institute of Technolog; Hyderabad, India [email protected] {Department of Computer Science and Engineering, Vignana Bharathi Institute of Technology, Hderabad, India “swapnakiran29@gmatl con 3 Research Scholar, Veltech Dr-RR & SR University: Chennai India _praveon [email protected] *Comesponding author E-mail: karaoS1/@gmatl.cov Abstract Data Mining isthe process of extracting useful information fiom large sets of data. Data mining enablesthe uses to have insights into the data and make useful decisions out of the knowledge mined fiom databases. The purpose of higher education exganizations is to offer ‘supetior oppostunities to its students. As with data mining, now-a-days Education Data Mining (EDM) alo is considered as « powerful tool inthe field of education It portrays an effective methed for mining the student's performance based oa various parameters to predict ‘and analyze whether a sicent (he she) will be recrited or notin tae camps placement Precitione are made using the machine leaning algorithms 148, Naive Bayes, Random Forest, and Random Tree in wel tool and Multiple Linear Regression, binomial logistic regres- sion, Reewrsive Partitioning and Rezression Tree (part), conditional inference tree (eee) and Neural Network (nnet) algorithms in R studio. The results obtsined fiom each approaches are then compared with respect to their performace and accuracy levels by graphical Analysis, Based on the result, higher ecucation organizations can offer perice taining to it sents, Keywords: Data Mining, Eduetional Dats Mining, machine leaning algritins. 1. Introduction ‘The purpose of higher education organizations is o offer superior ‘opportunities to its students. Placements are considered to be very ‘important for each and every student in the college. Colleges are fopted by parents and students based on placement eesord of the ‘ofganizstion, Organizations sre ranked bared on placement revord Hence it is beneficial for every organization to have an approach of predicting the placement chances of each student based on some attibutes and parameters Educational data miniug involves new methods aad approaches for discovering the Imorwledge by analyzing the stadent databases to support the devision making process in educational institution in offering the best training to their students ‘The placement ofa student not only depeads ou his/her academic capabilities but also involves the attributes stich as performance in placement assessment examinations conducted by assessment agencies (ex.co-cube), communication skills te. and thus deci= sions are made towards the best prediction ia the campus place- sient and also Which parameter of the eden is contsbting snore towards placement ofthe student. In this work we collected final year student data comprising of 5 attibutes SSC %, Inter %, B Tech aggregate Yo co-cube score an attinute called “placed! that tells us whether the student got placed or not ?, Machine Lestning algovthine are applied in wel tool and R studio on final year student datnset. Actial and predicted placement status is compared for accuracy. The efficiency’accuracy of each model is visualized and tested and based on the performance analysis, ach ‘model results are discussed. 2. Literature survey Molina etal. presented a case of study with educational datasets ‘using Meteeleaening approach for aitomatie parameter maing [1] They used 14 educational data sets and 148 algorithm with only 2 of its parameters and concluded that meta lesming approsch ea bbe used for parameter tuning of devsion tee algorithms. T. Jeevalatha, etal used the decision tee algorithsa to predict the selection of student for the placements [2]. They used Decision Tiee (DT) algorithm such a8 C45, 1D3, snd CHAID which were developed by using Data Mining Repid Miner softwaseltool. ‘Neslam Naik and Seema Purohit built the model to classy the performance ofthe placement of students [3], The error produced to classify validation data, result prediction classification tree was 38.46% and while for validating placemeat prediction classifica ‘ion the was Foal 15.38% respectively, Ajay Kumar Pal and Saurabla Pal collected the data for the study and analysis ofthe student's educational performance basically for training and placement, The autlors used diferent classification ‘Copyright ©2018 De Srenvasa Rao et al Tab ban open acess ale dbetbuted under the Creative Commons Aubin Lites, whieh perms unrestricted use dstibuin, ad nepreducton lau) mein, provided the oigal work s prope eed 43 Internationa! Jowna of Engineering & Tehnal algorithm and used WEKA data mining tool [4]. They concluded that naive Bayes classification model is the better algorithm based ‘the placement data with found acouraey of 86.15% and overall time taken to build the model is at 0 sec. As compared with others ‘Naive Bayes classifier had lowest average emtor ie. 0.28.Ajay. Shiv Sharma, and etal. used the logistic regression model and developed the placement prediction system (PPS) [5], The acoura- ‘and testing of the algorithm was 98.93% and BaheaSen, Eminellcer and DursiaDslen collected the large sad feature rich dataset and build the model to predict the placement test results [6]. They used support vestor machine, CS Decision ‘Tree algoritm, and artificial uewral network. They resolved that C5 Decision Tree algorthoa isthe better prediction model with efficimey of 95% the accuracy of support vector md artificial neural network is 91% and 89%, Mangasuli Sheetal B and Prof Savita Bakare made predictioas using the Data Mining Algor we compare the actual placement and predicted placement for socuracy analysis. Similarly. algorithms namely Multiple Lineae Regression, binomial logistic regression, Recursive Partitioning and Regression Tree (epart). conslitional inference tee (etree) and ‘Neural Network (anet) algorithms are also applied in R studio and placement i predicted Database Preprocessing “Fuzzy logic” and “K nearest neighbor (KNN)* [7], The accuraey obtained after analysis for KNN is 97.33% and for the Fuzzy logic is92.07%, Ramanathan et al, predicted the placement of student by using similarity mearure with mathematical method which is ealled sam of diflerence (SOD) [8]. They made it ebvious thet placement is not s0 easy to predict besause it depends on many atibutes, even the paper is considered with four attabures, ari Gmesh etal [9] discussed various applications of Data Min= ing. They summarized various data mining techuiques, slgoritmas their contribution to various areas of Educational Data Mining ‘They concluded that spart from contribution of EDM in higher ‘education, EDM em be extended to analyze knowledge process of | primary class students to know their lear problems. John Jacob et al. [10] predicted student performance using data ‘mining techniques like Regression and decision trees to know academic failure of students, They also used clustering to group the stadents ae per their academic performance based oa their strengths and weaknesses, They identified that apart fiom chale lenges and cost involved in EDM, EDM implementation requires privacy and ethics of al the stakeholders involved in EDM pro 3. Methodolo; Proposed placement prediction systems is equipped with varions data mining taske sod is depicted in the following Architecture diagram. Edueational data consisting of students’ details and their amarks ele. are colleced, In our dataset we collected Serial no, HTNo, SSC %, B.Tech % and Co-oube score. Then preprocessing is performed to eliminate izelevant attributes like Serial No. and HTNo since (besause) they do (wil) not play any role in analysis, Entre dataset is replicated into vo sets namely taining data snd test dia, there is ao difference between two sets except predicted placement status is placed in one column before the original placement stamis colima Machine algorithms like 148, Naive Bayes, Random Forest, and Random Tree are performed in weks tool te build the model and the learned model is used to predict the placement on test dats, After predicting the plavement on test dats, Training dataset Test dataset ‘Apply the Apply Learned algorithms on. model on Test Training data data (Model Learning) (Prediction) Accuracy comparison and Conclusions Fig 31: Placement Prediction System Arcot 4, Experimental results discussion 4.1. Performance of algorithms in weka tool ‘We performed various machine algoritas in our student dataset in weka tool and tabulated the analysis parameters in the table given below boxe table shows random forest and random tee algorithms are giving 100 % accuracy on student placement dataset and 148 algo- rithm has 88.89 % accuracy and Naibe Bayes got only 61.10 %. It depends on aature of dataset, siace our deta lise only numerical, (0) random forest and random tre algorithms performed well. On oer datasets 148 and Bayes may perform wel. : ths on Stodout Dataset, He Nave Bases Random Forest Random Tee “True Poaives 1% 11 2 3 False Poutives 1 3 0 0 True Negatives s “4 6 6 False Nezaves 8 n 0 ° ‘Aecuscy se.s9% 61.10% 106 100% International Journal of Engineering & Technolo - “= enue ay mia ot onan H@ 6 on cd enen on aie hnneun Fig. 32: Classification Tree Generate by Random Tree Algortim, 4.2. Performance of algorithms in R Studio We performed multiple regression on out dataset and analyzed which atibute of the etudeat is more contributing towneds place- ment of the student by eliminating and adding each of the atibute in multiple regression. Tabulation of regression output is also sepresented in Table 4.2. rom above table it ie evident thet B.Tech peroeatage has contib- ‘uted more towards placement of the students whose data is cole lected in tis paper. It isthe insight made for this dataset only; it ‘may be diferent for other datatets We alto performed various machine learning algorithms in R sta dio like Multiple Regression, binomial logistic regression, recure sive patitioning & repression tree. conditional inference tee and neural networks and tabulated the accuracy in table... here it is obvious that recursive partitioning & regression tree offes high accuracy compared to other methods, Also perfonamaces of algo rithms depend on nature of the dataset, Table 4.2: Mshiple Regression Summery ot Dataset Mail Vater Spams Valse Model Vase Ajusted Square ‘Rema S8C > 1 Scam - 0.06222 008 2 BIECH : 0046 © ont26s oss ‘MORE SIGNIFICANT sc oa 3 BTECH one = oar? ox coctae o5si6 BIECH . onus Q cocuae oss OST oe) 5 co.cuBE 03706. 0.706 0.009 NOT SIGNIFICANT Tachests nomtnal gecane indicates sigmtcance ‘Table 4.3: Performance Analysis of Various Algcriths in R Snuio on Our Dataset ‘WlipieRegres Binomial opine Recursive Paritoning Conditional Infeence xa New 00 Repression Ge Regression Tree Tees Ss True Postves 3 10 0 Fale Poutives 3 4 ° TrueNegatves 67 64 6 6 6 False Negatves 23 20 B 2B 2 Accuracy 74.44% 74.44% 90% 7.40% Tha 5. Conclusion Here educational data mining is performed om final year student information of an orgenization to predict campus placements of students. Machine Ieaming algorithms are performed ia weks ccavironment and R stidio. Results of application of algosithase are tabulated and analyzed that shows random tree algorithm gives 100 % accuracy in prediction on our dataset snd also in R envi rouient Recursive Pasttioing & Regression Tree perform better fand gives 00 % accuracy. We also sevept that perfoomance de- pends on nature of dataset. We conclude that B.Tech percentage atibute is contributing mote towards placements of students, ‘Again it may vary fron dataset to datast. References (8) Moti, M.M. Lona, JM, Romero, ©, & Ventura, $, 2012, "Mita leering approach for auiomne pameter tunag: a case of sgy wth educational datasets, mn Proceedings ofthe Sth items tional eanfereace on edaemvona dna mining, pp. 180-183, ‘T Jeevalatha N Ansa D. Saravana Kismat, “Performance Analy sis of Underradante Students Plcereat Section wing Decision ‘Tree Algocthims, Ieternational Joumal of Comper Apgliations| (0975 £887) Volume 108 No 15, Decsmaber 2015 Neclennk and SeemsPurokit, “Prediction of Final Reet and Placement of Studeats usiag Cissicaticn Alger. ltema- ‘sonal Journal of Computer Applications (0975 ~ $887), Volume S6=No12, October 2012 ‘Ajay Kumar Pal and Saurabh Pal, “Clasiicution Model of Predic- tion for Placement of Students” 1. Maem Eduesion sad Com pte Seience, 2013, 11, 48.56 ‘Ajay Shee Sharma, Swara Prince, ShuthamKapoor and Kesha Kumar, "PPS - Placement Prediction System using Logistic Re ression", IEEE International Coataence on MOOC, lnsevation| and Techaclogy ia Education (MITE), 2014 BalaSen, Emintcar and DurunDelen "Preis secondary education placement-est scores: A dart mining ap- rou ltezatonal joumal of Expest stem Yoth applica, Volume 3.2012 ssve 10, pano. 9468-9476 [anessui Sheetal BI, Prof Savita Bakae, “Prediction of Camps Placemont Una Dats Mining Algorstan Fuzzy loge and K ator fest elsbbor ", Tatemational Jourasl of Advanced Reserch i Computer and Gonsmnnscatin Enaineering Vol 5, Isue 6, Fuse Ramimathen | ct al “Ming Educates! Data for Stadeab! Placement Prediction wing Sum of Difference Method” Inte tonal Journal of Compoter Applicaicns (0975 ~ 8887) Volume 99-No8, Avast 2014 ‘Sai Ganesh A Toy Christy “Applications of Educational Data ‘Mung: A Suey "IEEE Sponsored 2ad Intemational Conference ICIECs. (10) Jon Jacob, Kavya Jha Paarth Kotak, Shubha Puthran “Escationl Dua Mining Teehngies And The Appistions IEEE lem ‘somal Con‘eraace On Green Computing a fseinet OF Things ACGCTOT) 2015 tps: dos org 10 1108 ICGCIoT 20157380675 el a fo) ol onan ro) fo ‘Copprigh © 2018 DrK Srenvasa Rao ot a. Thi an ope ces arc dstrlbuted under the Creative Commons Aston Liens, which perms unrestricted use dstibuin, ad nepreducton lau) mein, provided the oigal work s prope eed

You might also like