A Customized Hiring Process: Problem Statement
A Customized Hiring Process: Problem Statement
INTRODUCTION
There are lot of companies which are available in the market and there are many
training sites which help the candidates both fresher’s as well as experienced in order to
prepare the people and help them find a suitable role in the industry. In the same way the
company might have screened dozens of applicants, vetted a select few through multiple
stages of your hiring process, and now you’re down to the final two candidates. First there is a
candidate. Interviewing her is like playing a great game of tennis. You serve the question and
she smashes it right back with a well-crafted answer. At times your conversation is like
the perfect rally. You cannot fault her game. At the same time there is one more candidate On
paper she looks great. But she is stumbling, struggling to find her feet. She is just not giving
you any kind of game. You think she can do it, but she is not convincing you.
Problem Statement
As the new technologies are evolving day by day, the human resourcing is facing peculiar
challenges in meeting the requirements from client to client. The same set of resumes for a same
JD doesn’t work for all the clients. As every organization carries a different point about a resume
while reading through the resume. Barely matching skills and experience is no more important
alone for the serious organizations. For example, some companies consider the Domain expertise
but some other gives more importance to the number of skills and total yea rs of professional
experience. Human Resource (HR) agencies use various head hunting tools and online search
methods. These search methods connected with the database of millions of resumes.
There is no portal as such for the candidates to find out based on taking certain kind of
tests what kind of company suits them. Hence an effort is made in this project to judge the
candidate capabilities and find the cluster for the candidate based on the answer analysis using
KNN classifier.
METHODOLOGY
Middle Ware –
Controlling layer
Delegate Layer
Resume Upload
This module is responsible for generation of views for the front end using angular and ext
js framework along with java server pages.
There are many servers available in the market which is responsible for handling the web
requests. Most of the other servers are heavy weight and also are commercial in nature. Here we
make use of open source and light weight tomcat server.
This module is responsible for handling the web request and forwarding it to the
authentication layer. This also performs the basic validations like empty checks and regex
validations. If any validation fails then response is send to the front end otherwise the request is
forwarded to the authentication layer and respective services.
Delegate Layer
This layer is used to call the respective service in order to perform a specific task.
Registration Service
This module is responsible for allowing the user to register into the application by
providing details like Name, Password, Confirm Password, Email Id, Gender, Phone No, City,
State and Country
Login Service
This module is responsible for allowing the user in order to provide the username and
password and login into the application either as an administrator or as a user.
This module is used by an administrator in order to create a set of questions and each
question will have the following information.
Test Analysis
This module is responsible for providing questions to the user which will contain
aptitude, technical and general questions and then perform analysis for all the answers and then
generate a matrix which can act as an input to the k nn classification algorithm
K NN Classification
This module is responsible for taking the answers total rating and then performing the
count of nearest neighbors across the 3 clusters like Norm Package, Medium Package and High
Package and then predict this candidate is suitable for which kind of package.
This module is responsible for creating company information like company name,
company url, company desc along with the cluster (Norm, Medium or High) so that after the
classification is done by Knn the list of companies can be provided based on the cluster.
This module is responsible for providing the links for the candidates in order to prepare
for the aptitude, technical and general questions. Each Preparation Item will contain Name, Link
and Category.
Recommendations Service
This module will provide the recommendations of which companies are most suitable for
the candidate.
Resume Upload
The Data Cleaning algorithm is responsible for removal of stop words. Each of resumes
are cleaned by removing the stop words from reviews. These are the set of words which do not
have any specific meaning. The data mining forum has defined set of keywords which do not
have any meaning like a, able, about, across, after, all, almost, also, am, among, an etc
Tokenization of Resumes
Tokenization is a process of converting the clean data into a set of words known as
tokens
Frequency Computation of Resumes
This is a process in which the frequency computation is performed. For each of the
th th
reviews the frequency is computed. Frequency is number of times a i token appears in j .
Resume.
TF-IDF Computation of Resumes
This module is used to compute the Inverse document frequency based on the number of
resumes and then frequency of the resume.
This module is responsible for training the support vector machine based on the test data
set and then performs the attributes frequency. Find appropriate kernel and then classify the
domain to which the resume mostly belongs to. The module also computes the distance and then
classifies the domain to which the resumes belong to.
Ranking of Resumes
The entire query is divided into tokens and then frequency of those tokens across the
various resumes is found and then finally the resumes are ranked based on descending order of
the resume.
This module is to combine multiple criteria of the resume and then rank the best resumes
based on the requirements of multi attribute searches by doing intersection of the set of various
algorithms.
Objectives
1. The first objective is to perform the classification of candidates using KNN machine
learning algorithm for various companies- HIGH, MEDUIM and LOW Package.
2. The second objective is to perform the recommendations of list of companies to the
candidate based on the answer analysis
3. The third objective is to classify the resume into testing and development profiles using
SVM
4. The fourth objective is to provide the HR the capability of ranking the resumes based on
specific criteria keywords and then rank the resume that best suits the requirement based
on modified feature vector
Hardware Requirements
Sl No Parameter Description
1 RAM 4GB - 8GB
2 Hard Disk 500GB – 1TB
Software Requirements
References
[2] Shi Na ; Liu Xumin ; Guan Yong, "Research on k-means Clustering Algorithm: An Improved
k-means Clustering Algorithm", 2010 Third International Symposium on Intelligent Information
Technology and Security Informatics, 22 April 2010
[3] Jie Chen,1 Chunxia Zhang,2 and Zhendong Niu,"A Two-Step Resume Information Extraction
Algorithm", Received 16 August 2017; Revised 26 February 2018; Accepted 26 March 2018;
Published 8 May 2018
[4] Thomas Schmitt, Philippe Caillou, and Michele Sebag,"Matching Jobs and Resumes: a Deep
Collaborative Filtering Task",EPiC Series in Computing
[5] Tsung-Hsien Chiang, Hung-Yi Lo,Shou-De Lin,"A Ranking-based KNN Approach for
Multi-Label Classification",Graduate Institute of Computer Science and Information Engineering
[6] Junjie Wu, Advances in K-means Clustering, Springer-Verlag Berlin Heidelberg, 2012.
[7] Jure Leskovec, Anand Rajaraman, Jeffrey D. Ullman, Mining of Massive Datasets, Stanford
Infolab, 2014.
[8] Michael Steinbach, Vipin Kumar, Pang-Ning Tan, Introduction to Data Mining, Pearson
Publications, 2006.
[9] Yanchang Zhao, R and Data Mining: Examples and Case Studies, 2013.