0% found this document useful (0 votes)
61 views69 pages

M9 Documentation

The document presents a mini project focused on forecasting cancer death cases in India using supervised machine learning techniques. It analyzes cancer mortality data from 1990 to 2017 and employs algorithms like linear regression, decision tree regression, and random forest regression, concluding that the random forest model is the most effective. The research aims to aid policymakers and healthcare providers in addressing cancer mortality and improving cancer care in India.

Uploaded by

SANDEEP KUMAR
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
61 views69 pages

M9 Documentation

The document presents a mini project focused on forecasting cancer death cases in India using supervised machine learning techniques. It analyzes cancer mortality data from 1990 to 2017 and employs algorithms like linear regression, decision tree regression, and random forest regression, concluding that the random forest model is the most effective. The research aims to aid policymakers and healthcare providers in addressing cancer mortality and improving cancer care in India.

Uploaded by

SANDEEP KUMAR
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 69

CANCER DEATH CASES FORECASTING USING

SUPERVISED MACHINE LEARNING

A mini project submitted to the JAWAHARLAL NEHRU


TECHNOLOGICAL UNIVERSITY in partial fulfillment of the
requirement for the award of the degree of
BACHELOR OF TECHNOLOGY
IN
COMPUTER SCIENCE & ENGINEERING (AIML)
Submitted by

P.SATHWIKA 22D31A6640
CH.ROHITH REDDY 22D31A6611
J.AJITH REDDY 22D31A6621
K.VAISHNAVI 22D31A6627

Under the Guidance of


Mrs.P.SNEHA
Asst. Professor, Dept. of CSE

DEPARTMENT OF COMPUTER SCIENCE & ENGINEERING


INDUR INSTITUTE OF ENGINEERING & TECHNOLOGY
(Affiliated to J.N.T.University, Hyderabad)
Ponnala (Vill), Siddipet (Dist), Telangana State – 502277

June, 2025
i
INDUR INSTITUTE OF ENGINEERING & TECHNOLOGY
Ponnala (V), Siddipet (Mdl&Dist), Telangana State, PIN: 502277
DEPARTMENT OF COMPUTER SCIENCE & ENGINEERING

Date: / /2025

CERTIFICATE

This is to certify that the thesis “CANCER DEATH CASES FORECASTING USING
SUPERVISED MACHINE LEARNING” being submitted by

P.SATHWIKA 22D31A6640
CH.ROHITH REDDY 22D31A6611
J.AJITH REDDY 22D31A6621
K.VAISHNAVI 22D31A6627

In partial fulfillment for the award of “Bachelor of Technology”in Computer Science


& Engineering (AIML) (B. Tech) to the “Jawaharlal Nehru Technological University”
is a record of bonafide mini project work carried out by them under our guidance and
supervision.
The results embodied in this thesis have not been submitted to any other
University or Institute for the award of any degree or diploma.

Mrs.P.SNEHA DR. T. BENARJI


Project Guide Head of the Department
Asst.Professor, Dept.of CSE Assoc.Professor, Dept.of CSE

EXTERNAL EXAMINER

ii
ACKNOWLEDGEMENT

We are thankful to Mrs. P. SNEHA, Project Guide, Asst. Professor,


Dept. of CSE who guided us a lot by his favorable suggestions to complete our
project. She is the research-oriented personality with higher end technical exposure.

We are thankful to DR. T. BENARJI, Head of the Department, Assoc.


Professor, Dept. of CSE, Indur Institute of Engineering & Technology, for
extending his help in the department academic activities in the course duration. He is
the personality of dynamic, enthusiastic in the academic activities.

We extend our thanks to DR. VP. RAJU, Principal, Indur Institute of


Engineering and Technology, Siddipet for extending his help throughout the
duration of this project.

We sincerely acknowledge to all the lecturers of the Dept. of CSE for their
motivation during my B. Tech course.

We would like to say thanks to all of our friends for their timely help and
encouragement.

P.SATHWIKA 22D31A6640
CH.ROHITH REDDY 22D31A6611
J.AJITH REDDY 22D31A6621
K.VAISHNAVI 22D31A6627

iii
DECLARATION

W e here by declare that the project work entitled “CANCER DEATH


CASES FORECASTING USING SUPERVISED MACHINE LEARNING”
submitted to the university, is a record of an original work done by us under the guidance
of Mrs. P. SNEHA, Asst. Prof, Department of Computer Science & Engineering,
INDUR INSTITUTE OF ENGINEERING & TECHNOLOGY and this project work
has not formed the basis for the award of any degree\diploma\associateship\fellowship
and any other similar titles and no part of it has been published or sent for the publication
at the time of submission.

BY:
P.SATHWIKA 22D31A6640
CH.ROHITH REDDY 22D31A6611
J.AJITH REDDY 22D31A6621
K.VAISHNAVI 22D31A6627

iv
ABSTRACT

In India, like in the rest of the world, cancer is a major killer. This research objective is to
predict cancer mortality in India, using supervised machine learning methods. Cancer mortality
rates in India between 1990 and 2017 are provided by age group, gender, and region using data
from the Global Burden of Disease Study. We employ three distinct supervised learning
algorithms—linear regression, decision tree regression, and random forest regression—after
performing data preprocessing, which includes missing value imputation and feature engineering.

Using a variety of criteria, we analyze the effectiveness of these models and conclude that
the random forest regression model is superior to the other two. The scope of research is provide
a long-term prediction of cancer mortality in India using the best model so it will help health
department to work on it. Our research has implications for policymakers and healthcare providers
in India, where it may inform efforts to reduce cancer rates and improve cancer care.

The model is trained on a dataset containing essential health indicators such as age,
BMI, blood pressure levels, cholesterol, smoking habits, diabetes status, family history, and
physical activity levels. The dataset undergoes preprocessing steps, including data cleaning,
feature selection, and class balancing techniques, to enhance model performance.

v
CONTENTS

1 INTRODUCTION 9
2 LITERATURE SURVEY 10-14
3 SYSTEM ANALYSIS 15-35
3.1 Existing System 15
3.1.1 Disadvantages of Existing System 15
3.2 Proposed System 15
3.2.1 Advantages of Proposed System 16

3.3 System Environment 16-32


3.3.1 Python 16-25
3.3.2 Django 25-32
3.4 System Study 32-34
3.4.1 Feasibility Study 32
3.5 Requirement Analysis 34-35
3.5.1 Requirement Specification 34-35
3.6 System Requirements 35
3.6.1 Software Requirements 35
3.6.2 Hardware Requirements 35
4 SYSTEM DESIGN 36-41
4.1 System Architecture 36
4.2 Data Flow Diagram 36-37
4.3 UML Diagrams 37-39
4.3.1 Use Case Diagram 38
4.3.2 Class Diagram 39
4.3.3 Sequence Diagram 40
4.3.4 Activity Diagram 40-41
5 IMPLEMENTATION 42-54
5.1 Modules 42
5.2 Modules Description 42-43
5.2.1 User 42

vi
5.2.2 Admin 43

5.2.3 Machine Learning 43


5.3 Sample Code 43-52
5.4 Input and Output Design 53-54
5.4.1 Input Design 53
5.4.2 Output Design 54
6 RESULTS 55-59
7 SYSTEM TESTING 60-63
7.1 Unit Testing 60
7.2 Integration Testing 61
7.3 Functional Testing 61
7.4 System Testing 61
7.5 White box Testing 62
7.6 Block box Testing 62
7.7 Test strategy and approach 62
7.8 Sample Test Cases 63
8 CONCLUSION 64
8.1 Further Enhancement 65
9 BIBLIOGRAPHY 66-67

vii
LIST OF FIGURES

FIGURE NO: FIGURE NAME: PAGE NO:

1 Django Architecture 26
2 System Architecture 36
3 Data Flow Diagram 37
4 Use Case Diagram 39
5 Class Diagram 39
6 Sequence Diagram 40
7 Activity Diagram 41
8 Home Page 55
9 User Registration 55
10 Admin Login 56
11 Admin Home 56
12 User List 57
13 User Login 57
14 User Home 58
15 Dataset 58
16 Classification Results 59
17 Prediction 59

viii
CHAPTER 1

INTRODUCTION

In India, like in the rest of the globe, cancer is a leading killer. Population growth, age,
changes in lifestyle, and environmental factors are all contributing to a rising tide of cancer in
India. National Cancer Registry Program projections put the number of new cancer diagnoses in
India at 1.39 million in 2025, up from an expected 1.16 million in 2018. Predicting cancer fatalities
inIndia is critical for planning and allocating resources for cancer prevention, early detection,
And treatment.

Cancer prediction and diagnosis are only two areas where machine learning has showed
promising improvements in recent years. Predictions of mortality, morbidity, and disease
incidence have all seen extensive application of supervised machine learning methods. To predict
cancer mortality in India, we present a supervised machine learning method. Cancer mortality rates
in India between 1990 and 2017 are provided by age group, gender, and region using data from
the Global Burden of Disease Study.

9
CHAPTER 2
LITERATURE SURVEY
CANCER DEATH CASES FORECASTING USING SUPERVISED
MACHINE LEARNING

The rates of both new cases and deaths from cancer are expected to rise in the future years, making
it a major public health concern. Numerous research have tried to forecast cancer incidence and
death rates using statistical models and machine learning algorithms in an effort to better
understand and anticipate the burden of cancer. The purpose of this literature review is to
synthesise the results of current research into the problem of estimating cancer incidence and death
rates across geographic areas and scientific paradigms.

1) “A Novel Approach to Modeling and Forecasting Cancer Incidence and


Mortality Rates through Web Queries and Automated Forecasting
Algorithms

Authors: Tudor, Cristiana

Simple Summary: Cancer remains a global burden, currently causing nearly one in six deaths
worldwide. Accurate projections of cancer incidence and mortality are needed for effective and
efficient policymaking, accurate resource allocation, and to assess the impact of newly introduced
policies and measures. However, the COVID-19 pandemic disrupted public health systems and
caused a significant number of cancers to remain undiagnosed, thus affecting the quality of official
statistics and their usefulness for health studies. This paper addresses this issue by proposing novel
cancer incidence/cancer mortality models based on population web-search habits and historical
links with official health variables.

The models are empirically estimated using data from one of the most vulnerable
European Union (EU) members, Romania, a country that consistently reports lower survival rates

10
than the EU average, and are further used to forecast cancer incidence and mortality rates in the
country. Research findings have important policy implications, and the novel framework, owing
to its generalizability, can be applied to the same task in other countries. Overall, the results
indicate a continuation of the increasing trends in cancer incidence and mortality in Romania and
thus underline the urgency to change the status quo in the Romanian public-health system. Cancer
remains a leading cause of worldwide mortality and is a growing, multifaceted global burden. As
a result, cancer prevention and cancer mortality reduction are counted among the most pressing
public health issues of the twenty-first century. In turn, accurate projections of cancer incidence
and mortality rates are paramount for robust policymaking, aimed at creating efficient and
inclusive public health systems and also for establishing a baseline to assess the impact of newly
introduced public health measures. Within the European Union (EU), Romania consistently
reports higher mortality from all types of cancer than the EU average, caused by an inefficient and
underfinanced public health system and lower economic development that in turn have created the
phenomenon of "oncotourism".

This paper aims to develop novel cancer incidence/cancer mortality models based on
historical links between incidence and mortality occurrence as reflected in official statistics and
population web-search habits. Subsequently, it employs estimates of the web query index to
produce forecasts of cancer incidence and mortality rates in Romania. Various statistical and
machine-learning models—the autoregressive integrated moving average model (ARIMA), the
Exponential Smoothing State Space Model with Box-Cox Transformation, ARMA Errors, Trend,
and Seasonal Components (TBATS), and a feed-forward neural network nonlinear autoregression
model, or NNAR—are estimated through automated algorithms to assess in-sample fit and out-of-
sample forecasting accuracy for web-query volume data.

Forecasts are produced with the overperforming model in the out-of-sample context
(i.e., NNAR) and fed into the novel incidence/mortality models. Results indicate a continuation of
the increasing trends in cancer incidence and mortality in Romania by 2026, with projected levels
for the age-standardized total cancer incidence of 313.8 and the age-standardized mortality rate of
233.8 representing an increase of 2%, and, respectively, 3% relative to the 2019 levels.

Research findings thus indicate that, under the no-change hypothesis, cancer will
remain a significant burden in Romania and highlight the need and urgency to improve the status

11
quo in the Romanian public health system.

2)Predicting the burden of cancer in Switzerland up to 2025

AUTHORS: B. Trächsel, E. Rapiti, A. Feller, V. Rousson, I. Locatelli, and J.-


L. Bulliard,

Predicting the short-term evolution of the number of cancers is essential for planning investments
and allocating health resources. The objective of this study was to predict the numbers of cancer
cases and of the 12 most frequent cancer sites, and their age-standardized incidence rates, for the
years 2019–2025 in Switzerland. Projections of the number of malignant cancer cases were
obtained by combining data from two sources: forecasts of national age-standardized cancer
incidence rates and population projections from the Swiss Federal Statistical Office. Age-
standardized cancer incidence rates, approximating the individual cancer risk, were predicted by a
low-order Autoregressive Integrated Moving Average (ARIMA) model. The contributions of
changes in cancer risk (epidemiological component) and population aging and growth
(demographic components) to the projected number of new cancer cases were each quantified.
Between 2018 and 2025, age-standardized cancer incidence rates are predicted to stabilize for men
and women at around 426 and 328/100,000, respectively (<1% change).

These projected trends are expected for most cancer sites. The annual number of cancers
is expected to increase from 45,676 to 52,552 (+15%), more so for men (+18%) than for women
(+11%). These increases are almost entirely due to projected changes in population age structure
(+12% for men and +6% for women) and population growth (+6% for both sexes). The rise in
numbers of expected cancers for each site is forecast to range from 4.15% (thyroid in men) to 26%
(bladder in men). While ranking of the three most frequent cancers will remain unchanged for men
(1st prostate, 2nd lung, 3rd colon-rectum), colorectal cancer will overtake by 2025 lung cancer as
the second most common female cancer in Switzerland, behind breast cancer. Effective and
sustained prevention measures, as well as infrastructural interventions, are required to counter the
increase in cancer cases and prevent any potential shortage of professionals in cancer care.

12
3) Prediction of cancer incidence rates for the European continent using
machine learning models
AUTHORS : B. Sekeroglu and K. Tuncal,

Cancer is one of the most important and common public health problems on Earth that can
occur in many different types. Treatments and precautions are aimed at minimizing the deaths
caused by cancer; however, incidence rates continue to rise. Thus, it is important to analyze and
estimate incidence rates to support the determination of more effective precautions. In this
research, 2018 Cancer Datasheet of World Health Organization (WHO), is used and all countries
on the European Continent are considered to analyze and predict the incidence rates until 2020,
for Lung cancer, Breast cancer, Colorectal cancer, Prostate cancer and All types of cancer, which
have highest incidence and mortality rates. Each cancer type is trained by six machine learning
models namely, Linear Regression, Support Vector Regression, Decision Tree, Long-Short Term
Memory neural network, Backpropagation neural network, and Radial Basis Function neural
network according to gender types separately. Linear regression and support vector regression
outperformed the other models with the
R^2 scores 0.99 and 0.98, respectively, in initial experiments, and then used for prediction of
incidence rates of the considered cancer types. The ML models estimated that the maximum rise
of incidence rates would be in colorectal cancer for females by 6%.

4.) Analysis on Prediction of COVID-19 with Machine Learning Algorithms


AUTHORS: V. R. J and A. Jakka,

During the pandemic, the most significant reason for the deep concern for COVID-19 is that it
spreads from individual to individual through contact or by staying close with the diseased
individual. COVID-19 has been understood as an overall pandemic, and a couple of assessments
is being performed using various numerical models. Machine Learning (ML) is commonly used
in every field. Forecasting systems based on ML have shown their importance in interpreting
perioperative effects to accelerate decision-making in the potential course of action. ML models

have been used for long to define and prioritize adverse threat variables in several technology
13
domains. To manage forecasting challenges, many prediction approaches have been used
extensively. The paper shows the ability of ML models to estimate the amount of forthcoming
COVID-19 victims that is now considered a serious threat to civilization. COVID-19 describes the
comparative study on ML algorithms for predicting COVID-19, depicts the data to be predicted,
and analyses the attributes of COVID-19 cases in different places. It gives an underlying
benchmark to exhibit the capability of ML models for future examination.

14
CHAPTER 3
SYSTEM ANALYSIS

3.1 EXISTING SYSTEM

The existing system for forecasting cancer death cases using supervised machine learning involves
collecting historical cancer-related data, preprocessing it, splitting it into training and testing sets,
selecting or engineering relevant features, and training a machine learning model. The model's
performance is evaluated using metrics like MAE, MSE, RMSE, and R². Interpretability is crucial,
especially in healthcare, and the model is deployed for predictions in healthcare systems or
applications. Continuous monitoring and ethical considerations are paramount, and collaboration
with domain experts is essential for validation and insights. This system aims to predict cancer
deaths for early intervention and resource allocation in healthcare

3.1.1 Disadvantages of Existing System:

 Data dependency: Requires large, high-quality datasets for optimal performance.

 Computational complexity: Advanced models demand significant processing power.

 Interpretability challenges: Some deep learning models lack transparency.

 Ethical concerns: Privacy risks associated with handling sensitive patient data.

3.2 Proposed System:

The proposed system for forecasting cancer death cases using supervised machine learning
introduces several enhancements to the existing system. These enhancements include more
comprehensive data collection, advanced data preprocessing, feature engineering, a wider range
of model selection, and advanced explainability techniques. The system also emphasizes real-
time data integration, ethical and privacy measures, continuous learning, deployment in
clinical decision support systems, scalability, and performance optimization.

15
Collaboration with healthcare professionals and validation through clinical trials are key
components. Overall, the proposed system aims to provide more accurate and timely predictions,
benefiting patient care and healthcare resource allocation while maintaining ethical and
transparent practices.

3.1.2 Advantages of Proposed System:

Higher accuracy: Advanced models improve prediction reliability.


Better generalization: Works across diverse patient demographics.
Real-time adaptability: Updates predictions as new data arrives.
Improved decision-making: Helps policymakers allocate healthcare resources effectively.

3.3 System Environment:

3.3.1 Python:

Python is a general-purpose interpreted, interactive, object-oriented, and high-level programming


language.An interpreted language, Python has a design philosophy that emphasizes
code readability (notably using whitespace indentation to delimit code blocks rather than curly
brackets or keywords), and a syntax that allows programmers to express concepts in fewer lines
of code than might be used in languages such as C++or Java. It provides constructs that enable
clear programming on both small and large scales. Python interpreters are available for
many operating systems. CPython, the reference implementation of Python, is open
source software and has a community-based development model, as do nearly all of its variant
implementations. CPython is managed by the non-profit Python Software Foundation. Python
features a dynamic type system and automatic memory management. It supports multiple
programming paradigms, including object-oriented, imperative, functional and procedural, and
has a large and comprehensive standard library.

16
Interactive Mode Programming

Invoking the interpreter without passing a script file as a parameter brings up the following prompt

$ python
Python 2.4.3 (#1, Nov 11 2010, 13:34:43)
[GCC 4.1.2 20080704 (Red Hat 4.1.2-48)] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>>
Type the following text at the Python prompt and press the Enter −

>>> print "Hello, Python!"


If you are running new version of Python, then you would need to use print statement with
parenthesis as in print ("Hello, Python!");. However in Python version 2.4.3, this produces the
following result −

Hello, Python!

Script Mode Programming

Invoking the interpreter with a script parameter begins execution of the script and continues until
the script is finished. When the script is finished, the interpreter is no longer active.

Let us write a simple Python program in a script. Python files have extension .py. Type the
following source code in a test.py file −

Live Demo
print "Hello, Python!"
We assume that you have Python interpreter set in PATH variable. Now, try to run this program
as follows −

$ python test.py
This produces the following result −

Hello, Python!
Let us try another way to execute a Python script. Here is the modified test.py file −

Live Demo
#!/usr/bin/python

print "Hello, Python!"


We assume that you have Python interpreter available in /usr/bin directory. Now, try to run this
program as follows −

17
$ chmod +x test.py # This is to make file executable
$./test.py
This produces the following result −

Hello, Python!

Python Identifiers

A Python identifier is a name used to identify a variable, function, class, module or other object.
An identifier starts with a letter A to Z or a to z or an underscore (_) followed by zero or more
letters, underscores and digits (0 to 9).

Python does not allow punctuation characters such as @, $, and % within identifiers. Python is a
case sensitive programming language. Thus, Manpower and manpower are two different
identifiers in Python.

Here are naming conventions for Python identifiers −

Class names start with an uppercase letter. All other identifiers start with a lowercase letter.

Starting an identifier with a single leading underscore indicates that the identifier is private.

Starting an identifier with two leading underscores indicates a strongly private identifier.

If the identifier also ends with two trailing underscores, the identifier is a language-defined special
name.

Reserved Words
The following list shows the Python keywords. These are reserved words and you cannot use them as
constant or variable or any other identifier names. All the Python keywords contain lowercase letters only.

and exec not


assert finally or
break for pass
class from print
continue global raise
def if return
del import try
elif in while
else is with
except lambda yield

18
Lines and Indentation
Python provides no braces to indicate blocks of code for class and function definitions or flow
control. Blocks of code are denoted by line indentation, which is rigidly enforced.

The number of spaces in the indentation is variable, but all statements within the block must be
indented the same amount. For example −

if True:
print "True"
else:
print "False"
However, the following block generates an error −

if True:
print "Answer"
print "True"
else:
print "Answer"
print "False"
Thus, in Python all the continuous lines indented with same number of spaces would form a block.
The following example has various statement blocks −

Note − Do not try to understand the logic at this point of time. Just make sure you understood
various blocks even if they are without braces.

#!/usr/bin/python

import sys

try:
# open file stream
file = open(file_name, "w")
except IOError:
print "There was an error writing to", file_name
sys.exit()
print "Enter '", file_finish,
print "' When finished"
while file_text != file_finish:
file_text = raw_input("Enter text: ")
if file_text == file_finish:
# close the file
file.close
break
file.write(file_text)
file.write("\n")
file.close()
file_name = raw_input("Enter filename: ")
19
if len(file_name) == 0:
print "Next time please enter something"
sys.exit()
try:
file = open(file_name, "r")
except IOError:
print "There was an error reading file"
sys.exit()
file_text = file.read()
file.close()
print file_text
Multi-Line Statements
Statements in Python typically end with a new line. Python does, however, allow the use of the
line continuation character (\) to denote that the line should continue. For example −

total = item_one + \
item_two + \
item_three
Statements contained within the [], {}, or () brackets do not need to use the line continuation
character. For example −

days = ['Monday', 'Tuesday', 'Wednesday',


'Thursday', 'Friday']
Quotation in Python
Python accepts single ('), double (") and triple (''' or """) quotes to denote string literals, as long as
the same type of quote starts and ends the string.

The triple quotes are used to span the string across multiple lines. For example, all the following
are legal −

word = 'word'
sentence = "This is a sentence."
paragraph = """This is a paragraph. It is
made up of multiple lines and sentences."""
Comments in Python
A hash sign (#) that is not inside a string literal begins a comment. All characters after the # and
up to the end of the physical line are part of the comment and the Python interpreter ignores them.

Live Demo
#!/usr/bin/python

# First comment
print "Hello, Python!" # second comment
This produces the following result −

Hello, Python!
You can type a comment on the same line after a statement or expression −

20
name = "Madisetti" # This is again comment
You can comment multiple lines as follows −

# This is a comment.
# This is a comment, too.
# This is a comment, too.
# I said that already.
Following triple-quoted string is also ignored by Python interpreter and can be used as a multiline
comments:
'''
This is a multiline
comment.
'''
Using Blank Lines
A line containing only whitespace, possibly with a comment, is known as a blank line and Python
totally ignores it.

In an interactive interpreter session, you must enter an empty physical line to terminate a multiline
statement.

Waiting for the User


The following line of the program displays the prompt, the statement saying “Press the enter key
to exit”, and waits for the user to take action −

#!/usr/bin/python

raw_input("\n\nPress the enter key to exit.")


Here, "\n\n" is used to create two new lines before displaying the actual line. Once the user
presses the key, the program ends. This is a nice trick to keep a console window open until the
user is done with an application.
Multiple Statements on a Single Line
The semicolon ( ; ) allows multiple statements on the single line given that neither statement
starts a new code block. Here is a sample snip using the semicolon.
import sys; x = 'foo'; sys.stdout.write(x + '\n')
Multiple Statement Groups as Suites
A group of individual statements, which make a single code block are called suites in Python.
Compound or complex statements, such as if, while, def, and class require a header line and a
suite.
Header lines begin the statement (with the keyword) and terminate with a colon ( : ) and are
followed by one or more lines which make up the suite. For example −

if expression :
suite
elif expression :
suite
else :
Suite

21
Command Line Arguments
Many programs can be run to provide you with some basic information about how they should be
run. Python enables you to do this with -h −

$ python -h
usage: python [option] ... [-c cmd | -m mod | file | -] [arg] ...
Options and arguments (and corresponding environment variables):
-c cmd : program passed in as string (terminates option list)
-d : debug output from parser (also PYTHONDEBUG=x)
-E : ignore environment variables (such as PYTHONPATH)
-h : print this help message and exit
You can also program your script in such a way that it should accept various options. Command
Line Arguments is an advanced topic and should be studied a bit later once you have gone through
rest of the Python concepts.

Python Lists

The list is a most versatile datatype available in Python which can be written as a list of comma-
separated values (items) between square brackets. Important thing about a list is that items in a list
need not be of the same type.

Creating a list is as simple as putting different comma-separated values between square brackets.
For example −

list1 = ['physics', 'chemistry', 1997, 2000];


list2 = [1, 2, 3, 4, 5 ];
list3 = ["a", "b", "c", "d"]
Similar to string indices, list indices start at 0, and lists can be sliced, concatenated and so on.
A tuple is a sequence of immutable Python objects. Tuples are sequences, just like lists. The
differences between tuples and lists are, the tuples cannot be changed unlike lists and tuples use
parentheses, whereas lists use square brackets.

Creating a tuple is as simple as putting different comma-separated values. Optionally you can put
these comma-separated values between parentheses also. For example −

tup1 = ('physics', 'chemistry', 1997, 2000);


tup2 = (1, 2, 3, 4, 5 );
tup3 = "a", "b", "c", "d";
The empty tuple is written as two parentheses containing nothing −

tup1 = ();
To write a tuple containing a single value you have to include a comma, even though there is
only one value −

tup1 = (50,);
Like string indices, tuple indices start at 0, and they can be sliced, concatenated, and so on.

22
Accessing Values in Tuples
To access values in tuple, use the square brackets for slicing along with the index or indices to
obtain value available at that index. For example −

Live Demo
#!/usr/bin/python

tup1 = ('physics', 'chemistry', 1997, 2000);


tup2 = (1, 2, 3, 4, 5, 6, 7 );
print "tup1[0]: ", tup1[0];
print "tup2[1:5]: ", tup2[1:5];
When the above code is executed, it produces the following result −

tup1[0]: physics
tup2[1:5]: [2, 3, 4, 5]
Updating Tuples

Accessing Values in Dictionary


To access dictionary elements, you can use the familiar square brackets along with the key to
obtain its value. Following is a simple example −

Live Demo
#!/usr/bin/python

dict = {'Name': 'Zara', 'Age': 7, 'Class': 'First'}


print "dict['Name']: ", dict['Name']
print "dict['Age']: ", dict['Age']
When the above code is executed, it produces the following result −

dict['Name']: Zara
dict['Age']: 7
If we attempt to access a data item with a key, which is not part of the dictionary, we get an error
as follows −

Live Demo
#!/usr/bin/python

dict = {'Name': 'Zara', 'Age': 7, 'Class': 'First'}


print "dict['Alice']: ", dict['Alice']
When the above code is executed, it produces the following result −

dict['Alice']:
Traceback (most recent call last):
File "test.py", line 4, in <module>
print "dict['Alice']: ", dict['Alice'];
KeyError: 'Alice'
Updating Dictionary
You can update a dictionary by adding a new entry or a key-value pair, modifying an existing

23
entry, or deleting an existing entry as shown below in the simple example −

Live Demo
#!/usr/bin/python

dict = {'Name': 'Zara', 'Age': 7, 'Class': 'First'}


dict['Age'] = 8; # update existing entry
dict['School'] = "DPS School"; # Add new entry

print "dict['Age']: ", dict['Age']


print "dict['School']: ", dict['School']
When the above code is executed, it produces the following result −

dict['Age']: 8
dict['School']: DPS School
Delete Dictionary Elements
You can either remove individual dictionary elements or clear the entire contents of a dictionary.
You can also delete entire dictionary in a single operation.

To explicitly remove an entire dictionary, just use the del statement. Following is a simple
example −

Live Demo
#!/usr/bin/python

dict = {'Name': 'Zara', 'Age': 7, 'Class': 'First'}


del dict['Name']; # remove entry with key 'Name'
dict.clear(); # remove all entries in dict
del dict ; # delete entire dictionary

print "dict['Age']: ", dict['Age']


print "dict['School']: ", dict['School']
This produces the following result. Note that an exception is raised because after del dict
dictionary does not exist any more −

dict['Age']:
Traceback (most recent call last):
File "test.py", line 8, in <module>
print "dict['Age']: ", dict['Age'];
TypeError: 'type' object is unsubscriptable
Note − del() method is discussed in subsequent section.

Properties of Dictionary Keys


Dictionary values have no restrictions. They can be any arbitrary Python object, either standard
objects or user-defined objects. However, same is not true for the keys.

24
There are two important points to remember about dictionary keys −

(a) More than one entry per key not allowed. Which means no duplicate key is allowed. When
duplicate keys encountered during assignment, the last assignment wins. For example −

Live Demo
#!/usr/bin/python

dict = {'Name': 'Zara', 'Age': 7, 'Name': 'Manni'}


print "dict['Name']: ", dict['Name']
When the above code is executed, it produces the following result −

dict['Name']: Manni
(b) Keys must be immutable. Which means you can use strings, numbers or tuples as dictionary
keys but something like ['key'] is not allowed. Following is a simple example −

Live Demo
#!/usr/bin/python

dict = {['Name']: 'Zara', 'Age': 7}


print "dict['Name']: ", dict['Name']
When the above code is executed, it produces the following result −

Traceback (most recent call last):


File "test.py", line 3, in <module>
dict = {['Name']: 'Zara', 'Age': 7};
TypeError: unhashable type: 'list'
Tuples are immutable which means you cannot update or change the values of tuple elements.
You are able to take portions of existing tuples to create new tuples as the following example
demonstrates −

Live Demo
#!/usr/bin/python

tup1 = (12, 34.56);


tup2 = ('abc', 'xyz');

# Following action is not valid for tuples


# tup1[0] = 100;

# So let's create a new tuple as follows


tup3 = tup1 + tup2;
print tup3;
When the above code is executed, it produces the following result −

(12, 34.56, 'abc', 'xyz')


Delete Tuple Elements

25
Removing individual tuple elements is not possible. There is, of course, nothing wrong with
putting together another tuple with the undesired elements discarded.

To explicitly remove an entire tuple, just use the del statement. For example −

Live Demo
#!/usr/bin/python

tup = ('physics', 'chemistry', 1997, 2000);


print tup;
del tup;
print "After deleting tup : ";
print tup;
This produces the following result. Note an exception raised, this is because after del tup tuple
does not exist any more −

('physics', 'chemistry', 1997, 2000)


After deleting tup :
Traceback (most recent call last):
File "test.py", line 9, in <module>
print tup;
NameError: name 'tup' is not defined

3.3.2 Django:
Django is a high-level Python Web framework that encourages rapid development and
clean, pragmatic design. Built by experienced developers, it takes care of much of the hassle of
Web development, so you can focus on writing your app without needing to reinvent the wheel.
It’s free and open source.

Django's primary goal is to ease the creation of complex, database-driven websites. Django
emphasizes reusabilityand "pluggability" of components, rapid development, and the principle
of don't repeat yourself. Python is used throughout, even for settings files and data models.

26
Django also provides an optional administrative create, read, update and delete interface that is
generated dynamically through introspection and configured via admin models

27
Create a Project :
Whether you are on Windows or Linux, just get a terminal or a cmd prompt and navigate to the
place you want your project to be created, then use this code −

$ django-admin startproject myproject


This will create a "myproject" folder with the following structure −

myproject/
manage.py
myproject/
__init__.py
settings.py
urls.py
wsgi.py
The Project Structure
The “myproject” folder is just your project container, it actually contains two elements −

manage.py − This file is kind of your project local django-admin for interacting with your
project via command line (start the development server, sync db...). To get a full list of command
accessible via manage.py you can use the code −

$ python manage.py help


The “myproject” subfolder − This folder is the actual python package of your project. It contains
four files −

__init__.py − Just for python, treat this folder as package.

settings.py − As the name indicates, your project settings.

urls.py − All links of your project and the function to call. A kind of ToC of your project.

wsgi.py − If you need to deploy your project over WSGI.

Setting Up Your Project


Your project is set up in the subfolder myproject/settings.py. Following are some important
options you might need to set −

DEBUG = True
This option lets you set if your project is in debug mode or not. Debug mode lets you get more
information about your project's error. Never set it to ‘True’ for a live project. However, this has
to be set to ‘True’ if you want the Django light server to serve static files. Do it only in the
development mode.

DATABASES = {
'default': {
'ENGINE': 'django.db.backends.sqlite3',
'NAME': 'database.sql',
28
'USER': '',
'PASSWORD': '',
'HOST': '',
'PORT': '',
}
}
Database is set in the ‘Database’ dictionary. The example above is for SQLite engine. As stated
earlier, Django also supports −

MySQL (django.db.backends.mysql)
PostGreSQL (django.db.backends.postgresql_psycopg2)
Oracle (django.db.backends.oracle) and NoSQL DB
MongoDB (django_mongodb_engine)
Before setting any new engine, make sure you have the correct db driver installed.

You can also set others options like: TIME_ZONE, LANGUAGE_CODE, TEMPLATE…

Now that your project is created and configured make sure it's working −

$ python manage.py runserver


You will get something like the following on running the above code −

Validating models...

0 errors found
September 03, 2015 - 11:41:50
Django version 1.6.11, using settings 'myproject.settings'
Starting development server at http://127.0.0.1:8000/
Quit the server with CONTROL-C.

A project is a sum of many applications. Every application has an objective and can be reused
into another project, like the contact form on a website can be an application, and can be reused
for others. See it as a module of your project.

Create an Application
We assume you are in your project folder. In our main “myproject” folder, the same folder then
manage.py −

$ python manage.py startapp myapp


You just created myapp application and like project, Django create a “myapp” folder with the
application structure −

myapp/
__init__.py
admin.py
models.py
tests.py
views.py
29
__init__.py − Just to make sure python handles this folder as a package.

admin.py − This file helps you make the app modifiable in the admin interface.

models.py − This is where all the application models are stored.

tests.py − This is where your unit tests are.

views.py − This is where your application views are.

Get the Project to Know About Your Application


At this stage we have our "myapp" application, now we need to register it with our Django
project "myproject". To do so, update INSTALLED_APPS tuple in the settings.py file of your
project (add your app name) −

INSTALLED_APPS = (
'django.contrib.admin',
'django.contrib.auth',
'django.contrib.contenttypes',
'django.contrib.sessions',
'django.contrib.messages',
'django.contrib.staticfiles',
'myapp',
)
Creating forms in Django, is really similar to creating a model. Here again, we just need to
inherit from Django class and the class attributes will be the form fields. Let's add a forms.py file
in myapp folder to contain our app forms. We will create a login form.

myapp/forms.py

#-*- coding: utf-8 -*-


from django import forms

class LoginForm(forms.Form):
user = forms.CharField(max_length = 100)
password = forms.CharField(widget = forms.PasswordInput())
As seen above, the field type can take "widget" argument for html rendering; in our case, we
want the password to be hidden, not displayed. Many others widget are present in Django:
DateInput for dates, CheckboxInput for checkboxes, etc.

Using Form in a View


There are two kinds of HTTP requests, GET and POST. In Django, the request object passed as
parameter to your view has an attribute called "method" where the type of the request is set, and
all data passed via POST can be accessed via the request.POST dictionary.

Let's create a login view in our myapp/views.py −

#-*- coding: utf-8 -*-


30
from myapp.forms import LoginForm

def login(request):
username = "not logged in"

if request.method == "POST":
#Get the posted form
MyLoginForm = LoginForm(request.POST)

if MyLoginForm.is_valid():
username = MyLoginForm.cleaned_data['username']
else:
MyLoginForm = Loginform()

return render(request, 'loggedin.html', {"username" : username})


The view will display the result of the login form posted through the loggedin.html. To test it, we
will first need the login form template. Let's call it login.html.

<html>
<body>

<form name = "form" action = "{% url "myapp.views.login" %}"


method = "POST" >{% csrf_token %}

<div style = "max-width:470px;">


<center>
<input type = "text" style = "margin-left:20%;"
placeholder = "Identifiant" name = "username" />
</center>
</div>

<br>

<div style = "max-width:470px;">


<center>
<input type = "password" style = "margin-left:20%;"
placeholder = "password" name = "password" />
</center>
</div>

<br>

<div style = "max-width:470px;">


<center>

<button style = "border:0px; background-color:#4285F4; margin-top:8%;


height:35px; width:80%;margin-left:19%;" type = "submit"
value = "Login" >
31
<strong>Login</strong>
</button>

</center>
</div>

</form>

</body>
</html>
The template will display a login form and post the result to our login view above. You have
probably noticed the tag in the template, which is just to prevent Cross-site Request Forgery
(CSRF) attack on your site.

{% csrf_token %}
Once we have the login template, we need the loggedin.html template that will be rendered after
form treatment.

<html>

<body>
You are : <strong>{{username}}</strong>
</body>

</html>
Now, we just need our pair of URLs to get started: myapp/urls.py

from django.conf.urls import patterns, url


from django.views.generic import TemplateView

urlpatterns = patterns('myapp.views',
url(r'^connection/',TemplateView.as_view(template_name = 'login.html')),
url(r'^login/', 'login', name = 'login'))
When accessing "/myapp/connection", we will get the following login.html template rendered −
Setting Up Sessions
In Django, enabling session is done in your project settings.py, by adding some lines to the
MIDDLEWARE_CLASSES and the INSTALLED_APPS options. This should be done while
creating the project, but it's always good to know, so MIDDLEWARE_CLASSES should have −

'django.contrib.sessions.middleware.SessionMiddleware'
And INSTALLED_APPS should have −

'django.contrib.sessions'
By default, Django saves session information in database (django_session table or collection),
but you can configure the engine to store information using other ways like: in file or in cache.

When session is enabled, every request (first argument of any view in Django) has a session
(dict) attribute.
32
Let's create a simple sample to see how to create and save sessions. We have built a simple login
system before (see Django form processing chapter and Django Cookies Handling chapter). Let
us save the username in a cookie so, if not signed out, when accessing our login page you won’t
see the login form. Basically, let's make our login system we used in Django Cookies handling
more secure, by saving cookies server side.

For this, first lets change our login view to save our username cookie server side −

def login(request):
username = 'not logged in'

if request.method == 'POST':
MyLoginForm = LoginForm(request.POST)

if MyLoginForm.is_valid():
username = MyLoginForm.cleaned_data['username']
request.session['username'] = username
else:
MyLoginForm = LoginForm()

return render(request, 'loggedin.html', {"username" : username}


Then let us create formView view for the login form, where we won’t display the form if cookie
is set −

def formView(request):
if request.session.has_key('username'):
username = request.session['username']
return render(request, 'loggedin.html', {"username" : username})
else:
return render(request, 'login.html', {})
Now let us change the url.py file to change the url so it pairs with our new view −

from django.conf.urls import patterns, url


from django.views.generic import TemplateView

urlpatterns = patterns('myapp.views',
url(r'^connection/','formView', name = 'loginform'),
url(r'^login/', 'login', name = 'login'))
When accessing /myapp/connection, you will get to see the following page

3.4 System Study:


3.4.1 Feasibility Study:

The feasibility of the project is analyzed in this phase and business proposal is put forth
with a very general plan for the project and some cost estimates. During system analysis the
33
feasibility study of the proposed system is to be carried out. This is to ensure that the
proposed system is not a burden to the company. For feasibility analysis, some understanding of
the major requirements for the system is essential.

Three key considerations involved in the feasibility analysis are,

 Economical Feasibility
 Technical Feasibility
 Social Feasibility

Feasibility:

This study is carried out to check the economic impact that the system will have on
the organization. The amount of fund that the company can pour into the research and
development of the system is limited. The expenditures must be justified. Thus the developed
system as well within the budget and this was achieved because most of the technologies used
are freely available. Only the customized products had to be purchased.

Technical Feasibility:
This study is carried out to check the technical feasibility, that is, the technical
requirements of the system. Any system developed must not have a high demand on the
available technical resources. This will lead to high demands on the available technical
resources. This will lead to high demands being placed on the client. The developed system
must have a modest requirement, as only minimal or null changes are required for
implementing this system

34
Social Feasibility:

The aspect of study is to check the level of acceptance of the system by the user. This
includes the process of training the user to use the system efficiently. The user must not feel
threatened by the system, instead must accept it as a necessity. The level of acceptance by the
users solely depends on the methods that are employed to educate the user about the system
and to make him familiar with it. His level of confidence must be raised so that he is also able
to make some constructive criticism, which is welcomed, as he is the final user of the system.

3.5 Requirement Analysis:

The project involved analyzing the design of few applications so as to make the
application more users friendly. To do so, it was really important to keep the navigations from
one screen to the other well ordered and at the same time reducing the amount of typing the
user needs to do. In order to make the application more accessible, the browser version had to
be chosen so that it is compatible with most of the Browsers.

3.5.1 Requirement Specification:

Functional Requirements:

 Graphical User interface with the User.

Software Requirements:

For developing the application the following are the Software Requirements:

1. Python

2. Django

Operating Systems supported:

1. Windows 10 64 bit O

35
Technologies and Languages used to Develop:

1. Python

Debugger and Emulator:


 Any Browser (Particularly Chrome)

Hardware Requirements:

For developing the application the following are the Hardware Requirements:

 Processor: Minimum Intel i5


 RAM: 8 GB
 Space on Hard Disk: Minimum 1 TB

3.6 System Requirements:


3.6.1 Software Requirements:

 Operating system : Windows 10

 Technology : Python

 Front-End : Html, CSS

 Designing : Html,css,javascript.

 Data Base : SQLite.

3.6.2 Hardware Requirements:

 System : Intel i3.

 Processor Speed : 1.2 GHz-4.20 GHz.

 Hard Disk : 1 TB.

 Monitor : 14’ Colour Monitor.

 Mouse : Optical Mouse.

 Ram : 8GB

36
CHAPTER 4
SYSTEM DESIGN

4.1 SYSTEM ARCHITECTURE:

4.2 DATA FLOW DIAGRAM:

1. The DFD is also called as bubble chart. It is a simple graphical formalism that can be used
to represent a system in terms of input data to the system, various processing carried out
on this data, and the output data is generated by this system.
2. The data flow diagram (DFD) is one of the most important modeling tools. It is used to
model the system components. These components are the system process, the data used by
37
the process, an external entity that interacts with the system and the information flows in
the system.
3. DFD shows how the information moves through the system and how it is modified by a
series of transformations. It is a graphical technique that depicts information flow and the
transformations that are applied as data moves from input to output.
4. DFD is also known as bubble chart. A DFD may be used to represent a system at any level
of abstraction. DFD may be partitioned into levels that represent increasing information
flow and functional detail.

4.3 UML DIAGRAMS

UML stands for Unified Modeling Language. UML is a standardized general-purpose


modeling language in the field of object-oriented software engineering. The standard is managed,
and was created by, the Object Management Group.
The goal is for UML to become a common language for creating models of object oriented
computer software. In its current form UML is comprised of two major components: a Meta-model
and a notation. In the future, some form of method or process may also be added to; or associated
with, UML.
The Unified Modeling Language is a standard language for specifying, Visualization,

38
Constructing and documenting the artifacts of software system, as well as for business modeling
and other non-software systems.
The UML represents a collection of best engineering practices that have proven successful
in the modeling of large and complex systems.
The UML is a very important part of developing objects oriented software and the software
development process. The UML uses mostly graphical notations to express the design of software
projects.

GOALS:
The Primary goals in the design of the UML are as follows:
1. Provide users a ready-to-use, expressive visual modeling Language so that they can
develop and exchange meaningful models.
2. Provide extendibility and specialization mechanisms to extend the core concepts.
3. Be independent of particular programming languages and development process.
4. Provide a formal basis for understanding the modeling language.
5. Encourage the growth of OO tools market.
6. Support higher level development concepts such as collaborations, frameworks, patterns
and components.
7. Integrate best practices.

4.3.1USE CASE DIAGRAM:

A use case diagram in the Unified Modeling Language (UML) is a type of behavioral
diagram defined by and created from a Use-case analysis. Its purpose is to present a graphical
overview of the functionality provided by a system in terms of actors, their goals (represented as
use cases), and any dependencies between those use cases. The main purpose of a use case diagram
is to show what system functions are performed for which actor. Roles of the actors in the system
can be depicted.

39
4.3.2 CLASS DIAGRAM:

In software engineering, a class diagram in the Unified Modeling Language (UML) is a


type of static structure diagram that describes the structure of a system by showing the system's
classes, their attributes, operations (or methods), and the relationships among the classes. It
explains which class contains information.

40
4.3.3 SEQUENCE DIAGRAM:

A sequence diagram in Unified Modeling Language (UML) is a kind of interaction diagram that
shows how processes operate with one another and in what order. It is a construct of a Message
Sequence Chart. Sequence diagrams are sometimes called event diagrams, event scenarios, and
timing diagrams.

4.3.4 ACTIVITY DIAGRAM:

Activity diagrams are graphical representations of workflows of stepwise activities and actions
with support for choice, iteration and concurrency. In the Unified Modeling Language, activity
diagrams can be used to describe the business and operational step-by-step workflows of
components in a system. An activity diagram shows the overall flow of control.

41
42
CHAPTER 5
IMPLEMENTATION

5.1 MODULES:
 User
 Admin
 Machine Learning Results

5.2 MODULES DESCRIPTION:

5.2.1 User:
The User can register the first. While registering he required a valid user email and mobile for
further communications. Once the user register then admin can activate the user. Once admin
activated the user then user can login into our system. User can upload the dataset based on our
dataset column matched. For algorithm execution data must be in float format. Here we took
Employment Scam Aegean Dataset (EMSCAD) containing 18000 sample dataset. User can also
add the new data for existing dataset based on our Django application. User can click the
Classification in the web page so that the data calculated Accuracy and macro avg, weighted avg
based on the algorithms. User can display the ml results. user can also display the prediction
results.

5.2.2 Admin:
Admin can login with his login details. Admin can activate the registered users. Once he activate
then only the user can login into our system. Admin can view the overall data in the browser.
Admin can click the Results in the web page so calculated Accuracy and macro avg, weighted avg
based on the algorithms is displayed. All algorithms execution complete then admin can see the
overall accuracy in web page. And also display the classification results.

43
5.2.3 Machine learning:
This paper proposed to use different data mining techniques and classification algorithm like KNN,
decision tree, support vector machine, naïve bayes classifier, random forest classifier, multilayer
perceptron and deep neural network to predict a job post if it is real or fraudulent. The Accuracy
and macro avg weighted avg of the classifiers was calculated and displayed in my results. The
classifier which bags up the highest accuracy could be determined as the best classifier.

5.3 Sample Code:

User side views:


import os.path
import pandas as pd
from django.shortcuts import render, HttpResponse
from django.contrib import messages
from .forms import UserRegistrationForm
from .models import UserRegistrationModel
from django.conf import settings
from django.core.files.storage import FileSystemStorage
import numpy as np

# Create your views here.


def UserRegisterActions(request):
if request.method == 'POST':
form = UserRegistrationForm(request.POST)
if form.is_valid():
print('Data is Valid')
form.save()
messages.success(request, 'You have been successfully registered')
form = UserRegistrationForm()
return render(request, 'UserRegistrations.html', {'form': form})
else:
messages.success(request, 'Email or Mobile Already Existed')
print("Invalid form")
else:
form = UserRegistrationForm()
return render(request, 'UserRegistrations.html', {'form': form})

44
def UserLoginCheck(request):
if request.method == "POST":
loginid = request.POST.get('loginid')
pswd = request.POST.get('pswd')
print("Login ID = ", loginid, ' Password = ', pswd)
try:
check = UserRegistrationModel.objects.get(loginid=loginid, password=pswd)
status = check.status
print('Status is = ', status)
if status == "activated":
request.session['id'] = check.id
request.session['loggeduser'] = check.name
request.session['loginid'] = loginid
request.session['email'] = check.email
print("User id At", check.id, status)
return render(request, 'users/UserHomePage.html', {})
else:
messages.success(request, 'Your Account Not at activated')
return render(request, 'UserLogin.html')
except Exception as e:
print('Exception is ', str(e))
pass
messages.success(request, 'Invalid Login id and password')
return render(request, 'UserLogin.html', {})

def UserHome(request):
return render(request, 'users/UserHomePage.html', {})

def DatasetView(request):
path = os.path.join(settings.MEDIA_ROOT, 'datasets', '08 disease-burden-rates-by-cancer-
types.csv')
if request.method == 'POST':
countyCode = request.POST.get('countryCode')
df = pd.read_csv(path)
country = df['Entity'].unique()
df = df[df.Entity.isin([countyCode])]
df = df.rename(columns={df.columns[3]: 'Other pharyx', df.columns[4]: 'Liver',
df.columns[5]: 'Breast',
df.columns[6]: 'Tracheal', df.columns[7]: 'Gallbladder & bilary tract',
df.columns[8]: 'Kidney',
df.columns[9]: 'Larynx', df.columns[10]: 'Esophageal', df.columns[11]:
'Nasopharynx',
df.columns[12]: 'Colon & rectum', df.columns[13]: 'Non-melanoma skin',
df.columns[14]: 'lip & oral',
df.columns[15]: 'Malignant skin melanoma', df.columns[16]: 'Other malignant
neoplasms',
df.columns[17]: 'Mesothelioma', df.columns[18]: 'Hodgkin lymphoma',
df.columns[19]: 'Non-Hodgkin lymphoma'})
df = df.drop(df.columns[[3, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29]], axis=1)

45
df.dropna(inplace=True)
df = df.to_html(index=False)

return render(request, 'users/viewdataset.html', {'data': df, 'county': country})

else:
df = pd.read_csv(path)
country = df['Entity'].unique()
df = df.rename(columns={df.columns[3]: 'Other pharyx', df.columns[4]: 'Liver',
df.columns[5]: 'Breast',
df.columns[6]: 'Tracheal', df.columns[7]: 'Gallbladder & bilary tract',
df.columns[8]: 'Kidney',
df.columns[9]: 'Larynx', df.columns[10]: 'Esophageal', df.columns[11]:
'Nasopharynx',
df.columns[12]: 'Colon & rectum', df.columns[13]: 'Non-melanoma skin',
df.columns[14]: 'lip & oral',
df.columns[15]: 'Malignant skin melanoma', df.columns[16]: 'Other malignant
neoplasms',
df.columns[17]: 'Mesothelioma', df.columns[18]: 'Hodgkin lymphoma',
df.columns[19]: 'Non-Hodgkin lymphoma'})

df = df.drop(df.columns[[3, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29]], axis=1)
df.dropna(inplace=True)
df = df.to_html(index=False)

return render(request, 'users/viewdataset.html', {'data': df, 'county': country})

def UserRegressions(request):
if request.method == 'POST':
path = os.path.join(settings.MEDIA_ROOT, 'datasets', '08 disease-burden-rates-by-cancer-
types.csv')
df = pd.read_csv(path)
df.dropna(inplace=True)
countryCode = request.POST.get('countryCode')
cancerType = request.POST.get('cancerType')
df = df[df.Code.isin([countryCode])]
# print(df.head())

X = df['Year'].to_list()
y = df[cancerType].to_list()
X_X = []
y_y = []
for i in X:
X_X.append([i])
for j in y:
y_y.append([j])

from sklearn.model_selection import train_test_split

46
X_train, X_test, y_train, y_test = train_test_split(X_X, y_y, test_size=0.2, random_state=42)

from .utility import CancerRegressionModel


lr_dict = CancerRegressionModel.process_LinearRegression(X_train, X_test, y_train, y_test)
dt_dict = CancerRegressionModel.process_decesionTree(X_train, X_test, y_train, y_test)
rf_dict = CancerRegressionModel.process_randomForest(X_train, X_test, y_train, y_test)
pf_dict = CancerRegressionModel.process_polynomialRegressor(X_train, X_test, y_train,
y_test)
print("My Predictions:", lr_dict)
# return render(request, 'users/cl_reports.html',
# {'rf': rf_report.to_html, 'dt': dt_report.to_html, 'nb': nb_report.to_html, 'gb':
gb_report.to_html})
path = os.path.join(settings.MEDIA_ROOT, 'datasets', '08 disease-burden-rates-by-cancer-
types.csv')
df = pd.read_csv(path)
countryCode = df['Code'].unique()
Type_of_cancer = df.columns[3:].to_list()
return render(request, 'users/regressionModelResults.html',
{'lr_dict': lr_dict, 'dt_dict': dt_dict, 'rf_dict': rf_dict, 'pf_dict': pf_dict,
'countryCode': countryCode,
'cancerType': Type_of_cancer})

else:
path = os.path.join(settings.MEDIA_ROOT, 'datasets', '08 disease-burden-rates-by-cancer-
types.csv')
df = pd.read_csv(path)
df.dropna(inplace=True)
countryCode = df['Code'].unique()
Type_of_cancer = df.columns[3:].to_list()
# print(type(Type_of_cancer), Type_of_cancer)
return render(request, 'users/regressionModel.html', {'countryCode': countryCode,
'cancerType': Type_of_cancer})

def ForecastAnalysis(request):
if request.method == 'POST':
path = os.path.join(settings.MEDIA_ROOT, 'datasets', '08 disease-burden-rates-by-cancer-
types.csv')
df = pd.read_csv(path)
df.dropna(inplace=True)
countryCode = request.POST.get('countryCode')
cancerType = request.POST.get('cancerType')
df = df[df.Code.isin([countryCode])]
# print(df.head())

X = df['Year'].values
y = df[cancerType].values
# X_X = []
# y_y = []
# for i in X:
# X_X.append([i])
# for j in y:
47
# y_y.append([j])
myDf = pd.DataFrame(list(zip(X, y)),
columns=['year', 'val'])

from .utility.predections import FuturePredImpl


fut = FuturePredImpl()
pred_ci = fut.startFuturePrediction(myDf)
print('Am Which type ', type(pred_ci))
pred_ci['lower val'] = pred_ci['lower val'].astype(float)
pred_ci['upper val'] = pred_ci['upper val'].astype(float)
pred_ci['lower val'] = pred_ci['lower val'] / 800
pred_ci['upper val'] = pred_ci['upper val'] / 800
print(pred_ci.head())
pred_ci = pred_ci.tail(600)
pred_ci = pred_ci.to_html

path = os.path.join(settings.MEDIA_ROOT, 'datasets', '08 disease-burden-rates-by-cancer-


types.csv')
df = pd.read_csv(path)
countryCode = df['Code'].unique()
Type_of_cancer = df.columns[3:].to_list()

return render(request, 'users/forecastModel.html',


{'data': pred_ci, 'countryCode': countryCode,
'cancerType': Type_of_cancer})

else:
path = os.path.join(settings.MEDIA_ROOT, 'datasets', '08 disease-burden-rates-by-cancer-
types.csv')
df = pd.read_csv(path)
df.dropna(inplace=True)
countryCode = df['Code'].unique()
Type_of_cancer = df.columns[3:].to_list()
return render(request, 'users/forecastModel.html', {'countryCode': countryCode, 'cancerType':
Type_of_cancer})

base.html:

{%load static%}
<!doctype html>
<!--[if IE 7 ]> <html lang="en-gb" class="isie ie7 oldie no-js"> <![endif]-->
<!--[if IE 8 ]> <html lang="en-gb" class="isie ie8 oldie no-js"> <![endif]-->
<!--[if IE 9 ]> <html lang="en-gb" class="isie ie9 no-js"> <![endif]-->
<!--[if (gt IE 9)|!(IE)]><!-->
<html lang="en-gb" class="no-js">
<!--<![endif]-->
<head>
<meta charset="utf-8">
<meta name="viewport" content="width=device-width, initial-scale=1, maximum-scale=1">
48
<!--[if lt IE 9]>
<meta http-equiv="X-UA-Compatible" content="IE=edge,chrome=1">
<![endif]-->
<title>WebThemez - Single page website</title>
<meta name="description" content="">
<meta name="author" content="WebThemez">
<!--[if lt IE 9]>
<script src="http://html5shim.googlecode.com/svn/trunk/html5.js"></script>
<![endif]-->
<!--[if lte IE 8]>
<script type="text/javascript"
src="http://explorercanvas.googlecode.com/svn/trunk/excanvas.js"></script>
<![endif]-->
<link rel="stylesheet" href="{%static 'css/bootstrap.min.css'%}" />
<link rel="stylesheet" type="text/css" href="{%static 'css/isotope.css'%}" media="screen" />
<link rel="stylesheet" href="{%static 'js/fancybox/jquery.fancybox.css'%}" type="text/css"
media="screen" />
<link href="{%static 'css/animate.css'%}" rel="stylesheet" media="screen">
<!-- Owl Carousel Assets -->
<link href="{%static 'js/owl-carousel/owl.carousel.css'%}" rel="stylesheet">
<link rel="stylesheet" href="{%static 'css/styles.css'%}" />
<!-- Font Awesome -->
<link href="{%static 'font/css/font-awesome.min.css'%}" rel="stylesheet">
</head>

<body>
<header class="header">
<div class="container">
<nav class="navbar navbar-inverse" role="navigation">
<div class="navbar-header">
<button type="button" id="nav-toggle" class="navbar-toggle" data-toggle="collapse" data-
target="#main-nav"> <span class="sr-only">Toggle navigation</span> <span class="icon-
bar"></span> <span class="icon-bar"></span> <span class="icon-bar"></span> </button>
<a href="#" class="navbar-brand scroll-top logo animated
bounceInLeft"><b><i></i>Cancer Death Cases Forecasting</b></a> </div>
<!--/.navbar-header-->
<div id="main-nav" class="collapse navbar-collapse">
<ul class="nav navbar-nav" id="mainNav">
<li><a href="{%url 'index'%}" class="scroll-link">Home</a></li>
<li><a href="{%url 'UserLogin'%}" class="scroll-link">Users</a></li>
<li><a href="{%url 'AdminLogin'%}" class="scroll-link">Admin Login</a></li>
<li><a href="{%url 'UserRegister'%}" class="scroll-link">Registrations</a></li>
</ul>
</div>
<!--/.navbar-collapse-->
</nav>
<!--/.navbar-->
</div>
<!--/.container-->
</header>
<!--/.header-->
49
<div id="#top"></div>
<div id="#top"></div>

<div id="#top"></div>
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;
{%block contents%}
{%endblock%}
<section id="team" class="page-section">
<div class="container">
<div class="heading text-center">
<!-- Heading -->
<h2>Our Team</h2>
<p>At variations of passages of Lorem Ipsum available, but the majority have suffered
alteration..</p>
</div>
<!-- Team Member's Details -->
<div class="team-content">
<div class="row">
<div class="col-md-3 col-sm-6 col-xs-6">
<!-- Team Member -->
<div class="team-member pDark">
<!-- Image Hover Block -->
<div class="member-img">
<!-- Image -->
<img class="img-responsive" src="{%static 'images/photo-1.jpg'%}" alt=""> </div>
<!-- Member Details -->
<h4>John Doe</h4>
<!-- Designation -->
<span class="pos">CEO</span> </div>
</div>
<div class="col-md-3 col-sm-6 col-xs-6">
<!-- Team Member -->
<div class="team-member pDark">
<!-- Image Hover Block -->
<div class="member-img">
<!-- Image -->
<img class="img-responsive" src="{%static 'images/photo-2.jpg'%}" alt=""> </div>
<!-- Member Details -->
<h4>Larry Doe</h4>
<!-- Designation -->
<span class="pos">Art Director</span> </div>
</div>
<div class="col-md-3 col-sm-6 col-xs-6">
<!-- Team Member -->
<div class="team-member pDark">
<!-- Image Hover Block -->
<div class="member-img">
<!-- Image -->
<img class="img-responsive" src="{%static 'images/photo-3.jpg'%}" alt=""> </div>
<!-- Member Details -->
<h4>Ranith Kays</h4>
50
<!-- Designation -->
<span class="pos">Manager</span> </div>
</div>
<div class="col-md-3 col-sm-6 col-xs-6">
<!-- Team Member -->
<div class="team-member pDark">
<!-- Image Hover Block -->
<div class="member-img">
<!-- Image -->
<img class="img-responsive" src="{%static 'images/photo-4.jpg'%}" alt=""> </div>
<!-- Member Details -->
<h4>Joan Ray</h4>
<!-- Designation -->
<span class="pos">Creative</span> </div>
</div>
</div>
</div>
</div>
<!--/.container-->
</section>

<!--/.page-section-->
<section class="copyright">
<div class="container">
<div class="row">
<div class="col-sm-12 text-center"> Copyright 2024 | All Rights Reserved -- Template by <a
href="#">Alex Corporations</a> </div>
</div>
<!-- / .row -->
</div>
</section>
<a href="#top" class="topHome"><i class="fa fa-chevron-up fa-2x"></i></a>

<!--[if lte IE 8]><script


src="//ajax.googleapis.com/ajax/libs/jquery/1.11.0/jquery.min.js"></script><![endif]-->
<script src="{%static 'js/modernizr-latest.js'%}"></script>
<script src="{%static 'js/jquery-1.8.2.min.js'%}" type="text/javascript"></script>
<script src="{%static 'js/bootstrap.min.js'%}" type="text/javascript"></script>
<script src="{%static 'js/jquery.isotope.min.js'%}" type="text/javascript"></script>
<script src="{%static 'js/fancybox/jquery.fancybox.pack.js'%}" type="text/javascript"></script>
<script src="{%static 'js/jquery.nav.js'%}" type="text/javascript"></script>
<script src="{%static 'js/jquery.fittext.js'%}"></script>
<script src="{%static 'js/waypoints.js'%}"></script>
<script src="{%static 'js/custom.js'%}" type="text/javascript"></script>
<script src="{%static 'js/owl-carousel/owl.carousel.js'%}"></script>
</body>
</html>

51
Index.html:
{%extends 'base.html'%}
{% load static %}

{%block contents%}
<section id="" class="page-section page">
<div class="container text-center">
<div class="heading">
<h2>METHODOLOGY</h2>
<p>Modeling is the connection between a dependent variable and a set of independent
variables.</p>
</div>
<div class="row">
<div class="col-md-12">
<div id="portfolio">
<ul class="items list-unstyled clearfix animated fadeInRight showing" data-
animation="fadeInRight" style="position: relative; height: 438px;">
<li class="item branding" style="position: absolute; left: 0px; top: 0px;"> <a href="{%static
'images/work/1.jpg'%}" class="fancybox"> <img src="{%static 'images/work/1.jpg'%}" alt="">
<div class="overlay"> <span>Etiam porta</span> </div>
</a> </li>
<li class="item photography" style="position: absolute; left: 292px; top: 0px;"> <a
href="{%static 'images/work/2.jpg'%}" class="fancybox"> <img src="{%static
'images/work/2.jpg'%}" alt="">
<div class="overlay"> <span>Lorem ipsum</span> </div>
</a> </li>
<li class="item branding" style="position: absolute; left: 585px; top: 0px;"> <a
href="{%static 'images/work/3.jpg'%}" class="fancybox"> <img src="{%static
'images/work/3.jpg'%}" alt="">
<div class="overlay"> <span>Vivamus quis</span> </div>
</a> </li>
<li class="item photography" style="position: absolute; left: 877px; top: 0px;"> <a
href="{%static 'images/work/4.jpg'%}" class="fancybox"> <img src="{%static
'images/work/4.jpg'%}" alt="">
<div class="overlay"> <span>Donec ac tellus</span> </div>
</a> </li>
<li class="item photography" style="position: absolute; left: 0px; top: 219px;"> <a
href="{%static 'images/work/5.jpg'%}" class="fancybox"> <img src="{%static
'images/work/5.jpg'%}" alt="">
<div class="overlay"> <span>Etiam volutpat</span> </div>
</a> </li>
<li class="item web" style="position: absolute; left: 292px; top: 219px;"> <a
href="{%static 'images/work/6.jpg'%}" class="fancybox"> <img src="{%static
'images/work/6.jpg'%}" alt="">
<div class="overlay"> <span>Donec congue </span> </div>
</a> </li>
<li class="item photography" style="position: absolute; left: 585px; top: 219px;"> <a
href="{%static 'images/work/7.jpg'%}" class="fancybox"> <img src="{%static
'images/work/7.jpg'%}" alt="">
<div class="overlay"> <span>Nullam a ullamcorper diam</span> </div>
52
</a> </li>
<li class="item web" style="position: absolute; left: 877px; top: 219px;"> <a
href="{%static 'images/work/8.jpg'%}" class="fancybox"> <img src="{%static
'images/work/8.jpg'%}" alt="">
<div class="overlay"> <span>Etiam consequat</span> </div>
</a> </li>
</ul>
</div>
</div>
</div>
</div>
</section>
{%endblock%}

Admin side views:


from django.shortcuts import render, HttpResponse
from django.contrib import messages
from users.models import UserRegistrationModel

# Create your views here.


def AdminLoginCheck(request):
if request.method == 'POST':
usrid = request.POST.get('loginid')
pswd = request.POST.get('pswd')
print("User ID is = ", usrid)
if usrid == 'admin' and pswd == 'admin':
return render(request, 'admins/AdminHome.html')

else:
messages.success(request, 'Please Check Your Login Details')
return render(request, 'AdminLogin.html', {})

def AdminHome(request):
return render(request, 'admins/AdminHome.html')

def RegisterUsersView(request):
data = UserRegistrationModel.objects.all()
return render(request,'admins/viewregisterusers.html',{'data':data})

def ActivaUsers(request):
if request.method == 'GET':
id = request.GET.get('uid')
status = 'activated'
print("PID = ", id, status)
UserRegistrationModel.objects.filter(id=id).update(status=status)
data = UserRegistrationModel.objects.all()
return render(request,'admins/viewregisterusers.html',{'data':data})

53
5.4 Input and Output Design:

5.4.1 Input Design:

The input design is the link between the information system and the user.
It comprises the developing specification and procedures for data preparation and those
steps are necessary to put transaction data in to a usable form for processing can be
achieved by inspecting the computer to read data from a written or printed document or
it can occur by having people keying the data directly into the system. The design of
input focuses on controlling the amount of input required, controlling the errors,
avoiding delay, avoiding extra steps and keeping the process simple. The input is
designed in such a way so that it provides security and ease of use with retaining the
privacy. Input Design considered the following things:

 What data should be given as input?


 How the data should be arranged or coded?
 The dialog to guide the operating personnel in providing input.
 Methods for preparing input validations and steps to follow when error occur.

Objectives:

1.Input Design is the process of converting a user-oriented description of the input into a
computer-based system. This design is important to avoid errors in the data input process and
show the correct direction to the management for getting correct information from the
computerized system.

1.It is achieved by creating user-friendly screens for the data entry to handle large volume of

54
data. The goal of designing input is to make data entry easier and to be free from errors. The
data entry screen is designed in such a way that all the data manipulates can be performed. It
also provides record viewing facilities.

1.When the data is entered it will check for its validity. Data can be entered with the help of
screens. Appropriate messages are provided as when needed so that the user will not be in
maize of instant. Thus the objective of input design is to create an input layout that is easy to
follow.

55
5.4.2 Output Design:

A quality output is one, which meets the requirements of the end user and presents the
information clearly. In any system results of processing are communicated to the users and to
other system through outputs. In output design it is determined how the information is to be
displaced for immediate need and also the hard copy output. It is the most important and direct
source information to the user. Efficient and intelligent output design improves the system’s
relationship to help user decision-making.
1. Designing computer output should proceed in an organized, well thought
out manner; the right output must be developed while ensuring that each
output element is designed so that people will find the system can use easily
and effectively. When analysis design computer output, they should Identify
the specific output that is needed to meet the requirements.

2. Select methods for presenting information.

3. Create document, report, or other formats that contain information produced


by the system.

The output form of an information system should accomplish one or more of the
following objectives.
 Convey information about past activities, current status or projections of the
 Future.
 Signal important events, opportunities, problems, or warnings.
 Trigger an action.
 Confirm an action.

56
CHAPTER 6
RESULTS

Homepage:

User register:

57
Admin Home:

Admin login:

58
User details:

User login:

59
User Home:

Dataset:

60
Regression:

Forecast:

61
CHAPTER 7
SYSTEM TESTING

The purpose of testing is to discover errors. Testing is the process of trying to discover
every conceivable fault or weakness in a work product. It provides a way to check the
functionality of components, sub assemblies, assemblies and/or a finished product It is the
process of exercising software with the intent of ensuring that the Software system meets its
requirements and user expectations and does not fail in an unacceptable manner. There are
various types of test. Each test type addresses a specific testing requirement.

Types Of Tests:
 Unit Testing
 Integration Testing
 Functional Testing
 System Testing
 White Box Testing
 Black box Testing
 Acceptance Testing

7.1Unit Testing:
Unit testing involves the design of test cases that validate that the internal program logic
is functioning properly, and that program inputs produce valid outputs. All decision branches
and internal code flow should be validated. It is the testing of individual software units of the
application .it is done after the completion of an individual unit before integration. This is a
structural testing, that relies on knowledge of its construction and is invasive. Unit tests
perform basic tests at component level and test a specific business process, application, and/or
system configuration. Unit tests ensure that each unique path of a business process performs
accurately to the documented specifications and contains clearly defined inputs and expected
results.

62
7.2 Integration Testing:
Integration tests are designed to test integrated software components to determine if they
actually run as one program. Testing is event driven and is more concerned with the basic
outcome of screens or fields. Integration tests demonstrate that although the components were
individually satisfaction, as shown by successfully unit testing, the combination of
components is correct and consistent. Integration testing is specifically aimed at exposing the
problems that arise from the combination of components.

7.3 Functional testing:


Functional tests provide systematic demonstrations that functions tested are available as
specified by the business and technical requirements, system documentation, and user
manuals.
Functional testing is centered on the following items:
Valid Input : identified classes of valid input must be accepted.
Invalid Input : identified classes of invalid input must be rejected.
Functions : identified functions must be exercised.
Output : identified classes of application outputs must be
exercised.
Systems/Procedures : interfacing systems or procedures must be invoked.
Organization and preparation of functional tests is focused on requirements, key
functions, or special test cases. In addition, systematic coverage pertaining to identify
Business process flows; data fields, predefined processes, and successive processes
must be considered for testing. Before functional testing is complete, additional tests
are identified and the effective value of current tests is determined.

7.4 System Testing:


System testing ensures that the entire integrated software system meets requirements. It
tests a configuration to ensure known and predictable results. An example of system testing is
the configuration oriented system integration test. System testing is based on process
descriptions and flows, emphasizing pre-driven process links and integration points.

63
7.5 White Box Testing:
White Box Testing is a testing in which in which the software tester has knowledge of the
inner workings, structure and language of the software, or at least its purpose. It is purpose. It
is used to test areas that cannot be reached from a black box level.

7.6 Black Box Testing:


Black Box Testing is testing the software without any knowledge of the inner workings,
structure or language of the module being tested. Black box tests, as most other kinds of tests,
must be written from a definitive source document, such as specification or requirements
document, such as specification or requirements document. It is a testing in which the software
under test is treated, as a black box you cannot “see” into it. The test provides inputs and
responds to outputs without considering how the software works.

7.7 Test strategy and approach:


Field testing will be performed manually and functional tests will be written in detail.

Test objectives:
 All field entries must work properly.
 Pages must be activated from the identified link.
 The entry screen, messages and responses must not be delayed.

Features to be tested:
 Verify that the entries are of the correct format
 No duplicate entries should be allowed
 All links should take the user to the correct page.

Test Results:
All the test cases mentioned above passed successfully. No defects encountered.

Acceptance Testing:
User Acceptance Testing is a critical phase of any project and requires significant
participation by the end user. It also ensures that the system meets the functional requirements.

64
7.8 Sample Test Cases

Remarks(IF
S.no Test Case Excepted Result Result
Fails)
If already user
1 User Register If User registration successfully. Pass email exist then it
fails.
Un Register
If Username and password is correct
2 User Login Pass Users will not
then it will getting valid page.
logged in.
The
request
will be
The request will be accepted by the
accepted
5 Regression Regression is used to predict disease Pass
by the
progression based on input features.
Regression
otherwise
its failed
View dataset by Results not true
6 Data set will be displayed by the user Pass
user failed

Results not true


7 User classification Display reviews with true results Pass
failed

Calculate accuracy macro avg and


macro avg and weighted avg
8 macro avg and Pass weighted avg not
calculated
weighted avg displayed failed
Admin can login with his login Invalid login
9 Admin login credential. If success he get his home Pass details will not
page allowed here
Admin can If user id not
Admin can activate the register user
10 activate the Pass found then it
id
register users won’t login.

65
CHAPTER 8
CONCLUSION
The application of supervised machine learning techniques forecast cancer-related death
cases represents a transformative step forward in the intersection of healthcare and artificial
intelligence. This project has demonstrated that predictive analytics, when backed by robust
machine learning models, can offer significant potential for early diagnosis, strategic planning, and
informed decision-making in healthcare systems. By employing algorithms such as Random
Forest, Support Vector Machines (SVM), and Deep Learning models, we have been able to build
systems that not only analyze vast amounts of historical health data but also generate accurate
predictions regarding future mortality trends.

The capability to forecast cancer death cases enables healthcare providers and
policymakers to allocate resources more efficiently, plan for future demands on medical
infrastructure, and implement targeted public health interventions. Early warning systems based
on such models can prompt timely screenings, encourage preventive measures, and ultimately
reduce cancer mortality rates. Moreover, this proactive approach can help in reducing the burden
on healthcare systems, especially in countries with limited resources.

Despite these advantages, several challenges still need to be addressed. The quality and
completeness of input data play a crucial role in the accuracy and reliability of predictive models.
Issues such as missing data, bias in data collection, and inconsistencies across healthcare databases
can impact model performance. Furthermore, the computational complexity of training advanced
machine learning models can require significant infrastructure and expertise, which may not be
available in all regions. Ethical concerns such as patient data privacy, algorithmic transparency,
and the potential for misuse of predictive results also need to be thoughtfully considered and
regulated.

66
8.1 Further Enhancement:

Cancer death cases forecasting using supervised machine learning has the potential to be a
valuable tool for public health officials and researchers. By accurately predicting the number of
cancer deaths that will occur in the future, these officials can better allocate resources and develop
more effective prevention and treatment strategies.
One way to enhance the accuracy of cancer death cases forecasting is to use a variety of machine
learning algorithms and compare their results. This is known as ensemble learning. Ensemble
learning algorithms work by combining the predictions of multiple individual algorithms to
produce a more accurate overall prediction.

Another way to improve the accuracy of cancer death cases forecasting is to use real-time
data. This data can include information such as cancer incidence rates, cancer treatment trends,
and population demographics. By using real-time data, machine learning algorithms can better
account for changes in the cancer landscape and produce more accurate predictions.
Finally, the accuracy of cancer death cases forecasting can be improved by using artificial
intelligence (AI) techniques. AI techniques can be used to develop more sophisticated machine
learning algorithms that can better understand the complex relationships between the various
factors that influence cancer mortality.

67
CHAPTER 9
BIBLIOGRAPHY

REFERENCES
[1] K. Sathishkumar, M. Chaturvedi, P. Das, S. Stephen, and P. Mathur, “Cancer incidence
estimates for 2022 & projection for 2025: Result from National Cancer Registry Programme,
India.,” The Indian journal of medical research, vol. 156, no. 4&5, pp. 598–607, 2022, doi:
10.4103/ijmr.ijmr_1821_22.
[2] C. Tudor, “A Novel Approach to Modeling and Forecasting Cancer Incidence and
Mortality Rates through Web Queries and Automated Forecasting Algorithms: Evidence from
Romania.,” Biology, vol. 11, no. 6, Jun. 2022, doi: 10.3390/biology11060857.
[3] R. Gupta, A. Sharma, V. Anand, and S. Gupta, “Automobile Price Prediction using
Regression Models,” in 2022 International Conference on Inventive Computation Technologies
(ICICT), 2022, pp. 410–416. doi: 10.1109/ICICT54344.2022.9850657.
[4] M. Dalmartello et al., “European cancer mortality predictions for the year 2022 with focus
on ovarian cancer,” Annals of Oncology, vol. 33, no. 3, pp. 330–339, 2022,
doi: https://doi.org/10.1016/j.annonc.2021.12.007.
[5] B. Trächsel, E. Rapiti, A. Feller, V. Rousson, I. Locatelli, and J.-L. Bulliard, “Predicting
the burden of cancer in Switzerland up to 2025,” PLOS Global Public Health, vol. 2, no. 10, p.
e0001112, Oct. 2022, [Online]. Available: https://doi.org/10.1371/journal.pgph.0001112
[6] K. W. Jung, Y. J. Won, M. J. Kang, H. J. Kong, J. S. Im, and H. G. Seo, “Prediction of
Cancer Incidence and Mortality in Korea, 2022,” Cancer Research and Treatment, vol. 54, no. 2,
pp. 345–351, 2022, doi: 10.4143/crt.2022.179.
[7] B. Sekeroglu and K. Tuncal, “Prediction of cancer incidence rates for the European
continent using machine learning models,” Health Informatics Journal, vol. 27, no. 1, p.
1460458220983878, Jan. 2021, doi: 10.1177/1460458220983878.
[8] S. Shaikh, J. Gala, A. Jain, S. Advani, S. Jaidhara, and M. R.Edinburgh, “Analysis and
Prediction of COVID-19 using Regression Models and Time Series Forecasting,” in 2021 11th

68
International Conference on Cloud Computing, Data Science & Engineering (Confluence),
2021, pp. 989–995. doi: 10.1109/Confluence51648.2021.9377137.
[9] L. Rahib, M. R. Wehner, L. M. Matrisian, and K. T. Nead, “Estimated Projection of US
Cancer Incidence and Death to 2040,” JAMA Network Open, vol. 4, no. 4, pp. e214708–e214708,
Apr. 2021, doi: 10.1001/jamanetworkopen.2021.4708.
[10] T. Jain, V. K. Verma, M. Agarwal, A. Yadav, and A. Jain, “Supervised Machine Learning
Approach For The Prediction of Breast Cancer,” in 2020 International Conference on System,
Computation, Automation and Networking (ICSCAN), 2020, pp. 1–6. doi:
10.1109/ICSCAN49426.2020.9262403.
[11] V. R. J and A. Jakka, “Forecasting COVID-19 cases in India Using Machine Learning
Models,” in 2020 International Conference on Smart Technologies in Computing, Electrical and
Electronics (ICSTCEE), 2020, pp. 466–471. doi:
10.1109/ICSTCEE49637.2020.9276852.
[12] E.-D. J Med Discov and L. Xie, “Time Series Analysis and Prediction on Cancer Incidence
Rates,” Journal of Medical Discovery, vol. 2, p. jmd17030, Sep. 2017, doi:
10.24262/jmd.2.3.17030.
[13] X. X. Ai, H. Jia, and L. Xin, “SVM-based Cancer Incidence Forecasting of Patients,” in
2016 9th International Symposium on Computational Intelligence and Design (ISCID), 2016, vol.
2, pp. 281– 284. doi: 10.1109/ISCID.2016.2074.
[14] G. Dong and V. Taslimitehrani, “Pattern-aided regression modeling and prediction model
analysis,” in 2016 IEEE 32nd International Conference on Data Engineering (ICDE), 2016, pp.
1508–1509. doi: 10.1109/ICDE.2016.7498398.
[15] I. Ali, W. A. Wani, and K. Saleem, “Cancer scenario in India with future perspectives,”
Cancer Therapy, vol. 8, no. ISSUE A, pp. 56–70, 2011.

69

You might also like