0% found this document useful (0 votes)
7 views

Final Year Project[1]

The document presents a major project report on an 'AI Based Health Monitoring Assistant' developed by students of AMC Engineering College as part of their Bachelor of Engineering in Artificial Intelligence and Machine Learning. The project utilizes Optical Character Recognition (OCR) to automate data extraction from blood test reports, enabling efficient monitoring and analysis of health metrics over time. The system aims to enhance preventive healthcare through predictive analytics and continuous learning, ultimately improving patient engagement and supporting healthcare providers.

Uploaded by

Subhashri C
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
7 views

Final Year Project[1]

The document presents a major project report on an 'AI Based Health Monitoring Assistant' developed by students of AMC Engineering College as part of their Bachelor of Engineering in Artificial Intelligence and Machine Learning. The project utilizes Optical Character Recognition (OCR) to automate data extraction from blood test reports, enabling efficient monitoring and analysis of health metrics over time. The system aims to enhance preventive healthcare through predictive analytics and continuous learning, ultimately improving patient engagement and supporting healthcare providers.

Uploaded by

Subhashri C
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 62

VISVESVARAYA TECHNOLOGICAL UNIVERSITY

BELAGAVI-590018

A Major Project Report On


“AI BASED HEALTH MONITORING ASSISTANT”
Submitted in partial fulfilment of the requirements for the award of the degree
of
Bachelor Of Engineering In
Artificial Intelligence & Machine Learning

Submitted by
DEEKSHA BHAVANI S :1AM21AI008
SAMUEL KENNETH JOSEPH :1AM21AI034
SHRINIDHI J :1AM21AI037

Under the Support and Guidance of


Mrs. C. Subhashri
Assistant Professor, Dept. of
AIML

AMC ENGINEERING COLLEGE


DEPARTMENT OF ARTIFICIAL INTELLIGENCE AND MACHINE LEARNING
18 K.M. Bannerghatta Main Road, Bengaluru-560083
th

2024-2025
DEPARTMENT OF ARTIFICIAL INTELLIGENCE AND MACHINE LEARNING
AMC ENGINEERING COLLEGE
18th K.M. Bannerghatta Main Road Bengaluru – 560083

CERTIFICATE

We Certify that the Major Project work entitled “AI BASED HEALTH MONITORING ASSISTANT”
carried out by DEEKSHA BHAVANI S (1AM21AI008), SHRINIDHI J (1AM21AI037),
SAMUEL KENNETH JOSEPH (1AM21AI034), are Bonafide students of AMC Engineering College in
partial fulfilment of the requirement of VII semester (Major project work (21AIP76)) Bachelor of Engineering
in Artificial Intelligence And Machine Learning, Visvesvaraya Technological University, Belagavi, during the
year 2024 – 2025. It is certified that all corrections/suggestions indicated for Internal Assessment have been
incorporated in the Report deposited in the departmental library. The Major Project report has been approved
as it satisfies the academic requirements.

Signature of the Guide Signature of the HOD Signature of the Principal


Mrs. C.Subhashri Dr. Sridhar C.S Dr. K Kumar
Assistant Professor, Professor & HOD, Principal
Dept of AIML Dept. of AIML

External Viva
Name of the examiners Signature with date

1.

2.
DECLARATION

We DEEKSHA BHAVANI S (1AM21AI008), SAMUEL KENNETH JOSEPH(1AM21AI034),


SHRINIDHI J (1AM21AI037), students of VII semester of BE, Artificial Intelligence & Machine
Learning, AMC Engineering College hereby declare that the Major project work entitled “AI BASED
HEALTH MONITORING ASSISTANT” has been carried out by us at AMC Engineering College,
Bengaluru and submitted in partial fulfilment of the course requirements of Bachelor of Engineering
in ARTIFICAL INTELLIGENCE and MACHINE LEARNING of Visvesvaraya Technological
University, Belagavi, during the academic year 2024- 2025.We also declare that, to the best of our
knowledge and belief, the work reported here does not from part of any other dissertation on the basis
of which a degree or an award was conferred on an earlier occasion on this by any other student.

Date:
Place: Bengaluru

Name USN Signature

DEEKSHA BHAVANI S 1AM21AI008

SAMUEL KENNETH JOSEPH 1AM21AI034

SHRINIDHI J 1AM21AI037
ACKNOWLEDGEMENT

It gives us immense pleasure to present before you our project titled “AI BASED HEALTH
MONITORING ASSISTANT”. The joy and satisfaction that accompany the successful
completion of any task would be incomplete without the mention of those who made it possible.

We are glad to express our gratitude towards our prestigious institution AMC ENGINEERING
COLLEGE for providing us with utmost knowledge, encouragement, and the maximum facilities
in undertaking this major project.

First of all, we would like to thank the Management of AMC Engineering College for providing
such a healthy environment for the successful completion of the Major project work.

In this regard, We express our sincere gratitude to Dr. K Kumar AMCEC, for providing us all the
facilities in this college.

We express our deepest gratitude and special thanks to Dr. Sridhar C.S, Professor & H.O.D,
Dept. Of Artificial Intelligence and Machine Learning, for all his guidance and encouragement.

We sincerely acknowledge the guidance and constant encouragement of our Major - Project guide,
Mrs. C.Subhashri Assistant Prof., Dept. Of Artificial Intelligence & Machine Learning.

DEEKSHA BHAVANI S 1AM21AI008


SAMUEL KENNETH JOSEPH 1AM21AI034
SHRINIDHI J 1AM21AI037
ABSTRACT

In recent years, advancements in Artificial Intelligence (AI) and Deep Learning (DL) have led to
revolutionary changes across numerous fields, including healthcare. This paper explores a novel
application of AI technology that uses Optical Character Recognition (OCR) to extract detailed
data from blood test reports, analyze key health metrics over time, and generate trend
visualizations. By automating data capture from health reports, this system circumvents the need
for manual data entry, enhancing the accuracy and efficiency of patient data management. The
OCR tool captures values for various health parameters—such as glucose, cholesterol,
hemoglobin, and white blood cell count—directly from scanned reports or photographs. The
extracted data is then processed by deep learning models that are designed to identify patterns,
track changes, and make predictions based on historical data trends.

Using line charts as a primary visualization method, this system allows for an easy-to-understand,
graphical representation of blood test results over time. Line charts offer patients and healthcare
providers a clear view of how individual parameters have fluctuated across multiple test periods,
enabling a proactive approach in managing health. Such trend analyses can aid in early detection
of irregularities, helping to identify potential health risks like diabetes, infections, or
cardiovascular concerns before they escalate. The AI-based system also includes algorithms that
identify anomalies, flagging parameters that deviate significantly from the patient’s baseline,
which can prompt further medical investigation.

The implementation of this AI-based solution involves training deep learning models to accurately
recognize numeric patterns and potential correlations between different health factors. By
leveraging a large dataset of blood report samples, the model is fine-tuned to produce reliable and
precise outputs. Moreover, the system is designed to improve over time through continuous
learning from new data inputs. This ensures that the predictive analytics provided by the system
become increasingly refined, enhancing the ability of healthcare providers to make data-informed
decisions.

Overall, the proposed AI application demonstrates significant potential for improving patient
health monitoring and management. By providing an automated, data-driven approach to
interpreting blood reports, this solution could transform preventive healthcare, allowing for a more
personalized and timely response to emerging health conditions. The insights derived from
continuous monitoring empower both patients and healthcare professionals to better understand
and manage health on a regular basis, moving towards a more proactive, AI-enabled healthcare.
INDEX
Chapter Title Page
No

INTRODUCTION 2-6

1.1 AI BASED HEALTH MONITORING ASSISTANT 2

1 1.2 INTRODUCTION TO OCR 2-3

1.3 INTRODUCTION OF NLP 3

1.4 CONCEPT, ANALYSIS AND APPLICATIONS 4-6

LITERATURE SURVEY 7-10


2.1 ADVANCEMENT 7

2 2.2 REMOTE MONITORING 8

2.3 ENHANCING DIAGNOSIS 9-10

OBJECTIVES AND METHODOLOGY 11-12

3.1 OBJECTIVES 11
3
11
3.2 ADVANTAGES
12
3.3 DRAWBACKS
SYSTEM ANALYSIS 13-14

4.1 WHAT IS SYSTEM ANALYSIS 13

4 13
4.2 SOFTWARE REQUIREMENT
14
4.3 HARDWARE REQUIREMENT

DATASET AND LIBRARIES 15-19

5.1 DATASET 15
5
5.2 PYTHON LIBARARIES 16-19
20-35
IMPLEMENTATION
20
6.1 SYSTEM IMPLEMENTATION
21-22
6.2 ALGORITHMS
6
23-27
6.3 MODELTRAINING

6.4 DETAIL PROCESS 28-33

6.5 CODE SUMMARY 34-35

ARCHITECTURE 33-38

7.1 SYSTEM ARCHITECTURE DIAGRAM 33-35


7
7.2 WORKFLOW DIAGRAM 36-38

RESULTS,SCREENSHOTS AND CONCUSION 39-52

8.1 RESULTS AND SCREENSHOTS 39-49

8 8.2 PLOT OF ROC-AOC CURVE 49-50

50-52
8.3 EVOLUTION METRIC GRAPH

CONCLUSION,FUTURE SCOPE AND REFERENCES 53-56

9.1 CONCLUSION 53
9
9.2 FUTURE SCOPE 54

55-56
9.3 REFERENCES
AI Based Health Monitoring Assistant

Chapter 1

INTRODUCTION

1.1 AI based health monitoring assistant

Advancements in Artificial Intelligence (AI) and Deep Learning (DL) have revolutionized various
fields, including healthcare. This project explores an innovative application of AI technology that
leverages Optical Character Recognition (OCR) to extract detailed data from blood test reports,
analyze key health metrics over time, and generate trend visualizations. By automating data capture
from health reports, this system eliminates the need for manual data entry, thereby enhancing the
accuracy and efficiency of patient data management.

The OCR tool captures values for various health parameters—such as glucose, cholesterol,
hemoglobin, and white blood cell count—directly from scanned reports or photographs. The
extracted data is processed by deep learning models designed to identify patterns, track changes, and
make predictions based on historical data trends. Using line charts as the primary visualization
method, this system provides an easy-to-understand graphical representation of blood test results
over time.

This automated, data-driven approach facilitates a proactive and personalized response to emerging
health conditions, allowing both patients and healthcare professionals to better understand and
manage health on a regular basis. Through continuous monitoring and predictive analytics, the
system aims to improve preventive healthcare and empower users with actionable health insights.

1.2 Introduction to Optical Character Recognition (OCR):

Optical Character Recognition (OCR) is a technology that converts different types of documents,
such as scanned paper documents, PDFs, or images captured by a digital camera, into editable and
searchable data. OCR software works by analyzing the structures of the characters and the words in
a document, interpreting them, and converting them into machine-encoded text.

Key Features and Benefits:

1. Text Recognition: OCR can identify printed or handwritten text characters within scanned
documents or images, making it possible to digitize hardcopy records.

2. Efficiency and Accuracy: It significantly reduces the time and effort required to manually
transcribe documents, enhancing accuracy and productivity.

3. Data Management: OCR enables easy storage, retrieval, and editing of documents, facilitating
better data management and accessibility.

4. Searchable Text: Once converted, the text can be searched, edited, and even translated, making
it useful for a variety of applications across different sectors.

Department of AIML, AMCEC 2024-25 2


AI Based Health Monitoring Assistant
5. Accessibility: OCR technology helps in creating accessible content for visually impaired users
by converting text into speech or Braille.

Applications of OCR:

• Document Digitization: Used by libraries, offices, and archives to convert physical documents
into digital formats.

• Data Entry Automation: Helps businesses automate data entry processes, reducing manual
labor and errors.

• Invoice and Receipt Scanning: Frequently used in finance to scan and manage invoices,
receipts, and other financial documents.

• Text Extraction for Analysis: Useful in extracting text from images for further analysis in
research, legal, and medical fields.

• Language Translation: Assists in translating text from scanned foreign language documents.
OCR technology continues to evolve, incorporating advanced AI and machine learning techniques
to improve its accuracy and range of applications. It plays a crucial role in modernizing document
management and enhancing the accessibility and usability of information in the digital age.

1.3 Introduction to Natural Language Processing (NLP)

Natural Language Processing (NLP) is a branch of artificial intelligence that focuses on the
interaction between computers and humans through natural language. The goal of NLP is to enable
computers to understand, interpret, and generate human language in a way that is both meaningful
and useful.

Key Components of NLP:

1. Text Analysis: Breaking down text into its component parts to understand its structure and
meaning. This includes tasks like tokenization (splitting text into words or sentences), part-of-
speech tagging (identifying the grammatical parts of a text), and named entity recognition
(identifying names of people, organizations, locations, etc.).

2. Language Modeling: Developing models that can predict the next word in a sequence of words,
which is crucial for tasks like text generation and auto-completion.

3. Machine Translation: Translating text from one language to another using algorithms and
models that understand and generate text in multiple languages.

4. Sentiment Analysis: Determining the sentiment or emotion expressed in a piece of text, such as
whether a review is positive, negative, or neutral.

5. Text Summarization: Automatically generating a concise summary of a longer text, which helps
in quickly understanding large volumes of information.

Department of AIML, AMCEC 2024-25 3


AI Based Health Monitoring Assistant
6. Question Answering: Developing systems that can answer questions posed in natural language,
by understanding the context and retrieving relevant information.

7. Speech Recognition and Synthesis: Converting spoken language into text (speech recognition)
and generating spoken language from text (speech synthesis).

Applications of NLP:

• Customer Support: Chatbots and virtual assistants that understand and respond to customer
queries.

• Healthcare: Analyzing patient records, medical literature, and research papers to support
healthcare professionals.

• Finance: Processing and analyzing financial documents, news, and reports to make informed
decisions.

• Education: Developing tools that help in language learning and understanding educational
content.

• Social Media Monitoring: Analyzing sentiments and trends on social media platforms to gauge
public opinion.

NLP is continuously evolving, powered by advancements in machine learning and deep learning,
making it an essential technology for automating and enhancing communication and information
processing in numerous fields.

1.4 Ai based health monitoring assistant : Concept, Analysis and Applications

Concept
• The AI-Based Health Monitoring Assistant leverages advancements in Artificial Intelligence
(AI) and Deep Learning (DL) to revolutionize healthcare. This system utilizes Optical Character
Recognition (OCR) technology to extract data from blood test reports, automate data capture,
and analyse key health metrics over time. By eliminating the need for manual data entry, this
solution enhances the accuracy and efficiency of patient data management.

Analysis

• Data Extraction and Processing: The OCR tool captures values for health parameters such as
glucose, cholesterol, hemoglobin, and white blood cell count directly from scanned reports or
photographs. Deep learning models process this data to identify patterns, track changes, and
make predictions based on historical trends.

• Visualization: The system uses line charts to provide a graphical representation of blood test
results over time, making it easy for patients and healthcare providers to understand fluctuations
in health metrics.

Department of AIML, AMCEC 2024-25 4


AI Based Health Monitoring Assistant

• Predictive Analytics: By analyzing historical data, the AI system can predict potential
health issues, enabling early intervention. The system also includes algorithms that flag
anomalies, prompting further medical investigation when necessary.

• Continuous Learning: The system improves over time by learning from new data inputs,
enhancing the precision of its predictive analytics and supporting more informed healthcare
decisions.

Applications

• Proactive Health Management: Provides continuous monitoring and early detection of


health irregularities, aiding in the prevention and management of conditions like diabetes,
infections, and cardiovascular diseases.

• Enhanced Patient Engagement: Empowers patients with clear, visual insights into their
health metrics, facilitating better understanding and management of their health.

• Support for Healthcare Providers: Assists healthcare providers in tracking patient health
trends and making data-driven decisions, improving the quality of care.

• Efficiency in Data Management: Automates the data capture and analysis process, reducing
the workload for healthcare professionals and minimizing human errors.

Problem Statement

The traditional approach to managing patient health data, particularly from blood test reports,
involves manual data entry and interpretation, which is time-consuming and prone to errors. This
manual process can lead to inaccuracies in patient records and delayed health interventions.
Additionally, patients and healthcare providers often lack efficient tools to visualize and analyze
health trends over time, which is crucial for early detection of potential health issues. The need
for an automated, accurate, and efficient system to manage, analyze, and visualize blood test data
has become increasingly critical in modern healthcare.

Goals and Objectives of the Project

Automate Data Extraction: Develop an AI-based system that uses Optical Character
Recognition (OCR) to automatically extract detailed health data from blood test reports,
eliminating the need for manual data entry.

Enhance Data Accuracy: Improve the accuracy of patient records by reducing human errors
associated with manual data handling.

Department of AIML, AMCEC 2024-25 5


AI Based Health Monitoring Assistant
Analyze Health Metrics: Utilize deep learning models to process and analyze key health metrics
over time, identifying patterns and trends in patient data.

Visualize Health Data: Create clear and understandable visual representations of blood test
results using line charts, enabling both patients and healthcare providers to easily track health
metrics over multiple test periods.

Early Detection of Health Risks: Implement algorithms to detect anomalies and deviations in
health parameters, allowing for early identification of potential health risks such as diabetes,
infections, and cardiovascular issues.

Continuous Learning and Improvement: Design the system to continuously learn from new
data inputs, refining its predictive analytics and enhancing its ability to provide accurate and
reliable health insights.

Proactive Health Management: Empower patients and healthcare professionals with actionable
insights derived from continuous monitoring and analysis, promoting a proactive approach to
health management.

Personalized Healthcare: Support personalized healthcare by providing tailored health insights


based on individual patient data, improving the overall quality of care.

Department of AIML, AMCEC 2024-25 6


AI Based Health Monitoring Assistant

Chapter-2

LITERATURE SURVEY

2.1 Advancements in IoT-Based Health Monitoring Systems

Year: 2023

Author: Aritra Dey and Biswamoy Pal

Problem Addressed:
The study by Sachin Shetty V S, Vedanth M, and Manjunath S (2023) focuses on the role of
Artificial Intelligence (AI) in enhancing the diagnosis and treatment planning within healthcare.
The aim is to leverage AI technologies to improve the accuracy and efficiency of medical
diagnostics and develop better treatment strategies, which are critical for patient outcomes.

Methodology:
The authors conducted a comprehensive literature survey to provide an up-to-date understanding
of the current state of IoT-based health monitoring systems. They examined a wide range of
research articles, conference papers, and industry reports to identify key technological
innovations, applications, challenges, and future directions1. The survey focused on
technological advancements such as sensor technologies, wearable devices, communication
protocols, and data analytics techniques.

Drawbacks:
Despite the promising advancements, the study highlighted several challenges associated with
IoT-based health monitoring systems. These include privacy concerns, as the collection and
transmission of health data raise significant privacy issues1. Security is another major challenge,
as these systems must be protected against cyber-attacks and data breaches. Interoperability
issues arise when different devices and systems need to work together seamlessly1. Data
management is also a concern, as the vast amount of data generated by these systems requires
efficient storage, processing, and analysis. Finally, regulatory considerations must be addressed
to ensure compliance with healthcare regulations and standards

Department of AIML, AMCEC 2024-25 7


AI Based Health Monitoring Assistant

2.2 Remote Monitoring and Management of Elderly Patients Using IoT-


Based Healthcare Systems

Year: 2023

Author: Abdikarim Abi Hassan, Kemal Tutuncu, Husein Osman Abdullahi, Abdifatah Farah

Problem Addressed:

The study by Abdikarim Abi Hassan, Kemal Tutuncu, Husein Osman Abdullahi, and Abdifatah
Farah Ali (2023) addresses the need for effective remote monitoring and management of elderly
patients using IoT-based healthcare systems. The aging population poses significant healthcare
challenges, as elderly individuals often require continuous monitoring due to chronic illnesses
and other health conditions.

Methodology:

The researchers developed an IoT-based healthcare system specifically designed to monitor vital
signs of elderly patients. The system included wearable devices equipped with sensors to track
various health metrics such as body temperature, blood pressure, heart rate, and sleep patterns.
These wearable devices continuously collected data and transmitted it to a centralized system
where healthcare providers could access and analyze the information in real-time. The system
aimed to provide timely alerts and interventions, improving the overall quality of care for elderly
patients.

Drawbacks:

Despite the promising capabilities of the IoT-based healthcare system, the study identified
several significant challenges. Privacy and security were major concerns, as the continuous
monitoring and transmission of sensitive health data require robust measures to protect against
unauthorized access and cyber threats. Interoperability issues also emerged, as the system needed
to seamlessly integrate with other healthcare technologies and platforms. The reliability of real-
time data alerts was another challenge, as any delays or inaccuracies in data transmission could
impact patient care. Additionally, the study highlighted the importance of user-friendly interfaces
to ensure that both healthcare providers and elderly patients could effectively use the system. The
need for continuous monitoring and maintenance of the devices to ensure their proper functioning
was also emphasized.

Department of AIML, AMCEC 2024-25 8


AI Based Health Monitoring Assistant

2.3 Enhancing Diagnosis and Treatment Planning Using AI in Healthcare

Year:2023

Author: Sachin Shetty V S, Vedanth M, Manjunath S

Problem Addressed:

The study by Sachin Shetty V S, Vedanth M, and Manjunath S (2023) focuses on the role of
Artificial Intelligence (AI) in enhancing the diagnosis and treatment planning within healthcare. The
aim is to leverage AI technologies to improve the accuracy and efficiency of medical diagnostics
and develop better treatment strategies, which are critical for patient outcomes.

.
Methodology:

The researchers conducted a comprehensive survey of various AI applications in healthcare. They


explored machine learning algorithms, natural language processing (NLP) techniques, and other AI-
driven approaches that can be utilized for medical diagnostics. The study examined how these
technologies can be applied to analyze medical images, interpret clinical data, and predict patient
outcomes. The focus was on developing models that can assist healthcare professionals by providing
insights derived from large datasets, thereby improving the precision of diagnoses and the
effectiveness of treatment plans.

Drawbacks:

The study identified several challenges and ethical considerations associated with the use of AI in
healthcare. One major concern is data privacy, as AI systems often require access to vast amounts
of sensitive patient data to function effectively. Ensuring that this data is protected and used
responsibly is paramount. Additionally, there are regulatory hurdles that need to be addressed to
ensure that AI applications in healthcare meet safety and efficacy standards. The study also pointed
out the potential for bias in AI algorithms, which can result from skewed training data, leading to
inaccurate or unfair treatment recommendations. The integration of AI into existing healthcare
systems requires substantial investment and changes to current workflows, which can be barriers to
adoption. Finally, the study emphasized the importance of maintaining the human element in
healthcare, as AI should augment rather than replace the critical judgement and empathy provided
by healthcare professionals.

Department of AIML, AMCEC 2024-25 9


AI Based Health Monitoring Assistant

Chapter-3

OBJECTIVES AND METHODOLOGY

3.1 Objectives of the AI-Based Health Monitoring Assistant

1. Automate Data Extraction: Develop an AI-based system using OCR technology to


automatically extract health data from blood test reports, eliminating the need for manual data
entry.

2. Enhance Data Accuracy: Improve the precision of patient records by reducing human errors
associated with manual data handling.

3. Analyze Health Metrics: Utilize deep learning models to process and analyze key health
metrics over time, identifying patterns and trends in patient data.

4. Visualize Health Data: Create clear and understandable visual representations of blood test
results using line charts, enabling easy tracking of health metrics over multiple test periods.

5. Early Detection of Health Risks: Implement algorithms to detect anomalies and deviations in
health parameters, allowing early identification of potential health risks such as diabetes,
infections, and cardiovascular issues.

6. Continuous Learning and Improvement: Design the system to continuously learn from new
data inputs, refining its predictive analytics and enhancing its ability to provide accurate and
reliable health insights.

7. Proactive Health Management: Empower patients and healthcare professionals with


actionable insights derived from continuous monitoring and analysis, promoting a proactive
approach to health management.

8. Personalized Healthcare: Support personalized healthcare by providing tailored health


insights based on individual patient data, improving the overall quality of care.

3.2 Advantages of the proposed system:

• Improved Accuracy: Automating data extraction and analysis reduces the likelihood of human
error, leading to more reliable patient records.

• Efficiency: The system saves time by eliminating the need for manual data entry, allowing
healthcare providers to focus more on patient care.

• Early Detection: By identifying trends and anomalies, the system enables early detection of
potential health issues, allowing for timely intervention.

Department of AIML, AMCEC 2024-25 10


AI Based Health Monitoring Assistant

• Continuous Improvement: The system's ability to learn from new data ensures that its
predictions and analyses become increasingly accurate over time.

• Better Patient Engagement: Visual representations of health data make it easier for
patients to understand their health status and track changes, encouraging proactive health
management.

• Personalized Care: Tailored health insights based on individual data help provide more
personalized and effective healthcare.

3.3 Drawbacks of the proposed system:

• Privacy Concerns: Handling sensitive health data requires stringent measures to ensure
privacy and security, which can be challenging to implement.

• Initial Costs: Developing and implementing the AI-based system can involve significant
initial costs, which may be a barrier for some healthcare providers.

• Technical Challenges: Ensuring the accuracy of OCR and deep learning models requires
extensive training and fine-tuning, which can be complex and resource-intensive.

• Dependence on Technology: Relying heavily on technology may lead to issues if the


system experiences technical failures or malfunctions.

• Data Integration: Integrating the system with existing healthcare technologies and
platforms can be challenging and may require substantial effort and resources.

• Ethical Considerations: Ensuring that AI algorithms are free from biases and that decisions
made by the system are fair and transparent is crucial.

Department of AIML, AMCEC 2024-25 11


AI Based Health Monitoring Assistant

Chapter 4

SYSTEM ANALYSIS

4.1 What is system analysis?

Analysis is the process of breaking a complex topic or substance into smaller parts to gain a better
understanding of it. Analysts in the field of engineering look at requirements, structures,
mechanisms, and systems dimensions. Analysis is an exploratory activity. The Analysis Phase is
where the project lifecycle begins. The Analysis Phase is where you break down the deliverables in
the high-level Project Charter into the more detailed business requirements. The Analysis Phase is
also the part of the project where you identify the overall direction that the project will take through
the creation of the project strategy documents.

Gathering requirements is the main attraction of the Analysis Phase. The process of gathering
requirements is usually more than simply asking the users what they need and writing their answers
down. Depending on the complexity of the application, the process for gathering requirements has a
clearly defined process of its own. This process consists of a group of repeatable processes that utilize
certain techniques to capture, document, communicate, and manage requirements.

4.2 Software Requirement Specification

A Software Requirements Specification (SRS) – a requirements specification for a software system


– is a complete description of the behaviour of a system to be developed. In addition to a description
of the software functions, the SRS also contains non-functional requirements. Software requirements
are a sub-field of software engineering that deals with the elicitation, analysis, specification, and
validation of requirements for software.

4.2.1 Software Requirements:

• Operating System: Windows, macOS


• Python: Install the latest version of Python (preferably Python 3.x) from the official Python
website (https://www.python.org/).
• Web Browser – Google chrome
• VS Code
• Python Libraries: Install the necessary libraries and frameworks for OCR.

Department of AIML, AMCEC 2024-25 12


AI Based Health Monitoring Assistant
4.3 Hardware Requirements:

• CPU: A CPU with multiple cores is recommended for faster processing.


• Memory (RAM): At least 8GB of RAM is recommended
• Storage: Ensure that you have enough disk space to store your dataset, pre- trained models, and
any intermediate or output files generated during the process.
• Display: A monitor with a suitable resolution is required for visualizing the results and analyzing
the output.
• Network Connection: A high-speed router or a stable network connection is necessary for training
the model.

Department of AIML, AMCEC 2024-25 13


AI Based Health Monitoring Assistant

Chapter 5
Dataset and Libraries
5.1 Dataset
The Indian Council of Medical Research (ICMR) has developed extensive datasets to support
dietary recommendations tailored for various patient groups. These datasets provide scientific
insights into nutrition, addressing specific medical conditions and health needs. By leveraging
these resources, healthcare providers can create personalized diet plans to promote better patient
outcomes.

Source: The datasets produced by the Indian Council of Medical Research (ICMR) can typically
be accessed through their official website or publications, such as the ICMR-NIN (National
Institute of Nutrition) reports. Key documents like the "Nutrient Requirements and
Recommended Dietary Allowances for Indians" are available online. Researchers and healthcare
professionals can explore these resources on the ICMR official website or through affiliated portals
like the National Institute of Nutrition's website.
.

Department of AIML, AMCEC 2024-25 14


AI Based Health Monitoring Assistant

Python Libraries used

Core Libraries:

1. Pandas (import pandas as pd): Used for data manipulation, handling datasets, and
performing data analysis.

2. NumPy (import numpy as np): Essential for numerical computations, array operations, and
handling mathematical operations efficiently.

3. Re (import re): Provides support for working with regular expressions, often used for text
processing tasks.

4. NLTK (import nltk): The Natural Language Toolkit offers various tools and resources for
natural language processing tasks, including tokenization, stemming, stopwords, etc.

5. Sci-kit learn (from sklearn.*): Scikit-learn is a powerful machine learning library that
includes tools for data preprocessing, feature extraction, model building, and evaluation. It
encompasses various classification, regression, clustering, and model evaluation
algorithms.

Department of AIML, AMCEC 2024-25 15


AI Based Health Monitoring Assistant

6. Seaborn (import seaborn as sns)

7. Matplotlib (import matplotlib.pyplot as plt): Visualization libraries used to create plots,


charts, and visual representations of the data and model evaluation metrics.

8. Wordcloud (from wordcloud import WordCloud): A library specifically used for


generating word clouds from textual data, visually representing the frequency of words.

9. PIL (from PIL import Image): The Python Imaging Library, used for handling images and
supporting image operations.

Specific Modules:

 nltk.tokenize.word_tokenize: Tokenizes text into words.

 nltk.corpus.stopwords: Provides a list of common stopwords for text data.

 nltk.stem.porter.PorterStemmer: Implements stemming to reduce words to their root form.

 sklearn.feature_extraction.text.CountVectorizer and TfidfVectorizer: Used for converting text


data into numerical features (Bag-of-Words and TF-IDF representations).

 sklearn.model_selection.train_test_split : Splits data into training and testing sets.

 sklearn.linear_model.LogisticRegressionCV,sklearn.svm.SVC,
sklearn.ensemble.RandomForestClassifier : Machine learning models used for sentiment
analysis.

 sklearn.metrics.classification_report,sklearn.metrics.roc_auc_score, sklearn.metrics.roc_curve :
Functions for evaluating classification model performance.

 Pickle : Used for serializing and deserializing Python objects, here specifically for saving trained
models

Department of AIML, AMCEC 2024-25 16


AI Based Health Monitoring Assistant

Data Splitting: Training and Testing:

1. Import Libraries: The code imports the necessary library train_test_split


from sklearn.model_selection to perform the data split.
2. Data Preparation:
• X represents the vectorized features derived from movie reviews.
• y contains the corresponding sentiment labels (positive, negative) for each review.

3. Splitting the Data:


• train_test_split function divides the data into four subsets: X_train, X_test, y_train, and y_test.
• test_size=0.30 specifies that 30% of the data will be allocated for testing, and 70% will be used
for training.
• random_state=1 sets a random seed, ensuring the same random split occurs if the code is run
multiple times (providing reproducibility).
• shuffle=False means the data won't be shuffled before splitting, which can be useful in certain
scenarios. However, shuffling can help prevent any inherent ordering bias in the dataset.
4. Result:
• X_train and y_train will contain 70% of the original data, used for training the sentiment analysis
model.
• X_test and y_test will contain 30% of the original data, used to evaluate the model's
performance.

Department of AIML, AMCEC 2024-25 17


AI Based Health Monitoring Assistant

1. Importing Libraries:
• The code imports the MaxAbsScaler class from sklearn.preprocessing, which is used for
scaling features.

2. Data Scaling:
• MaxAbsScaler scales each feature by its maximum absolute value, bringing all features within the
range [-1, 1] without changing their directionality.

3. Fitting and Transforming:


• scaling = MaxAbsScaler().fit(X_train) initializes the scaler and fits it to the training data (X_train).
This step calculates the maximum absolute values of each feature in the training set.
• X_train = scaling.transform(X_train) applies the scaling transformation to the training data. It
scales the features in X_train based on the maximum absolute values calculated earlier.
• X_test = scaling.transform(X_test) applies the same transformation to the testing data (X_test). It
ensures that the scaling is consistent across both training and testing datasets, maintaining the same
scaling factors applied to X_train.
4. Purpose in Sentiment Analysis of Movie Reviews:
• In this project on movie reviews, scaling the features will be beneficial, especially when the
original features (word frequencies, TF-IDF values, etc.) have varying scales or ranges.
• Scaling ensures that the features’ magnitudes don't affect the learning process disproportionately,
especially in models sensitive to feature scales (e.g., SVMs, neural networks). It aids in improving.

Department of AIML, AMCEC 2024-25 18


AI Based Health Monitoring Assistant

Chapter 6

IMPLEMENTATION

Implementation is the realization of an application, or execution of a plan, idea, model, design,


specification, standard, algorithm, or policy. In other words, an implementation is a realization of
a technical specification or algorithm as a program, software component, or other computer system
through programming and deployment. Many implementations may exist for a given specification
or standard.

Implementation is one of the most important phases of the Software Development Life Cycle
(SDLC). It encompasses all the processes involved in getting new software or hardware operating
properly in its environment, including installation, configuration, and running, testing, and making
necessary changes. Specifically, it involves coding the system using a particular programming
language and transferring the design into an actual working system. This phase of the system is
conducted with the idea that whatever is designed should be implemented; keeping in mind that it
fulfils user requirements, objective and scope of the system. The implementation phase produces
the solution to the user problem.

6.1 Overview of System Implementation


There could be many ways of implementing this project we have chosen Python because python
is a widely used high-level, general-purpose, interpreted, dynamic programming language. Its
design philosophy emphasizes code readability, and its syntax allows programmers to express
concepts in fewer lines of code than would be possible in languages such as C++ or Java. The
language provides constructs intended to enable clear programs on both a small and large scale.

Python supports multiple programming paradigms, including object-oriented, imperative


and functional programming or procedural styles. It features a dynamic type system and
automatic memory management and has a large and comprehensive standard library.
Python, the reference implementation of Python, is free and open-source software and has
a community-based development model, as do nearly all of its alternative implementations.
Python is managed by the non-profit Python Software Foundation.

Next, we partition our dataset into training, validation, and testing sets, ensuring a robust
evaluation of model performance. During the training phase, we employ diverse
algorithms, experimenting with various models known for their efficacy in sentiment
analysis. This includes algorithms like Support Vector Machines, Logistic Regression,
Ensemble methods such as Random Forests or Bagging, and Naive Bayes classifiers.

Department of AIML, AMCEC 2024-25 19


AI Based Health Monitoring Assistant

6.2 Algorithms

Logistic Regression: Logistic regression, a statistical technique, excels in predicting binary


outcomes, making it fitting for sentiment analysis tasks. By analysing relationships between
independent variables—like review features—and a binary sentiment outcome, it facilitates
predictions, enabling straightforward categorization of reviews into positive or negative
sentiments.

Logistic regression can also play a role in data preparation activities by allowing data sets to be put
into specifically predefined buckets during the extract, transform, load (ETL) process in order to
stage the information for analysis.

Logistic regression is important because it transforms complex calculations around probability into
a straightforward arithmetic problem. This dramatically simplifies analyzing the impact of multiple
variables and helps to minimize the effect of confounding factors. As a result, statisticians can
quickly model and explore the contribution of various factors to a given outcome.

Support vector machines (SVM): Support vector machines are a set of supervised learning
methods used for classification, regression, and outliers detection. All of these are common tasks
in machine learning.
A simple linear SVM classifier works by making a straight line between two classes. That means all
of the data points on one side of the line will represent a category and the data points on the other
side of the line will be put into a different category.
This means there can be an infinite number of lines to choose from. What makes the linear SVM
algorithm better than some of the other algorithms, like k-nearest neighbours, is that it chooses the
best line to classify your data points. It chooses the line that separates the data and is the furthest
away from the closet data points as possible.

Bagging Classifier: Bootstrap Aggregation (bagging) is an ensemble method that attempts to


resolve overfitting for classification or regression problems. Bagging aims to improve the accuracy
and performance of machine learning algorithms.

It does this by taking random subsets of an original dataset, with replacement, and fits either a
classifier (for classification) or regressor (for regression) to each subset. The predictions for each
subset are then aggregated through majority vote for classification or averaging for regression,
increasing prediction accuracy.
The main advantage of bagging is that it can reduce the variance of the predictions made by a
supervised learning algorithm without significantly compromising its accuracy.

Department of AIML, AMCEC 2024-25 20


AI Based Health Monitoring Assistant

Random Forest Classifier: Random Forest is a popular machine learning algorithm that
belongs to the supervised learning technique. It can be used for both Classification and Regression
problems in ML.
As the name suggests, Random Forest is a classifier that contains a number of decision trees on
various subsets of the given dataset and takes the average to improve the predictive accuracy of
that dataset. Instead of relying on one decision tree, the random forest takes the prediction from
each tree and based on the majority votes of predictions, it predicts the final output.
The greater number of trees in the forest leads to higher accuracy and prevents the problem of
overfitting.

Naive Bayes: Naive Bayes, which is computationally very efficient and easy to implement, is a
learning algorithm frequently used in text classification problems.

Naive Bayes, treats language as a bag of words, disregarding word order and assigning fixed class
labels to new documents based on their word features. Particularly suited for high-dimensional
problems in text classification due to its efficiency, Naive Bayes utilizes two event models:
Multivariate Bernoulli and Multinomial, with the latter commonly referred to as Multinomial
Naive Bayes in Natural Language Processing (NLP).

This algorithm estimates tag likelihood for text samples using Bayes theorem, considering each
feature's independence from others—a feature's presence or absence does not influence others.

While simple and efficient, Naive Bayes’ prediction accuracy might be lower compared to other
algorithms and it's not suitable for regression, limited to classifying textual data without estimating
numerical values.

Stacked Classifier: A stacked classifier, also known as a stacked ensemble or meta-learner, is


a powerful machine learning technique that combines multiple individual classifiers or models to
improve predictive performance. It operates by training a new model, often referred to as a meta-
learner, on the predictions or outputs of base classifiers.

In a stacked classifier, the base models can be diverse, ranging from different algorithms like
decision trees, support vector machines, or neural networks, each trained on the same dataset but
potentially capturing different aspects of the data.

The key idea behind stacking is to leverage the diverse perspectives of multiple models, allowing
the meta-learner to learn from their collective outputs and potentially create a more accurate and
robust final prediction.

Department of AIML, AMCEC 2024-25 21


AI Based Health Monitoring Assistant

6.3 Model Training and Implementation


Logistic Regression
A Logistic Regression model is trained for this classification. For the c parameter tuning,
LogisticRegressionCV is used, which will perform k-fold cross validation and grid search to find
the optimal parameter based on accuracy.

• lr_model = LogisticRegressionCV : Instantiating the Logistic Regression model with cross-


validation.
• cv=5 : 5-fold cross-validation.
• scoring= ‘accuracy’: Metric for evaluation.
• max_iter=300: Maximum number of iterations.
• n_jobs=-1 : Utilize all available CPU cores.
• verbose=3: Verbosity level for logging.
• random_state=0 : Setting a random seed for reproducibility.
• lr_model.fit(X_train, y_train) : Fitting the model with training data.
• pred = lr_model.predict(X_test): Predicting sentiments on the test dataset.
• print(classification_report(pred, y_test)): Generating a classification report for performance
evaluation.

Department of AIML, AMCEC 2024-25 22


AI Based Health Monitoring Assistant

Support Vector Machine (SVM)

A Linear Support Vector Machine (SVM) model (LinearSVC) with enabled probability
estimates for sentiment analysis of movie reviews. It initializes, trains, and evaluates the
model’s performance metrics.

• from sklearn.svm import LinearSVC : Importing Linear Support Vector Machine module.
• l2_norm = 25 : Setting L2 norm regularization value.
• l2_norm_inverse = 1 / l2_norm : Calculating the inverse of L2 norm for model parameter
C.
• maximum_iterations = 4000 : Defining the maximum number of iterations for the model.
• model_svm=LinearSVC(C=l2_norm_inverse,max_iter=maximum_iterations) : Creating a
Linear SVM model with specified parameters.
• model_svm.fit(X_train, y_train) : Training the model using the training data
• y_pred_svm = model_bc.predict(X_test) : Generating predictions on the test data using the
SVM model.
• print(classification_report(y_pred_svm, y_test)) : Printing a classification report
comparing predicted labels against true test labels.

The code initializes and trains a Linear Support Vector Machine (LinearSVC) model with
specified regularization (L2 norm), maximum iterations, and fits it to training data. Then,
it generates predictions on the test data and prints a classification report comparing the
predictions against the true test labels.

Department of AIML, AMCEC 2024-25 23


AI Based Health Monitoring Assistant

Bagging Classifier
It involves a Bagging Classifier that uses Linear Support Vector Machines (LinearSVC) as base
estimators. It is trained on provided data, then evaluates its accuracy on both training and test sets.
The process time is also recorded, giving insight into computational efficiency.

• Imported necessary modules: time for time tracking, BaggingClassifier and Linear SVC from
sklearn.ensemble and sklearn.svm respectively.
• Defined regularization values (l2_norm, l2_norm_inverse) and maximum iterations for the
Linear Support Vector Machine (maximum_iterations).
• Created a Bagging Classifier with LinearSVC as the base estimator, utilizing 30 estimators and a
specified random state.
• Trained the Bagging Classifier using the provided training data (X_train,
y_train).
• Generated predictions on the test data (X_test) using the trained model. Calculated accuracy
scores on both the training and test sets.
• Recorded the start time before generating predictions. Recorded the end time after predictions to
measure the duration.
• Printed the accuracy scores for the training and test sets. Displayed the time taken for the entire
process in milliseconds.

Department of AIML, AMCEC 2024-25 24


AI Based Health Monitoring Assistant

Random Forest Classifier


A Random Forest Classifier model is trained for this classification with 200 decision trees,
leveraging parallel processing (n_jobs = -1), and displaying training progress (verbose = 1). It then
generates predictions on test data and prints a classification report, comparing these predictions
against the true test labels.

• from sklearn.ensemble import RandomForestClassifier : Importing Linear Random Forest


Classifier module.
• rf_model = RandomForestClassifier(n_estimators=200, n_jobs=-1, verbose=1) : Initiating a
Random Forest Classifier with 200 decision trees, utilizing parallel processing for improved
efficiency, and displaying the training progress during the process.
• rf_model.fit(X_train, y_train) : Training the Random Forest Classifier using the provided training
data.
• pred = rf_model.predict(X_test) : Generating predictions on the test data using the trained model.
• print(classification_report(pred, y_test)) : Printing a detailed classification report that compares
the predicted labels against the true labels in the test dataset.
The implementation of the Random Forest Classifier with 200 decision trees, parallel processing,
and training progress display signifies a robust approach to modeling complex relationships within
the data.
This method not only enhances predictive accuracy but also provides insights into feature
importance due to the ensemble nature of Random Forests. The resulting classification report
allows for a comprehensive evaluation of the model's performance, aiding in the assessment of its
efficacy in sentiment analysis for movie reviews.

Department of AIML, AMCEC 2024-25 25


AI Based Health Monitoring Assistant

Bernoulli Naive Bayes model


This initiates a Bernoulli Naive Bayes model for classification and measures the time taken for
training and prediction.

• Records the start time using time.time() to measure the execution time.
• Imports the Bernoulli Naive Bayes module from sklearn.naive_bayes.
• Creates a Bernoulli Naive Bayes model (nb_model).
• Trains the Naive Bayes model using the provided training data (X_train, y_train) via the fit()
function.
• (End):Records the end time after model training to calculate the duration of the process.
• Generates predictions (y_pred_nb) on the test data (X_test) using the trained model.
• Prints a classification report, comparing the predicted labels (y_pred_nb) against the true test labels
(y_test), using the classification_report() function from the relevant library.

Stacked Classifier
A stacked classifier is used to combine the predictions of multiple base classifiers, which, in this
case, are a Support Vector Machine (SVM), a Bagging Classifier, and a Random Forest Classifier.
Stacking involves training a meta-learner on the predictions made by these base classifiers, aiming
to improve overall predictive performance. The combined model leverages the diverse strengths of
each base classifier to enhance its accuracy and robustness.

Department of AIML, AMCEC 2024-25 26


AI Based Health Monitoring Assistant

6.4 Smart Health Monitoring and Diet Planner - Detailed Process

Step 1: Warm Greeting

The initial step is to warmly greet the user according to their local time to foster a personalized
connection. This involves:

• Incorporating their Name: If the user’s name is known, include their name in the greeting for a more
personal touch. For example, "Good morning, [User Name]!"

• General Greeting: If the user’s name is not known, provide a friendly, general greeting based on the
time of day. For example, "Good afternoon!"

This ensures that every interaction starts on a warm and welcoming note, setting a positive tone for
the rest of the experience.

Step 2: Language Selection

To ensure user-friendliness and inclusivity, the next step involves presenting the user with a choice
of 13 languages. This can be done through clickable buttons for ease of use. The process includes:

• Language Options: Displaying the languages as clickable buttons for the user’s convenience.

• User Interaction: Users can simply click or type their preferred language to proceed.

This allows all subsequent interactions to take place in their chosen language, enhancing
understanding and comfort.

Step 3: Explaining the Purpose

After the user has selected their preferred language, the purpose of the tool is briefly explained. This
step involves:

• Purpose Clarification: Informing users that the tool analyzes their blood reports, identifies health-
related abnormalities, and provides a diet plan tailored to their specific needs.

• Setting Expectations: Clearly outlining the tool's functionality to set accurate expectations for the
user.

By doing this, users are made aware of what to expect from the tool, ensuring a smoother and more
informed experience.

Department of AIML, AMCEC 2024-25 27


AI Based Health Monitoring Assistant

Step 4: Blood Report Upload

• Users are guided to securely upload their blood reports via action buttons.
• The tool supports various file formats, including PDF, JPG, JPEG, and PNG.
• Uploaded reports are encrypted to ensure privacy.
• The reports are used solely for analysis purposes.
• This responsible handling of user data ensures its security and confidentiality.

Step 5: Blood Report Analysis


• The tool utilizes advanced AIML classifiers to process the uploaded blood report.
• Various metrics such as cholesterol levels, blood sugar, iron levels, and vitamins are analyzed.
• Results are displayed in a clear, Excel-style table with grid lines.
• The table showcases the user’s values compared to normal ranges.
• Any abnormalities in the results are flagged for the user.
• This visual format aids in easy interpretation of the results.

Step 6: Visual Representation

• To enhance user understanding, the tool offers the option to view a visual graph.
• If users agree, a bar chart is generated.
• The bar chart provides a clear comparison between user values and normal ranges.
• The chart is color-coded for simplicity:
o Red for abnormal values
o Green for normal values
o Blue for the normal range
o This makes it easier for users to interpret and understand their health data.

Department of AIML, AMCEC 2024-25 28


AI Based Health Monitoring Assistant

Step 7: Observations and Explanations

• Simplified Language: The tool explains the observations in clear, easy-to-understand language,
avoiding technical jargon.
• Examples of Findings:
o Instead of "elevated LDL cholesterol," it says, "Your cholesterol is higher than it should be, which
might be bad for your heart."
o Instead of "low hemoglobin," it says, "Your iron levels are lower than normal, which might cause
tiredness."
• Health Tips: The tool provides general health tips based on the findings, such as "Try to include
more fiber in your diet" or "Consider reducing your sugar intake."

Step 8: Dietary Preferences

• User Selection: The user is prompted to specify their dietary preference.


• Options Available:
o Vegetarian
o Non-Vegetarian
o Vegan
• Customization: The diet plan is tailored to match the user’s dietary choice, ensuring it aligns with
their lifestyle and personal preferences.
• Cultural Considerations: The tool also considers cultural dietary practices to provide relevant meal
options.

Step 9: Duration Selection

• Prompting Users: Users are asked to select the duration of the diet plan they wish to follow.
• Available Options:
o 1 Month
o 3 Months
o 6 Months
• Goal Setting: Each duration option is designed to align with the user’s health goals and commitment
level.
• Flexible Plans: The tool allows for adjustments and updates to the plan as the user progresses,
ensuring it remains effective and relevant.

Department of AIML, AMCEC 2024-25 29


AI Based Health Monitoring Assistant

Step 10: Customized Diet Plan


• Based on the analysis and user preferences, the tool generates a customized diet plan.
• The diet plan includes a variety of meal options and snacks.
• Nutritional information and portion sizes are provided for each meal.
• The plan is designed to meet the user’s dietary needs and preferences while addressing any health
concerns identified in the blood report.

Step 11: Exercise Recommendations


• To complement the diet plan, the tool offers personalized exercise recommendations.
• Recommendations are based on the user’s health status and goals.
• Exercises may include a mix of cardio, strength training, and flexibility exercises.
• Each exercise comes with instructions and tips to ensure proper form and safety.

Step 12: Progress Tracking


• The tool includes a feature for users to track their progress over time.
• Users can log their meals, exercise, and health metrics.
• Progress is visually represented through charts and graphs.
• Regular updates and reminders help users stay motivated and on track.

Step 13: Feedback and Support


• Users are encouraged to provide feedback on their experience with the tool.
• A support system is available for any questions or issues that may arise.
• Feedback is used to continuously improve the tool and enhance user satisfaction.

Step 14: Educational Resources


• The tool provides access to educational resources related to nutrition and health.
• Resources may include articles, videos, and tips on maintaining a healthy lifestyle.
• Users are encouraged to learn more about their health and make informed decisions.

Department of AIML, AMCEC 2024-25 30


AI Based Health Monitoring Assistant

6.5 Code Summary:


1. Logistic Regression
Purpose: A linear model for binary or multiclass classification.
Implementation:
from sklearn.linear_model import LogisticRegression
logreg = LogisticRegression()
logreg.fit(x_train, y_train)
y_pred_logreg = logreg.predict(x_test)
Evaluation:
Metrics: Accuracy, Precision, Recall, F1-Score.
from sklearn.metrics import accuracy_score, precision_score, recall_score, f1_score
print(f"Accuracy: {accuracy_score(y_test, y_pred_logreg)}")
print(f"Precision: {precision_score(y_test, y_pred_logreg, average='weighted')}")
print(f"Recall: {recall_score(y_test, y_pred_logreg, average='weighted')}")
print(f"F1 Score: {f1_score(y_test, y_pred_logreg, average='weighted')}")

2. Quadratic Discriminant Analysis (QDA)


Purpose: A probabilistic model suitable for classifying data by fitting class-specific distributions.
Implementation:
from sklearn.discriminant_analysis import QuadraticDiscriminantAnalysis
qda_model = QuadraticDiscriminantAnalysis()
qda_model.fit(x_train, y_train)
y_pred_qda = qda_model.predict(x_test)

3. K-Nearest Neighbors (KNN)


Purpose: A non-parametric model for classification based on the proximity of neighbors.
Implementation:
from sklearn.neighbors import KNeighborsClassifier
KNN_model = KNeighborsClassifier()
KNN_model.fit(x_train, y_train)
y_pred_knn = KNN_model.predict(x_test)

Department of AIML, AMCEC 2024-25 31


AI Based Health Monitoring Assistant

4. Support Vector Machine (SVM)


Purpose: A powerful model for classification that finds the hyperplane separating different classes.
Implementation:
from sklearn.svm import SVC
svm_model = SVC()
svm_model.fit(x_train, y_train)
y_pred_svm = svm_model.predict(x_test)

5. Random Forest Classifier


Purpose: An ensemble model based on multiple decision trees for robust classification.
Implementation:
from sklearn.ensemble import RandomForestClassifier
rf_model = RandomForestClassifier()
rf_model.fit(x_train, y_train)
y_pred_rf = rf_model.predict(x_test)

6. Gaussian Naive Bayes


Purpose: A simple probabilistic classifier based on Bayes’ theorem.
Implementation:
from sklearn.naive_bayes import GaussianNB
gnB_model = GaussianNB()
gnB_model.fit(x_train, y_train)
y_pred_gnb = gnB_model.predict(x_test)

Evaluation Metrics Used:


For all models, the following metrics were used to evaluate performance:
• Accuracy: Percentage of correct predictions.
• Precision: The ratio of true positives to predicted positives.
• Recall: The ratio of true positives to actual positives.
• F1-Score: Harmonic mean of precision and recall.

Department of AIML, AMCEC 2024-25 32


AI Based Health Monitoring Assistant

Chapter 7
Architecture
7.1 System Architecture Diagram:

Components:
1. User Interface (Web/App):
o Users interact with the system through a web or mobile application.
o Provides options to upload blood reports and view results.
2. Input (Blood Reports - PDF/Images):
o Accepts blood reports in various formats (PDF, JPG, PNG).
o Initiates the data extraction process.
3. OCR Engine:
o Optical Character Recognition (OCR) extracts text data (health metrics) from the uploaded
reports.
o Converts the content into structured, machine-readable text.
4. Preprocessing:
o Cleans and formats extracted data.
o Ensures compatibility for the deep learning model by handling inconsistencies or missing
values.
Department of AIML, AMCEC 2024-25 33
AI Based Health Monitoring Assistant

5. Deep Learning Model:


o Processes the cleaned data to identify patterns and trends.
o Generates insights by leveraging historical data and predictive analytics.
6. Pattern Recognition & Anomaly Detection:
o Identifies deviations from normal health metrics.
o Flags potential health risks for further medical investigation.
7. Database:
o Stores user data, processed metrics, and historical trends.
o Facilitates continuous learning for the AI model.
8. Visualization Module:
o Generates line charts and bar graphs for clear, visual representation of health trends.
o Helps users and healthcare providers interpret results.
9. Insights & Recommendations:
o Provides tailored health recommendations and dietary plans based on the analyzed data.
o Supports proactive health management.
10. Report Delivery (Email/Notifications):
o Delivers results and recommendations to users via email or app notifications.
o Ensures timely communication of important insights.

Code for the system Architecture:


import matplotlib.pyplot as plt
import matplotlib.patches as mpatches

# Create the figure and axis


fig, ax = plt.subplots(figsize=(12, 8))

# Define positions of components


components = {
"User Interface (Web/App)": (1, 8),
"Input (Blood Reports - PDF/Images)": (3, 6),

Department of AIML, AMCEC 2024-25 34


AI Based Health Monitoring Assistant

"OCR Engine": (5, 6),


"Preprocessing": (5, 5),
"Deep Learning Model": (7, 5),
"Pattern Recognition & Anomaly Detection": (9, 5),
"Database": (7, 8),
"Visualization Module": (9, 6),
"Insights & Recommendations": (9, 4),
"Report Delivery (Email/Notifications)": (11, 4)
}

# Draw arrows and components


for component, pos in components.items():
ax.text(pos[0], pos[1], component, ha="center", va="center",
bbox=dict(boxstyle="round", facecolor="lightblue", edgecolor="black",
pad=0.5))

# Draw arrows
arrows = [
("User Interface (Web/App)", "Input (Blood Reports - PDF/Images)"),
("Input (Blood Reports - PDF/Images)", "OCR Engine"),
("OCR Engine", "Preprocessing"),
("Preprocessing", "Deep Learning Model"),
("Deep Learning Model", "Pattern Recognition & Anomaly Detection"),
("Deep Learning Model", "Visualization Module"),
("Deep Learning Model", "Insights & Recommendations"),
("Database", "Deep Learning Model"),
("Pattern Recognition & Anomaly Detection", "Insights & Recommendations"),
("Insights & Recommendations", "Report Delivery (Email/Notifications)" ]

Department of AIML, AMCEC 2024-25 35


AI Based Health Monitoring Assistant

7.2 Workflow diagram:

Department of AIML, AMCEC 2024-25 36


AI Based Health Monitoring Assistant
1. Login/Sign Up:
o User Information: The system begins with a user login or sign-up, gathering essential user
information (perhaps for personalization or record-keeping).

2. Language Preference Selection:


o Users select their preferred language, making the system more user-friendly and accessible. This
choice impacts how the information is presented later.

3. Report Dataset:
o Users upload health reports, typically in formats like PDF, JPG, PNG.
o The system preprocesses these files, converting them into a readable format.

4. Data Extraction:
o Extracting Data: Optical Character Recognition (OCR) or similar technologies extract data
from these reports and save it in a structured format like CSV.

5. Categorization:
o Categorize Values: The extracted data is categorized — determining which health metrics
(vitals) fall within normal ranges and which are abnormal. This likely involves comparing values
against predefined thresholds.

6. Report Segmentation:
o Normal vs. Abnormal: The report is split into two parts:
▪ Normal Part of Report: Vitals that are within normal ranges.
▪ Abnormal Part of Report: Vitals that deviate from normal ranges, requiring further attention.

7. Visualization:
o Graphs and charts are generated to visualize the trends of both normal and abnormal vitals,
making it easier for users and healthcare providers to interpret the data.

8. Diet Plan Recommendation:


o Abnormal Vitals: If any vitals are flagged as abnormal, the system uses a pre-trained AI model
(possibly GPT or similar) to suggest an appropriate diet plan tailored to the individual's needs.
o Decision Tree:
▪ If the suggested diet plan is suitable, it’s accepted and presented to the user.
Department of AIML, AMCEC 2024-25 37
AI Based Health Monitoring Assistant
▪ If not, the user or the system can modify it to better fit the user's preferences or requirements.

9. Conclusion:
o For normal vitals, a simple summary is provided.
o For abnormal vitals, a more detailed conclusion, possibly with follow-up actions or monitoring
advice, is given.

Department of AIML, AMCEC 2024-25 38


AI Based Health Monitoring Assistant

Chapter 8

Result, Screenshots and Conclusion


8.1 Results:
The result stage showcases the outputs that are displayed at different parts of the
project. Certain outputs display quickly after execution of the cell, while others
reliant on stable, fast networks take longer to showcase specific results.

Logistic Regression model

The image displayed above shows the result of executing the cell which contains the
particular code involved with training the Logistic Regression model.

Linear Support Vector Classifier model

The image displayed above shows the result of executing the cell which contains the
particular code involved with training the Linear Support Vector Classifier model.

Department of AIML, AMCEC 2024-25 39


AI Based Health Monitoring Assistant

Random Forest Classifier model

The image displayed above shows the result of executing the cell which contains the
particular code involved with training the Linear Support Vector Classifier model.

Bagging Classifier model

The image displayed above shows the result of executing the cell which contains the
particular code involved with training the Bagging Classifier model.

Department of AIML, AMCEC 2024-25 40


AI Based Health Monitoring Assistant

Naive Bayes model

The image displayed above shows the result of executing the cell which contains the
particular code involved with training the Naive Bayes model.

Stacked Classifier

The image displayed above shows the result of executing the cell which contains the particular code
involved with training the stacked classifier . It is used to combine the predictions of multiple base
classifiers, which, in this case, are a Support Vector Machine (SVM), a Bagging Classifier, and a
Random Forest Classifier.

Department of AIML, AMCEC 2024-25 41


AI Based Health Monitoring Assistant

Screenshots

Department of AIML, AMCEC 2024-25 42


AI Based Health Monitoring Assistant

Department of AIML, AMCEC 2024-25 43


AI Based Health Monitoring Assistant

Department of AIML, AMCEC 2024-25 44


AI Based Health Monitoring Assistant

Department of AIML, AMCEC 2024-25 45


AI Based Health Monitoring Assistant

Department of AIML, AMCEC 2024-25 46


AI Based Health Monitoring Assistant

Department of AIML, AMCEC 2024-25 47


AI Based Health Monitoring Assistant

Department of AIML, AMCEC 2024-25 48


AI Based Health Monitoring Assistant

8.2 Plot of ROC-AUC curve


The classification report shows all the models perform quite well and close to each
other. The ROC curve for each model is plotted below. It shows the AUC and TPR
vs FPR of the model. The more the AUC the better the model performs.

Department of AIML, AMCEC 2024-25 49


AI Based Health Monitoring Assistant

The models – Logistic Regression and Support Vector have very close Area under
the curves. Random Forest and Naïve Bayes Model comparatively underperforms.
The final model chosen is the Support Vector machine as it slightly out-performs
the Logistic Regression model.

8.3 Evaluation Metric Graphs:

Department of AIML, AMCEC 2024-25 50


AI Based Health Monitoring Assistant

Department of AIML, AMCEC 2024-25 51


AI Based Health Monitoring Assistant

Department of AIML, AMCEC 2024-25 52


AI Based Health Monitoring Assistant

Chapter 9

Conclusion, Future Scope and References

9.1 Conclusion

The AI-Based Health Monitoring Assistant represents a transformative advancement in healthcare,


leveraging the power of Artificial Intelligence and Deep Learning to enhance patient data
management and proactive health monitoring. By automating the extraction and analysis of blood
test data through Optical Character Recognition (OCR), this system improves accuracy and
efficiency, reducing the potential for human error.

The use of deep learning models to identify patterns and predict health trends enables early
detection of potential health issues, fostering a more proactive approach to healthcare. The visual
representation of data through line charts provides clear and accessible insights for both patients
and healthcare providers, facilitating better understanding and management of health.

Moreover, continuous learning from new data ensures that the system's predictive analytics
become increasingly refined over time, supporting more informed and timely medical decisions.
This AI-driven solution not only empowers patients to take charge of their health but also aids
healthcare professionals in delivering higher quality care.

Overall, the AI-Based Health Monitoring Assistant stands at the forefront of preventive healthcare,
offering a data-driven, automated, and personalized approach to health management. By bridging
the gap between technology and healthcare, it has the potential to significantly improve patient
outcomes and revolutionize the way we approach health monitoring and management in the digital
age.

The integration of such advanced AI technology into healthcare not only represents a significant
leap in patient care but also paves the way for future innovations in health monitoring. As the
system continues to evolve and incorporate more sophisticated algorithms, it has the potential to
further enhance predictive analytics, making it possible to anticipate health issues with even
greater accuracy.

Department of AIML, AMCEC 2024-25 53


AI Based Health Monitoring Assistant

9.2 Future Scope

The proposed AI-based system for blood test report analysis holds immense potential for future
development and application. The following are some of the key areas for future scope:

Integration with Electronic Health Records (EHR):


The system can be integrated into existing EHR platforms to provide seamless data
synchronization, enabling healthcare providers to access comprehensive patient histories and make
more informed decisions.

Expansion of Health Metrics:


The AI model can be extended to include additional health parameters, such as liver function tests,
kidney profiles, or hormone levels, broadening its utility across various medical specialties.

Real-Time Data Processing:


Incorporating real-time data capture and analysis capabilities from wearable health devices or
remote monitoring systems could provide continuous health tracking and early alerts for critical
conditions.

Personalized Health Recommendations:


By integrating with advanced recommendation systems, the AI tool can provide tailored lifestyle,
diet, or medication suggestions based on individual health trends and risk factors.

Cross-Disciplinary Applications:
The system could be adapted for use in other areas of medical diagnostics, such as analyzing
imaging data (e.g., X-rays, MRIs), pathology slides, or genomic data, further
enhancing its versatility.

Department of AIML, AMCEC 2024-25 54


AI Based Health Monitoring Assistant

9.3 References

1. Dey, Aritra, and Pal, Biswamoy. (2023). Advancements in IoT-Based Health Monitoring
Systems - Discusses the role of IoT and AI in health monitoring, focusing on innovations,
challenges, and integration into healthcare systems.

2. Hassan, Abdikarim Abi, Tutuncu, Kemal, et al. (2023). Remote Monitoring and Management
of Elderly Patients Using IoT-Based Healthcare Systems - Explores IoT applications for
elderly care, emphasizing real-time monitoring and wearable technologies.

3. Shetty, Sachin, Vedanth, M., and Manjunath, S. (2023). Enhancing Diagnosis and Treatment
Planning Using AI in Healthcare - Focuses on the integration of AI for diagnostics and
treatment optimization.

4. ICMR-NIN Report (Latest Edition). Nutrient Requirements and Recommended Dietary


Allowances for Indians - Provides scientific dietary guidelines relevant to health monitoring
systems.

5. Jiang, F., Jiang, Y., et al. (2020). Artificial Intelligence in Healthcare: Past, Present, and Future
- Reviews the evolution of AI applications in healthcare and their implications.

6. Topol, Eric J. (2019). High-Performance Medicine: The Convergence of Human and Artificial
Intelligence - Explores AI's transformative impact on personalized medicine and patient
monitoring.

7. Krittanawong, C., et al. (2021). Machine Learning and Deep Learning in Cardiovascular
Health - Discusses AI models for analyzing health metrics, such as heart rate and cholesterol
levels.

8. Lecun, Y., Bengio, Y., and Hinton, G. (2015). Deep Learning - Explains foundational concepts
of deep learning used in health monitoring systems.

Department of AIML, AMCEC 2024-25 55


AI Based Health Monitoring Assistant

9. Witten, I. H., and Frank, E. (2017). Data Mining: Practical Machine Learning Tools and
Techniques - A foundational text for understanding data extraction and predictive analytics.

10. Obermeyer, Z., and Emanuel, E. J. (2016). Predicting the Future — Big Data, Machine
Learning, and Clinical Medicine - Analyzes the potential of predictive analytics in healthcare.

Department of AIML, AMCEC 2024-25 56

You might also like