Research Paper Summarizer Using AI
Research Paper Summarizer Using AI
Volume: 02
and Management Issue: 08 August 2024
https://goldncloudpublications.com Page No: 2579-2583
https://doi.org/10.47392/IRJAEM.2024.0374
Abstract
An AI-powered tool that can analyze and summarize research papers, making it easier for students to
understand complex academic articles. The amount of digital information is growing rapidly, making it hard
to handle and understand all the text available in different areas. It's really important to quickly and accurately
summarize large articles or research papers of text to find information, combine knowledge, and make
decisions. This research paper explains how we developed and tested a system that can turn long documents
into short, clear summaries. Develop algorithms for extracting key phrases and terms that capture the core
concepts and topics of the research paper. Develop features for Highlighting Keywords, read aloud option,
Plagiarism Check, Extracting Images, and focus areas. This tool plays an important role and help researchers
and high academic professors to get updated with the current technologies in their respective fields. The
Research Paper Summarizer Project utilizes advanced Natural Language Processing (NLP) to analyze and
summarize research papers effectively.
Keywords: Natural Language Processing (NLP), Highlighting Keywords, Read Aloud, Plagiarism, Images,
Research.
1. Introduction
In present digital world scenario, there are so many comprehend the large amount of complex
research papers that it can be challenging for information in various fields. [10]
students, researchers, and professors to keep up with 2. Literature Survey
the latest developments. Understanding and getting [1] It creates a summary by first organizing the
updated with such large amount of information takes document in layers and then choosing sentences step
lot of time, which is not always feasible. To address by step, considering what has already been included.
this issue, we developed an AI-powered tool It treats the task of picking sentences for the
specifically designed to analyze and summarize summary like a decision-making problem, where the
research papers efficiently. This tool not only document provides the information, and selecting
generates clear, efficient summaries but also each sentence is like taking an action. [2] This review
highlights key phrases and terms, reads the text on text summarization was conducted using a
aloud, checks for plagiarism, extracts relevant Systematic Literature Review (SLR) approach. SLR
images, and focuses on core concepts. We use is a method to find, evaluate, and interpret all
advanced Natural Language Processing (NLP) relevant research on a specific topic or set of research
techniques, our tool simplifies the complex academic questions. [3] The software uses the external tool
literature to simple and accessible summaries. This WordNet to improve the generated summary.
project aims to support academic and researchers by WordNet is a database that groups words by their
providing an effective solution to manage and meanings. The Natural Language Toolkit (NLTK)
IRJAEM 2579
International Research Journal on Advanced Engineering e ISSN: 2584-2854
Volume: 02
and Management Issue: 08 August 2024
https://goldncloudpublications.com Page No: 2579-2583
https://doi.org/10.47392/IRJAEM.2024.0374
for Python is used to connect to WordNet through the view the images in the paper, to view the summary
program. The quality of the summarization is of the paper, and can also check for the plagiarism.
evaluated using ROUGE. [4] Sentence scoring [11] The flow of our system is first the user uploads
features are grouped into seven categories. One the research paper and then text processing will be
category, frequency-keyword heuristics, uses the done on the paper that is uploaded. Then it will
most common words in the document to identify its generate the summary of the paper and it will check
main themes. Sentences that include these frequent for the post-processing and removes the stop words
words are scored based on how often these words and again goes to the text processing stage. [12] This
appear. Another category, indicator phrases, focuses process will continue until it generates the
on words that usually appear in important or meaningful summary of the paper, Shown in Figure
informative parts of the text. [5] Extractive 1.
unsupervised summarization creates a summary
from a document without using any pre-labeled data
or classifications. There are three main methods to do
this: graph-based, latent variable, and term
frequency. These methods are easy to implement and
provide good results. They often produce better
outcomes compared to other advanced techniques.
[16]
3. Proposed System
In our proposed system we developed an AI
summarization tool where the users can upload a
research paper and get the summary of the paper [6].
This system is based on the research paper that is
uploaded by the users. [13] This system will generate
a summary by framing a meaningful sentence that are
extracted from the paper to generate the extract
summary for the research paper. This system will Figure 1 Architecture of The System
provide the images related to main content of the
research paper that are extracted from the paper 4. Implementation
along with summary to visualize the images the are 4.1.Natural Language Toolkit Module
present in the research paper. bIt also includes the Our system uses the nltk module. It includes text
plagiarism checker it will give the how much processing libraries for tasks such as tokenization,
percentage of text is included in the plagiarism. This stemming, lemmatization, part-of-speech tagging,
system has a read aloud feature where the users can and named entity recognition. Tokenization is used
use it to read the summary that is generated [7-9]. It to split the text form the paper into words or the
also underlines the keywords in the research paper to simple sentences. [14] Stemming is used to reduce
highlight the words in the summary. Keywords and the words to their root. Lemmatization will make
read aloud module enhance the user interaction with sure that there is no grammatical mistakes in
the paper. This module identifies the important summary. Stop words is used to remove the
words in the paper and highlight them. It will help unnecessary words from the paper like is, to, in etc.
the users to locate essential information. [17] This which will not effect to the meaning of the sentences.
system can create simplified and coherent summaries Named entity recognition will identify the proper
making complex papers more accessible and nouns in the text to add it into the summary. [15]
understandable to users. The system also allows the Also we incorporated a PIL for the image and
users to perform various controlling actions like to imageTK libraries, Image library is used for opening,
IRJAEM 2580
International Research Journal on Advanced Engineering e ISSN: 2584-2854
Volume: 02
and Management Issue: 08 August 2024
https://goldncloudpublications.com Page No: 2579-2583
https://doi.org/10.47392/IRJAEM.2024.0374
5.2. Images
Based on the paper uploaded the images present in
the paper could be charts, flowcharts or any graphs,
such images will be extracted and displayed in the
option name ‘Image’. [19], shown in Figure 4 &
Figure 5.
5. Result
5.1. Generated Summary
Firstly, the file should be uploaded (PDF). After
uploading a path will be displayed in the interface.
Later, after selecting the option named ‘Summarize
Paper’, the summary will be generated and displayed Figure 5 Image Generated from Paper
as shown in Figure 3.
IRJAEM 2581
International Research Journal on Advanced Engineering e ISSN: 2584-2854
Volume: 02
and Management Issue: 08 August 2024
https://goldncloudpublications.com Page No: 2579-2583
https://doi.org/10.47392/IRJAEM.2024.0374
IRJAEM 2582
International Research Journal on Advanced Engineering e ISSN: 2584-2854
Volume: 02
and Management Issue: 08 August 2024
https://goldncloudpublications.com Page No: 2579-2583
https://doi.org/10.47392/IRJAEM.2024.0374
IRJAEM 2583