EASESUM: An Online Abstractive and Extractive Text Summarizer Using Deep Learning Technique
EASESUM: An Online Abstractive and Extractive Text Summarizer Using Deep Learning Technique
Corresponding Author:
Sunday Adeola Ajagbe
Department of Computer Science, University of Zululand
Kwadlangezwa 3886, South Africa
Email: [email protected]
1. INTRODUCTION
Over the years, there has been a drastic increase in the data generated daily [1], [2]. The global data
sector is projected to reach 175 zettabytes by 2025, according to the International Data Corporation (IDC) in
its data age 2025 analysis for Seagate [3], [4]. This increase in data has been attributed to technological
advancement and datafication of the world; which resulted in the birth of big data [5]. Data are either structured
or unstructured. Unlike its amorphous form, that usually includes text and multimedia, structured data are more
organized (usually in a tabular form). A significant portion of the data generated is unstructured, necessitating
a study in unstructured data analytics. Unstructured data contains many irregularities and ambiguities;
therefore, it needs to be analyzed to draw meaningful insights. Manually manipulating and compressing
unstructured data is highly time-intensive and cannot keep up with the increasing data every day, hence the
introduction of electronic means [1]. Unstructured data is also easier to process using conventional methods
than structured data. Hence, it must be converted into machine language, which involves long codes humans
cannot understand. Text mining and natural language processing are essential in overcoming this obstacle.
Text mining, also known as knowledge discovery, involves deriving insight and looking for patterns
in textual data. Information is inherent in a document, unknown, unidentified, and can hardly be derived
without automatic data mining techniques. Automatic text summarization is a subfield of data mining and
natural language processing concerned with extracting meaningful information from textual documents [6].
Automatic text summarization is substantially different from that human-based text summarization, as humans
can identify and connect significant meanings and patterns in text documents [7].
Text summarization could be categorized as extractive or abstractive using the output of the summary
procedure [8]. The extractive text summarization's output uses sentences from the original manuscript. When
abstractive text summarization is applied, the resulting summary solely contains concepts from the original
text. In literature, more study has been conducted in extractive text summarization [9], [10]. Text
summarization could also be categorized in terms of their approaches. The approaches used in text
classification include feature-based, which uses statistical methods to determine the level of importance of a
sentence in a text. The latent Semantic Analysis based method also reduces sentence vector dimension using
singular value decomposition [11]. The topic-based technique uses the topic in the sentence to rate the
sentence's value. The relevance measure considers statistical similarity to assign levels for the inclusion of a
sentence in a summary. The graph-based method generates a graph using the input text and ranks the sentence
using the graph. The template-based method generates templates from the input text and uses the template for
summarization. The more recent machine learning-based approach [10], [12], [13] uses machine learning
algorithms for text summarization. This study examines the use graph-based approach and deep learning
approach to summarize text documents online with little loss of the document's ideas.
Various techniques have been utilized for abstractive text summarization. This study contributes to the
body of knowledge by using the text rank algorithm to implement the extractive summarizer. In contrast, bi-
directional recurrent neural network (RNN) was used to implement the abstractive text summarizer. Furthermore,
word embedding was used to improve the quality of the summaries produced. The average recall-oriented
understudy for gisting evaluation (ROUGE) recall score ranging from 30.00 to 60.00 for abstractive summarizer
and 0.75 to 0.82 for extractive text showed an encouraging result compared to the state-of-the-art results.
The study has the potential to provide significant benefits to users by helping them save time, improve
comprehension, make better-informed decisions, and keep up with the ever-increasing amount of information
available online. Also, help users make better-informed decisions by providing them with a concise overview
of the information they need to consider. This can be particularly useful for businesses, policymakers, and
others who need to make decisions based on large amounts of data. The motivation is to help readers understand
complex material by breaking it down into more manageable chunks. This can be particularly useful for people
who are not experts in a particular field or for those who have limited time to read.
The remaining section in this paper include a literature review that examines relevant literature. The
methodology comes after a literature review, and it examines the methods used in the proposed system. Results
and discussion examine the result obtained in the paper and the implication of these results. The last section of
this paper concludes the paper and shows the possible area of future work.
2. RELATED STUDIES
Several studies have been conducted on the summarization of text. They can broadly be categorized
into extractive and abstractive text summarization. In this section, we examines literatures of both categories
of text summarization.
performance evaluation of the proposed summarization technique indicated it to be effective. Jang and Kang [12]
examined extractive summarization using graph-based approach. The approach considered the degree to which
nodes on the edges of the graph are similar. Also, weights were distributed based on the similarity with the topic.
Semantic measure was also used for finding the similarity between nodes. The method proposed produced a
precision, recall, and F-measure of 0.154, 0.229, and 0.445 respectively.
Liu et al. [16] examined multi-document text summarization with the use of firefly algorithm. Their
fitness function introduced three features which include the readability, coherence, and topic relation factors.
The proposed system was evaluated using the ROUGE score and a comparison with other nature inspired
algorithm like particle swarm optimization and genetic algorithm. The proposed method showed produced
ROUGE-1 recall, precision, and F-score of 0.43803, 0.48095, and 0.47821. ROUGE-2 result of 0.21212,
0.25012, and 0.22951 respectively. Ajagbe et al. [17] considered the use of common hand-crafted features for
text summarization in multiple documents. The number of sentences, phrase frequency, title similarity,
sentence position, sentence length, sentence-sentence frequency, and other characteristics are among these
features. Two fuzzy inference systems and a multilayer perceptron were utilized for phrase extraction and
document understanding after various combinations of these features were looked at. The recall, precision, and
F-score of 0.409, 0.512 and 0.370 for Rouge-1. Rouge-2 also produced a recall, precision and f-score of 0.290,
0.360, and 0.264.
Bhuiyan et al. [18] presented a document summarization technique using quantum inspired genetic
algorithm. In their method, the preprocessing steps include sentence segmentation, tokenization, removal of
stop words, case folding, tagging of parts-of-speech, and stemming. Sentence scoring made use of statistical
features, sentence-to-document and sentence-to-title cosine similarity, and quantum inspired genetic algorithm.
The result showed a recall, precision, and F-score of 0.4779, 0.4757, and 0.4767 for ROUGE-1 respectively.
A recall, precision, and F-score of 0.1289, 0.1286, and 0.1287 was also recorded for ROUGE-2 respectively.
Mallick et al. [19] presented an approach to unsupervised extractive text summarization. The system used
sentence graph, generated from each document automatically. The method was extended from single document
to multi-document by using both document graph and proximity-base cross-document edges.
Mattupalli et al. [20] proposed an unsupervised extractive summarization model called learning free
integer programming summarizer. Their approach prevents the gruesome training stage required for supervised
extractive summarizing methods. In their system, an integer programming problem was formulated from pre-
trained sentence embedding vectors. Principal component analysis was used to select sentences to extract from
the document. The F1-score obtained for ROUGE-1, ROUGE-2, and ROUGE-L after testing with Wikihow
dataset were 24.28, 5.32, and 18.69 respectively. 36.45, 14.29, and 24.56 were the F1-score obtained for
ROUGE-1, ROUGE-2, and ROUGE-L for convolutional neural network (CNN) dataset.
3. METHOD
3.1. Input dataset
For the evaluation the proposed system, two datasets were used. The two datasets are the Amazon food
review dataset and the news room dataset. Abstractive text summarization techniques are supervised learning
techniques; therefore, they require a labelled corpus (dataset) to be trained on. In this study, the Amazon food
review dataset was used. The Amazon fine food reviews dataset is a CSV file in English language, consisting of
reviews of fine foods from amazon. It includes 74258 products, 256059 users, and 568,454 reviews. The data was
collected between October 1999 and October 2012. This dataset was downloaded from Kaggle and it is available
at https://www.kaggle.com/snap/amazon-fine-food-reviews [5], [21].
The Newsroom dataset is a collection of summaries. It has 1.3 million stories and summaries that were
written and edited by people working in the newsrooms of 38 major news organizations. This high-quality text,
which was extracted from search and social media information between 1998 and 2017, shows a wide range
of use in text summarization. The dataset is available at Cornell University's dataset repository [22]–[24].
Figure 1 shows the proposed system's block diagram.
Generate Summary
EASESUM: an online abstractive and extractive text summarizer using natural … (Jide Kehinde Adeniyi)
1892 ISSN: 2252-8938
3.2.2. Tokenization
Tokenization is the process of breaking down a written document into tiny components called tokens.
A token can be a word, a word's fragment, or merely a character, like a period (which has been removed in the
cleansing stage). It essentially divides material into little chunks of words and removes the stop word [16].
Tokenization was used to extract the words from the sentence.
||𝐷|| = √𝐷1 2 + 𝐷2 2 + ⋯ 𝐷𝑛 2
PageRank algorithm: websites are ranked in search engine results using the PageRank algorithm
developed by Google. PageRank was inspired by one of Google's original founders, Larry Page. Using
PageRank, one may assess the significance of website pages. By calculating the quantity and caliber of links
pointing to a website, PageRank generates an approximate evaluation of its importance. The underlying
assumption is that websites with greater authority are more likely to receive links from other websites. Let's
say that pages T1 through Tn all point to page A. (i.e., are citations). A variable called the damping factor d
has a range of 0 to 1 (usually set around 0.85). The next section contains more information about d. C(A) also
refers to the number of links that leave page A. A page's PageRank is calculated using (2) [29]:
𝑃𝑅(𝑇1)
𝑃𝑅(𝐴) = (1 − 𝑑) + 𝑑 ( ) + ⋯ + 𝑃𝑅(𝑇𝑛)/𝐶(𝑇𝑛)) (2)
𝐶(𝑇1)
where 𝑇𝑘 = 𝑝𝑎𝑔𝑒 𝑝𝑜𝑖𝑛𝑡𝑖𝑛𝑔 𝑡𝑜 𝑡ℎ𝑒 𝑝𝑎𝑔𝑒 𝐴; 𝑃𝑅(𝑇𝑘)=PageRank of the page 𝑇𝑘; 𝑑=a damping factor; and
𝐶(𝑇𝑘)=a number of outgoing links of the page 𝑇𝑘, k=1,…,n.
Because Page Rank is a probability distribution over online pages, the total PageRank of all web pages
will be one. PageRank, or PR(A), is the primary eigenvector of the web's normalized link matrix and may be
determined using a simple iterative process. In this study, Page Ranking algorithm was used to rank the
sentences not webpages. The algorithm ranks each sentence in order of importance in the text using the number
of words in the sentence that appear in the topic of the article.
EASESUM: an online abstractive and extractive text summarizer using natural … (Jide Kehinde Adeniyi)
1894 ISSN: 2252-8938
SoftMax, which converts the decoder outputs into a probability distribution across a fixed-size vocabulary.
This likelihood is projected based on the recurrent decoder state and the previously produced token. The
encoded interpretations of the source article are sent into the decoder together with a vector called the
context vector from the attention layer. Figure 3 shows the abstractive model structure. The model was
gotten after tunning several hyper-parameters.
Precision: precision is the size of accurate information retrieved by a system in comparison to the
amount of incorrect information recovered. The precision P is obtained using (4) [6].
𝑇𝑃
𝑃 = (4)
𝑇𝑃+𝐹𝑃
F-score: F-score is a metric that combines accuracy and memory by calculating the harmonic mean of
recall and precision. The F1-score, which is an exchange between recall and accuracy, is the most commonly
used F-score. F-score is obtained using (5) [6].
2𝑇𝑃
𝐹 = (5)
2𝑇𝑃+𝐹𝑃+𝐹𝑁
ROUGE: it is a collection of measures for assessing machine translation and automatic text
summarization. The objective is to compare the quality of the resulting summary to a standard document
automatically. The goal is to determine the recall by counting the number of units (N-grams) in both the summary
and reference systems. Because a text may include numerous summaries, this method enables the usage of
multiple reference summaries. ROUGE compares an autonomously generated summary to a collection of pre-
set or golden summaries. Many ROUGE variants have been proposed, including ROUGE-N, ROUGE-L,
ROUGE-W, ROUGE-S, and ROUGE-SU. For this study, ROUGE-N and ROUGE-L were used for summary
evaluation [32], [33].
Test Case I
Test Case IV
EASESUM: an online abstractive and extractive text summarizer using natural … (Jide Kehinde Adeniyi)
1896 ISSN: 2252-8938
Table 2. Case II
Precision Recall F1-score
R-1 0.7619 0.9697 0.8533
R-2 0.7000 0.9459 0.8046
R-L 0.7619 0.9697 0.8533
Table 3. Abstractive text summarizer test case I Table 4. Abstractive text summarizer test case II
results results
Precision Recall F1-score Precision Recall F1-score
R-1 0.250 1.000 0.399 R-1 0.200 1.000 0.333
R-2 0.000 0.000 0.000 R-2 0.000 0.000 0.000
R-L 0.250 1.000 0.399 R-L 0.200 1.000 0.333
as human summaries, since humans can think about and choose the best option. However, most readers cannot
grasp the summary and put them together by applying basic logic. So, if a suitable summarizing approach is
employed, automatically generated summaries may be a good substitute for human summaries. It can also
make dealing with vast amounts of data much easier and faster. Providing this summarization approach online
as done in this study would provide easier access to text summarization. For future studies, the comparison
could be made between machine learning techniques. Other ranking algorithms could also be compared with
page-rank algorithm to see which is more efficient.
REFERENCES
[1] I. Awasthi, K. Gupta, P. S. Bhogal, S. S. Anand, and P. K. Soni, “Natural language processing (NLP) based text summarization - a
survey,” in 2021 6th International Conference on Inventive Computation Technologies (ICICT), 2021, pp. 1310–1317, doi:
10.1109/ICICT50816.2021.9358703.
[2] N. Alami, M. Meknassi, N. En-nahnahi, Y. El Adlouni, and O. Ammor, “Unsupervised neural networks for automatic Arabic text
summarization using document clustering and topic modeling,” Expert Systems with Applications, vol. 172, 2021, doi:
10.1016/j.eswa.2021.114652.
[3] D. T. Anh and N. T. T. Trang, “Abstractive text summarization using pointer-generator networks with pre-trained word embedding,”
in Proceedings of the Tenth International Symposium on Information and Communication Technology - SoICT 2019, 2019,
pp. 473–478, doi: 10.1145/3368926.3369728.
[4] R. Bhargava and Y. Sharma, “Deep extractive text summarization,” Procedia Computer Science, vol. 167, no. 2019, pp. 138–146,
2020, doi: 10.1016/j.procs.2020.03.191.
[5] R. C. Belwal, S. Rai, and A. Gupta, “A new graph-based extractive text summarization using keywords or topic modeling,” Journal
of Ambient Intelligence and Humanized Computing, vol. 12, no. 10, pp. 8975–8990, 2021, doi: 10.1007/s12652-020-02591-x.
[6] Y. Dong, S. Wang, Z. Gan, Y. Cheng, J. C. K. Cheung, and J. Liu, “Multi-fact correction in abstractive text summarization,” in
Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP), 2020, pp. 9320–9331, doi:
10.18653/v1/2020.emnlp-main.749.
[7] W. S. El-Kassas, C. R. Salama, A. A. Rafea, and H. K. Mohamed, “Automatic text summarization: A comprehensive survey,”
Expert Systems with Applications, vol. 165, Mar. 2021, doi: 10.1016/j.eswa.2020.113679.
[8] I. D. Oladipo, M. AbdulRaheem, J. B. Awotunde, A. K. Bhoi, E. A. Adeniyi and M. K. Abiodun, "Machine learning and deep
learning algorithms for smart cities: a start-of-the-art review," IoT and IoE driven smart cities, pp. 143-162, 2021.
[9] R. Rawat, O. Oki, R. K. Chakrawarti, T. S. Adekunle, J. M. Lukose and S. A. Ajagbe, "Autonomous artificial intelligence systems
for fraud detection and forensics in dark web environments," Informatica, vol. 47, no. 9, pp. 51-62, 2023,
https://doi.org/10.31449/inf.v47i9.4538.
[10] M. Grusky, M. Naaman, and Y. Artzi, “Newsroom: A Dataset of 1.3 million summaries with diverse extractive strategies,” in
Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human
Language Technologies, Volume 1 (Long Papers), 2018, pp. 708–719, doi: 10.18653/v1/N18-1065.
[11] S. L. Hou et al., “A survey of text summarization approaches based on deep learning,” Journal of Computer Science and Technology,
vol. 36, no. 3, pp. 633–663, 2021, doi: 10.1007/s11390-020-0207-x.
[12] M. Jang and P. Kang, “Learning-free unsupervised extractive summarization model,” IEEE Access, vol. 9, pp. 14358–14368, 2021,
doi: 10.1109/ACCESS.2021.3051237.
[13] B. Jing, Z. You, T. Yang, W. Fan, and H. Tong, “Multiplex graph neural network for extractive text summarization,” in Proceedings
of the 2021 Conference on Empirical Methods in Natural Language Processing, 2021, pp. 133–139, doi: 10.18653/v1/2021.emnlp-
main.11.
[14] J. Li, C. Zhang, X. Chen, Y. Hu, and P. Liao, “Survey on automatic text summarization,” Jisuanji Yanjiu yu Fazhan/Computer
Research and Development, vol. 58, no. 1, pp. 1–21, 2021, doi: 10.7544/issn1000-1239.2021.20190785.
[15] Y. Kumar, K. Kaur, and S. Kaur, “Study of automatic text summarization approaches in different languages,” Artificial Intelligence
Review, vol. 54, no. 8, pp. 5897–5929, 2021, doi: 10.1007/s10462-021-09964-4.
[16] J. Liu, D. J. D. Hughes, and Y. Yang, “Unsupervised extractive text summarization with distance-augmented sentence graphs,” in
Proceedings of the 44th International ACM SIGIR Conference on Research and Development in Information Retrieval, 2021, pp.
2313–2317, doi: 10.1145/3404835.3463111.
[17] S. A. Ajagbe, A. A. Adegun, A. B. Olanrewaju, J. B. Oladosu, and M. O. Adigun, “Performance investigation of two-stage detection
techniques using traffic light detection dataset,” IAES International Journal of Artificial Intelligence, vol. 12, no. 4, pp. 1909–1919,
2023, doi: 10.11591/ijai.v12.i4.pp1909-1919.
[18] M. R. Bhuiyan, M. H. Mahedi, N. Hossain, Z. N. Tumpa, and S. A. Hossain, “An attention based approach for sentiment analysis
of food review dataset,” in 2020 11th International Conference on Computing, Communication and Networking Technologies
(ICCCNT), 2020, pp. 1–6, doi: 10.1109/ICCCNT49239.2020.9225637.
[19] C. Mallick, A. K. Das, M. Dutta, A. K. Das, and A. Sarkar, “Graph-based text summarization using modified TextRank,” in Soft
Computing in Data Analytics, Singapore: Springer, 2019, pp. 137–146, doi: 10.1007/978-981-13-0514-6_14.
[20] S. Mattupalli, A. Bhandari, and B. . Praveena, “Text summarization using deep learning,” International Journal of Recent
Technology and Engineering (IJRTE), vol. 9, no. 1, pp. 2663–2667, 2020, doi: 10.35940/ijrte.a3056.059120.
[21] M. Mojrian and S. A. Mirroshandel, “A novel extractive multi-document text summarization system using quantum-inspired genetic
algorithm: MTSQIGA,” Expert Systems with Applications, vol. 171, 2021, doi: 10.1016/j.eswa.2020.114555.
[22] B. Mutlu, E. A. Sezer, and M. A. Akcayol, “Multi-document extractive text summarization: A comparative assessment on features,”
Knowledge-Based Systems, vol. 183, p. 104848, 2019, doi: 10.1016/j.knosys.2019.07.019.
[23] D. Patel, S. Shah, and H. Chhinkaniwala, “Fuzzy logic based multi document summarization with improved sentence scoring and
redundancy removal technique,” Expert Systems with Applications, vol. 134, pp. 167–177, 2019, doi: 10.1016/j.eswa.2019.05.045.
[24] H. P. Chan and I. King, “A condense-then-select strategy for text summarization,” Knowledge-Based Systems, vol. 227, 2021, doi:
10.1016/j.knosys.2021.107235.
[25] J. Q.- Espino, R. M. Romero-González, and A.-M. Herrera-Navarro, “A deep look into extractive text summarization,” Journal of
Computer and Communications, vol. 9, no. 6, pp. 24–37, 2021, doi: 10.4236/jcc.2021.96002.
[26] M. M. Rahman and F. H. Siddiqui, “Multi-layered attentional peephole convolutional LSTM for abstractive text summarization,”
ETRI Journal, vol. 43, no. 2, pp. 288–298, 2021, doi: 10.4218/etrij.2019-0016.
EASESUM: an online abstractive and extractive text summarizer using natural … (Jide Kehinde Adeniyi)
1898 ISSN: 2252-8938
[27] D. Reinsel, J. Gantz, and J. Rydning, The digitization of the world - from edge to core. Needham, Massachusetts: Framingham:
International Data Corporation, 2018.
[28] B. Rekabdar, C. Mousas, and B. Gupta, “Generative adversarial network with policy gradient for text summarization,” in 2019
IEEE 13th International Conference on Semantic Computing (ICSC), 2019, pp. 204–207, doi: 10.1109/ICOSC.2019.8665583.
[29] R. K. Roul, “Topic modeling combined with classification technique for extractive multi-document text summarization,” Soft
Computing, vol. 25, no. 2, pp. 1113–1127, 2021, doi: 10.1007/s00500-020-05207-w.
[30] N. K. Sirohi, D. M. Bansal, and D. S. N. R. Rajan, “Text summarization approaches using machine learning & LSTM,” Revista
Gestão Inovação e Tecnologias, vol. 11, no. 4, pp. 5010–5026, 2021, doi: 10.47059/revistageintec.v11i4.2526.
[31] S. Song, H. Huang, and T. Ruan, “Abstractive text summarization using LSTM-CNN based deep learning,” Multimedia Tools and
Applications, vol. 78, no. 1, pp. 857–875, 2019, doi: 10.1007/s11042-018-5749-3.
[32] T. Yorozu, M. Hirano, K. Oka, and Y. Tagawa, “Electron spectroscopy studies on magneto-optical media and plastic substrate
interface,” IEEE Translation Journal on Magnetics in Japan, vol. 2, no. 8, pp. 740–741, 1987, doi: 10.1109/TJMJ.1987.4549593.
[33] S. A. Ajagbe and M. O. Adigun, “Deep learning techniques for detection and prediction of pandemic diseases: a systematic literature
review,” Multimedia Tools and Applications, vol. 83, no. 2, pp. 5893–5927, 2024, doi: 10.1007/s11042-023-15805-z.
BIOGRAPHIES OF AUTHORS
EASESUM: an online abstractive and extractive text summarizer using natural … (Jide Kehinde Adeniyi)