Trend Analysis in Machine Learning Research
Trend Analysis in Machine Learning Research
Abstract—This paper aims to identify the trends in machine data suitable for data mining for thoroughinvestigation
learning research using text mining. The researcharticles contain [9].Similar works on various research areas have been
significant knowledge and research results. However, they are performed using text mining. In [10], the text mining has been
long and have many noisy results such that it takes a lot of applied to understand the trend analysis of consumer policy. In
human efforts to analyze them. Text mining can be used to
analyze and extract useful information froma large number
[11], the text mining has employed for identifying the primary
ofresearch articles quickly and automatically. Text mining is the trends on Big Data in marketing. The research outcome helped
method of defininginnovative, and unseenknowledge to progress more direct efforts in the direction of business for
fromunstructured, semi-structured and structured textual data. Big Data in the marketing arena. In [12], the text mining
This knowldege contributed to very important information that applied for knowledge discovery in academic research.
can derive from textual data. In this paper, text mining This paper proposes to identify the trends in research of
methodsareapplied to detect trends of termsthat occur in the Machine Learning. The research articles available in well-
research articles and how they varies over time. We established mainstream journals overthe past three decades,
collected21,906 scientific papers from six top journals in the field i.e., 1988~2017. In this work, prominent journals included are
of machine learning published in period 1988-2017 and analyzed
them usingtext mining. Our result analysis shows a changing
IEEE Transactions on Pattern Analysis and Machine
trend ofvarious terms in Machine learning researchin three Intelligence (IEEE-PAMI), Journal of Machine Learning
decades. The analysis of our study helps the upcoming Research (JMLR), ScienceDirect Pattern Recognition (ScD-
researchers to explore the significant research area of machine PR), IEEE Transactions on Neural Networks (IEEE-NN),
learning. Springer Machine Learning (Sp-ML), and ScienceDirect
Neural Networks (ScD-NN) as primary data source.In this
Keywords—text mining; machine learning; research trend paper, text mining techniques employed in a framework for
analysis; data analysis determining the trends of Machine learning research articles
I. INTRODUCTION published in three decades. These articles include thetitle,
abstract, and complete contents of the articles [13].This
Text mining denotes to a process of mining meaningful, approach may be helpful to new researchers for further
non-trivial patterns or knowledge from a set of unstructured explorationof theirresearch area. The data used for processing
texts [1]. It is an essential task to uncover trends from large in this study were only the title and abstracts of the research
volume of textual data [2]. In particular, the advent of high- articles. Analyzing the title and abstract of a research article is
speed internet generates large amounts of textual data in a relevant as it comprises the comprehensive objectiveof a
variety of forms [3]. As an aspect of this trend, research research articleand prunedunneeded components of article
utilizing text mining technique is actively being carried out to i.e.figures and tables[13]. The remaining of the paperarranged
find patterns and extract implicit data from the large volume as follows:Section 2 provides the data collection and
of data in various fields such as academic article information preprocessing steps. Section 3 presents the result analysis.
and news article information [1,4,5].The goal of text mining is Finally, section 4 concludes the paper.
to determine hiddenknowledge which was not known
ealier.[6]. In [7],text mining referred as agroup of
techniquesemployed to identifytrends and produceknowledge II. METHODOLOGY
from data. In this section, we discuss the method of data preparation,
Text mining techniquesare derived the frequenciesof description of the corpus, data preprocessing for corpus before
important terms in thecontent of thetextual data such as applying text transformation in text mining. Fig. 1, shows the
internet chat rooms, articles, or web pages and methodology for trend analysis in machine learning.
classifyassociationsbetweenfeatures [8]. Text mining every so
ofteninterpretsunorganized text into a effectivecollection of
B. Description of Corpus
137
International Conference on Advances in Computing, Communication Control and Networking (ICACCCN2018)
converted to TfIdf vectors using Eq. (1) as a formula for TfIdf 1998~2007) and decade3 (i.e., 2008~2017). The research
weights of term 𝑖, in document 𝑗, in a corpus of D documents. contribution in each decade is calculated using Eq. (2).
𝑎𝑐𝑡
𝐶𝐸𝐷(𝑡,𝑑) = 𝑛 (2)
D 𝑡=1 𝑎𝑐𝑡
𝑤𝑒𝑖𝑔𝑡 𝑖, 𝑗 = 𝑓𝑟𝑒𝑞𝑢𝑒𝑛𝑐𝑦 𝑖,𝑗 ∗ log 2 (1)
d𝑜𝑐𝑢𝑚𝑒𝑛𝑡𝑓𝑟𝑒𝑞 𝑖
where 𝐶𝐸𝐷(𝑡,𝑑) is the research contribution for each
𝑓𝑟𝑒𝑞𝑢𝑒𝑛𝑐𝑦 𝑖,𝑗 is the number of word occurrences in a descriptive term 𝑡 in each decade 𝑑, 𝑎𝑐𝑡 is the article count for
document (term frequency); d𝑜𝑐𝑢𝑚𝑒𝑛𝑡𝑓𝑟𝑒𝑞 𝑖 represents the each descriptive term t, 𝑛 is the total number of descriptive
𝑛
terms and 𝑡=1 𝑎𝑐𝑡 is sum of article count for all descriptive
number of documents containing the word (document
terms in each decade.
frequency); Drepresents the count of all documents;
𝑤𝑒𝑖𝑔𝑡 𝑖, 𝑗 is the relative significance of the word in the TABLE III. ARTICLE CLASSIFICATION WITH TERMS FROM 1988~2017
document.
S.
Descriptive 1988~ 1998~ 2008~
E. Feature and Attribute Selection N
Terms
Terms
1997 2007 2017
o.
In this phase, a subgroup of the features was picked to Artificial belief, boltzmann,
depict a text document. The selected featuresproduced an Neural convolutional, deep,
enhanced textual description,ascompared to several features 1 Network and forward, learning, logic, 1459 1052 2703
that have very few information regarding data. The number of Deep network, propagation,
learning recurrent
indicator variables reducedby eliminating a list of stop words. base, knowledge,
Stemming is performed on the terms, which are converted into average, bayesian,
Bayesian
root form. The terms are filtered by using two parameters such 2
statistics
dependence, estimator, 410 635 945
as word frequency and inverse document frequency. The terms gaussian, multinomial,
naive, network
having low occurences in the corpus and lowoccurences in the binary, classifier,
each document removed.Features were elected based on discriminant,
classification,andeliminate the fewinsignificant attributes. 3 Classifiers hierarchical, linear, 560 1020 1363
machine, multi, naive,
III. RESULT AND DISCUSSION probability, support
birch, dbscan, fuzzy,
In this section, we discuss the research article classification Cluster hierarchical, mean,
4 721 1214 1711
of the dataset used for study and articles contribution in each analysis algorithm, cluster, group,
decade. Finally, we have presented the trend analysis in optics, expectation
c4.5, c5.0, decision,
machine learning research. 5
Decision tree
detect, id3, iterative, 659 860 1135
algorithm
A. Research Article Classification random, sliq, stump, tree
component, correlation,
The Gensim package is used to perform text mining on the Dimension- discriminant, extraction,
titles and abstract of the collected articles. It based on the idea 6 ality factor, feature, least, 754 1572 2432
of handling on substantial unstructured text corpora, document reduction mapping, principal,
stochastic
after document, in a memory-independent fashion. Also, it ada, aggregate, average,
implements the Vector Space Model (VSM) algorithms [17] Ensemble boost, ensemble, forest,
7 248 464 714
and includes corpus transformations such as TfIdf, LSI, learning gradient, machine,
Random projection, etc. For experimental purpose, Gensim as random, tree
algorithm, base, learn,
a python library used for implementing the trend analysis and Instance-
map, near, object,
document streaming [18]. The articles classified into 14 8 based 309 926 1082
organize, quant, self,
learning
machine learning areas by applying the steps discussed in vector
Section 2. Table III shows the classification produced by the adapt, least, linear,
Regression logistic, multi, ordinary,
term frequencies and weights. The descriptive terms (1-14) 9
Analysis regression, spline, step,
158 457 800
represented as a classification labels with their corresponding variable
terms. Also, the table shows the count of articles retrieved in absolute, angle, elastic,
each decade as per the terms identified using TfIdf model. 10
Regularizatio least, net, operator,
88 271 526
n algorithm regression, ridge, select,
B. Research Contribution In Each Decade square
action, advance,
In this subsection, we discuss the contribution of Reinforceme algorithm, automata,
descriptive terms in each decade. Fig. 2, shows the percentage 11 201 269 372
nt learning difference, learn, prior,
of research contribution of descriptive terms (1-14)in each reward, state, temporal
decade from 1988~2017. The donut shape for each of the active, density, generate,
Semi-
graph, learn, method,
descriptiveterms represents the percentage contribution for 12 supervised
model, separate, train,
577 723 713
decade1 (i.e., from 1988~1997), decade2 (i.e., from learning
trans
13 Supervised algorithm, annova, boost, 1720 2347 3052
138
International Conference on Advances in Computing, Communication Control and Networking (ICACCCN2018)
learning classify, hidden, learn, from 1998~2007) and green color for decade3
model, near, support, (i.e.,2008~2017). The research contribution across decades
target
expect, maximize, calculated using Eq. (3).
𝐶𝐸𝐷 𝑡,𝑑
Unsupervise algorithm, generate, map, 𝐶𝐴𝐷(𝑡,𝑑) = (3)
14 503 649 600 𝑑
d learning method, text, mine, 𝑘=1 𝐶𝐸𝐷 𝑡,𝑘
group, vector where 𝐶𝐴𝐷(𝑡,𝑑) is the research contribution for each
descriptive term 𝑡 across decade 𝑑, 𝐶𝐸𝐷(𝑡,𝑑) is the research
The percentage contribution of each descriptive term’s contribution for each descriptive term 𝑡 in each decade 𝑑, and
result shown in the donut. The top five research areas in 𝑑
𝑘=1 𝐶𝐸𝐷(𝑡,𝑘) is sum of each decade contribution for
decade1 were the supervised learning, artificial neural
network, dimensionality reduction, cluster analysis, and descriptive term 𝑡.
decision tree algorithm. Similarly, in deacde2 the top five
areas were supervised learning, dimensionality reduction,
cluster analysis, artificial neural network, and classifiers.
Finally, in the decade3top, five areas were supervised
learning, artificial neural network and deep learning,
dimensionality reduction, cluster analysis, and classifiers.
(a)
(b)
(c)
139
International Conference on Advances in Computing, Communication Control and Networking (ICACCCN2018)
140
International Conference on Advances in Computing, Communication Control and Networking (ICACCCN2018)
decades. Each descriptive term increased significantly from [4] S. Lee et al., "Using Patent Information for New Product Development:
Keyword-Based Technology Roadmapping Approach," 2006
earlier decades, but the regularization algorithm and Technology Management for the Global Future - PICMET 2006
regression analysis showed the highestrise in decade3. Conference, Istanbul, 2006, pp. 1496-1502.
[5] A. Balahur and R. Steinberger, “Rethinking Sentiment Analysis in the
News: from Theory to Practice and back,” Proceedings of the 1st
Workshop on Opinion Mining and Sentiment Analysis, University of
Sevilla, pp. 1-12, 2009.
[6] V. Gupta and G. S. Lehal, “A survey of text mining techniques and
applications,” Journal of emerging technologies in web intelligence, vol.
1, no. 1, pp. 60-76, 2009.
[7] Louise Francis, and Matt Flynn, “Text Mining Handbook,” Casualty
Actuarial Society E-Forum, Spring, pp. 1-61, 2006.
[8] Louise Francis, “Taming Text: An Introduction to Text Mining,”
Casualty Actuarial Society Forum, Winter, pp. 51-88, 2010.
[9] P. Cerrito, “Inside text mining. Text mining provides a powerful
diagnosis of hospital quality rankings,” Health management technology,
vol. 25, no. 3, pp. 28-31,2004.
[10] M.-J. Kim, K. Ohk, and C.-S. Moon, “Trend Analysis by Using Text
Mining of Journal Articles Regarding Consumer Policy,” New Physics:
Fig. 8. Percentage increase of each descriptive term over three decades Sae Mulli, vol. 67, no. 5, pp. 555–561, 2017.
[11] A. Amado, P. Cortez, P. Rita, and S. Moro, “Research trends on Big
Data in Marketing: A text mining and topic modeling based literature
Thus, this section concludes the result analysis of machine analysis,” European Research on Management and Business Economics,
learning research trends using text mining. vol. 24, no. 1, pp. 1–7, 2018.
[12] A. K.Ojo and A. B. Adeyemo, “Knowledge Discovery In Academic
IV. CONCLUSION Electronic Resources Using Text Mining,” International Journal of
In this paper, text mining technique is utilized to perform Computer Science and Information Security, vol. 11, no. 2, pp. 1-10,
2013.
the trend analysis in the research area of machine learning in
[13] Z. Shaik, S. Garia and G. Chakraborty, “SAS® Since 1976: An
three decades. The content collection is prepared from the Application of Text Mining to Reveal Trends,” Proceedings of the
published research articles in sixwell-established SAS.Global Forum 2012 Conference, Data Mining and Text Analytics,
journals.Torealize the scholar'sresearch interest in descriptive SAS Institute Inc., Cary.
termsover previous30 years, the datasetsplit into three sets for [14] S. Bird, “NLTK: the natural language toolkit,” In Proceedings of the
COLING/ACL on Interactive presentation sessions, Association for
the time span of 1988~1997, 1998~2007, 2008~2017.This Computational Linguistics, pp. 69-72, 2006.
study can be useful to the upcoming researchers in the area of [15] Available for download from
machine learning to get an intuition of trends to their area of ftp://ftp.cs.cornell.edu/pub/smart/english.stop.
interest. This framework can also be applied to identifytrends [16] M. Porter, “An algorithm for suffix stripping,” Program, vol. 14, no. 3,
in research areasassociated to other field of study. pp.130-137, 1980.
[17] G. Salton, “A vector space model for automatic indexing,”
REFERENCES Communications of the ACM, vol. 18, no. 11, pp. 613-620, 1975.
[1] J.-L. Hung and K. Zhang, “Examining mobile learning trends 2003– [18] R. Řehůřek and P. Sojka, “Software Framework for Topic Modeling
2008: a categorical meta-trend analysis using text mining techniques,” with Large Corpora,” In Proceedings of LREC workshop New
Journal of Computing in Higher Education, vol. 24, no. 1, pp. 1–17, Challenges for NLP Frameworks, Valletta, Malta: University of Malta,
Oct. 2011. pp. 46-50, 2010.
[2] A. Kao and S. R. Poteet, Natural language processing and text mining.
London: Springer, 2010.
[3] I. O. R. Patterns and T. T. Mining, “Identification of Research Patterns
and Trends through Text Mining,” International Journal of Information
and Education Technology, vol. 2, no. 3, pp. 233–235, 2012.
141