27 A Review of Some Semi Supervised Learning Methods
27 A Review of Some Semi Supervised Learning Methods
250-259
I n recent years the use of data-mining techniques as well as smart algorithms has become common.
Several tasks, formerly done at the expense of significant amounts of time and money, can be performed
by means of these techniques and algorithms. On the other hand, many of our sources are textual ones.
During all these years, there have been different classifications with varying approaches for this task. It is
noteworthy that the possibility of automatization of these classifications relies on new texts. This paper deals
with basic concepts, concerning data-mining and text-mining, reviewing some semi-supervised learning
methods. It also gives a review some common algorithms in this area and finally presents the summary and
conclusion.
1. Introduction
In terms of natural language processing, particularly processing the text, one of the basic tasks is the
automatic classification of the texts [1] and [2]. Identification of a text’s category or class could provide
useful information for some processes such as machine translation, text-to-speech transformation, and
OCR (Optic Character Recognition). When classifying a set, there is the learning from the documents which
for this set the classes are pre-determined. By means of this set, the classification model is learned, later
to determine the new incoming document class. The aim of text classification is to attribute pre-defined
classes to textual documents. For instance in case of a new piece of news, inserted into the system it will
be able to determine wheterh this piece of news belongs to the class of sports, politics, or art [3]. There
various method to classify the documents. The majority of used methods in previous works were based
on supervised learning methods, where we have sufficient amount of labeled data. Also a few of them
have been done based on unsupervised learning methods on unlabeled data. Since retrieving the label is
quite hard and costy while the unlabeled data are frequent, semi-supervised learning becomes a good idea
to decrease human work and improve the accuracy and authenticity. The semi-supervised learning, using
a great amount of unlabeled information along with labeled data is better for categorization. Since in semi-
supervised learning, for higher accuracy human endeavors less, there is much interest in it both
250
Article History:
JKBEI DOI: 649123/11036
Received Date: 14 Sep. 2015
Accepted Date: 06 Dec. 2015
Available Online: 14 Jan. 2016
Mohsen Hajighorbani et al. / Vol. 2(4) Jan. 2016, pp. 250-259 JKBEI DOI: 649123/11036
theoretically and practically. In terms of classifying the texts in English, there have been different
techniques such as Basian Classifier [4], Artificial Neural Network [5], Nearest Neighbor Algorithms [6],
Support Vector Machine [7], EM Algorithm, and other varying techniques. As for classifying Persian texts,
techniques such as unsupervised methods [8], Basian Classifier [9], Nearest Neighbor Algorithms [10], n-
gram indicator [11], use of semantic knowledge [12], etc. have been employed, yet there have not been
any outstanding activity in terms of the semi-supervised method.
2. Basic Concepts
2.1. Data Mining
Data mining is a set of intelligent methods to extract hidden knowledge within the data [13]. In fact
“Data Mining” as the term means, is conceived as the extraction of hidden information, patterns, or certain
relations in a great amount of data within one or several big datasets [14].
Data mining puts dense databases and datasets under analysis or machine (and semi-machine) mining
in order to discover and extract knowledge. Such studies and searches could be considered as the
continuation and permanence of statistics [15]. The main difference is in the scale, vastness, and variance
of backgrounds and functions as well as dimensions of sizes of the modern world’s data which
requiremachine methods, related to learning, modeling, and training.
Nowadays as database systems and the great amount of data stored in these systems grow, there is a
need for a tool with which the stored data could be processed and the outcome information of such
processing could be provided to the users. By means of various tools of normal SQL-based recording and
via simpe queries, one can provide the users with reports so that they will be able to reach conclusions,
concerning the data and their logical relations; however, when the amount of the data is collosal, the
users, whatsoever skilled and experienced, cannot identify suitable patterns among the mass of the data
or even if they can do so, the cost of the operation will be very high both in human force and financial
terms.
On the other hand, users usually pose a hypothesis, then to prove or disprove it according to the
observed reports, whereas nowadays there is a need for methods that discover knowledge, i.e. with the
least user interference and in an autoimatic way they discover and indicate the patterns and logical
relations.
Data mining is one of the most important example of such methods, with which suitable patterns inside
the data are identified with minimum user interference and the users and analysts are provided with some
information so that important and vital decisions are made in organizations accordingly [16]. In data
mining, a part of statistics called exploratory analysis of the data is used that emphasizes the discovery of
hidden and unknown information from a bulk of data. In addition, data mining with artificial intelligence
and machine learning has a close relationship; therefore, one can say that in data mining basic theories of
the data. Artificial intelligence, machine learning, and statistics are intermingled to result in a functional
foundation.
One should consider that data mining is used when we are faced with a bulk of data. In all data mining
sources this has been emphasized [17]. The greater the bulk of data and the more complicated the
relations among them, the harder the access to hidden information within the data, and the clearer the
role of data mining as one of the methods of exploring knowledge. In general learning methods in data
mining can be categorized into three groups of supervised learning (classification), unsupervised learning
(clustering), and semi-supervised learning. In the following while reviewing the concepts of text mining,
we will deal with the differences of supervised and unsupervised learning at 2.2.3, while the methods of
semi-supervised learning will be discussed at 2.4.
251
252
Fig. 1 A) In the classification by means of a series of primary information, the data are attributed to known sets. In
clustering, the data are attributed to clusters in accordance with the selected algorithm. B) In clustering, the
learning is done without a supervisor, i.e. there is no need for primary training data.
Generally, in supervised learning, classification being a part of it, the classes are known from the
beginning and each item of the training data are attributed to a particular class, and it is said that there is
a supervisor to train the system [23]; yet in unsupervised learning, clustering being a part of it, there is no
information such as training data at the learner’s disposal and it is the learner that should seek special
structures in the data [24].
253
Inter-cluster
Intra-cluster distances are
distances are maximized
minimized
In supervised learning (classification) the class label of each training data has been given. Based on such
training data a model is constructed and, afterwards, based on this model class labels for unseen data (test
data) are predicted.
254
To begin, the cluster center is selected randomly, yet whatever selected, the must-links should remain (i.e.
they cannot be selected as the center of another cluster).
Its algorithm is so that without violating any constraint at the assignment stage, every point is assigned
to the nearest cluster center. If such an assignment does not exist, the entire clustering is rejected.
Policies, existing to measure the inter-class distance, are as follows: Min, Max, Group Average, and
Distance between Centroids. Each initializes the distance matrix in a way. In this method, also two
constraints of Must-link and Cannot-link are enacted in the distance matrix.
Must-link Constraint: The distance between must-link pairs is considered as zero.
Cannot-link Constraint: Assume that the hierarchical clustering is done with (max) complete link and that
the distance between two clusters is determined with the greatest distance. Then the distance between
two cannot-link pairs is considered as the maximum intra-matrix rate plus one (maxi, jDij + 1). In what
comes below, the quasi-code, related to semi-supervised hierarchical clustering with complete-link
constraint is given in distance matrix:
255
This method can be executed by any supervised learning algorithm and often acts in a good way. The
problem of this method is that the errors are reinforced in each stage. For instance consider the basic
learning algorithm, KNN Classifier, in Fig. 4.2. As it can be seen, if outlier data exist, the learning algorithm
makes a mistake and reinforces the errors.
256
• This method works well if the classifiers have a good certainty on their predictions and are
sufficiently various (on different areas of the sample space, they are correct)
• The method could expand to more than two classifiers. In order to obtain variety and
independence of the classifiers, two separate feature sets (two views) are employed.
• Co-training learning with several views decrease the error of separate views. More decrease could
be attained by combining the predictions of the two classifiers.
Reference
[1] Fernando Enríquez, Fermín L. Cruz, F. Javier Ortega, Carlos G. Vallejo, José A. Troyano, A comparative study of classifier
combination applied to NLP tasks, Information Fusion, Volume 14, Issue 3, July 2013, Pages 255-267, ISSN 1566-2535
[2] Asif Ekbal, Sriparna Saha, Simulated annealing based classifier ensemble techniques: Application to part of speech tagging,
Information Fusion, Volume 14, Issue 3, July 2013, Pages 288-300, ISSN 1566-2535
[3] SMR. Hashemi "A Survey of Visual Attention Models" Ciência e Natura, v. 37 Part 2 2015, p. 297−306 ISSN impressa: 0100-
8307 ISSN on-line: 2179-460X.
[4] Limeng Cui, Yong Shi, A Method based on One-class SVM for News Recommendation, Procedia Computer Science, Volume
31, 2014, Pages 281-290, ISSN 1877-0509
[5] Li, Y.H., Jain, A.K. “Classification of text documents”, Computer Journal, 41 (8), 1998, pp. 537-546.
[6] Li, Wei; Lee, Bob; Krausz, Franl and Sahin, Kenan. Text Classification by a Neural Network. In Proceedings of the 23rd Annual
Summer Computer Simulation Conference, 1991, pp. 313-318.
[7] Soucy, P., Mineau, G.W.”A simple KNN algorithm for text categorization”, in Proceedings of the IEEE International
Conference on Data Mining (ICDM), 2001, pp. 647-648.
257
[8] SMR. Hashemi, M. Zangian, M. Shakeri, and M. Faridpoor, "Survey Article about Image Fuzzy Processing Algorithms." The
Journal of Mathematics and Computer Science, Vol 13, Issue 1 2014, pp 26-40.
[9] Joachims, Thorsten.Text Categorization with Support Vector Machines: Learning with Many Relevant Features. In
Proceedings of the 10th European Conference on Machine Learning, 1998, pp. 137-42.
[10] M. Arabsorkhi, M. Shamsfard, "Unsupervised Discovery of Persian Morphemes," in Proceedings of the 11th Conference of
the European chapter of the Association for Computational Linguistics (EACL), Italy, 2006.
[11] A. Bagheri, H. Farzanehfar, M. H. Soraie, M. R. Ahmadzadeh, "Persian News Text Classification Using Naïve Bayes Algorithm",
in 2st National Conference of Iranian Data Mining, Tehran, 2009.
[12] M. E. Basiri, S. Nemati, N. G. Aqaie, "Comparison of Persian Text Classification Using kNN, fkNN and Feature Selection
Algorithms Based on Information Gain and Document Ferquency", in 13st Annual Conference of Computer Society of Iran,
Tehrn, 2008.
[13] SMR. Hashemi, A.Broumandnia. "A Review of Attention Models in Image Protrusion and Object Detection." The Journal of
Mathematics and Computer Science, Vol 15, Issue 4 2015, pp 273-283.
[14] B. Bina, M. Rahgozar, and A. D. Mobad, "Automatic Persian Text Classification", in 13st Annual Conference of Computer
Society of Iran, Tehrn, 2008.
[15] N. Maqsudi, and M. M. Homayunpur, "A Novel Method for Persian Text Classification Using Semantic Knowledge", in 15st
Annual Conference of Computer Society of Iran, Tehrn, 2010.
[16] M.S.B. PhridviRaj, C.V. GuruRao, Data Mining – Past, Present and Future – A Typical Survey on Data Streams, Procedia
Technology, Volume 12, 2014, Pages 255-263, ISSN 2212-0173
[17] Mohsen Ramezani, Parham Moradi, Fardin Akhlaghian, A pattern mining approach to enhance the accuracy of collaborative
filtering in sparse data domains, Physica A: Statistical Mechanics and its Applications, Volume 408, 15 August 2014, Pages
72-84, ISSN 0378-4371
[18] SMR. Hashemi, S. Mohammadalipour and A. Broumandnia, " Evaluation and classification new algorithms in Image
Resizing.", International Journal of Mechatronics, Electrical and Computer Technology Vol. 5(18) Special Issue, Dec. 2015,
PP. 2649-2654, ISSN: 2305-0543
[19] Chrystalleni Lazarou, Minas Karaolis, Antonia-Leda Matalas, Demosthenes B. Panagiotakos, Dietary patterns analysis using
data mining method. An application to data from the CYKIDS study, Computer Methods and Programs in Biomedicine,
Volume 108, Issue 2, November 2012, Pages 706-714, ISSN 0169-2607
[20] Edrisi Muñoz, Elisabet Capón-García, José M. Laínez-Aguirre, Antonio Espuña, Luis Puigjaner, Using mathematical knowledge
management to support integrated decision-making in the enterprise, Computers & Chemical Engineering, Volume 66, 4
July 2014, Pages 139-150, ISSN 0098-1354
[21] Zhenhua Wang, Lai Tu, Zhe Guo, Laurence T. Yang, Benxiong Huang, Analysis of user behaviors by mining large network data
sets, Future Generation Computer Systems, Volume 37, July 2014, Pages 429-437, ISSN 0167-739
[22] Seyyed Mohammad Reza. Hashemi, A. Broumandnia, "A New Method for Image Resizing Algorithm via Object Detection."
International Journal of Mechatronics, Electrical and Computer Technology, Vol 5, Issue 16 2015.
[23] Dirk Thorleuchter, Dirk Van den Poel, Anita Prinzie, Mining ideas from textual information, Expert Systems with Applications,
Volume 37, Issue 10, October 2010, Pages 7182-7188, ISSN 0957-4174
[24] Xiaofei Zhou, Yue Hu, Li Guo, Text Categorization based on Clustering Feature Selection, Procedia Computer Science, Volume
31, 2014, Pages 398-405, ISSN 1877-0509
[25] Yao-Tsung Chen, Meng Chang Chen, Using chi-square statistics to measure similarities for text categorization, Expert
Systems with Applications, Volume 38, Issue 4, April 2011, Pages 3085-3090, ISSN 0957-4174
[26] Jan Kalina, Classification methods for high-dimensional genetic data, Biocybernetics and Biomedical Engineering, Volume
34, Issue 1, 2014, Pages 10-18, ISSN 0208-5216
[27] Mohamed Maher Ben Ismail, Hichem Frigui, Unsupervised clustering and feature weighting based on Generalized Dirichlet
mixture modeling, Information Sciences, Volume 274, 1 August 2014, Pages 35-54, ISSN 0020-0255
[28] Masanori Kawakita, Jun’ichi Takeuchi, Safe semi-supervised learning based on weighted likelihood, Neural Networks,
Volume 53, May 2014, Pages 146-164, ISSN 0893-6080
[29] Martin Längkvist, Lars Karlsson, Amy Loutfi, A review of unsupervised feature learning and deep learning for time-series
modeling, Pattern Recognition Letters, Volume 42, 1 June 2014, Pages 11-24, ISSN 0167-8655
[30] Yangchang Zhao, Chapter 10 - Text Mining, In R and Data Mining, edited by Yangchang Zhao, Academic Press, 2013, Pages
105-122, ISBN 9780123969637
[31] M. ZANGIAN, SMR. HASHEMI, F. YAGHMAEE, and E. MOSHTAGH " COMPARATIVE EVALUATION OF FACE RECOGNITION
ALGORITHMS USING AND NONINDIVIDUAL ALGORITHMS, Vol 2, Issue1 2014 , pp 16-19.( Special Online Issue-February
2014)
[32] SMR. Hashemi, M. Kalantari, and M. Zangian, "Giving a New Method for Face Recognition Using Neural Networks.",
International Journal of Mechatronics, Electrical and Computer Technology Vol. 4(11), A pr, 2014, pp. 744-761, ISSN: 2305-
0543
258
[33] Mohammad Mahdi Deramgozin, Smr.Hashemi, Azam Bastan Fard "Face Recognition Improvement in Angled Status Using
Invasive Weed Optimization Algorithm And fuzzy System"2016 1st International Conference on New Research Achievements
in Electrical and Computer Engineering (2016).
[34] Smr.Hashemi, A. Broumandnia, Z. Zangian and E. Moshtagh "ANNOTATING THE IMAGES USING BACKGROUND"Indian
Journal of Scientific Research, Special Online Issue-April2014.
[35] A Nouhi, SMR Hashemi, F Yaghmaee, M Zangian " Indexing for PERSIAN Textual Images " Applied Mathematics in
Engineering, Management and Technology 2 (1) (Jan 2014)
[36] M.HajiGhorbani, SMR.Hashemi, B.Minaei-Bidgoli, Shabnam Safari "A Review of Some Semi-Supervised Learning Methods"
IEEE-2016, First International Conference on New Research Achievements in Electrical and Computer Engineering 2016.
[37] MM.Deramgozin, SMR.Hashemi, A.BastanFard, M.HajiGhorbani "Face Recognition Improvement in Angled Status Using
Invasive Weed Optimization Algorithm And fuzzy System" IEEE-2016, First International Conference on New Research
Achievements in Electrical and Computer Engineering 2016.
[38] SMR.Hashemi, MM.Deramgozin, M. Hajighorbani, B.Minaei-Bidgoli "A review of the methods of watermarking in digital
texts" IEEE-2016, First International Conference on New Research Achievements in Electrical and Computer Engineering
2016.
259