0% found this document useful (0 votes)

15 views

27 A Review of Some Semi Supervised Learning Methods

This document provides a review of some semi-supervised learning methods. It discusses basic concepts of data mining and text mining, and reviews common algorithms in semi-supervised learning. Some key methods mentioned include Bayesian classifier, artificial neural networks, nearest neighbor algorithms, and support vector machines.

Uploaded by

Saniya Yt

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

15 views

27 A Review of Some Semi Supervised Learning Methods

Uploaded by

Saniya Yt

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 10

Vol. 2(4), Jan. 2016, pp.

250-259

A Review of Some Semi-Supervised Learning Methods

Mohsen Hajighorbani1*, Seyyed Mohammad Reza Hashemi2. Ali Broumandnia3 and Maryam Faridpour4
1
Department of Computer Engineering, Iran University of Science and Technology, Tehran, Iran
2
Department of Computer Engineering and Information Technology, Payame Noor University, Iran
3
Faculty of Computer and Information Technology Engineering, South Tehran Branch, Islamic Azad
University, Tehran, Iran
4
Young Researchers and Elite Club, Qazvin Branch, Islamic Azad University, Qazvin, Iran
*Corresponding Author's E-mail: [email protected]
Abstract

I n recent years the use of data-mining techniques as well as smart algorithms has become common.
Several tasks, formerly done at the expense of significant amounts of time and money, can be performed
by means of these techniques and algorithms. On the other hand, many of our sources are textual ones.
During all these years, there have been different classifications with varying approaches for this task. It is
noteworthy that the possibility of automatization of these classifications relies on new texts. This paper deals
with basic concepts, concerning data-mining and text-mining, reviewing some semi-supervised learning
methods. It also gives a review some common algorithms in this area and finally presents the summary and
conclusion.

Keywords: data-mining, text-mining, learning, semi-supervised, classification

1. Introduction
In terms of natural language processing, particularly processing the text, one of the basic tasks is the
automatic classification of the texts [1] and [2]. Identification of a text’s category or class could provide
useful information for some processes such as machine translation, text-to-speech transformation, and
OCR (Optic Character Recognition). When classifying a set, there is the learning from the documents which
for this set the classes are pre-determined. By means of this set, the classification model is learned, later
to determine the new incoming document class. The aim of text classification is to attribute pre-defined
classes to textual documents. For instance in case of a new piece of news, inserted into the system it will
be able to determine wheterh this piece of news belongs to the class of sports, politics, or art [3]. There
various method to classify the documents. The majority of used methods in previous works were based
on supervised learning methods, where we have sufficient amount of labeled data. Also a few of them
have been done based on unsupervised learning methods on unlabeled data. Since retrieving the label is
quite hard and costy while the unlabeled data are frequent, semi-supervised learning becomes a good idea
to decrease human work and improve the accuracy and authenticity. The semi-supervised learning, using
a great amount of unlabeled information along with labeled data is better for categorization. Since in semi-
supervised learning, for higher accuracy human endeavors less, there is much interest in it both

250
Article History:
JKBEI DOI: 649123/11036
Received Date: 14 Sep. 2015
Accepted Date: 06 Dec. 2015
Available Online: 14 Jan. 2016
Mohsen Hajighorbani et al. / Vol. 2(4) Jan. 2016, pp. 250-259 JKBEI DOI: 649123/11036

theoretically and practically. In terms of classifying the texts in English, there have been different
techniques such as Basian Classifier [4], Artificial Neural Network [5], Nearest Neighbor Algorithms [6],
Support Vector Machine [7], EM Algorithm, and other varying techniques. As for classifying Persian texts,
techniques such as unsupervised methods [8], Basian Classifier [9], Nearest Neighbor Algorithms [10], n-
gram indicator [11], use of semantic knowledge [12], etc. have been employed, yet there have not been
any outstanding activity in terms of the semi-supervised method.

2. Basic Concepts
2.1. Data Mining
Data mining is a set of intelligent methods to extract hidden knowledge within the data [13]. In fact
“Data Mining” as the term means, is conceived as the extraction of hidden information, patterns, or certain
relations in a great amount of data within one or several big datasets [14].
Data mining puts dense databases and datasets under analysis or machine (and semi-machine) mining
in order to discover and extract knowledge. Such studies and searches could be considered as the
continuation and permanence of statistics [15]. The main difference is in the scale, vastness, and variance
of backgrounds and functions as well as dimensions of sizes of the modern world’s data which
requiremachine methods, related to learning, modeling, and training.
Nowadays as database systems and the great amount of data stored in these systems grow, there is a
need for a tool with which the stored data could be processed and the outcome information of such
processing could be provided to the users. By means of various tools of normal SQL-based recording and
via simpe queries, one can provide the users with reports so that they will be able to reach conclusions,
concerning the data and their logical relations; however, when the amount of the data is collosal, the
users, whatsoever skilled and experienced, cannot identify suitable patterns among the mass of the data
or even if they can do so, the cost of the operation will be very high both in human force and financial
terms.
On the other hand, users usually pose a hypothesis, then to prove or disprove it according to the
observed reports, whereas nowadays there is a need for methods that discover knowledge, i.e. with the
least user interference and in an autoimatic way they discover and indicate the patterns and logical
relations.
Data mining is one of the most important example of such methods, with which suitable patterns inside
the data are identified with minimum user interference and the users and analysts are provided with some
information so that important and vital decisions are made in organizations accordingly [16]. In data
mining, a part of statistics called exploratory analysis of the data is used that emphasizes the discovery of
hidden and unknown information from a bulk of data. In addition, data mining with artificial intelligence
and machine learning has a close relationship; therefore, one can say that in data mining basic theories of
the data. Artificial intelligence, machine learning, and statistics are intermingled to result in a functional
foundation.
One should consider that data mining is used when we are faced with a bulk of data. In all data mining
sources this has been emphasized [17]. The greater the bulk of data and the more complicated the
relations among them, the harder the access to hidden information within the data, and the clearer the
role of data mining as one of the methods of exploring knowledge. In general learning methods in data
mining can be categorized into three groups of supervised learning (classification), unsupervised learning
(clustering), and semi-supervised learning. In the following while reviewing the concepts of text mining,
we will deal with the differences of supervised and unsupervised learning at 2.2.3, while the methods of
semi-supervised learning will be discussed at 2.4.

251

Journal of Knowledge-Based Engineering and Innovation (JKBEI)

Universal Scientific Organization, http://www.aeuso.org/jkbei
ISSN: 2413-6794 (Online)
Mohsen Hajighorbani et al. / Vol. 2(4) Jan. 2016, pp. 250-259 JKBEI DOI: 649123/11036

2.2. Text Mining

A considerable part of available information is stored as a text containing a great set of different sources’
documents (for instance news articles, papers, books, emails, web pages, etc.). Textual databases grow
quickly as a result of the increase in the amount of existing information in electronic form. Nowadays most
of the information in industry, business, and organizations are stored in electronic form as textual
databases.
The data stored electronically are often half-structured, because they are neighter completely
unstructured not throroughly structured. For instance a document contains many structured fields such
as the title, authors, date of publication, category, etc., while on the other hand it has some unstructured
textual elements such as summary and contents. Methods of retrieving the information (like indexing
methods of the text) are established for the management of unstructured documents. Older information
retrieving methods are inefficient for a good deal of textual data, growing increasingly. Without knowning
the document’s contents, formulating suitable queries to extract useful information from the data is
difficult. Users need tools to compare different documents, organize them based on their relevance, and
find the patterns. Therefore, one of the latest researched backgrounds in data mining, i.e. text mining,
expanded for this purpose. Text mining is the search for patterns in an unstructured text [18], which is
used to automatically discover the desired or suitable knowledge from the text. There have been many
techniques for text mining which are conceptual structure, exploring association rules, decision-making
tree, methods of rule deduction, and information retrieving methods for tasks such as matching the
documents, organizing, clustering, etc.
A problem of text-mining is to discover the suitable knowledge from semi-structured or unstructured
texts, which has attracted much attention. Traditaional data-mining methods assume that in the form of
association databases, hence not suitable for many fucntions such as available elctrocnic information in
semi-structured or unstructured forms. Without text-mining, processing unstructured textual data should
be done by hand via the users, which is quite costly and frustrating. Hence it can be said that text mining
is to automatize a great amount of users’ work for exploration inside the texts.
Sometimes instead of text mining, exploration of textual data or knowledge discover in the text are
used [19]. Text mining relies on finding new knowledge from the text (usually the knownledge which is
implied in the documents), whereas the retrieving method finds the information of documents which has
the most relevance. Text mining can be considered an interdisciplinary method for information retrieving,
machine learning, statistics, computational linguistics, and particularly data mining [20]. Since text mining
is rooted in different technologies, there are many definitions for it. People with a background in terms of
data mining, wanted to execute the same concepts and methods of data mining and their definitions were
based on the same area. Yet those, coming from computational linguistics, aimed to give this ability to the
computer so that they could understand the text and this is the ultimate goal, expected from text mining.
Thus the use of processing techniques of the natural language is undeniable for this purpose. Section 3.2
deals with the required processings of texts in Persian.

2.3. Difference of Supervised Learning from Unsupervised Learning

In supervised learning (classification), the classes are pre-determined and each data item is allocated
to a pre-determined class [21]. However in unsupervised learning (clustering) there is no hypothesis of the
existing classes in the data and in other words the clusters, themselves, are extracted from the data [22].
Fig. 2-1 shows the difference between clustering and classification.

252

Journal of Knowledge-Based Engineering and Innovation (JKBEI)

Fig. 1 A) In the classification by means of a series of primary information, the data are attributed to known sets. In
clustering, the data are attributed to clusters in accordance with the selected algorithm. B) In clustering, the
learning is done without a supervisor, i.e. there is no need for primary training data.

Generally, in supervised learning, classification being a part of it, the classes are known from the
beginning and each item of the training data are attributed to a particular class, and it is said that there is
a supervisor to train the system [23]; yet in unsupervised learning, clustering being a part of it, there is no
information such as training data at the learner’s disposal and it is the learner that should seek special
structures in the data [24].

2.3.1 Semi-Superivsed Learning Methods

In semi-supervised learning problem we are faced with two types of data, namely labeled and
unlabeled data. Creation of labeled data is mainly difficult, costly, and time-consuming while achieving
unlabeled data is easier and cheaper. Labelling the training data could be done by hand. Since the learning
method is semi-supervised there is no need to label all possible cases. The aim of using both labeled and
unlabeled data is to improve the efficiency. For so doing, semi-supervised learning could be seen in two
ways:
Unsupervised learning + extra labeled data: In thi method a few number of labeled data is used to
organize the unlabeled clustering.
Supervised learning + extra unlabeled data: In this method, training on labeled data and the use of extra
unlabeled data often results in more accurate classification.
Unsupervised learning (clustering) groups similar objects with each other to find the clusters. The aim
in clustering is to minimize the intra-cluster and maximize the inter-cluster distances. Fig. 2.2 demonstrates
this concept.

253

Journal of Knowledge-Based Engineering and Innovation (JKBEI)

Inter-cluster
Intra-cluster distances are
distances are maximized
minimized

Fig. 2 How clustering is done in unsupervised learning

In supervised learning (classification) the class label of each training data has been given. Based on such
training data a model is constructed and, afterwards, based on this model class labels for unseen data (test
data) are predicted.

2.4. Semi-Supervised Clustering Algorithms

For search-based semi-supervised clustering, we alter clustering algorithms, which search for a good
separator, in way that:
• The target function changes in a way that it will determine a prize to follow the supervised data
labels.
• While clustering the labeled data, it should enact some constraints (must-link, cannot-link, etc.).
• To begin the clusters in a repetitive reinforcement algorithm (k-means) it would use labeled data.

2.4.1. Semi-Supervised K-Means Algorithm

• This method contains different algorithms:
• Selected K-Means
• Constrained K-Means
• COP K-Means

2.4.1.1. Selected K-Means Algorithm

In this method, in order to begin some labeled data, provided by the user, are used:
• The initial center for cluster i is the average of points, labeled as i.
• To begin, labeled points are used, which will not be employed in later stages (their labels might
vary).

2.4.1.2. Constrained K-Means Algorithm

To begin in this method, the labeled data, provided by the user, are employed. Therefore, the labeled
points are used for the beginning, remaining unchanged in later stages and only the labels of other points
differ.

2.4.1.3. COP K-Means Algorithm

This is the same as k-means method with the constraints of must-link (they should be in one cluster)
and cannot-link (they cannot be in one cluster) on the data points.

254

Journal of Knowledge-Based Engineering and Innovation (JKBEI)

To begin, the cluster center is selected randomly, yet whatever selected, the must-links should remain (i.e.
they cannot be selected as the center of another cluster).
Its algorithm is so that without violating any constraint at the assignment stage, every point is assigned
to the nearest cluster center. If such an assignment does not exist, the entire clustering is rejected.

2.5. Semi-Supervised Hierarchical Clustering Algorithm

There are two methods for hierarchical clustering: Agglomerative and Divisive Methods. The basis of
agglomerative hierarchical clustering algorithms is as follow:
1. Compute the distance matrix
2. Let each data point be a cluster
3. Repeat
4. Merge the two closest clusters
5. Update the distance matrix
6. Until only a single cluster remains

Policies, existing to measure the inter-class distance, are as follows: Min, Max, Group Average, and
Distance between Centroids. Each initializes the distance matrix in a way. In this method, also two
constraints of Must-link and Cannot-link are enacted in the distance matrix.
Must-link Constraint: The distance between must-link pairs is considered as zero.
Cannot-link Constraint: Assume that the hierarchical clustering is done with (max) complete link and that
the distance between two clusters is determined with the greatest distance. Then the distance between
two cannot-link pairs is considered as the maximum intra-matrix rate plus one (maxi, jDij + 1). In what
comes below, the quasi-code, related to semi-supervised hierarchical clustering with complete-link
constraint is given in distance matrix:

3. Semi-Supervised Classification Algorithms

The most important algorithms, used for semi-supervised learning in this class, are [25]:
• Self-Training Learning
• Co-Training Learning

255

Journal of Knowledge-Based Engineering and Innovation (JKBEI)

3.1. Self-Training Learning Algorithm

In this method a few number of labeled training data exist. At first, training happens by means of
labeled data. Aferwards some predictions are made on unlabeled data and eventually the best predictions
undergo the training once more.

Fig. 3 the process of self-training learning algorithm

This method can be executed by any supervised learning algorithm and often acts in a good way. The
problem of this method is that the errors are reinforced in each stage. For instance consider the basic
learning algorithm, KNN Classifier, in Fig. 4.2. As it can be seen, if outlier data exist, the learning algorithm
makes a mistake and reinforces the errors.

Fig. 4 the learning process of KNN Algorithm in different iterations

256

Journal of Knowledge-Based Engineering and Innovation (JKBEI)

3.2. Co-Training Learning Algorithm

The learning process in this algorithm is as follows:
• We have labeled (L) and unlabeled (U) data.
• We creare two datasets L1 and L2 from L with view 1 and view 2.
• We train Classifier f1 by means of L1 and classifier f2 by means of L2.
• We apply f1 and f2 on unlabeled datasets so that the labels are predicted. Estimations are done
from the learning algorithm, itself, from the features.
• We add K estimations with the highest certainty rate of f1 on L2.
• We add K estimaions with the highest certainty rate of f2 on L2.
• We omit these samples from unlabeled datasets.
• We train f1 again by means of L1 and f2 via L2.
• It is similar to self-training learning, with this difference that it trains both classifiers.
• Eventually we use election or averaging to estimate the test data.

The characteristics of this method are as follows:

• This method works well if the classifiers have a good certainty on their predictions and are
sufficiently various (on different areas of the sample space, they are correct)
• The method could expand to more than two classifiers. In order to obtain variety and
independence of the classifiers, two separate feature sets (two views) are employed.
• Co-training learning with several views decrease the error of separate views. More decrease could
be attained by combining the predictions of the two classifiers.

Summary and Conclusion

Nowadays due to the bulk and increasing growth of Persian texts, automatic classification of the
documents and texts is of great practical value, increasingly becoming an important area for research. The
present paper reviewed some semi-supervised training methods in textual documents. Many learning
methods, such as the supervised ones, only rely on labeled training data, while obtaining them is quite
costly. However, a great bulk of unlabeled data is available rapidly and with a low cost. On the other hand,
methods such as unsupervised learning only rely on unlabeled data. Afterwards we surveyed two semi-
supervised learning methods, positioned between the supervised and unsupervised methods, which use
a combination of unlabeled and a limited amount of labeled ones, so that we can employ this technique
to classify Persian texts.

Reference
[1] Fernando Enríquez, Fermín L. Cruz, F. Javier Ortega, Carlos G. Vallejo, José A. Troyano, A comparative study of classifier
combination applied to NLP tasks, Information Fusion, Volume 14, Issue 3, July 2013, Pages 255-267, ISSN 1566-2535
[2] Asif Ekbal, Sriparna Saha, Simulated annealing based classifier ensemble techniques: Application to part of speech tagging,
Information Fusion, Volume 14, Issue 3, July 2013, Pages 288-300, ISSN 1566-2535
[3] SMR. Hashemi "A Survey of Visual Attention Models" Ciência e Natura, v. 37 Part 2 2015, p. 297−306 ISSN impressa: 0100-
8307 ISSN on-line: 2179-460X.
[4] Limeng Cui, Yong Shi, A Method based on One-class SVM for News Recommendation, Procedia Computer Science, Volume
31, 2014, Pages 281-290, ISSN 1877-0509
[5] Li, Y.H., Jain, A.K. “Classification of text documents”, Computer Journal, 41 (8), 1998, pp. 537-546.
[6] Li, Wei; Lee, Bob; Krausz, Franl and Sahin, Kenan. Text Classification by a Neural Network. In Proceedings of the 23rd Annual
Summer Computer Simulation Conference, 1991, pp. 313-318.
[7] Soucy, P., Mineau, G.W.”A simple KNN algorithm for text categorization”, in Proceedings of the IEEE International
Conference on Data Mining (ICDM), 2001, pp. 647-648.

257

Journal of Knowledge-Based Engineering and Innovation (JKBEI)

[8] SMR. Hashemi, M. Zangian, M. Shakeri, and M. Faridpoor, "Survey Article about Image Fuzzy Processing Algorithms." The
Journal of Mathematics and Computer Science, Vol 13, Issue 1 2014, pp 26-40.
[9] Joachims, Thorsten.Text Categorization with Support Vector Machines: Learning with Many Relevant Features. In
Proceedings of the 10th European Conference on Machine Learning, 1998, pp. 137-42.
[10] M. Arabsorkhi, M. Shamsfard, "Unsupervised Discovery of Persian Morphemes," in Proceedings of the 11th Conference of
the European chapter of the Association for Computational Linguistics (EACL), Italy, 2006.
[11] A. Bagheri, H. Farzanehfar, M. H. Soraie, M. R. Ahmadzadeh, "Persian News Text Classification Using Naïve Bayes Algorithm",
in 2st National Conference of Iranian Data Mining, Tehran, 2009.
[12] M. E. Basiri, S. Nemati, N. G. Aqaie, "Comparison of Persian Text Classification Using kNN, fkNN and Feature Selection
Algorithms Based on Information Gain and Document Ferquency", in 13st Annual Conference of Computer Society of Iran,
Tehrn, 2008.
[13] SMR. Hashemi, A.Broumandnia. "A Review of Attention Models in Image Protrusion and Object Detection." The Journal of
Mathematics and Computer Science, Vol 15, Issue 4 2015, pp 273-283.
[14] B. Bina, M. Rahgozar, and A. D. Mobad, "Automatic Persian Text Classification", in 13st Annual Conference of Computer
Society of Iran, Tehrn, 2008.
[15] N. Maqsudi, and M. M. Homayunpur, "A Novel Method for Persian Text Classification Using Semantic Knowledge", in 15st
Annual Conference of Computer Society of Iran, Tehrn, 2010.
[16] M.S.B. PhridviRaj, C.V. GuruRao, Data Mining – Past, Present and Future – A Typical Survey on Data Streams, Procedia
Technology, Volume 12, 2014, Pages 255-263, ISSN 2212-0173
[17] Mohsen Ramezani, Parham Moradi, Fardin Akhlaghian, A pattern mining approach to enhance the accuracy of collaborative
filtering in sparse data domains, Physica A: Statistical Mechanics and its Applications, Volume 408, 15 August 2014, Pages
72-84, ISSN 0378-4371
[18] SMR. Hashemi, S. Mohammadalipour and A. Broumandnia, " Evaluation and classification new algorithms in Image
Resizing.", International Journal of Mechatronics, Electrical and Computer Technology Vol. 5(18) Special Issue, Dec. 2015,
PP. 2649-2654, ISSN: 2305-0543
[19] Chrystalleni Lazarou, Minas Karaolis, Antonia-Leda Matalas, Demosthenes B. Panagiotakos, Dietary patterns analysis using
data mining method. An application to data from the CYKIDS study, Computer Methods and Programs in Biomedicine,
Volume 108, Issue 2, November 2012, Pages 706-714, ISSN 0169-2607
[20] Edrisi Muñoz, Elisabet Capón-García, José M. Laínez-Aguirre, Antonio Espuña, Luis Puigjaner, Using mathematical knowledge
management to support integrated decision-making in the enterprise, Computers & Chemical Engineering, Volume 66, 4
July 2014, Pages 139-150, ISSN 0098-1354
[21] Zhenhua Wang, Lai Tu, Zhe Guo, Laurence T. Yang, Benxiong Huang, Analysis of user behaviors by mining large network data
sets, Future Generation Computer Systems, Volume 37, July 2014, Pages 429-437, ISSN 0167-739
[22] Seyyed Mohammad Reza. Hashemi, A. Broumandnia, "A New Method for Image Resizing Algorithm via Object Detection."
International Journal of Mechatronics, Electrical and Computer Technology, Vol 5, Issue 16 2015.
[23] Dirk Thorleuchter, Dirk Van den Poel, Anita Prinzie, Mining ideas from textual information, Expert Systems with Applications,
Volume 37, Issue 10, October 2010, Pages 7182-7188, ISSN 0957-4174
[24] Xiaofei Zhou, Yue Hu, Li Guo, Text Categorization based on Clustering Feature Selection, Procedia Computer Science, Volume
31, 2014, Pages 398-405, ISSN 1877-0509
[25] Yao-Tsung Chen, Meng Chang Chen, Using chi-square statistics to measure similarities for text categorization, Expert
Systems with Applications, Volume 38, Issue 4, April 2011, Pages 3085-3090, ISSN 0957-4174
[26] Jan Kalina, Classification methods for high-dimensional genetic data, Biocybernetics and Biomedical Engineering, Volume
34, Issue 1, 2014, Pages 10-18, ISSN 0208-5216
[27] Mohamed Maher Ben Ismail, Hichem Frigui, Unsupervised clustering and feature weighting based on Generalized Dirichlet
mixture modeling, Information Sciences, Volume 274, 1 August 2014, Pages 35-54, ISSN 0020-0255
[28] Masanori Kawakita, Jun’ichi Takeuchi, Safe semi-supervised learning based on weighted likelihood, Neural Networks,
Volume 53, May 2014, Pages 146-164, ISSN 0893-6080
[29] Martin Längkvist, Lars Karlsson, Amy Loutfi, A review of unsupervised feature learning and deep learning for time-series
modeling, Pattern Recognition Letters, Volume 42, 1 June 2014, Pages 11-24, ISSN 0167-8655
[30] Yangchang Zhao, Chapter 10 - Text Mining, In R and Data Mining, edited by Yangchang Zhao, Academic Press, 2013, Pages
105-122, ISBN 9780123969637
[31] M. ZANGIAN, SMR. HASHEMI, F. YAGHMAEE, and E. MOSHTAGH " COMPARATIVE EVALUATION OF FACE RECOGNITION
ALGORITHMS USING AND NONINDIVIDUAL ALGORITHMS, Vol 2, Issue1 2014 , pp 16-19.( Special Online Issue-February
2014)
[32] SMR. Hashemi, M. Kalantari, and M. Zangian, "Giving a New Method for Face Recognition Using Neural Networks.",
International Journal of Mechatronics, Electrical and Computer Technology Vol. 4(11), A pr, 2014, pp. 744-761, ISSN: 2305-
0543

258

Journal of Knowledge-Based Engineering and Innovation (JKBEI)

[33] Mohammad Mahdi Deramgozin, Smr.Hashemi, Azam Bastan Fard "Face Recognition Improvement in Angled Status Using
Invasive Weed Optimization Algorithm And fuzzy System"2016 1st International Conference on New Research Achievements
in Electrical and Computer Engineering (2016).
[34] Smr.Hashemi, A. Broumandnia, Z. Zangian and E. Moshtagh "ANNOTATING THE IMAGES USING BACKGROUND"Indian
Journal of Scientific Research, Special Online Issue-April2014.
[35] A Nouhi, SMR Hashemi, F Yaghmaee, M Zangian " Indexing for PERSIAN Textual Images " Applied Mathematics in
Engineering, Management and Technology 2 (1) (Jan 2014)
[36] M.HajiGhorbani, SMR.Hashemi, B.Minaei-Bidgoli, Shabnam Safari "A Review of Some Semi-Supervised Learning Methods"
IEEE-2016, First International Conference on New Research Achievements in Electrical and Computer Engineering 2016.
[37] MM.Deramgozin, SMR.Hashemi, A.BastanFard, M.HajiGhorbani "Face Recognition Improvement in Angled Status Using
Invasive Weed Optimization Algorithm And fuzzy System" IEEE-2016, First International Conference on New Research
Achievements in Electrical and Computer Engineering 2016.
[38] SMR.Hashemi, MM.Deramgozin, M. Hajighorbani, B.Minaei-Bidgoli "A review of the methods of watermarking in digital
texts" IEEE-2016, First International Conference on New Research Achievements in Electrical and Computer Engineering
2016.

259

Journal of Knowledge-Based Engineering and Innovation (JKBEI)

Universal Scientific Organization, http://www.aeuso.org/jkbei
ISSN: 2413-6794 (Online)

AI in Health and Medicine
No ratings yet
AI in Health and Medicine
8 pages
Machine Learning and Non-Volatile Memories (Rino Micheloni, Cristian Zambelli) (Bibis - Ir)
No ratings yet
Machine Learning and Non-Volatile Memories (Rino Micheloni, Cristian Zambelli) (Bibis - Ir)
178 pages
Hot Ho 05 Text Mining
No ratings yet
Hot Ho 05 Text Mining
37 pages
A Brief Survey of Text Mining: Andreas Hotho KDE Group University of Kassel
No ratings yet
A Brief Survey of Text Mining: Andreas Hotho KDE Group University of Kassel
37 pages
Concept Mining: Fundamentals and Applications
From Everand
Concept Mining: Fundamentals and Applications
Fouad Sabry
No ratings yet
127 1498038923 - 21-06-2017 PDF
No ratings yet
127 1498038923 - 21-06-2017 PDF
9 pages
Text Mining and Its Applications
No ratings yet
Text Mining and Its Applications
5 pages
Text Mining: Fundamentals and Applications
From Everand
Text Mining: Fundamentals and Applications
Fouad Sabry
No ratings yet
Automatic Image Annotation: Fundamentals and Applications
From Everand
Automatic Image Annotation: Fundamentals and Applications
Fouad Sabry
No ratings yet
zhang2015
No ratings yet
zhang2015
5 pages
Machine Learning: 1.1 Types of Problems and Tasks
No ratings yet
Machine Learning: 1.1 Types of Problems and Tasks
9 pages
43.IJCSCN PreprocessingTechniquesforTextMining Ilamathi Nithya
No ratings yet
43.IJCSCN PreprocessingTechniquesforTextMining Ilamathi Nithya
11 pages
Short Review On Machine Learning and Its Application
No ratings yet
Short Review On Machine Learning and Its Application
12 pages
Automatic Image Annotation: Enhancing Visual Understanding through Automated Tagging
From Everand
Automatic Image Annotation: Enhancing Visual Understanding through Automated Tagging
Fouad Sabry
No ratings yet
Machine Learning
0% (1)
Machine Learning
8 pages
Pattern Recognition With Semi-Supervised Learning Algorithm
No ratings yet
Pattern Recognition With Semi-Supervised Learning Algorithm
57 pages
Background Research: 2.1 Machine Learning
No ratings yet
Background Research: 2.1 Machine Learning
9 pages
Effective Classification of Text
No ratings yet
Effective Classification of Text
6 pages
Machine Learning
No ratings yet
Machine Learning
9 pages
Ijermt Jan2019
No ratings yet
Ijermt Jan2019
9 pages
Text Extraction Research Paper
No ratings yet
Text Extraction Research Paper
6 pages
Using Text Mining To Locate and Classify Research Papers: Mathematical Methods and Systems in Science and Engineering
No ratings yet
Using Text Mining To Locate and Classify Research Papers: Mathematical Methods and Systems in Science and Engineering
7 pages
Worksheet 8
No ratings yet
Worksheet 8
17 pages
Techniques of Text Classification
No ratings yet
Techniques of Text Classification
28 pages
Classification
No ratings yet
Classification
44 pages
Text Mining Assignment
No ratings yet
Text Mining Assignment
12 pages
Semi-Supervised Learning A Brief Review
No ratings yet
Semi-Supervised Learning A Brief Review
6 pages
Survey Data Analysis
No ratings yet
Survey Data Analysis
17 pages
Image Retrieval: Fundamentals and Applications
From Everand
Image Retrieval: Fundamentals and Applications
Fouad Sabry
No ratings yet
Performance Enhancement Using Combinatorial Approach of Classification and Clustering in Machine Learning
No ratings yet
Performance Enhancement Using Combinatorial Approach of Classification and Clustering in Machine Learning
8 pages
Theis finaldoc
No ratings yet
Theis finaldoc
86 pages
Data Science for Librarians: Transforming Information into Insight
From Everand
Data Science for Librarians: Transforming Information into Insight
Jason Miller
1/5 (1)
Data Science Fundamentals and Practical Approaches: Understand Why Data Science Is the Next (English Edition)
From Everand
Data Science Fundamentals and Practical Approaches: Understand Why Data Science Is the Next (English Edition)
Dr. Gypsy Nandi
No ratings yet
The Information Process: A Model and Hierarchy
From Everand
The Information Process: A Model and Hierarchy
Victor Yang
No ratings yet
English Language Review Using Pattern Recognition and Machine Learning
No ratings yet
English Language Review Using Pattern Recognition and Machine Learning
12 pages
An Overview of Categorization Techniques: B. Mahalakshmi, Dr. K. Duraiswamy
No ratings yet
An Overview of Categorization Techniques: B. Mahalakshmi, Dr. K. Duraiswamy
7 pages
Image Retrieval: Unlocking the Power of Visual Data
From Everand
Image Retrieval: Unlocking the Power of Visual Data
Fouad Sabry
No ratings yet
(IJCST-V6I4P5) :S.Sheela, T.Bharathi
No ratings yet
(IJCST-V6I4P5) :S.Sheela, T.Bharathi
7 pages
Machine Learning
100% (1)
Machine Learning
6 pages
Article 6
No ratings yet
Article 6
6 pages
Differentiating Between Data-Mining and Text-Mining Terminology
No ratings yet
Differentiating Between Data-Mining and Text-Mining Terminology
15 pages
machine2
No ratings yet
machine2
3 pages
282.A Survey On Machine Learning Concept
No ratings yet
282.A Survey On Machine Learning Concept
9 pages
A Comparative Study of K-Means and K-Medoid Clustering For Social Media Text Mining
No ratings yet
A Comparative Study of K-Means and K-Medoid Clustering For Social Media Text Mining
6 pages
tmp6D8D TMP
No ratings yet
tmp6D8D TMP
5 pages
(IJCST-V7I2P2) :azhar Ushmani
No ratings yet
(IJCST-V7I2P2) :azhar Ushmani
4 pages
Ijcst V3i2p17
No ratings yet
Ijcst V3i2p17
5 pages
Adbms Ans
No ratings yet
Adbms Ans
4 pages
The Survey of Data Mining Applications and Feature Scope
No ratings yet
The Survey of Data Mining Applications and Feature Scope
16 pages
Technical Report 2.0
No ratings yet
Technical Report 2.0
8 pages
Implement A Mining Web Document Through New Data Clustering Algorithm PDF
No ratings yet
Implement A Mining Web Document Through New Data Clustering Algorithm PDF
7 pages
DMDW Case Study Finished
No ratings yet
DMDW Case Study Finished
28 pages
Data Mining With Linked Data: Past, Present, and Future: Rohit Beniwal, Vikas Gupta, Manish Rawat, and Rishabh Aggarwal
No ratings yet
Data Mining With Linked Data: Past, Present, and Future: Rohit Beniwal, Vikas Gupta, Manish Rawat, and Rishabh Aggarwal
5 pages
ML
No ratings yet
ML
18 pages
CaseStudy 4203 Content Document 20230410121302PM
No ratings yet
CaseStudy 4203 Content Document 20230410121302PM
7 pages
2565-Article Text-7901-3-10-20231018
No ratings yet
2565-Article Text-7901-3-10-20231018
8 pages
DWDMunit 2
No ratings yet
DWDMunit 2
27 pages
Classification Algorithm in Data Mining: An
No ratings yet
Classification Algorithm in Data Mining: An
6 pages
Introduction to Data Analysis in Qualitative Research
From Everand
Introduction to Data Analysis in Qualitative Research
Asher Shkedi
No ratings yet
Data Mining Techniques
No ratings yet
Data Mining Techniques
8 pages
Machine Learning Algorithms: A Review: Department of CSE, Gautam Buddha University, Greater Noida, Uttar Pradesh, India
No ratings yet
Machine Learning Algorithms: A Review: Department of CSE, Gautam Buddha University, Greater Noida, Uttar Pradesh, India
6 pages
ML Chapter 1
No ratings yet
ML Chapter 1
37 pages
Data Science S (2 Files Merged)
No ratings yet
Data Science S (2 Files Merged)
30 pages
Democratic Co Learning
No ratings yet
Democratic Co Learning
9 pages
40-algorithms-every-data-scientist-should-know-jurgen-weichenberger-huw-kwon
No ratings yet
40-algorithms-every-data-scientist-should-know-jurgen-weichenberger-huw-kwon
39 pages
Outlier Detection in Sensor Data Using Ensemble Learning
No ratings yet
Outlier Detection in Sensor Data Using Ensemble Learning
10 pages
Data Science Process and Machine Learning
No ratings yet
Data Science Process and Machine Learning
6 pages
Machine Learinig Ja Bca 2nd Year Part 1
No ratings yet
Machine Learinig Ja Bca 2nd Year Part 1
10 pages
futureinternet-15-00271
No ratings yet
futureinternet-15-00271
21 pages
Understanding AI Technology
No ratings yet
Understanding AI Technology
20 pages
ChatGPT - Deep Learning vs Machine Learning
No ratings yet
ChatGPT - Deep Learning vs Machine Learning
61 pages
Machine Learning new
No ratings yet
Machine Learning new
41 pages
Final Report
No ratings yet
Final Report
36 pages
Jntuk Machine Learning 3-2 Unit-4
No ratings yet
Jntuk Machine Learning 3-2 Unit-4
32 pages
Lecture04 Graph SVM
No ratings yet
Lecture04 Graph SVM
54 pages
AIML Algorithms and Applications in VLSI Design and Technology
No ratings yet
AIML Algorithms and Applications in VLSI Design and Technology
41 pages
Industry 4.0 Answer Key
No ratings yet
Industry 4.0 Answer Key
14 pages
Artificial Intelligence For Smart Manufacturing Methods Applications And Challenges Kim Phuc Tran download
100% (1)
Artificial Intelligence For Smart Manufacturing Methods Applications And Challenges Kim Phuc Tran download
78 pages
Artificial Intelligence and Business Value: A Literature Review
No ratings yet
Artificial Intelligence and Business Value: A Literature Review
26 pages
Lecture 4
No ratings yet
Lecture 4
21 pages
Complete
No ratings yet
Complete
27 pages
NYT Dataset
No ratings yet
NYT Dataset
16 pages
Unit 3
No ratings yet
Unit 3
13 pages
AI Pastpaper Solve by M.Noman Tariq
No ratings yet
AI Pastpaper Solve by M.Noman Tariq
23 pages
Final Doc1
No ratings yet
Final Doc1
57 pages
The Machine Learning Landscape
No ratings yet
The Machine Learning Landscape
25 pages
s13634-022-00941-9
No ratings yet
s13634-022-00941-9
20 pages
Prayag Report
No ratings yet
Prayag Report
39 pages
Greens Function Homework
100% (1)
Greens Function Homework
5 pages
Basak Pseudo-Label Guided Contrastive Learning For Semi-Supervised Medical Image Segmentation CVPR 2023 Paper
No ratings yet
Basak Pseudo-Label Guided Contrastive Learning For Semi-Supervised Medical Image Segmentation CVPR 2023 Paper
12 pages

27 A Review of Some Semi Supervised Learning Methods

Uploaded by

27 A Review of Some Semi Supervised Learning Methods

Uploaded by

Vol. 2(4), Jan. 2016, pp.

A Review of Some Semi-Supervised Learning Methods

Keywords: data-mining, text-mining, learning, semi-supervised, classification

Journal of Knowledge-Based Engineering and Innovation (JKBEI)

2.2. Text Mining

2.3. Difference of Supervised Learning from Unsupervised Learning

Journal of Knowledge-Based Engineering and Innovation (JKBEI)

2.3.1 Semi-Superivsed Learning Methods

Journal of Knowledge-Based Engineering and Innovation (JKBEI)

Fig. 2 How clustering is done in unsupervised learning

2.4. Semi-Supervised Clustering Algorithms

2.4.1. Semi-Supervised K-Means Algorithm

2.4.1.1. Selected K-Means Algorithm

2.4.1.2. Constrained K-Means Algorithm

2.4.1.3. COP K-Means Algorithm

Journal of Knowledge-Based Engineering and Innovation (JKBEI)

2.5. Semi-Supervised Hierarchical Clustering Algorithm

3. Semi-Supervised Classification Algorithms

Journal of Knowledge-Based Engineering and Innovation (JKBEI)

3.1. Self-Training Learning Algorithm

Fig. 3 the process of self-training learning algorithm

Fig. 4 the learning process of KNN Algorithm in different iterations

Journal of Knowledge-Based Engineering and Innovation (JKBEI)

3.2. Co-Training Learning Algorithm

The characteristics of this method are as follows:

Summary and Conclusion

Journal of Knowledge-Based Engineering and Innovation (JKBEI)

Journal of Knowledge-Based Engineering and Innovation (JKBEI)

Journal of Knowledge-Based Engineering and Innovation (JKBEI)

You might also like