Research On Data Mining System Based
Research On Data Mining System Based
DOI:10.3233/JIFS-189507
IOS Press
Abstract. Due to the explosive increase of data scale, the traditional database management technology can no longer satisfy
and analyze these data. Data acquisition technology is a tool that can process data effectively. The research of data acquisition
has produced many new concepts and methods, which enrich and improve the data acquisition technology and establish the
theoretical system. The relevant extraction criteria are an important branch of data extraction and one of the most important
research fields. The use of genetic algorithms to mine related standards has been widely used, but traditional genetic algorithms
are easy to be used. Therefore, under the best conditions, the application of better genetic algorithm to mine the relevant
standards is a key problem to be dealt with in this paper.
Keywords: Artificial intelligence, improved genetic algorithm, genetic BP neural network, data mining
The research of data acquisition has produced progressive method based on competitive selection,
many new concepts and methods, which enrich and and the genetic algorithm is improved by optimizing
improve the data acquisition technology and estab- the particle mass to create new individuals [7] litera-
lish the theoretical system. the collection of these ture, the algorithms of particle mass, synthetic mass,
standards is an important branch of data acqui- ant and artificial immunity are analyzed, including:
sition research. using genetic algorithms to mine genetic algorithm combined with particle mass algo-
relevant standards has been widely used, but tradi- rithm, artificial immune algorithm, formed a hybrid
tional genetic algorithms are often the most suitable genetic algorithm, the advantage is that the polymer-
conditions. therefore, better application of genetic ization speed is not easy to decline, and a test function
algorithms to mine relevant standards is a key issue is used to verify the effectiveness of particle mass
to be addressed in this paper. and artificial immunity literature [8] explores the rela-
tionship between crossover and variable probability
and individual adaptation, and designs crossover and
2. Related work variable probability functions for individuals. dur-
ing the whole operation, highly adapted individuals
About the improved genetic algorithm, the liter- are protected from damage and the results are good
ature [1] analyzed the phenomenon of convergence through the test [1] literature has improved the for-
of traditional genetic algorithms. the reasons are mula of crossover and variable probability to solve its
mainly related to the selection of genetics, popu- shortcomings and adapt to the change of crossover
lation distribution and problem allocation. it lists and variable probability, and the formula of varia-
some individuals who can have a significant impact tion probability evolves relatively slowly in the early
on convergence of genetic algorithms and combine stage of development [9] literature describes other
the concept of the joint role of these individuals genetic activities and macro-operation of biomes
and environmental evolution literature [2] analysis and genetic algorithms, including mutations, visible
combines traditional genetic algorithms into static genes, diploid and polyploid structures, etc.
optimization research through algorithm selection, As to the research of the data acquisition system
intersection and mutation. the model theorem can of the relevant standards, the literature [1] focuses on
not guarantee that traditional genetic algorithms fol- the use of the relevant standard algorithms in large
low the optimal approach when solving optimization databases. These algorithms are successfully applied
problems [3] literature, the inherent algorithm paral- to the extraction of massive data through analysis
lelism of genetic algorithms is studied, and the choice and improvement. Literature [10] the basic concepts
to achieve this is summarized. Parallel genetic algo- and processes of data extraction, as well as the usual
rithms discuss the possible fields of general algorithm methods and techniques. Document [11] describes
parallelism. Document [4] designed an arithmetic the programming of computer standard algorithms
option with control parameters, introduced the con- and how to extract data about standards from com-
cept and mathematical basis of the design, analyzed puters literature [12] provides detailed information
the relationship between arithmetic selection and on the use of WEKA software (a set of machine
control parameters, compared the weak defects of learning algorithms for data extraction tasks) for
adjustment scale and global convergence used in pre-data processing, classification, regression, and
the selection of arithmetic, and adopted the “power grouping. Literature [13] introduces the methods
gradient method” to control the selection of selec- and techniques used in data acquisition, and ana-
tive arithmetic [5] the literature, some suggestions lyzes these methods in combination with biological
for improvement are put forward, which are based data acquisition cases. Document [14] provides a
on the traditional genetic algorithm accumulated in vision: an overview of new data acquisition research
the early stage, which recognizes the direction of areas and a detailed description of the data acqui-
evolution and uses this index to guide gene chro- sition process and related data acquisition methods.
mosome adaptation. A modified gene algorithm [6] Information [1] literature has studied the method of
developed in literature. This algorithm proposes a mining related standards by using genetic algorithm
particle swarm optimization method and combines to improve the structure and data coding of adaptive
it with a digital coding gene algorithm. Using a function to avoid premature variation adaptability [1]
series of chaos to produce the initial population, the literature, a mining algorithm based on multi-level
selection process uses nonlinear classification, the correlation genetic algorithm is proposed. according
S. Ruifeng / Research on data mining system based on artificial intelligence and improved genetic algorithm 6733
Output layer is also a S activation function, which And after substitution, the above formula is sim-
constrains the data of data output layer. and the output plified as follows:
of the neuron j is:
Wjk (t + 1) = Wjk (t) + Wjk
∂E
oj = f (xj ) = f (OW j ) (12) Wjk = −η (16)
∂Wjk
So the output vector of the whole network is: δk = (tk − Ok )Ok (1 − Ok )
1 vij = lr ∗ δj ∗ χi
E= (tpk − opk )2 (14)
2P The formula is simplified as follows:
P K
Wij (t + 1) = Wij (t) + Wij
When the error of the network is obtained, the error
first reaches the output layer and adjusts the weight ∂E
Wij = −η = ηδj xi
of the output layer. The mathematical expression of ∂Wij (18)
weight matrix adjustment is as follows:
δj = δk Wjk Oj (1 − Oj )Oi
k
wij = lr ∗ δ ∗ ohi
Learn for the second time according to the next set
wij = wij + wij (15)
of sample data and repeat until the network perfor-
δj = ooj ∗ (1 − ooj ) ∗ (yi − ooj ) mance is optimal see Fig. 1 for details.
3.2. Basic principles of genetic algorithms of individual selection is the cumulative probability is
QI, and the cumulative probability is compared with
GA encode each possible solution as a vector, the R[0.1] random average generated by the proba-
each chromosome vector element is called a gene. bility. Determine which individual replicates in the
all chromosomes are evaluated according to the next generation.
expected objective function of each chromosome and
the fitness values are assigned according to their
i
f (i)
respective characteristics. starting from the random Pi = N
, Qi = Pj (j = 1, 2, · · · , i) (19)
generation of certain chromosomes, their adaptabil- f (i) j=1
ity, chromosome selection, exchange, and mutation i=1
are calculated by the elimination of low-adaptive
chromosomes and the retention of high-adaptive Therefore, the probability reflects the proportion of
chromosomes; in general, the new chromosome individual adaptation in the whole group adaptation
group is larger than the generating group. By anal- and the greater the individual adaptation, the greater
ogy, until the optimization goal is achieved, Fig. 2 the possibility of selection: conversely, the higher the
illustrates the basic principles of genetic algorithms: probability of selecting each individual in the group.
see Fig. 2 for details. (2) Cross operator: Cross chromosomes, called
(1) Selecting operators: The basic operation of “recombination and pairing”, are between two paired
genetic algorithms includes the selection of screen- chromosomes, exchanging some of their genes in one
ing, intersection and mutation operators, also known way or another, resulting in two new chromosomes.
as operators, whose function is to determine whether The effectiveness of genetic algorithms mainly comes
the individual will be eliminated or retained in the from the selection of cross operations, which play a
next generation, from which the best parents are central role and determine the overall search ability
selected according to their merit. In general, three dif- of general algorithms.
ferent types of specific choices in the field are most The first is to randomly select the two chains of the
common when there is a mix of clear options and father’s generation, and then randomly determine the
options: intersection point; finally, the intersection point is L,
The specific population is N, individual adaptation the length of the chain is L, the intersection point is
is F (i), the individual adaptation is I, the probability L-1, the result is see Fig. 3 for details.
S. Ruifeng / Research on data mining system based on artificial intelligence and improved genetic algorithm 6737
Table 4
Table of correspondence between arrays and attributes
A[1] A[2] A[3] A[4] A[5] A[6]
month temp RH wind rain Area
Table 5
Selected association rules mined
Rule code Parameters
002010 61.7% support; 100% confidence; 1.01 interest
320011 16.4% support; 51% confidence; 1.07 interest
332210 13.7% support; 51% confidence; 1.65 interest
002122 11.9% support; 54% confidence; 0.94 interest
Fig. 8. Breakdown2. 320210 17.9% support; 56% confidence; 0.97 interest
300013 11.6% support; 98% confidence; 1.00 interest
302110 17.8% support; 84% confidence; 1.17 interest
5.1.7. Extraction and evaluation of rules ... ... ... ...
If the average adaptation of nearby generations is
lower than a certain level, the flow algorithm of the
According to the algorithm described above, the
above rules can improve the simulated annealing gene
association rules are excavated as follows: see Table 5
algorithm in Fig. 5: see Fig. 5 for details.
for details.
(NO: 2019GG350); Project Supported by Basic and classification approach Neural networks (IJCNN). In:
Foundation of Inner Mongolia Agricultural Univer- 2016 international joint conference on, IEEE 63(1)(2016),
5149–5155.
sity (NO: JC2013001). [15] A.A. Ewees, M.A. EL Aziz and A.E. Hassanien, Chaotic
multi-verse optimizer-based feature selection, Neural Com-
put Appl 31(4) (2019), 991–1006.
References [16] X. Fan and T. Tjahjad, A dynamic framework based on local
zernike moment and motion history image for facial expres-
sion recognition, Pattern Recognit 64(9) (2017), 399–406.
[1] J. Rafferty, et al., Automatic summarization of activities [17] C. Fuentes, V. Herskovic, I. Rodrı́guez, et al., A systematic
depicted in instructional videos by use of speech analysis. literature review about technologies for self-reporting emo-
In: Pecchia L et al (eds.) Ambient assisted living and daily tional information, J Ambient Intell Human Comput 8(3)
activities. Lecture notes in computer science. Springer, New (2017), 593–606.
York 35(8) (2014), 123–130. [18] R. Gross, I. Matthews, J. Cohn, T. Kanade and S. Baker,
[2] J. Rafferty, et al., NFC based provisioning of instructional Multi-PIE. In: 8th IEEE International Conference on auto-
videos to assist with instrumental activities of daily living. matic face & gesture recognition, Amsterdam 46(2) (2008),
In: 2014 36th annual international conference of the IEEE 1–8.
engineering in medicine and biology society, EMBC 56(8) [19] S.L. Happy, S. Member and A. Routray, Automatic facial
(2014), 4131–4134. expression recognition using features of salient facial
[3] J. Rafferty, L. Chen et al., Goal lifecycles and ontologi- patches, IEEE Trans Affect Comput 6(4) (2015), 1–12.
cal models for intention based assistive living within smart [20] H. Sikkandar and R. Thiyagarajan, Soft biometrics-based
environments, Comput Syst Sci Eng 30(1) (2015), 7–18. face image retrieval using improved grey wolf optimization,
[4] J. Rafferty, C. Nugent, et al., Automatic metadata generation IET Image Process 14(3) (2020), 451–461.
through analysis of narration within instructional videos, J [21] H.R. Kanan, K. Faez and M. Hosseinzadeh, Face recogni-
Med Syst 39(9) (2015), 1–7. tion system using ant colony optimization-based selected
[5] A.H. Shabani, J.S. Zelek and D.A. Clausi, Multi- features. In: Proceedings of the 2007 IEEE symposium on
ple scale-specific representations for improved human computational intelligence in security and defense applica-
action recognition, Pattern Recognit Lett 34(15) (2013), tions (CISDA 2007), IEEE 62(5) (2007), 57–62.
1771–1779. [22] N. Karaboga, A new design method based on artificial bee
[6] H. Yang and C. Meinel, Content based lecture video retrieval colony algorithm for digital IIR filters, J Frankl Inst 346(4)
using speech and video text information, IEEE Trans Learn (2009), 328–348.
Technol 7(2) (2014), 142–154. [23] V. Kazemi and J. Sullivan, One millisecond face alignment
[7] J.I. Ababneh and M.H. Bataineh, Linear phase FIR filter with an ensemble of regression trees, In: 2014 IEEE con-
design using p swarm optimization and genetic algorithms, ference on computer vision and pattern recognition 43(9)
Digital Signal Process 18(4) (2008), 657–668. (2014), 1867–1874.
[8] M.A.E. Aziz, A.A. Ewees and A.E. Hassanien, Multi- [24] A. Krizhevsky, I. Sutskever and G.E. Hinton, Image net
objective whale optimization algorithm for content- classification with deep convolutional neural networks, Adv
based image retrieval, Multimed Tools 77(4) (2018), Neural Inf Process Syst 82(6) (2012), 1097–1105.
26135–26172. [25] L. Pappula and D. Ghosh, Cat swarm optimization with nor-
[9] M.A.E. Aziz and A.E. Hassanien, Modified cuckoo search mal mutation for fast convergence of multimodal functions,
algorithm with rough sets for feature selection, Neural Com- Appl Soft Comput 66(2) (2018), 473–491.
put 29(4) (2018), 925–934. [26] Y. LeCun, Y. Bengio and G. Hinton, Deep learning, Nature
[10] G. Boqing, Y. Wang, J. Liu and X. Tang, Automatic facial 521(3) (2015), 436–444.
expression recognition on a single 3D face by exploring [27] P. Lucey, J.F. Cohn, T. Kanade, J. Saragih, Z. Ambadar and
shape deformation, In: Proc. 17th ACM Int. Conf. Multimed I. Matthews, The extended Cohn–Kanade Dataset (CK+):
58(6) (2009), 569–572. a complete dataset for action unit and emotion-specified
[11] I. Buciu, C. Kotropoulos and I. Pitas, ICA and gabor expression, IEEE Comput Soc Conf Comput Vision Pattern
representation for facial expression recognition, In: Pro- Recogn 26(7) (2010), 1325–1338.
ceedings International Conference on Image Processing [28] M. Lyons, M. Kamachi and J. Gyoba, The Japanese Female
89(5) (2003), 855–858. Facial Expression (JAFFE) Database, Zenodo 10(5) (1998),
[12] H.T.Y. Chang, Facial expression recognition using a com- 235–249.
bination of multiple facial features and support vector [29] A. Mehrabian, Communication without words, Psychol
machine, Soft Comput 22(2) (2017), 4389–4405. Today 2(4) (1968), 53–56.
[13] S.C. Chu, P.W. Tsai and J.S. Pan, Cat Swarm Optimization, [30] M. Minsky and S. Papert, Perceptrons: an introduction
LNAI 3(1) (2006), 854–858. to computational geometry, MIT Press, Cambridge 78(3)
[14] M.J. Cossetin, J.C. Nievola and A.L. Koerich, Facial (1969), 780–782.
expression recognition using a pairwise feature selection