0% found this document useful (0 votes)

3 views

Dynamic gesture recognition based on deep learning in human-to-computer interfaces

The document presents a deep learning approach for dynamic gesture recognition in human-computer interfaces, addressing limitations of traditional methods that rely on manual feature extraction. It utilizes an improved inverted residual network architecture based on the SSD (Single Shot MultiBox Detector) for efficient feature extraction and employs transfer learning to optimize the model. Experimental results demonstrate that the proposed method effectively recognizes various gestures quickly and accurately.

Uploaded by

Shahid Karim

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

3 views

Dynamic gesture recognition based on deep learning in human-to-computer interfaces

Uploaded by

Shahid Karim

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 8

Journal of Applied Science and Engineering, Vol. 23, No. 1, pp. 31-38 (2020) DOI: 10.6180/jase.202003_23(1).

0004

Dynamic Gesture Recognition Based on Deep

Learning in Human-to-Computer Interfaces
Jing Yu1, Hang Li2*, Shou-Lin Yin2*, Qingwu Shi4* and Shahid Karim3
1
Luxun Academy of Fine Arts, Shenyang 110034, P.R. China
2
Software College, Shenyang Normal University, Shenyang 110034, P.R. China
3
Institute of Image and Information Technology, Harbin Institute of Technology, Harbin 150000, P.R. China
4
College of Information Science & Electronic Technique, Jiamusi University

Abstract
Currently, gesture recognition provides a faster, simpler, convenient, effective and more natural
way for human-computer interaction, which has been widely concerned. Gesture recognition plays an
important role in real life. The manual feature extraction in traditional gesture recognition methods is
time-consuming and strenuous. Moreover, in order to improve the accuracy of recognition, the quantity
and quality of features to be extracted are required to be very high, which is a bottleneck for traditional
gesture recognition methods. Therefore, we propose a deep learning method for dynamic gesture
recognition in Human-to-Computer interfaces. An improved inverted residual network architecture is
utilized as the basis of SSD (Single Shot MultiBox Detector) network for feature extraction. And the
convolution structure of the auxiliary layer is predicted by using the inverse residual structure combining
the cavity convolution. It uses multi-scale information, which can reduce the amount of calculation and
parameters number. Transfer learning is used to optimize the trained network model so as to reduce the
training time and make the model more convergent. Finally, experimental results show that the proposed
method can recognize different gestures quickly and effectively.

Key Words: Gesture Recognition, Deep Learning, Human-to-Computer Interfaces, Feature Extraction

1. Introduction hand gesture recognition from challenging depth and in-

tensity data using 3D convolutional neural networks. This
The Human-to-Computer interface is a process that method combined information from multiple spatial scales
people exchange information with computers in a certain for the final prediction. It also employed spatio-temporal
way [1]. Recently, gesture interaction based on computer data augmentation for more effective training and to re-
vision has become a research hotspot in the field of Hu- duce potential overfitting. Wilson [5] extended the stan-
man-to-Computer interfaces due to its convenient and dard hidden Markov model method of gesture recognition
simple equipment. HOG, SIFT and other traditional fea- by including a global parametric variation in the output
ture based gesture recognition methods have low recog- probabilities of the HMM states. Using a linear model of
nition accuracy. It is difficult to identify multiple gesture dependence, it formulated an expectation-maximization
targets in one image [2,3]. (EM) method for training the parametric HMM. Cara-
Molchanov [4] proposed an algorithm for drivers’ miaux [6] presented a gesture recognition/ adaptation sys-
tem for human--computer interaction applications that, as
a complement to gesture labeling, characterized the mo-
*Corresponding author. E-mail: [email protected];
[email protected]; vement execution. It described a template-based recogni-
[email protected] tion method that simultaneously aligned the input gesture
32 Jing Yu et al.

to the templates using a Sequential Monte Carlo inference deep convolutional neural network are: RCNN, Fast
technique. And many other topics are proposed to detect RCNN, Faster RCNN and SSD, etc. [10-13]. When the
the gestures. However, there are still some problems such PASCAL VOC data set was tested, the object recogni-
as low efficiency, time-consuming etc. tion rate of Faster RCNN was 73.2%, and 7 frames of im-
Deep learning model is a complex, multi-layer artifi- age were recognized in each second. The recognition
cial neural network structure. Deep learning models have rate of SSD method was 72.1%, and 58 frames of image
strong nonlinear modeling ability and use general learn- were recognized per second. The recognition rate of
ing process to learn features from data. Compared with Faster R-CNN was faster than that of SSD. The recogni-
the features of traditional artificial design, the deep learn- tion rate of YOLO method was 63.4%, and it could rec-
ing model can express higher level and more abstract in- ognize 45 frames of image per second. The recognition
ternal features [7-9]. speed was similar to that of SSD method, and the recog-
Deep convolutional neural network (CNN) in deep nition rate was significantly lower than that of SSD. In
learning is an effective method for image feature extrac- this paper, modified SSD (MSSD) model is selected as
tion. Because of its invariance in translation and rotation the recognition model.
of image information, it has become a popular method in
the field of image processing and target recognition. At 2.1 SSD Network Structure
present, most of the researches on gesture recognition fo- SSD target detection model does not require time-
cus on the gesture recognition with a single hand. In the consuming region generation and feature re-sampling
process of gesture interaction, two-handed operation and steps. By directly convolving the whole image and pre-
other hands often occur. For gesture recognition of mul- dicting the category and corresponding coordinates of
tiple hands, this paper proposes a dynamic gesture recog- the object contained in the image, the detection speed is
nition method based on deep convolutional neural net- greatly improved. Meanwhile, the accuracy of target de-
work. Our contributions are as follows: tection is greatly improved by using small size convolu-
1. Feature is extracted by an improved inverted residual tion kernel and multi-scale prediction.
network architecture based on SSD. The SSD network structure is divided into Base net-
2. The convolution structure of the auxiliary layer is pre- work and Auxiliary network. The Base network is the
dicted by using the inverse residual structure combin- network that has high classification accuracy in the field
ing the cavity convolution with multi-scale informa- of image classification and removes its classification
tion, which can reduce the amount of calculation and layer. The auxiliary network is a convolutional network
parameters number. structure added on the basis of the basic network for tar-
3. Transfer learning is used to optimize the trained net- get detection. The size of these layers gradually de-
work model so as to reduce the training time and make creases so that multi-scale prediction can be made. Each
the model more convergent. added auxiliary network layer through a series of convo-
4. Experimental results show that the proposed method can lution kernels will produce a fixed predicted set. For a m
recognize different gestures quickly and effectively. ´ n ´ p (p is the channel number, m, n are the size) feature
The rest of this paper is organized as follows. In the layer, each auxiliary network will use 3 ´ 3 ´ p convolu-
next section, we detailed introduce the proposed SSD tion kernel to predict and produce score for one class. In
method for gesture recognition. Then, we give rich ex- the m ´ n positions, it predicts all the corresponding val-
periments and analysis in section 3. A conclusion is con- ues.
ducted in section 4. SSD model predicts k boundary boxes at each posi-
tion of feature graph. At the same time, the score of an
2. Gesture Recognition Model in Deep object appearing in this position and the offset of the ob-
Learning ject position relative to the boundary box are predicted.
Thus, c ´ k scores and 4k position offsets are predicted at
The main methods of object recognition based on the positions of each feature graph. For a feature graph
Dynamic Gesture Recognition Based on Deep Learning in Human-to-Computer Interfaces 33

with m ´ n size, it will predict (c + 4) × k × m × n outputs. Hierarchical feature fusion is the sum of the outputs
Finally, non-maximal suppression is applied to obtain of each convolution unit in the empty convolution layer.
the final predicted value of object category and position And the result of each sum is obtained by concatenate
information in the image. operation to get the final output result.
The reverse residual structure adopts ReLU6 as the
2.2 Modified SSD Network Structure activation function, and its output is,
SSD model uses VGG network as the basic network.
Y = min(max(X, 0), 6) (1)
But VGG network model has a large number of parame-
ters, occupies most of the running time in the process of where Y is the output of ReLU6 activation function. X is
feature extraction. And in the forward propagation pro- the input eigenvalue.
cess, information loss in the transformation process is al- Compared with ReLU, ReLU6 has better robustness
ways caused by nonlinear transformation. in low precision computing scenes. In addition, the con-
Shen [14] put forward the nonlinear activation func- volution kernel of 3´3 is used. Dropout and batch nor-
tion ReLU based on the manifold learning theory. Under malization are used in the training network process to re-
the high dimension, it would be better to retain informa- duce the overfitting in the training process. The impro-
tion. And in the low dimension, it would cause greater ved reverse residual structure is shown in Figure 1.
loss of information. Therefore, the input layer should in- Where Dilated denotes the empty convolution, Linear is
crease the feature dimension before the nonlinear trans- the Linear activation function, and HFF represents the
formation. In the output layer, the linear activation func- hierarchical feature fusion. Dwise represents a depth-se-
tion should be used to reduce the dimension of the fea- parable convolution structure.
ture to reduce the loss of information. So inverted resid- Combining with the improved reverse residual struc-
ual block was proposed. ture, we modify the base layer and auxiliary layer in SSD
The down-sampling operation in the reverse residual model: (1) original SSD uses VGG network as a base
structure will cause the loss of feature information while layer for feature extraction, but VGG network model is
increasing the perceptive field of the convolution kernel. not suitable for deployment to run on mobile devices, so
Therefore, it is considered to abandon the down-sam- reverse residual MobileNetV2 is proposed on the basis
pling operation in the convolution structure and introduce of network structure, which has less parameters, small
the empty convolution to solve this problem. Empty con- footprint, and running faster, which is as the SSD feature
volution adds an expansion parameter on the basis of the extraction network and to reduce the size of the model
original convolution operation. It expands the convolu- and calculation. The traditional convolutional network
tion kernel to the corresponding scale, and fills 0 in the structure is used in SSD auxiliary layer, which leads to
unused area of the original convolution kernel. The ap- large number of parameters and large amount of calcula-
plication of empty convolution can increase the sensing tion. As the basic structure of the auxiliary layer, the im-
field of the convolution kernel without the down-sam- proved auxiliary network layer can reduce the informa-
pling operation. However, the using of empty convolution loss caused by the nonlinear transformation in the
tion will make the operation of convolution check data learning process and the convolution kernel has multi-
discontinuous, and small objects cannot be better identi- scale receptive field.
fied. This paper considers the hierarchical feature fusion
to solve the problems caused by the introduction of 2.3 Loss Function in MSSD Network Structure
empty convolution. Generating recognition box in MSSD model is a re-

Figure 1. Improved inverted residual network.

34 Jing Yu et al.

gression process. Judging the category within the recog- tive. The data set contains 4800 images, and each image
nition box is a classification process. The total objective contains 4 categories: his own left hand (owlh), his own
loss function is the weighted sum of position loss (loc) right hand (owrh), opposite left hand (oplh) and opposite
and confidence loss (conf). right hand (oprh). Each image labels the gesture region
position of 4 categories, as shown in Figure 2.
L(c, l, g) = 1 / N(Lconf (c) + aLloc (l, g)) (2)
In training process of MSSD model, the training set,
where, N is the number of default boxes corresponding verification set and test set are shown in Table 1.
to the real boxes. a = 1 is the weight term according to
the real experiment situation. Lconf (c) is the cross en- 3.2 Evaluation Index
tropy classification loss function of Softmax, and c is In this paper, we adopt the following evaluation in-
the confidence of each category. In Lloc (l, g), l = (lx, ly, dexes to analyze the effectiveness of proposed model.
lw, lh), each item denotes the predicted center of the box 1. IoU (intersection over union) is defined as the ratio of
(x, y) and the width (w), high (h). g = (gx, gy, gw, gh) re- the intersection and union of the area occupied by two
presents the true central position (x, y), width (w) and boxes [16].
high (h).
(5)
(3)
where P is the predicted box. GT is the ground truth.
where 2. Precision and recall are two famous quantitative in-
dexes. The gesture recognition model will classify the
contents in the identified boxes, predict the possibility
(4)
of the four gesture categories, and set the most likely
as the classification result.
3. Experiments on Gesture Recognition
(6)
3.1 Data Set Analysis
In order to realize the training of MSSD model, the (7)
gesture image data set taken from the first perspective is
used. The experiment adopts the gesture data set Ego-
(8)
Hands created by Indiana university [15]. The EgoHands
use the wearable device Google glass to shoot images.
Two people interact with each other in the first perspec- where TP is the detected correct gesture number. FP is

Figure 2. Samples in EgoHands.

Table 1. Some data set in this paper

Data set Description Number
Training set Multiple scenes, multiple people, multiple activities 2500
Verification set Same with above 500
Test set Same with above 1000
Dynamic Gesture Recognition Based on Deep Learning in Human-to-Computer Interfaces 35

the detected other posture number. FN is the leak de- 1060. The original image size of EgoHands dataset is
tected gesture number. F-score is used to adjust Preci- 1240´720 pixel, which is adjusted to 600´600 during
sion and Recall, which is more close to 1, the model is training process. The training strategy is shown in Table
better. 2. In this paper, the fine-tuning and transfer learning are
3. mAP (mean Average Precision) is to get an index that improved in MSSD model network.
can reflect the global performance. The size of input image and the size of feature graph
with true box would affect the recognition accuracy of
(9) MSSD model [17]. The added BN layer will also affect
the recognition accuracy of the deep learning model. This
3.3 Fine-tuning Network and Transfer Leaning experiment will fine-tune the MSSD model structure.
We firstly verify the effect of IoU on the recognition In the experiment, the size of the input image is ad-
accuracy with proposed method. The blue bar is accu- justed from 1240´720 to 600´600 and 300´300. Finally,
racy rate of recognition and the red bar is error rate of the trained models are denoted as MSSD6 and MSSD3,
recognition in Figure 3. When IoU = 0.3, though the rec- respectively. In the experiment, each pixel in the Conv3
ognition rate is high, the error rate is high too. When IoU ´3 layer extracted from the VGG-16 basic network is
= 0.6, the result is similar to IoU = 0.9. But IoU = 0.9, it added with box. The conv3´3 layer is also introduced
needs more time to process one image. Therefore, we into the calculation of loss function and the back propa-
choose IoU = 0.6 in this paper. gation process of box recognition, and the training result
For gesture recognition problem in gesture interac- is MSSD+Conv3 model. The results are shown in Table
tion process, the parameters are changed in MSSD mo- 3 and Figure 4.
del. The VGG-16 recognition model trained in PASCAL Transfer learning means that a learning algorithm
VOC dataset is used to initialize the parameters of the can use the commonalities among different learning ta-
basic network in MSSD model. It fixes the first two lay- sks to share statistical advantages and transfer knowl-
ers and does not participate in the back propagation. The edge between tasks. Transfer learning can shorten the
target to be identified is divided into four categories, and training time and improve the recognition rate of the
one background category. The total number of categories model.
is set as 5. The maximum recognition results of each Bambach [18] proposed a model for EgoHands ges-
frame are set as 4, and the maximum recognition result of ture recognition based on Caffenet network. In the ex-
each class is set as 1. This set only shows the most likely periment, the basic network in MSSD model was appro-
recognition result in each gesture class, which greatly re- priately changed, and then the parameters in Caffenet
duces the false recognition in each class. The training model and residual network model (Resnet) were trans-
and testing in MSSD model adopt Caffe deep learning ferred to MSSD model for training.
framework, and computer graphics card is NVIDIA GTX In the experiment, the MSSD model is adjusted by
changing the basic network in VGG as the top-5 layer
network in Caffenet model. Then, the parameters of the
Caffenet model in [18] are transferred to the basic net-

Table 2. Parameters in SSD model

Name Value
Size 600 ´ 600 pixel
Learning rate 10-4
Forgetting rate 0.9
Weight decay 5 ´ 10-4
Image number in each iteration 3
Figure 3. Effect of IoU on recognition. Iteration number 64000
36 Jing Yu et al.

Table 3. mAP results with different models

Average recognition
Model mAP/%
image per second
MSSD6 91.3 10
MSSD3 89.5 12
MSSD + Conv3 84.9 9
MSSD + BN 70.8 5

work in the MSSD model to initialize it. Then the net-

work is trained. Training results are as transfer Caffenet
model. In addition, the parameters of Caffenet model
Figure 4. Effect of different models on mAP.
(top-5 layer structure) are fixed. It dose not participate in
reverse back propagation. Training results are as transfer
Caffenet top-5 model. 3.4 Comparison experiment
The basic network of MSSD model is changed from We conduct comparison experiments with other two
VGG to residual network with101 layers. The parame- state-of-the-art dynamic gesture recognition methods in-
ters of the residual network trained in PASCAL VOC cluding RPS [19], GRM [20], FMCW [21] and LSPD
data set are transferred to the basic network of MSSD [22]. mAP results are shown in Table 5 and Table 6. Fig-
model and initialized. Then the network training is car- ure 6 displays the mAP value of four hands and Figure 7
ried out, the training result is as the transfer Resnet101 presents some gesture recognition results with our pro-
model. The residual network is relatively complex. In or- posed method.
der to shorten the training time, the size of the image is Our proposed MSSD method can achieve a better re-
adjusted to 256´256 for training. The training process of sults on all the hands recognition in terms of the mAP.
each transfer learning model is shown in Figure 5. Due to crossed hands with a big area, the recognition re-
Table 4 is the mAP results with different transfer
learning methods. Table 5. Comparison results with different methods
Method Four hands Precision Recall F-score
RPS owlh 91.73 87.58 89.54
owrh 92.77 86.31 89.64
oplh 90.88 85.23 87.46
oprh 91.25 87.37 89.46
GRM owlh 92.54 88.71 90.58
owrh 94.63 90.28 92.86
oplh 92.86 83.77 88.67
oprh 93.78 89.65 92.07
FMCW owlh 93.18 89.67 90.24
owrh 94.71 90.58 92.45
Figure 5. Effect of different transfer learning models on mAP. oplh 93.14 84.65 87.56
oprh 94.87 88.56 92.15
Table 4. mAP results with different transfer learning LSPD owlh 94.38 90.84 91.54
models owrh 95.37 91.57 92.84
oplh 93.94 85.72 89.41
Model mAP/% oprh 95.88 90.63 93.72
MSSD6 92.6 MSSD6 owlh 95.21 91.62 93.79
Transfer Caffenet top-5 91.7 owrh 96.42 91.08 94.14
Transfer Caffenet 86.2 oplh 94.83 90.58 93.18
Transfer Resnet 101 73.4 oprh 96.88 91.27 94.27
Dynamic Gesture Recognition Based on Deep Learning in Human-to-Computer Interfaces 37

Table 6. mAP results with different methods

Model mAP/%
RPS 78.6
GRM 84.3
FMCW 84.8
LSPD 85.2
MSSD6 88.9

Figure 6. Four hands’ mAP value.

Figure 7. Part of the results: left segment result, right recog-
sult is not ideal, but it still has productive efficiency than nition result.
other methods.
Through the wearable device, we take the first view myocontrol of a human–computer interface by paretic
video, and 100 video frames are randomly selected as muscles after stroke, IEEE Transactions on Cognitive
test images. The mAP obtained on the trained MSSD6 & Developmental Systems 10(4), 1126-1132. doi: 10.
model is 93.2%, and it can recognize 20 pictures per sec- 1109/TCDS.2018.2830388
ond. The dynamic gesture recognition effect is better. [2] Mert, A., and A. Akan (2018) Emotion recognition
from EEG signals by using multivariate empirical mode
4. Conclusion decomposition, Pattern Analysis & Applications 21(1),
81-89. doi: 10.1007/s10044-016-0567-6
In this paper, we propose a modified deep learning [3] Yu, J., H. Li, and S. L. Yin (2019) New intelligent in-
model for dynamic gesture recognition in Human-to- terface study based on K-means gaze tracking, Inter-
Computer Interfaces. Multiple gestures in the image can national Journal of Computational Science and Engi-
be recognized at the same time, the average mAP of ges- neering 18(1), 12-20. doi: 10.1504/IJCSE.2019.
ture recognition with proposed MSSD6 model is larger 096971
than 90 percent. It can be used in real-time recognition [4] Molchanov, P., S. Gupta, K. Kim, et al. (2015) Hand
based on visual gesture interaction. Experiments show gesture recognition with 3D convolutional neural net-
that, the method in this paper can quickly and accurately works, Computer Vision & Pattern Recognition Work-
recognize the multi-gesture hands in video. In the future, shops. doi: 10.1109/CVPRW.2015.7301342
we will design more advanced CNN network to improve [5] Wilson, A. D., and A. F. Bobick (2016) Parametric
the accuracy of gesture recognition. hidden Markov models for gesture recognition, IEEE
Trans.pattern Anal. & Mach.intell 21(9), 884-900.
References doi: 10.1109/34.790429
[6] Caramiaux, B., N. Montecchio, and A. Tanaka (2014)
[1] Yang, C., J. Long, M. A. Urbin, et al. (2018) Real-time Adaptive gesture recognition with variation estimation
38 Jing Yu et al.

for interactive systems, Acm Transactions on Inter- [15] Bambach, S., S. Lee, D. J. Crandall, et al. (2015) Lend-
active Intelligent Systems 4(4), 1-34. doi: 10.1145/ ing A hand: detecting hands and recognizing activities
2643204 in complex egocentric interactions, 2015 IEEE Inter-
[7] Gao, J., P. Li, and Z. K. Chen (2019) A canonical national Conference on Computer Vision (ICCV). IEEE
polyadic deep convolutional computation model for Computer Society. doi: 10.1109/ICCV.2015.226
big data feature learning in Internet of Things, Future [16] Lepetit-Aimon, G., R. Duval, and F. Cheriet (2018)
Generation Computer Systems. doi: 10.1016/j.future. Large receptive field fully convolutional network for
2019.04.048 semantic segmentation of retinal vasculature in fundus
[8] Lin, T., H. Li, and S. L. Yin (2018) Modified pyramid images, International Workshop on Computational
dual tree direction filter-based image de-noising via Pathology 201-209. doi: 10.1007/978-3-030-00949-
curvature scale and non-local mean multi-grade rem- 6_24
nant multi-grade remnant filter, International Journal [17] Liu, W., D. Anguelov, D. Erhan, et al. (2016) SSD: sin-
of Communication Systems 31(16). doi: 10.1002/dac. gle shot MultiBox detector, European Conference on
3486 Computer Vision. ECCV, 21-37. doi: 10.1007/978-
[9] Yin, S. L., and J. Bi (2019) Medical image annotation 3-319-46448-0_2
based on deep transfer learning, Journal of Applied [18] Bambach, S., S. Lee, D. J. Crandall, et al. (2015) Lend-
Science and Engineering 22(2), 385-390. doi: 10. ing A hand: detecting hands and recognizing activities
6180/jase.201906_22(2).0020 in complex egocentric interactions, 2015 IEEE Inter-
[10] Yin, S. L., Y. Zhang, and S. Karim (2018) Large scale national Conference on Computer Vision (ICCV).
remote sensing image segmentation based on fuzzy re- IEEE Computer Society. doi: 10.1109/ICCV.2015.226
gion competition and Gaussian mixture model, IEEE [19] Zhou, Z., Z. Cao, and Y. Pi (2018) Dynamic gesture
Access 6, 26069-26080. doi: 10.1109/ACCESS.2018. recognition with a Terahertz Radar based on range pro-
2834960 file sequences and Doppler signatures, Sensors 18(1),
[11] Yin, S. L., Y. Zhang, and S. Karim (2019) Region 10. doi: 10.3390/s18010010
search based on hybrid CNN in optical remote sensing [20] Verma, B., and A. Choudhary (2018) Framework for
images under cloud computing environment, Interna- dynamic hand gesture recognition using Grassmann
tional Journal of Distributed Sensor Networks 15(5). manifold for intelligent vehicles, Iet Intelligent Trans-
doi: 10.1177/1550147719852036 port Systems 12(7), 721-729. doi: 10.1049/iet-its.2017.
[12] Ren, S., K. He, R. Girshick, et al. (2017) Faster R- 0331
CNN: towards real-time object detection with region [21] Zhang, Z., Z. Tian, and Z. Mu (2018) Latern: dynamic
proposal networks, IEEE Transactions on Pattern An- continuous hand gesture recognition using FMCW ra-
alysis & Machine Intelligence 39(6), 1137-1149. doi: dar sensor, IEEE Sensors Journal 18(8), 1-1. doi: 10.
10.1109/TPAMI.2016.2577031 1109/JSEN.2018.2808688
[13] Li, J., H. C. Wong, S. L. Lo, et al. (2018) Multiple ob- [22] Nguyen, X. S., L. Brun, O. Lezoray, et al. (2019) Skel-
ject detection by deformable part-based model and R- eton-based hand gesture recognition by learning SPD
CNN, IEEE Signal Processing Letters PP(99):1-1. matrices with neural networks, IEEE International
doi: 10.1109/LSP.2017.2789325 Conference on Automatic Face & Gesture Recogni-
[14] Shen, J., J. Bu, B. Ju, et al. (2012) Refining Gaussian tion (FG). IEEE. doi: 10.1109/FG.2019.8756512
mixture model based on enhanced manifold learning,
Neurocomputing 87(1), 19-25. doi: 10.1016/j.neucom. Manuscript Received: Jul. 22, 2019
2012.01.029 Accepted: Oct. 19, 2019

SIMPLEX6100SPEC
No ratings yet
SIMPLEX6100SPEC
4 pages
Salesforce Marketing Cloud Administrator
No ratings yet
Salesforce Marketing Cloud Administrator
65 pages
IM C3000 Service Manual
100% (1)
IM C3000 Service Manual
4,315 pages
ST ND RD: Ntroduction
No ratings yet
ST ND RD: Ntroduction
4 pages
ApplicationofDeepLearningusingConvolutionalNeural
No ratings yet
ApplicationofDeepLearningusingConvolutionalNeural
8 pages
Deep Learning Based Accurate Hand Gesture Recognition Using Enhanced CNN Model
No ratings yet
Deep Learning Based Accurate Hand Gesture Recognition Using Enhanced CNN Model
12 pages
Gesture Recognition System
No ratings yet
Gesture Recognition System
12 pages
2d Convolution Research
No ratings yet
2d Convolution Research
15 pages
Hand Gesture Recognition With Convolution Neural Networks
No ratings yet
Hand Gesture Recognition With Convolution Neural Networks
4 pages
ResearchPaper3
No ratings yet
ResearchPaper3
6 pages
Machine Learning - Advanced Concepts
From Everand
Machine Learning - Advanced Concepts
Derrick Mwiti
No ratings yet
5_Dynamic_hand_gesture_recognition_for_wearable_devices_with_low_complexity_recurrent_neural_networks
No ratings yet
5_Dynamic_hand_gesture_recognition_for_wearable_devices_with_low_complexity_recurrent_neural_networks
4 pages
Online Detection and Classification of Dynamic Hand Gestures With Recurrent 3D Convolutional Neural Networks
No ratings yet
Online Detection and Classification of Dynamic Hand Gestures With Recurrent 3D Convolutional Neural Networks
9 pages
SEMG Basedhandgesturesclassificationusingasemi Supervisedmulti LayerneuralnetworkswithAutoencoder
No ratings yet
SEMG Basedhandgesturesclassificationusingasemi Supervisedmulti LayerneuralnetworkswithAutoencoder
10 pages
Gesture_Recognition_System
No ratings yet
Gesture_Recognition_System
3 pages
Fast and Robust Dynamic Hand Gesture Recognition Via Key Frames Extraction and Feature Fusion
No ratings yet
Fast and Robust Dynamic Hand Gesture Recognition Via Key Frames Extraction and Feature Fusion
11 pages
A Novel Hybrid Deep Learning Architecture for Dynamic Hand Gesture Recognition
No ratings yet
A Novel Hybrid Deep Learning Architecture for Dynamic Hand Gesture Recognition
14 pages
1376 3343 1 PB
No ratings yet
1376 3343 1 PB
10 pages
sensors-22-00706
No ratings yet
sensors-22-00706
14 pages
ICSCSS 177 Final Manuscript
No ratings yet
ICSCSS 177 Final Manuscript
6 pages
Few-Shot_User-Definable_Radar-Based_Hand_Gesture_Recognition_at_the_Edge
No ratings yet
Few-Shot_User-Definable_Radar-Based_Hand_Gesture_Recognition_at_the_Edge
19 pages
Hand Gesture Recognition Using Machine Learning
No ratings yet
Hand Gesture Recognition Using Machine Learning
7 pages
Paper 9004
No ratings yet
Paper 9004
4 pages
Hand Gesture i-PACT
No ratings yet
Hand Gesture i-PACT
6 pages
Sign Language Recognition Using CNNs
No ratings yet
Sign Language Recognition Using CNNs
7 pages
Feature Analysis Using Spiking Neurons With Improved PCA Appoach For Hand Gesture Recognition
No ratings yet
Feature Analysis Using Spiking Neurons With Improved PCA Appoach For Hand Gesture Recognition
4 pages
Paper 5
No ratings yet
Paper 5
10 pages
Learning To Recognize Touch Gestures: Recurrent vs. Convolutional Features and Dynamic Sampling
No ratings yet
Learning To Recognize Touch Gestures: Recurrent vs. Convolutional Features and Dynamic Sampling
9 pages
Deep Learning Based Real-Time Recognition of Dynamic Finger Gestures Using a Data Glove
No ratings yet
Deep Learning Based Real-Time Recognition of Dynamic Finger Gestures Using a Data Glove
11 pages
Research On Human Behavior Recognition Based On Deep Neural Network
No ratings yet
Research On Human Behavior Recognition Based On Deep Neural Network
5 pages
Paper 3
No ratings yet
Paper 3
10 pages
Dop-DenseNet: Densely Convolutional Neural Network-Based Gesture Recognition Using A Micro-Doppler Radar
No ratings yet
Dop-DenseNet: Densely Convolutional Neural Network-Based Gesture Recognition Using A Micro-Doppler Radar
9 pages
Object Detection: Advances, Applications, and Algorithms
From Everand
Object Detection: Advances, Applications, and Algorithms
Fouad Sabry
No ratings yet
A Review of Research on Human Behavior Recognition Methods Based on Deep Learning
No ratings yet
A Review of Research on Human Behavior Recognition Methods Based on Deep Learning
5 pages
Hand Gesture Recognition Using Neural Networks PDF
No ratings yet
Hand Gesture Recognition Using Neural Networks PDF
69 pages
Paper_122-An_End_to_End_Model_of_ArVi_MoCoGAN_and_C3D
No ratings yet
Paper_122-An_End_to_End_Model_of_ArVi_MoCoGAN_and_C3D
10 pages
Shsconf Stehf2022 03011
No ratings yet
Shsconf Stehf2022 03011
5 pages
sensors-23-05555
No ratings yet
sensors-23-05555
20 pages
2802 8020 1 PB
No ratings yet
2802 8020 1 PB
3 pages
sensors-24-06262
No ratings yet
sensors-24-06262
16 pages
(SOTA) Deep Learning in Multi-Object Detection and Tracking State of The Art
No ratings yet
(SOTA) Deep Learning in Multi-Object Detection and Tracking State of The Art
30 pages
Hand Gesture Recognition PHD Thesis
100% (2)
Hand Gesture Recognition PHD Thesis
5 pages
Multi Distance Metric Network For Few Shot Learning: Farong Gao Lijie Cai Zhangyi Yang Shiji Song Cheng Wu
No ratings yet
Multi Distance Metric Network For Few Shot Learning: Farong Gao Lijie Cai Zhangyi Yang Shiji Song Cheng Wu
12 pages
Human Hand Gesture Recognition Using A Convolution Neural Network
No ratings yet
Human Hand Gesture Recognition Using A Convolution Neural Network
7 pages
Automatic Target Recognition: Advances in Computer Vision Techniques for Target Recognition
From Everand
Automatic Target Recognition: Advances in Computer Vision Techniques for Target Recognition
Fouad Sabry
No ratings yet
Gesture Recognition Using Image Comparison Methods
No ratings yet
Gesture Recognition Using Image Comparison Methods
5 pages
Research Paper Hand Gesture Recognition
No ratings yet
Research Paper Hand Gesture Recognition
4 pages
Sign Language Recognition System Using Deep Neural Network
No ratings yet
Sign Language Recognition System Using Deep Neural Network
5 pages
gray word6
No ratings yet
gray word6
7 pages
Towards Smart Interaction Hand Gesture Recognition Using Machine Learning in IoT Scenarios
No ratings yet
Towards Smart Interaction Hand Gesture Recognition Using Machine Learning in IoT Scenarios
5 pages
Mathematics 12 01393
No ratings yet
Mathematics 12 01393
34 pages
U18Ini5600 - Engineering Cilincs - V Project Report
No ratings yet
U18Ini5600 - Engineering Cilincs - V Project Report
14 pages
Electronics and Communication s7 & s8
No ratings yet
Electronics and Communication s7 & s8
38 pages
Real-Time Recognition of The Users Arm Gestures in 2D Space With A Smart Camera
No ratings yet
Real-Time Recognition of The Users Arm Gestures in 2D Space With A Smart Camera
9 pages
An_Investigation_of_Deep_Neural_Network_based_Techniques_for_Object_Detection_an
No ratings yet
An_Investigation_of_Deep_Neural_Network_based_Techniques_for_Object_Detection_an
6 pages
Full Text
No ratings yet
Full Text
6 pages
Motion Sensors-Based Human Behavior Recognition and Analysis
No ratings yet
Motion Sensors-Based Human Behavior Recognition and Analysis
139 pages
E3sconf Iconnect2023 04032
No ratings yet
E3sconf Iconnect2023 04032
11 pages
Automatic Target Recognition: Fundamentals and Applications
From Everand
Automatic Target Recognition: Fundamentals and Applications
Fouad Sabry
No ratings yet
OBJECT DETECTION AND GESTURE RECOGNITION
No ratings yet
OBJECT DETECTION AND GESTURE RECOGNITION
11 pages
A Unified Learning Approach For Hand Gesture Recog
No ratings yet
A Unified Learning Approach For Hand Gesture Recog
19 pages
Hand Gesture Recognition Thesis
100% (2)
Hand Gesture Recognition Thesis
5 pages
Gesture Recognition For Human Computer Interaction
No ratings yet
Gesture Recognition For Human Computer Interaction
3 pages
Compiler Manual
No ratings yet
Compiler Manual
62 pages
WOCMAT 2018 Conference Program December 7th, Saturday, 2018: Program Chair Place Time
No ratings yet
WOCMAT 2018 Conference Program December 7th, Saturday, 2018: Program Chair Place Time
3 pages
NTC Thermistors:: Type CL
No ratings yet
NTC Thermistors:: Type CL
2 pages
Ece4750 Tut3 Verilog PDF
No ratings yet
Ece4750 Tut3 Verilog PDF
79 pages
File Signature Analysis
No ratings yet
File Signature Analysis
12 pages
38576
No ratings yet
38576
66 pages
Statistics & Probability Week 1-2
No ratings yet
Statistics & Probability Week 1-2
16 pages
Id SNB MWPN 0101 500019
100% (1)
Id SNB MWPN 0101 500019
77 pages
Sales and Product For Dax
No ratings yet
Sales and Product For Dax
6 pages
AI Activated Value Co Creation An Exploratory Stu 2022 Industrial Marketing
No ratings yet
AI Activated Value Co Creation An Exploratory Stu 2022 Industrial Marketing
13 pages
Learning Aim A
No ratings yet
Learning Aim A
14 pages
DxDiag Jardas
No ratings yet
DxDiag Jardas
58 pages
Dice Profile Ambreen Imran
No ratings yet
Dice Profile Ambreen Imran
5 pages
Revised PPT For Online Lecture 6 HVAC-Types of Systems
100% (1)
Revised PPT For Online Lecture 6 HVAC-Types of Systems
22 pages
TriMast Mount For 75cm-1m Antennas
No ratings yet
TriMast Mount For 75cm-1m Antennas
4 pages
Lab File Front Pages
No ratings yet
Lab File Front Pages
4 pages
Daewoo dtq20s1ssfv Chassis cn-402fn
No ratings yet
Daewoo dtq20s1ssfv Chassis cn-402fn
4 pages
Samba Tech at Chin ICT
No ratings yet
Samba Tech at Chin ICT
10 pages
Fortigate 100f Series PDF
No ratings yet
Fortigate 100f Series PDF
6 pages
T.D. Williamson, Inc.: Your Source For Piping Solutions
No ratings yet
T.D. Williamson, Inc.: Your Source For Piping Solutions
4 pages
Battle of The Backbones - A Large-Scale Comparison of Pretrained Models Across Computer Vision Tasks
No ratings yet
Battle of The Backbones - A Large-Scale Comparison of Pretrained Models Across Computer Vision Tasks
29 pages
Yamaha SW-P240 (NS-P240) Owner's Manual
No ratings yet
Yamaha SW-P240 (NS-P240) Owner's Manual
16 pages
Critical
No ratings yet
Critical
3 pages
Rural Access Ghana
No ratings yet
Rural Access Ghana
41 pages
C Arrays Q - A With Explana
No ratings yet
C Arrays Q - A With Explana
28 pages
Fijal - 1 - Final
No ratings yet
Fijal - 1 - Final
47 pages
Agile Requirements Engineering With User Stories (FINAL PDF)
No ratings yet
Agile Requirements Engineering With User Stories (FINAL PDF)
4 pages

Dynamic gesture recognition based on deep learning in human-to-computer interfaces

Uploaded by

Dynamic gesture recognition based on deep learning in human-to-computer interfaces

Uploaded by

Journal of Applied Science and Engineering, Vol. 23, No. 1, pp. 31-38 (2020) DOI: 10.6180/jase.202003_23(1).

Dynamic Gesture Recognition Based on Deep

1. Introduction hand gesture recognition from challenging depth and in-

Figure 1. Improved inverted residual network.

Figure 2. Samples in EgoHands.

Table 1. Some data set in this paper

Table 2. Parameters in SSD model

Table 3. mAP results with different models

work in the MSSD model to initialize it. Then the net-

Table 6. mAP results with different methods

Figure 6. Four hands’ mAP value.

You might also like