0% found this document useful (0 votes)

34 views

Deep-Learning-Based Stair Detection Using 3D Point Cloud Data For Preventing Walking Accidents of The Visually Impaired

Deep learning to improve the technology

Uploaded by

Sleeplover Ruhi

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

34 views

Deep-Learning-Based Stair Detection Using 3D Point Cloud Data For Preventing Walking Accidents of The Visually Impaired

Deep learning to improve the technology

Uploaded by

Sleeplover Ruhi

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 7

Received March 26, 2022, accepted May 21, 2022, date of publication May 26, 2022, date of current

version June 1, 2022.

Digital Object Identifier 10.1109/ACCESS.2022.3178154

Deep-Learning-Based Stair Detection Using 3D

Point Cloud Data for Preventing Walking
Accidents of the Visually Impaired
HARUKA MATSUMURA1 AND CHINTHAKA PREMACHANDRA 2, (Senior Member, IEEE)
1 Department of Electronics Engineering, School of Engineering, Shibaura Institute of Technology, Tokyo 135-8548, Japan
2 Graduate School of Engineering and Science, Shibaura Institute of Technology, Tokyo 135-8548, Japan

Corresponding author: Chinthaka Premachandra ([email protected])

This work was supported in part by the Branding Research Fund of the Shibaura Institute of Technology.

ABSTRACT Visually impaired individuals worldwide are at a risk of accidents while walking. In particular,
falling from a raised place, such as stairs, can lead to serious injury. Therefore, we attempted to determine
the best accident prevention method that can notify visually impaired individuals of the existence, height,
and step information when they approach stairs. In this study, we have investigated stair detection through
deep learning. First, the three-dimensional point cloud data generated from depth information are learned
by deep learning. Stairs were detected using the results of deep learning. To apply the point cloud data for
deep learning-based training, we proposed preprocessing stages to reduce the weight of the point cloud data.
The accuracy of stair detection was 97.3%, which is the best performance compared to other conventional
methods. Therefore, we confirmed the effectiveness of the proposed method.

INDEX TERMS Visually impaired support systems, depth sensor, 3D point cloud data, deep-learning,
PointNet.

I. INTRODUCTION In Japan, Article 14 of the Road Traffic Law stipulates that

There are visually impaired people in every country in the persons with visual impairment (including people equivalent
world. According to World Health Organization (WHO) [1], to visually impaired persons) must carry a cane or having
as of 2019, more than 2.2 billion people in the world have a guide dog specified by Cabinet Orders when crossing the
various visual impairments ranging from visual impairment road.’’ Thus, carrying a white cane or having a guide dog is
to blindness. In Japan, the number of visually impaired common because it is specified by the government ordinance.
people in 2016 was 312,000 [2]. Furthermore, according According to the International Guide Dog Federation [4],
to a study by a research team at Anglia Ruskin University there are 20,000 guide dogs in 31 countries, but this is
in the UK [3], the number of visually impaired people not sufficient for the number of visually impaired people.
worldwide will increase unless current treatment methods are In contrast, white canes are readily available, but it takes time
improved. The number of partially sighted people is expected to get used to them, and walking training is required to move
to increase from 36 million in 2015 to 115 million in 2050, safely with them [5].
and the number of people with moderate-to-severe visual However, several visually impaired people are reluctant
impairment is expected to increase from 216.6 million in to carry a white cane owing to various concerns, such as
2015 to 550 million in 2050. They also reported that even stares from people around them and the inconvenience to their
moderate visual impairment can have a substantial impact on families. According to a survey that analyzed the behavior
people’s lives. Therefore, it is important to provide support to of visually impaired people [6], 76.6% of visually impaired
the visually impaired. The visually impaired are assisted in people walk alone without the assistance of a caregiver
various ways. or advanced technology. According to a Japanese survey,
‘‘Walking Accident National Survey to Maintain Visually
Handicapped Persons Walking Environment’’ [7], 47% of
The associate editor coordinating the review of this manuscript and the people who walk alone have experienced a walking
approving it for publication was Wu-Shiung Feng. accident. Therefore, the risk of accidents while walking is

This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/
VOLUME 10, 2022 56249
H. Matsumura, C. Premachandra: Deep-Learning-Based Stair Detection Using 3D Point Cloud Data

FIGURE 1. Operation image of this system.

high. In particular, a fall from a raised place, such as a

station platform or stairs, can lead to serious injuries. Thus,
according to the survey, stairs are one of the most dangerous
places for visually impaired people and the most likely
place for them to fall outdoors. In addition, it is the third
most dangerous indoor location. From the aforementioned
information, it can be observed that stairs are very dangerous
for the visually impaired.
To prevent accidents, we developed a wearable system that
notifies visually impaired people of the existence of stairs as
they approach them, and also informs them of the height of
the stairs and the difference in steps. The operation of this FIGURE 2. Flow of the proposed method.
system is illustrated in Fig. 1. In this study, we focused on the
detection of stairs.
II. OVERVIEW OF THE PROPOSED SYSTEM AND
Conventional research exists in stair detection, for exam-
EQUIPMENT USED
ple, the Hough transform using grayscale images [8]
A. DEVELOPMENT ENVIRONMENT
and the RGB image [9]. Conventional studies on stair
The programming language used in this study was Python,
detection include methods based on two-dimensional (2D)
which has extensive libraries for image processing and AI.
images [10]–[13]. In addition, methods have been proposed
For deep learning using 3D point cloud data, we used
to detect stairs using three-dimensional (3D) data, such as
PointNet [19], a deep learning model that can easily handle
methods using RGB-D images [14]–[16] and methods using
point cloud data.
3D data with stereo images [17]. In several cases, these
methods do not distinguish between upward and downward
stairs or detect stairs. Therefore, we believe that we can B. OVERALL FLOW OF THE PROPOSED STAIRS DETECTION
detect stairs with a higher probability than that achieved using METHOD
conventional methods by acquiring information regarding The overall flow of the proposed stair-detection method is
the steps in stairs from the depth information captured by illustrated in Fig. 2. We generated 3D point cloud data from
the depth camera and analyzing the information using deep depth information. Using the point cloud data, we performed
learning. In addition, because each pixel of the depth camera stair detection using deep learning. In this study, we used
has distance information instead of color information, it can PointNet, a deep learning model that can easily handle point
be applied to stair detection in dark environments such as cloud data, to perform class classification and segmentation
nighttime or during power outages due to disasters. using point clouds as inputs. Therefore, we used a class
In this study, we pursued the detection of stairs using classification method to detect and classify stairs into three
deep learning. Specifically, we pre-processed the depth types, namely, downstairs, upstairs, and other than stairs, and
information of the stairs and non-stairs places from the depth trained and verified the classification.
camera attached to the user, and trained it using deep learning.
The system then uses the learning results to detect stairs. C. DEPTH CAMERA USED IN THIS STUDY
We used the RealSense depth camera D435i (RealSense) A depth camera can detect the distance between the camera
[18] because it is small, easy to wear, and can acquire depth and an object using a sensor, and each pixel of the image
information in addition to 2D images. represents that distance. In this study, we used the RealSense
In this study, we confirm the effectiveness of the proposed depth camera D435i (RealSense), which can be used indoors
method through several verification experiments. The results and outdoors as a depth camera. The equipment used is shown
reveal that the proposed method can detect stairs with a very in Fig. 3. This depth camera can output depth information
high accuracy rate of 97.3%, exhibiting the best performance in addition to 2D information and is small and easy to
compared to other conventional methods. wear. In recent years, RealSense has been used to build

56250 VOLUME 10, 2022

H. Matsumura, C. Premachandra: Deep-Learning-Based Stair Detection Using 3D Point Cloud Data

collecting a large amount of 3D point cloud data for deep

learning. Down-sampling has an impact on the processing
time. For example, it may take a long time to read the
point cloud data (pcd) file and process 3D point cloud
data. Therefore, down-sampling is performed. This reduces
the number of point clouds and accelerates the subsequent
processing time. In the down-sampling of 3D point cloud
data, a new point is placed at the center of the points within a
certain range, and all other points are deleted to thin out the
data. Because the points are thinned out at equal intervals, the
overall structure can be relatively preserved.
FIGURE 3. RealSense depth camera D435i.
C. NORMALIZATION OF POINT CLOUD DATA
TABLE 1. The specifications of Realsense d435i.
When captured by a depth camera, nearby objects have
smaller values than distant objects. These differences may
prevent deep learning from working well and may require
more training time. Therefore, as pre-processing of the data,
we perform normalization [23] based on the vector distance
that is the largest from the origin in the sample. Normalization
is the process of transforming the range of values of a feature
such that they fall within a certain range. The formula for
normalizing the original data is expressed by equation (1) as
follows:
xi −σ
xnorm D (1)
max {|xi −σ |}

support systems for the physically challenged [20], [21]. The D. DEEP LEARNING FOR 3D POINT CLOUD DATA
specifications of the depth camera are listed in Table 1. 3D point cloud data is a method of describing 3D shapes
expressed as a set of 3D points (x, y, and z). 3D point cloud
III. STAIRS DETECTION USING DEEP LEARNING OF 3D data has two important properties that must be considered
POINT CLOUDS when handling it in deep learning; order and translation
A. PREPARING THE DATA SET invariances [24].
In this study, we first generated 3D point-cloud data based on First, let us discuss order invariance. It is the property
the depth information captured by a depth camera. Thereafter, that the output is invariant even if the order of the points
considering the reduction in processing time, we performed is changed and input into the model. Because point cloud
downsampling [22] on the number of points in each point data does not have a fixed format and the order of points
cloud data sample to prepare a lightweight 3D point cloud cannot be assigned to each element, the order of input to the
dataset. model is arbitrary. Therefore, for a point cloud of N points,
We prepared 1000 training and 500 validation datasets for there will be N! different inputs, but the object represented
each of the above classes and conducted experiments with by the point cloud will be similar, even if the order of the
3000 training and 1500 validation datasets. Fig. 4 shows inputs changes. Therefore, a deep learning model is required
the sample 3D point cloud data stairs with their 2D image to output the same value each time for different permutations
(Fig. 4 (a)). Fig. 4 (b) shows an example of depth data of point cloud inputs. Next, we discuss translation invariance.
from the Realsence while Fig. 4 (c) shows extraction of Translation invariance is a property in which the output
the approximate stair region by Open 3D. Fig. 4 (d) shows is invariant, even if point cloud data are input to a deep
the down-sampled results of the depth image in Fig. 4(c). The learning model under parallel or rotational translation. First,
down-sampling process is explained in the next sub-section. the invariance to translation is expressed by (2) as follow:

f (x1 + r, x2 + r, . . . , xM + r) = f (x1 , x2 , . . . , x M ) (2)

B. DOWNSAMPLING
In this study, to reduce the processing time, down-sampling Here, xM denotes the input and r denotes an arbitrary
was performed on the number of point clouds of each vector. This equation demonstrates that moving the input xM
point cloud data sample to make the 3D point cloud data by an arbitrary vector r will not change the output. Next, the
lightweight. Down-sampling refers to thinning of the soil. invariance to the rotational movement is expressed as follows:
Without down-sampling, the 3D point cloud data are too
detailed and the data size is large, which is very large when f (Rx1 , Rx2 , . . . , RxM ) = f (x1 , x2 , . . . , x M ) (3)

VOLUME 10, 2022 56251

H. Matsumura, C. Premachandra: Deep-Learning-Based Stair Detection Using 3D Point Cloud Data

FIGURE 4. 3D point cloud data samples and their appearance as a 2D image.

where R denotes the rotation matrix. This equation demon-

strates that the output is invariant even if the input data are
rotated and shifted by an arbitrary rotation matrix R. Point
clouds are rotating and moving, and not all point cloud data
will stay in the same position without rotating. Therefore, the
deep learning model must be able to maintain the output, even
when the point cloud data are transformed by translation or
rotation. FIGURE 5. Network structures.

E. POINTNET output will be the same as the output before the replacement,
PointNet is a deep-learning model that considers the order because the function outputs the largest element.
and movement invariances described above [19]. In con- Next, we describe the movement invariance of PointNet,
ventional 3D convolutional neural networks, point clouds which estimates the affine transformation matrix of the input
are voxelized and one layer is treated as an image that point cloud and multiplies it by the transformation matrix to
is used as the input. On the other hand, PointNet accepts obtain approximate movement invariance. The structure of
point clouds as input, which facilitates the handling of point this network is illustrated in Fig. 5. The affine transformation
cloud data and solves the shortcomings of conventional matrix is a transformation of rotation, translation, and scaling,
methods. and can be represented by a single 3 × 3 matrix. The affine
This section describes how PointNet considers the two transformation matrix is estimated using T-Net [25], and by
points of order and movement invariances, as described multiplying the input point cloud by this estimated matrix,
above. A symmetric function is a function whose value the output does not change even if the point cloud data
does not change even if the order of the variables is are transformed by translation or rotation. Here, T-Net is a
changed [19]. PointNet obtains order invariance by using a network consisting of feature extraction, max-pooling, and
symmetric function called MaxPooling, which outputs the total joins.
largest element among the input elements. In other words, We describe the flow of the PointNet classifications.
even if the input elements of MaxPooling are replaced, the The structure of this network is illustrated in Fig. 6.

56252 VOLUME 10, 2022

H. Matsumura, C. Premachandra: Deep-Learning-Based Stair Detection Using 3D Point Cloud Data

FIGURE 7. Shooting environment.

TABLE 2. Computer specifications.

TABLE 3. Confusion matrix (3-class classification.

FIGURE 6. Network structures of PointNet.

TABLE 4. Confusion matrix (2-class classification).

Here, n denotes the number of points. Because we are
dealing with 3D point cloud data, the input data are
n × 3. mlp is the multilayer perceptron. First, we input
the input data to the transform layer. The structure of this
layer is illustrated in Fig. 5. This structure allowed us to
approximate the movement invariance of the input data in
the transform layer. Next, we used a convolutional neural
network. By repeating these steps, we obtained the feature element of the matrix represents the predicted number of
values for the points. MaxPooling was then performed on samples for correct labels. The confusion matrix illustrates
the resulting values to obtain the order invariance, and that all downstairs and upstairs are detected correctly, but
the features of the entire 1024-dimensional point cloud various non-stair places are detected as upstairs.
were obtained. Finally, by passing the features through The confusion matrix in Table 3 is summarized in Table 4,
mlp, the classification scores of the three classes were where the upstairs and downstairs are combined as one
obtained. element, and two classes of non-stairs are considered.
Accuracy, precision, and recall levels are described. Here,
IV. VERIFICATION EXPERIMENT
each element of the confusion matrix of the two-class
A. EXPERIMENTAL ENVIRONMENT
classification is a true positive (TP), true negative (TN), false
positive (FP), or false negative (FN).
In this study, a depth camera (RealSense) was attached to the
First, accuracy refers to the percentage of correct answers
waist position of the subject to obtain data. Fig. 7 shows a
to all predictions and is calculated as follows.
scene captured on the stairs. In other cases, the camera was
placed at waist level in a room with obstacles, and depth data TP + TN
Accuracy = (4)
were taken from various angles. We used a computer with TP + FP + FN + TN
the specifications listed in Table 2 to perform the learning Thereafter, the rate of fit refers the percentage of data that
process. is actually positive among the data predicted to be positive
and is calculated as expressed by equation (5).
B. EXPERIMENTAL RESULTS
TP
The confusion matrix predicted from the validation data when Precision = (5)
the number of training epochs was 10 is shown in Table 3. TP + FP
In this confusion matrix, the vertical and horizontal axes Finally, recall refers the proportion of predicted positive
represent the correct and predicted labels, respectively. Each values among the actual positive values and is calculated as

VOLUME 10, 2022 56253

H. Matsumura, C. Premachandra: Deep-Learning-Based Stair Detection Using 3D Point Cloud Data

TABLE 5. Performance evaluation.

TABLE 6. Comparison with conventional methods.

FIGURE 9. Sample estimation results for validation data.

a depth camera with a high probability of 97.3% accuracy,

96.2% precision, and 100% recall.
In addition, Table 6 summarizes a comparison of the
correctness rate between our method and conventional
FIGURE 8. A sample 2D image showing the direction of the shot. methods, which demonstrates that our method can detect
stairs with a higher accuracy rate than conventional methods.
In addition, the samples of the estimation results for the
expressed by equation (6). validation data in Fig. 9 indicate that the estimation results
TP are correct.
Recall = (6) Therefore, the effectiveness of the proposed method is
TP + FN
considered very high. However, Table 3 summarizes that
Table 5 summarizes the results of accuracy, precision, and all the downstairs and upstairs were detected correctly,
recall. A comparison of the accuracies of the conventional but some of the non-stairs were detected as upstairs. The
methods for stair detection is presented in Table 6. Conven- reason for this is that some of the data other than stairs
tional methods ¬ [9] and [11] are based on 2D images, include obstacles, and some of them were detected upstairs.
whereas methods ® [15] and ¯ [16] are based on RGB-D Therefore, it is necessary to add preprocessing steps, such as
images. noise processing, to eliminate this false detection to increase
In this study, we acquired the depth information of stairs the correct answer rate in the future. The average learning
and non-stairs from various directions. Fig. 8 shows a sample time was 842.406s, and the average estimation time was 2.13s
2D image that shows an example of the orientation of each on a computer with the configuration mentioned in Table 2.
scene. Fig. 9 shows a sample of the results of randomly
selecting and estimating the 3D point cloud data generated V. CONCLUSION
from depth information acquired from various directions. In this study, stairs detection was realized using point cloud
In Fig. 9, label indicates the correct label, and Pred indicates data captured by a depth camera by applying deep learning.
the estimated label. To use the point cloud data for deep learning model training,
the point cloud had to be made lighter through processes
C. DISCUSSIONS such as downsampling and max pooling. We confirmed
In this study, we used the class classification of PointNet, one the effectiveness of the proposed method through several
of the deep learning model, to detect stairs from 3D point verification experiments by preparing the relevant depth data.
cloud data generated using depth information captured by The results reveal that the proposed method can detect stairs

56254 VOLUME 10, 2022

H. Matsumura, C. Premachandra: Deep-Learning-Based Stair Detection Using 3D Point Cloud Data

with a very high accuracy rate of 97.3% and exhibited the best [20] Y. Endo and C. Premachandra, ‘‘Development of a bathing accident
performance compared to other conventional methods. monitoring system using a depth sensor,’’ IEEE Sensors Lett., vol. 6, no. 2,
pp. 1–4, Feb. 2022.
[21] Y. Ito, C. Premachandra, S. Sumathipala, H. W. H. Premachandra, and
REFERENCES B. S. Sudantha, ‘‘Tactile paving detection by dynamic thresholding based
[1] Fact Sheet Blindness and Vision Impairment, World Health Org., Geneva, on HSV space analysis for developing a walking support system,’’ IEEE
Switzerland, 2019. Access, vol. 9, pp. 20358–20367, 2021.
[2] Results of the 2016 Survey on Difficulties in Daily Life (National Survey [22] E. Nezhadarya, E. Taghavi, R. Razani, B. Liu, and J. Luo, ‘‘Adaptive
on Children and Persons With Disabilities at Home), Department Health hierarchical down-sampling for point cloud classification,’’ in Proc. CVPR,
Welfare Persons Disabilities, Social Welfare War Victims Relief Bureau, Jun. 2020, pp. 12956–12964.
Ministry Health, Labour Welfare, 2018. [23] Z. Yang, Y. Sun, S. Liu, X. Qi, and J. Jia, ‘‘CN: Channel normalization for
[3] R. R. A. Bourne, S. R. Flaxman, T. Braithwaite, M. V. Cicinelli, A. Das, point cloud recognition,’’ in Proc. ECCV, 2020, pp. 600–616.
and J. B. Jonas, ‘‘Magnitude, temporal trends, and projections of the [24] Y. Liu, C. Wang, Z. Song, and M. Wang, ‘‘Efficient global point cloud
global prevalence of blindness and distance and near vision impairment: registration by matching rotation invariant features through translation
A systematic review and meta-analysis,’’ Lancet Global Health, vol. 5, search,’’ in Proc. ECCV, 2018, pp. 448–463.
no. 9, pp. 888–897, Sep. 2017. [25] J. Kossaifi, A. Bulat, G. Tzimiropoulos, and M. Pantic, ‘‘T-Net:
[4] (2020). International Guide Dog Federation. Guide Dogs Parametrizing fully convolutional nets with a single high-order tensor,’’ in
Worldwide. [Online]. Available: https://www.guidedogs.org.U.K./ab Proc. IEEE/CVF Conf. Comput. Vis. Pattern Recognit. (CVPR), Jun. 2019,
out%20us/what%20we%20do/guide%20dogs%20worldwide pp. 7822–7831.
[5] J. Ballemans, G. I. Kempen, and G. R. Zijlstra, ‘‘Orientation and mobility
training for partially-sighted older adults using an identification cane: A
systematic review,’’ Clin. Rehabil., vol. 25, no. 10, pp. 880–891, Oct. 2011.
[6] A. Nobuyuki and H. Norihisa, ‘‘Walking accident national survey to
maintain visually handicapped persons walking environment,’’ Bull.
Hachinohe Inst. Technol., vol. 24, pp. 81–92, Feb. 2005.
[7] N. Abe and N. Hashimoto, ‘‘Walking accident national survey to maintain
visually handicapped persons walking environment,’’ Bull. Hachinohe Inst. HARUKA MATSUMURA received the B.S.
Technol., vol. 24, pp. 81–92, Feb. 2004. degree in electronic engineering from the Shibaura
[8] N. Molton, S. Se, M. Brady, D. Lee, and P. Probert, ‘‘Robotic sensing for Institute of Technology, Tokyo, Japan, in 2022.
the partially sighted,’’ Robot. Auto. Syst., vol. 26, nos. 2–3, pp. 185–201, Her research interests include depth image
Feb. 1999. processing, visually impaired support systems, and
[9] U. Patil, A. Gujarathi, A. Kulkarni, A. Jain, L. Malke, R. Tekade, 3D vision.
K. Paigwar, and P. Chaturvedi, ‘‘Deep learning based stair detection and
statistical image filtering for autonomous stair climbing,’’ in Proc. 3rd
IEEE Int. Conf. Robotic Comput. (IRC), Feb. 2019, pp. 159–166.
[10] S. Carbonara and C. Guaragnella, ‘‘Efficient stairs detection algorithm
assisted navigation for vision impaired people,’’ in Proc. IEEE Int. Symp.
Innov. Intell. Syst. Appl. (INISTA), Jun. 2014, pp. 313–318.
[11] A. Ramteke, B. Parabattina, and P. K. Das, ‘‘A neural network based
technique for staircase detection using smart phone images,’’ in Proc.
6th Int. Conf. Wireless Commun., Signal Process. Netw. (WiSPNET),
Mar. 2021, pp. 374–379. CHINTHAKA PREMACHANDRA (Senior Mem-
[12] E. Mihankhah, A. Kalantari, E. Aboosaeedan, H. D. Taghirad, S. Ali, and ber, IEEE) was born in Sri Lanka. He received
A. Moosavian, ‘‘Autonomous staircase detection and stair climbing for
the B.Sc. and M.Sc. degrees from Mie University,
a tracked mobile robot using fuzzy controller,’’ in Proc. IEEE Int. Conf.
Tsu, Japan, in 2006 and 2008, respectively, and
Robot. Biomimetics, Feb. 2009, pp. 1980–1985.
[13] C. Zhong, Y. Zhuang, and W. Wang, ‘‘Stairway detection using Gabor filter the Ph.D. degree from Nagoya University, Nagoya,
and FFPG,’’ in Proc. Int. Conf. Soft Comput. Pattern Recognit. (SoCPaR), Japan, in 2011.
Oct. 2011, pp. 578–582. From 2012 to 2015, he was an Assistant Profes-
[14] S. Murakami, M. Shimakawa, K. Kivota, and T. Kato, ‘‘Study on stairs sor with the Department of Electrical Engineering,
detection using RGB-depth images,’’ in Proc. Joint 7th Int. Conf. Soft Faculty of Engineering, Tokyo University of
Comput. Intell. Syst. (SCIS) 15th Int. Symp. Adv. Intell. Syst. (ISIS), Science, Tokyo, Japan. From 2016 to 2017, he was
Dec. 2014, pp. 699–702. an Assistant Professor. From 2018 to 2022, he was an Associate Professor
[15] R. Munoz, X. Rong, and Y. Tian, ‘‘Depth-aware indoor staircase detection with the Department of Electronic Engineering, School of Engineering,
and recognition for the visually impaired,’’ in Proc. IEEE Int. Conf. Shibaura Institute of Technology, Tokyo. In 2022, he was promoted to a
Multimedia Expo Workshops (ICMEW), Sep. 2016, pp. 1–6. Professor with the Department of Electronic Engineering, Graduate School
[16] S. Wang, H. Pan, C. Zhang, and Y. Tian, ‘‘RGB-D image-based detection of Engineering, Shibaura Institute of Technology, where he is currently the
of stairs, pedestrian crosswalks and traffic signs,’’ J. Vis. Commun. Image Manager of the Image Processing and Robotic Laboratory. His research
Represent., vol. 25, no. 2, pp. 263–272, Feb. 2014. interests include AI, UAV, image processing, audio processing, intelligent
[17] M. Hayami and M. Hild, ‘‘Detection of stairs using stereo images as a
transport systems (ITS), and mobile robotics.
walking aid for visually impaired persons,’’ in Proc. Conf. Inf. Process.
Dr. Premachandra is a member of IEICE, Japan; SICE, Japan; and
Soc. Japan, 2010.
[18] (2021). Intel RealSense Technology. Intel RealSense Camera SOFT, Japan. He received the FIT Best Paper Award and the FIT Young
D400 Series Product Family Datasheet. [Online]. Available: Researchers Award from IEICE and IPSJ, Japan, in 2009 and 2010,
https://www.intelrealsense.com/wp-content/uploads/2020/06/Intel- respectively. He was a recipient of the IEEE Japan Medal, in 2022. He has
RealSense-D400-Series-Datasheet-June-2020.pdf served many international conferences and journals as a steering committee
[19] R. Q. Charles, H. Su, M. Kaichun, and L. J. Guibas, ‘‘PointNet: member and an editor, respectively. He is the Founding Chair of the
Deep learning on point sets for 3D classification and segmentation,’’ in International Conference on Image Processing and Robotics (ICIPRoB)
Proc. IEEE Conf. Comput. Vis. Pattern Recognit. (CVPR), Jul. 2017, which is technically co-sponsored by the IEEE.
pp. 652–660.

VOLUME 10, 2022 56255

Image Caption Genrator Report
No ratings yet
Image Caption Genrator Report
45 pages
A Lightweight Robust Distance Estimation Method
No ratings yet
A Lightweight Robust Distance Estimation Method
20 pages
Object Detection: Advances, Applications, and Algorithms
From Everand
Object Detection: Advances, Applications, and Algorithms
Fouad Sabry
No ratings yet
Real-Time Obstacle Detection System in Indoor Envi
No ratings yet
Real-Time Obstacle Detection System in Indoor Envi
13 pages
Underwater Computer Vision: Exploring the Depths of Computer Vision Beneath the Waves
From Everand
Underwater Computer Vision: Exploring the Depths of Computer Vision Beneath the Waves
Fouad Sabry
No ratings yet
1-s2.0-S2666990021000148-main_IOT
No ratings yet
1-s2.0-S2666990021000148-main_IOT
13 pages
1-s2.0-S2665917422002483-main
No ratings yet
1-s2.0-S2665917422002483-main
11 pages
Optical Braille Recognition: Empowering Accessibility Through Visual Intelligence
From Everand
Optical Braille Recognition: Empowering Accessibility Through Visual Intelligence
Fouad Sabry
No ratings yet
A Deep Learning Based Assistant For The Visually Impaired
No ratings yet
A Deep Learning Based Assistant For The Visually Impaired
11 pages
Uniforme Rfd
No ratings yet
Uniforme Rfd
8 pages
Panchi 2019
No ratings yet
Panchi 2019
16 pages
On Edge Human Action Recognition Using Radar-Based Sensing and Deep Learning
No ratings yet
On Edge Human Action Recognition Using Radar-Based Sensing and Deep Learning
13 pages
Detecting human fall using internet of things devices for healthcare applications
No ratings yet
Detecting human fall using internet of things devices for healthcare applications
9 pages
Real Time Human Fall Detection
No ratings yet
Real Time Human Fall Detection
49 pages
Computer Vision: Exploring the Depths of Computer Vision
From Everand
Computer Vision: Exploring the Depths of Computer Vision
Fouad Sabry
No ratings yet
Information 12 00403 v2
No ratings yet
Information 12 00403 v2
11 pages
Machines 11 01068 v2
No ratings yet
Machines 11 01068 v2
14 pages
2411.10945v1
No ratings yet
2411.10945v1
10 pages
(2019) Wearable Fall Detector Using Recurrent Neural Networks
No ratings yet
(2019) Wearable Fall Detector Using Recurrent Neural Networks
18 pages
Scientific Writing LR Submission File Team4
No ratings yet
Scientific Writing LR Submission File Team4
4 pages
journal paper
No ratings yet
journal paper
15 pages
final (4)
No ratings yet
final (4)
5 pages
On-Board Deep-Learning-Based Unmanned Aerial Vehicle Fault Cause Detection and Identification
No ratings yet
On-Board Deep-Learning-Based Unmanned Aerial Vehicle Fault Cause Detection and Identification
7 pages
2207.02696v1 2
No ratings yet
2207.02696v1 2
15 pages
Technologies 08 00072 v2
No ratings yet
Technologies 08 00072 v2
17 pages
Visual Sensor Network: Exploring the Power of Visual Sensor Networks in Computer Vision
From Everand
Visual Sensor Network: Exploring the Power of Visual Sensor Networks in Computer Vision
Fouad Sabry
No ratings yet
3
No ratings yet
3
20 pages
Smart System For The Blind
No ratings yet
Smart System For The Blind
4 pages
OpenCV Essentials
From Everand
OpenCV Essentials
Oscar Deniz Suarez
No ratings yet
Ultra low-power, wearable, accelerated shallow-learning fall
No ratings yet
Ultra low-power, wearable, accelerated shallow-learning fall
18 pages
A Real-Time Patient Monitoring Framework For Fall Detection
No ratings yet
A Real-Time Patient Monitoring Framework For Fall Detection
47 pages
Facial Recognition System: Unlocking the Power of Visual Intelligence
From Everand
Facial Recognition System: Unlocking the Power of Visual Intelligence
Fouad Sabry
No ratings yet
Human Fall Detection With Wearable Sensors Using ML Algorithms
No ratings yet
Human Fall Detection With Wearable Sensors Using ML Algorithms
5 pages
Yolov 7
100% (1)
Yolov 7
17 pages
Fall Detection System for Elderly Using IoT Technology
No ratings yet
Fall Detection System for Elderly Using IoT Technology
6 pages
CPSD
No ratings yet
CPSD
4 pages
Efficient Intelligence With Applications In Embedded Sensing Yong Liu pdf download
No ratings yet
Efficient Intelligence With Applications In Embedded Sensing Yong Liu pdf download
53 pages
Important---Vision-Based Fall Event Detection in Complex Background Using Attention Guided Bi-Directional LSTM
No ratings yet
Important---Vision-Based Fall Event Detection in Complex Background Using Attention Guided Bi-Directional LSTM
12 pages
Smartsun Introduction - Group 3
No ratings yet
Smartsun Introduction - Group 3
2 pages
IEEE Xplore Reference Download 2024.12.12.9.51.20
No ratings yet
IEEE Xplore Reference Download 2024.12.12.9.51.20
2 pages
paper2
No ratings yet
paper2
8 pages
Chang 2020
No ratings yet
Chang 2020
12 pages
Sensors: A Novel Walking Detection and Step Counting Algorithm Using Unconstrained Smartphones
No ratings yet
Sensors: A Novel Walking Detection and Step Counting Algorithm Using Unconstrained Smartphones
15 pages
sil2.12060
No ratings yet
sil2.12060
19 pages
Electronics 10 03159 v2
No ratings yet
Electronics 10 03159 v2
22 pages
Autonomous Flights in Dynamic Environments With Onboard Vision
No ratings yet
Autonomous Flights in Dynamic Environments With Onboard Vision
8 pages
Computer Vision: Fundamentals and Applications
From Everand
Computer Vision: Fundamentals and Applications
Fouad Sabry
No ratings yet
Machine Learning in Video Surveillance For Fall Detection: Lesya Anishchenko
No ratings yet
Machine Learning in Video Surveillance For Fall Detection: Lesya Anishchenko
4 pages
A Machine Learning Approach For Fall Detection and Daily Living Activity Recognition
No ratings yet
A Machine Learning Approach For Fall Detection and Daily Living Activity Recognition
17 pages
Fall Detection2018
No ratings yet
Fall Detection2018
5 pages
Aaad Phase 2 Updated
No ratings yet
Aaad Phase 2 Updated
31 pages
Putt
No ratings yet
Putt
4 pages
IoT_based_Powered_Objects_Recognition___Glasses_for_Blind_Persons
No ratings yet
IoT_based_Powered_Objects_Recognition___Glasses_for_Blind_Persons
3 pages
REF01
No ratings yet
REF01
8 pages
cv-1
No ratings yet
cv-1
5 pages
Project 1
No ratings yet
Project 1
14 pages
Articulated Body Pose Estimation: Unlocking Human Motion in Computer Vision
From Everand
Articulated Body Pose Estimation: Unlocking Human Motion in Computer Vision
Fouad Sabry
No ratings yet
Video Based Fall Detection For Seniors With Human Pose Estimation
No ratings yet
Video Based Fall Detection For Seniors With Human Pose Estimation
26 pages
Fall detection with DL
No ratings yet
Fall detection with DL
9 pages
Jeong 2022 Fall
No ratings yet
Jeong 2022 Fall
18 pages
Facial Recognition System: Fundamentals and Applications
From Everand
Facial Recognition System: Fundamentals and Applications
Fouad Sabry
No ratings yet
Emotion Based Movie Recommender System Using CNN
No ratings yet
Emotion Based Movie Recommender System Using CNN
11 pages
Lundberg, Lee - 2017 - A Unified Approach To Interpreting Model Predictions (2) - Annotated
No ratings yet
Lundberg, Lee - 2017 - A Unified Approach To Interpreting Model Predictions (2) - Annotated
11 pages
AI-Assisted Deep NLP-Based Approach for Prediction of Fake News From Social Medi
No ratings yet
AI-Assisted Deep NLP-Based Approach for Prediction of Fake News From Social Medi
11 pages
1brochure - June-Oct, 2021
No ratings yet
1brochure - June-Oct, 2021
19 pages
Real Time Violence Alert Sym
No ratings yet
Real Time Violence Alert Sym
14 pages
A Study of Network Intrusion Detection S
No ratings yet
A Study of Network Intrusion Detection S
27 pages
Occupancy
No ratings yet
Occupancy
31 pages
Neural Network Kinetics: Diffusion Multiplicity and B2 Ordering in Compositionally Complex Alloys
No ratings yet
Neural Network Kinetics: Diffusion Multiplicity and B2 Ordering in Compositionally Complex Alloys
33 pages
Theft Vehicle Detection Using Automatic License: Plate Recognition
No ratings yet
Theft Vehicle Detection Using Automatic License: Plate Recognition
5 pages
Intelligent Systems and Applications: Proceedings of the 2020 Intelligent Systems Conference (IntelliSys) Volume 3 Kohei Arai 2024 scribd download
100% (4)
Intelligent Systems and Applications: Proceedings of the 2020 Intelligent Systems Conference (IntelliSys) Volume 3 Kohei Arai 2024 scribd download
55 pages
Object-Detection-with-YOLO
No ratings yet
Object-Detection-with-YOLO
18 pages
A2 Poster Template
No ratings yet
A2 Poster Template
1 page
Automatic Vehicle Detection System in Different Environment Conditions Using Fast R-CNN
No ratings yet
Automatic Vehicle Detection System in Different Environment Conditions Using Fast R-CNN
21 pages
Financial Time Series Forecasting With Deep Learning a Systematic Literature Review 2005 2019
No ratings yet
Financial Time Series Forecasting With Deep Learning a Systematic Literature Review 2005 2019
64 pages
All List Cse Techlogics
No ratings yet
All List Cse Techlogics
32 pages
Here Is My Project Report
No ratings yet
Here Is My Project Report
14 pages
ML 2023 FinalExam
No ratings yet
ML 2023 FinalExam
3 pages
5474-Article Text-8699-1-10-20200511
No ratings yet
5474-Article Text-8699-1-10-20200511
8 pages
ISCSITR-IJCSE_2025_06_01_001
No ratings yet
ISCSITR-IJCSE_2025_06_01_001
13 pages
A Brief Introduction To Geometric Deep Learning - by Jason McEwen - Towards Data Science
No ratings yet
A Brief Introduction To Geometric Deep Learning - by Jason McEwen - Towards Data Science
16 pages
Financial Markets Prediction With Deep Learning
No ratings yet
Financial Markets Prediction With Deep Learning
8 pages
Empirical Analysis of Squeeze and Excitation-Based Densely Connected CNN For Chili Leaf Disease Identification
No ratings yet
Empirical Analysis of Squeeze and Excitation-Based Densely Connected CNN For Chili Leaf Disease Identification
12 pages
UNIT_2_DL[1]
No ratings yet
UNIT_2_DL[1]
43 pages
机器学习绘图模板
No ratings yet
机器学习绘图模板
101 pages
Deep Learning Master Thesis
100% (2)
Deep Learning Master Thesis
4 pages
IJIVP Vol 12 Iss 2 Paper 8 2610 2614
No ratings yet
IJIVP Vol 12 Iss 2 Paper 8 2610 2614
5 pages
A Lip Reading Method Based On 3D Convolutional Vision Transformer
No ratings yet
A Lip Reading Method Based On 3D Convolutional Vision Transformer
8 pages
Copy of It Fie Btech 24-25-Ioe-p14 (1)
No ratings yet
Copy of It Fie Btech 24-25-Ioe-p14 (1)
100 pages
A Simple Single-Scale Vision Transformer For Object Detection and Instance Segmentation
No ratings yet
A Simple Single-Scale Vision Transformer For Object Detection and Instance Segmentation
23 pages

Deep-Learning-Based Stair Detection Using 3D Point Cloud Data For Preventing Walking Accidents of The Visually Impaired

Uploaded by

Deep-Learning-Based Stair Detection Using 3D Point Cloud Data For Preventing Walking Accidents of The Visually Impaired

Uploaded by

Received March 26, 2022, accepted May 21, 2022, date of publication May 26, 2022, date of current

version June 1, 2022.

Deep-Learning-Based Stair Detection Using 3D

Corresponding author: Chinthaka Premachandra ([email protected])

I. INTRODUCTION In Japan, Article 14 of the Road Traffic Law stipulates that

FIGURE 1. Operation image of this system.

high. In particular, a fall from a raised place, such as a

56250 VOLUME 10, 2022

collecting a large amount of 3D point cloud data for deep

f (x1 + r, x2 + r, . . . , xM + r) = f (x1 , x2 , . . . , x M ) (2)

VOLUME 10, 2022 56251

FIGURE 4. 3D point cloud data samples and their appearance as a 2D image.

where R denotes the rotation matrix. This equation demon-

56252 VOLUME 10, 2022

FIGURE 7. Shooting environment.

TABLE 2. Computer specifications.

TABLE 3. Confusion matrix (3-class classification.

FIGURE 6. Network structures of PointNet.

TABLE 4. Confusion matrix (2-class classification).

VOLUME 10, 2022 56253

TABLE 5. Performance evaluation.

TABLE 6. Comparison with conventional methods.

FIGURE 9. Sample estimation results for validation data.

a depth camera with a high probability of 97.3% accuracy,

56254 VOLUME 10, 2022

VOLUME 10, 2022 56255

You might also like