PRO-VAS: Utilizing AR and VSLAM For Mobile Apps Development in Visualizing Objects
PRO-VAS: Utilizing AR and VSLAM For Mobile Apps Development in Visualizing Objects
Sofianita Mutalib1,2, Mohd Alif Izhar1, Shuzlina Abdul-Rahman1,2, Mohd Zaki Zakaria1,2,
Mastura Hanafiah3
1
Faculty of Computer and Mathematical Sciences, Universiti Teknologi MARA, 40450 Shah Alam, Selangor, Malaysia
2
Research Initiative Group Intelligent Systems, Faculty of Computer and Mathematical Sciences, Universiti Teknologi MARA,
40450 Shah Alam, Selangor, Malaysia
3
Accenture Solutions Sdn Bhd, Malaysia, Level 29 & 30, Menara Exchange, 106, Lingkaran TRX, Tun Razak Exchange,
55188 Kuala Lumpur, Wilayah Persekutuan Kuala Lumpur, Malaysia
Corresponding Author:
Sofianita Mutalib
Research Initiave Group Intelligent Systems, Faculty of Computer and Mathematical Sciences
Universiti Teknologi MARA, 40450 Shah Alam, Selangor, Malaysia
Email: [email protected]
1. INTRODUCTION
Mobile devices are widely used by most people in the world. Developing an augmented reality (AR)
enabled system on mobile devices can make people learn more interactively instead of using camera filters.
Act of realism is one of the main factors for the system ideation and approaches. Interactions between the
real world and the virtual world also show how virtual objects can do the same thing as the real objects do
which brings the execution of AR [1]. The broad range of technology can bring humans with machine easily.
The world right now is connected without restrictions through the online platform. Everything is on the
fingertips and it makes the user have full control on what they want and desire for. An application that really
helps users on their daily life is what they need right now, such as for dermatological diagnosis [2] that assist
users to identify main skin diseases using three main languages and child tracking observation and location
tracking system [3].
Our paper proposed an app that can visualize objects and provides mapping of the scene.
The interaction that happens between the real time and virtual reality is based on the pose estimation, which is
based on points, lines, distance and lines formation that estimate the geographical data of the environment [4].
Visual simultaneous localization and mapping (VSLAM) is usually used in the robotics field in order to
simultaneously localize and create maps for the robot to move around the environment [5], [6]. Depth
cameras can produce red green blue depth (RGB-D) mapping widely even if there is a phone that have a
LiDAR sensor like iPhone 12. Commonly, the simultaneous localization and mapping (SLAM) features that
are applied in applications have limitation in reading the visual images and points of the environment,
especially to in mapping the environment data [6], [7]. Meanwhile, most applications that use markerless
tracking do not meet the realistic view as it should be.
This work aims to produce a mobile app that can scan and visualise the environment in real time
thus helping the user to choose the right furniture or products for designing their interior house.
The contribution of this paper includes: 1) the use of VSLAM for localization of the object; and 2) the
proposed components for the apps namely, AR, VSLAM, point plane, RGB-D mapping and markerless
tracking method. These components create distinct characteristics compared to other existing similar mobile
apps. The remainder of this paper is organised as: section 2 describes the past studies and presents the similar
mobile apps in the market. Section 3 presents the components of the pro-visualizer app (PRO-VAS) while
section 4 discusses the development. The results and findings of the study are presented in section 5 and
finally, section 6 concludes the paper.
2.2. Houzz
Houzz is another interior design mobile application that uses AR, with idea trends and the price for
every single product displayed on the application as shown in Figure 1(a). Through commenting features, the
user engagement will be increasing, and it can be considered as a reference point for ensuring that users make
the right choice [11]. This new feature can detect the floor orientation and users have the capability to
estimate the amount of tile needed to put inside their house. The developers also expand the functionality of
the app whereby users can decorate their walls with the vertical plane detection feature [12].
2.3. Intiaro
Intiaro is a platform where users can buy and sell their products in the mobile application as shown in
Figure 1(b), which makes the functionality more interesting. A group of developers in Intiaro can do 3D model
design for users throughout the world including business organizations or individuals. The functionality
supports entrepreneurs in furniture and interior design industries [13]. The 3D digitization and AR mobile
application makes all furnitures in the application unscalable and makes the application realistic in which the
measurements haveto be exactly the same with the real products [14] with a high-level visualization.
(a) (b)
3.1. AR
One of the methods on applying AR is using ARCore to track position of mobile devices as it moves
and builds the knowledge of the surrounding environment. It also can detect the point whenever the mobile
devices move. The contour detection with the wall paint is developed by using OpenCV, a library that comes
along with Python programming [16]. Among important classes in AR Systems are task focus, nature of
augmentation, and OP-a-S for interaction modelling [1]. The nature of augmentation can be divided into two
parts which are execution and evaluation. For the execution part, the user can perform a lot of quality tasks,
while the evaluation part is based on the user perception in which much realistic information will be provided
to the user [17].
3.2. VSLAM
The most important component in this project is the VSLAM. VSLAM is used to identify visual
images, angles and points on the surface. VSLAM is an algorithm that is usually used on robots that can
navigate throughout the environment with the help of the vision sensor [18]. The VSLAM allows the robot to
navigate and clean the larger spaces in satisfying straight lines. The challenge in the VSLAM algorithm is on
the dynamic illustration of the multi segmentation target on the scene captured by the camera sensors [19].
The VSLAM algorithm consists of several main components, which are feature extraction, feature matching,
pose estimation, pose optimization and map updating, as shown in Figure 2. Feature extraction functionality
extracts every single image that is captured as the camera sensors move and change its position. Feature
matching functionality creates a map based on the extraction made from the RGB-D camera sensors. Pose
estimation estimates the camera rotation based on position and angle. Pose optimization minimizes the error
made when estimating the coordination of the camera pose. Map updating updates the map created based on
every single camera orientation [19]. SLAM is a method to map the environment and represent it in a
collection of points. In order to take advantage of the last success of graph-based approaches on SLAM, the
framework is constructed based on the advanced feature-based SLAM system, orbital (ORB) SLAM. This
allows us to take advantage of the sparsity of the outline and at the same time joining more semantically the
important geometric primitives such as planes within the outline [7], [20], [21].
Figure 2. VSLAM algorithm main components, from input image to map updating
TELKOMNIKA Telecommun Comput El Control, Vol. 20, No. 5, October 2022: 1064-1072
TELKOMNIKA Telecommun Comput El Control 1067
Meanwhile Figure 4 shows three main important screens of PRO-VAS, since the PRO-VAS is highly
depending on the RGB-D camera, so it will notify the user if the device is supported or not, in Figure 4(a), then
if yes, the second screen will be appeared, in Figure 4(b) and guide the user to scene the area for the suitable
location of the 3D objects in Figure 4(c). Finally the third screen appears to confirm the product being
choosen. Without the VSLAM application, the object could be following the camera movement and do not
stay at one place. Meanwhile, Table 1 shows the comparison of components in PRO-VAS to other discussed
apps in the previous section.
PRO-VAS: utilizing AR and VSLAM for mobile apps development in … (Sofianita Mutalib)
1068 ISSN: 1693-6930
Figure 4. PRO-VAS main screens with (a) popup message when the phone supports RGB-D mapping,
(b) hand animation to guide the used use camera, and (c) screen in choosing the object
Figure 5. PRO-VAS (a) process flow, (b) the detection of the object location with vertical plane,
(c) horizontal plane, and (d) the preview of RGB-D
Controller component scripts have depth menu, instant placement menu, instant placement prefab
and first-person camera. Depth menu consists of selection whether the user wants to enable the depth
application programming integration (API) and depth map. The depth API inside this application gives a full
optimization on the AR users. Instant placement menu script activates the object prefab once the user
generates the object into the real world. First person camera is the camera chosen which is the back facing
camera to give the view of the scene to generate 3D objects. Plane discovery guide gives the guide to users
on discovering the environment to detect planes. Based on Figure 4, there will be hand animation to guide
TELKOMNIKA Telecommun Comput El Control, Vol. 20, No. 5, October 2022: 1064-1072
TELKOMNIKA Telecommun Comput El Control 1069
users on searching for surfaces to place objects. The prefab used for the plane is detected plane visualizer.
Prefab will be generated once the camera detects plane surfaces in the environment. New planes will be
instantiated every time the camera finds new surfaces which are either horizontally or vertically. Figure 5(b)
and Figure 5(c) show the horizontal plane and the vertical plane on the environment. Camera detected the
planes as the user hover the phone over the environment surfaces. On Figure 5(d) shows the preview RGB-D
for the scene through PRO-VAS on the phone.
Figure 6. 3D objects appearance in pro-vas app in different views (a) 3d object on horizontal plane,
(b) 3d object 180° at y-axis, (c) 3d object 180° at x-axis, and (d) 3d object closer look
(a) (b)
Figure 7. Location testing for two objects: (a) table and (b) television
6. CONCLUSION
This paper presents the capability of AR objects to appear in the environment by using the VSLAM.
Tracking method used in this study was markerless tracking in which the camera is not restricted on the
tracking marker to create the object. By this way, it is proven that AR and VSLAM can fully connect the
virtual reality and real time, so called mixed reality limitlessly. The process of localization and mapping
simultaneously really helps the users to use this application and apply it in real time. Based on the testing, a
good hardware and operating software that support running this application makes the VSLAM experience
run smoothly. Previously, the usage of hardware such as Microsoft Kinect, which is very expensive, was
needed to do the mapping. However, currently, the usage of phone cameras that already have the depth
camera built-in can help to ease the process of mapping the environment. It shows that this study has
achieved the application of VSLAM for mobile apps with AR is successfully applied and the apps can
perform smoothly as the user feedback. Google ARCore really helps a lot on the development of the system
in which it has a lot of features to help developers on creating and expressing the idea of developing good
multimedia based applications such as AR application. Markerless tracking method is among the best
tracking method to use rather than marker based if the purpose is for wide tracking without any restriction on
scanning the visual images and prevent object occlusion. RGB-D camera help in creating a map that works
on VSLAM to understand the environment better.
ACKNOWLEDGEMENT
The authors would like to express the gratitude to Ministry of Higher Education Malaysia and
Research Management Center, UiTM for the FRGS 5/3 (461/2019) research grant, also to Faculty of
Computer and Mathematical Sciences, UiTM, for the support in product innovation development.
REFERENCES
[1] E. Dubois, L. Nigay, J. Troccaz, O. Chavanon, and L. Carrat, “Classification Space for Augmented Surgery, an Augmented
Reality Case Study,” Proceedings of Interact’99 IOS Press, 1999, pp. 353-359. [Online] Available:
https://www.academia.edu/18842330/Classification_Space_for_Augmented_Surgery._an_Augmented_Reality_Case_Study
[2] S. A. Hameed, A. Haddad, M. H. Habaebi, and A. Nirabi, “Dermatological diagnosis by mobile application,” Bulletin of
Electrical Engineering and Informatics, vol. 8, no. 3, pp. 847-854, Sep. 2019, doi: 10.11591/eei.v8i3.1502.
TELKOMNIKA Telecommun Comput El Control, Vol. 20, No. 5, October 2022: 1064-1072
TELKOMNIKA Telecommun Comput El Control 1071
[3] M. J. Alam, T. Chowdhury, S. Hossain, S. Chowdhury, and T. Das, “Child tracking and hidden activities observation system
through mobile app,” Indonesian Journal of Electrical Engineering and Computer Science, vol. 22, no. 3, pp. 1659-1666, 2021,
doi: 10.11591/ijeecs.v22.i3.pp1659-1666.
[4] A. I. Comport, E. Marchand, and F. Chaumette, “A real-time tracker for markerless augmented reality,” The Second IEEE and
ACM International Symposium on Mixed and Augmented Reality, 2003. Proceedings, 2003, pp. 36-45,
doi: 10.1109/ISMAR.2003.1240686.
[5] J. F. -Pacheco, J. R. -Ascencio, and J. M. R. -Mancha, “Visual Simultaneous Localization and Mapping: A Survey,” Artificial
Intelligence Review, no. 43, 55–81, 2015, doi: 10.1007/s10462-012-9365-8.
[6] A. M. Azri, S. A. -Rahman, R. Hamzah, Z. A. Aziz, and N. A. Bakar, “Visual analytics of 3D LiDAR point clouds in robotics
operating systems,” Bulletin of Electrical Engineering and Informatics, vol. 9, no. 2, pp. 492-499, 2020,
doi: 10.11591/eei.v9i2.2061.
[7] M. Hosseinzadeh, Y. Latif, and I. Reid, “Sparse point-plane SLAM,” University of Adelaide, Australia. Australasian Conference
on Robotics and Automation, ACRA, 2017. [Online]. Available: https://www.araa.asn.au/acra/acra2017/papers/pap170s1-file1.pdf
[8] C. Alves and J. L. Reis, “The Intention to Use E-Commerce Using Augmented Reality - The Case of IKEA Place,” International
Conference on Information Technology & Systems, 2020, pp. 114-123, doi: 10.1007/978-3-030-40690-5_12.
[9] S. G. Dacko, “Enabling smart retail settings via mobile augmented reality shopping apps,” Technological Forecasting and Social
Change, vol. 124, pp. 243–256, 2017, doi: 10.1016/j.techfore.2016.09.032.
[10] S. Ozturkcan, “Service Innovation: Using Augmented Reality in the IKEA Place App,” Journal of Information Technology
Teaching Cases, vol. 11, no. 1, pp. 8-13, 2021, doi: 10.1177/2043886920947110.
[11] T. Kiliç, “Investigation of mobile augmented reality applications used in the interior design,” The Turkish Online Journal of
Design Art and Communication, vol. 9, no. 2, pp. 303-317, 2019, doi: 10.7456/10902100/020.
[12] Boland, M. What’s Driving Houzz’ AR Success?. [Online]. Available: https://arinsider.co/2020/06/16/whats-driving-houzz-ar-
success/ (Accessed Jun. 16, 2020).
[13] Intiaro INC. Intiaro helps furniture retailers & brands transition their products into digital world [Online]. Available:
https://en.intiaro.com/visualisation-platform/ (Accessed 2021).
[14] R. Swaminathan, R. Schleicher, S. Burkard, R. Agurto, and S. Koleczko, “Happy Measure: Augmented Reality for Mobile Virtual
Furnishing,” International Journal of Mobile Human Computer Interaction (IJMHCI), vol. 5, no.1, pp. 16-44, 2013,
doi: 10.4018/jmhci.2013010102.
[15] N. S. Shaeeali, A. Mohamed, and S. Mutalib, “Customer reviews analytics on food delivery services in social media: a review,”
IAES International Journal of Artificial Intelligence (IJ-AI), vol. 9, no. 4, pp. 691~699, 2020, doi: 10.11591/ijai.v9.i4.pp691-699.
[16] E. Dubois and L. Nigay, “Augmented Reality: Which Augmentation for Which Reality?,” DARE '00: Proceedings of DARE 2000
on Designing augmented reality environments, 2000, pp. 165–166, doi: 10.1145/354666.354695.
[17] K. N. Al-Mutib, E. A. Mattar, M. M. Alsulaiman, and H. Ramdane, “Stereo vision SLAM based indoor autonomous mobile robot
navigation,” 2014 IEEE International Conference on Robotics and Biomimetics (ROBIO 2014), 2014, pp. 1584-1589,
doi: 10.1109/ROBIO.2014.7090560.
[18] S. Yang, G. Fan, L. Bai, R. Li, and D. Li, “MGC-VSLAM: A Meshing-Based and Geometric Constraint VSLAM for Dynamic
Indoor Environments,” in IEEE Access, vol. 8, pp. 81007-81021, 2020, doi: 10.1109/ACCESS.2020.2990890.
[19] R. M-Artal, J. M. M. Montiel, and J. D. Tardós, “ORB-SLAM: A Versatile and Accurate Monocular SLAM System,” IEEE
Transactions on Robotics, vol. 31, no. 5, pp. 1147-1163, Oct. 2015, doi: 10.1109/TRO.2015.2463671.
[20] F. Endres, J. Hess, J. Sturm, D. Cremers, and W. Burgard, “3-D mapping with an RGB-D camera,” IEEE Transactions on
Robotics, vol. 30, no. 1, pp. 177–187, 2014, doi: 10.1109/TRO.2013.2279412.
[21] M. Z. Zakaria, S. Mutalib, S. A. Rahman, S. J. Elias, and A. Z. Shahuddin, “Solving RFID mobile reader path problem with
optimization algorithms,” Indonesian Journal of Electrical Engineering and Computer Science, vol. 13, no. 3, pp. 1110-116,
2019, doi: 10.11591/ijeecs.v13.i3.pp1110-1116.
[22] J. Platonov, H. Heibel, P. Meier, and B. Grollmann, “A mobile markerless AR system for maintenance and repair,” 2006
IEEE/ACM International Symposium on Mixed and Augmented Reality, 2006, pp. 105-108, doi: 10.1109/ISMAR.2006.297800.
[23] I. Y. -H. Chen, B. MacDonald, and B. Wünsche, “Markerless Augmented Reality for robots in unprepared environments.”
Australasian Conference on Robotics and Automation. ACRA08, 2008. [Online]. Available:
https://www.araa.asn.au/acra/acra2008/papers/pap121s1.pdf
[24] L. Zhang, D. Chen, and W. Liu, "Point-plane SLAM based on line-based plane segmentation approach," 2016 IEEE International
Conference on Robotics and Biomimetics (ROBIO), 2016, pp. 1287-1292, doi: 10.1109/ROBIO.2016.7866503.
[25] R. Jamiruddin, A. O. Sari, J. Shabbir, and T. Anwer, “RGB-Depth SLAM Review,” ArXiv, 2018,
doi: 10.48550/arXiv.1805.07696.
BIOGRAPHIES OF AUTHOR
PRO-VAS: utilizing AR and VSLAM for mobile apps development in … (Sofianita Mutalib)
1072 ISSN: 1693-6930
Mohd Alif Izhar completed his bachelor’s degree at Faculty of Computer and
Mathematical Sciences, Universiti Teknologi MARA (UiTM) Shah Alam, with Bachelor of
Information Technology (Hons.) in Intelligent Systems Engineering. He can be contacted at
email: [email protected].
TELKOMNIKA Telecommun Comput El Control, Vol. 20, No. 5, October 2022: 1064-1072