Module 4 - Introduction to Augmented Reality
Module 4 - Introduction to Augmented Reality
Augmented Reality
History of AR
Early Foundations (1960s-1970s)
• 1968: Ivan Sutherland, often considered the father of
computer graphics, developed the first head-mounted
display system, known as "The Sword of Damocles”.
• It was a primitive AR system that required the user to be
strapped into a mechanical rig.
• The graphics were simple wireframe models, but this
marked the beginning of AR technology.
• 1974: Myron Krueger introduced the concept of "artificial
reality" with his work on "Videoplace."
• While not AR by today's standards, it laid the groundwork
for interactive environments where users could manipulate
virtual objects in real time.
Cont.,
Development and Expansion (1980s-1990s)
• 1980: Steve Mann, often credited with being the first "cyborg,"
developed a wearable computer that could overlay simple text and
graphics on his vision. This was one of the earliest examples of a
personal, wearable AR system.
• 1990: The term "Augmented Reality" was coined by Tom Caudell, a
Boeing researcher, to describe a digital display used to guide workers
during the assembly of aircraft. This was a significant step toward
understanding AR as a tool for practical applications.
• 1992: The U.S. Air Force's Armstrong Laboratory developed one of
the first functional AR systems, known as the Virtual Fixtures system.
It was designed to assist with complex tasks, such as surgical
operations or aircraft maintenance.
• 1998: AR started gaining attention in the entertainment and sports
industries. The first major application was the yellow "first down" line
superimposed on the field during live football broadcasts, providing
viewers with enhanced understanding of the game.
Cont.,
Rise of AR in the 21st Century (2000s-2010s)
• 2000: The ARToolKit, an open-source library for creating AR
applications, was released. It allowed developers to easily create
AR environments, making AR more accessible.
• 2008: The first AR-enabled smartphone apps were released,
leveraging the cameras and sensors in modern smartphones to
overlay information and graphics on the real world. Wikitude and
Layar were some of the earliest examples.
• 2012: Google announced Google Glass, a wearable AR device
that aimed to bring AR to the masses. Though it faced significant
challenges and criticism, it played a crucial role in raising public
awareness of AR technology.
• 2016: The release of Pokémon GO marked a watershed moment
for AR, demonstrating its potential for mainstream entertainment.
The game used smartphones to superimpose virtual creatures
over the real world, becoming a global phenomenon.
Cont.,
Modern Developments (2020s-Present)
• 2020: AR saw increased adoption in various industries,
including retail, healthcare, and education. For example,
IKEA's AR app allows users to visualize furniture in their
homes before purchasing.
• 2022: AR glasses and headsets, such as Microsoft's
HoloLens and Magic Leap, have continued to evolve,
offering more advanced features for professional and
enterprise use. These devices are used in fields like
manufacturing, medical training, and remote collaboration.
• Present: The integration of AR into social media platforms,
like Snapchat and Instagram, has made AR a part of
everyday life for millions. Companies like Apple and
Facebook are investing heavily in AR, indicating that the
technology will play a significant role in the future of
Selection of AR Platform
Selecting an AR platform depends on several factors,
including the specific use case, the target audience, the
devices you intend to support, and your development
expertise.
Purpose and Use Case
• Entertainment (e.g., gaming, social media filters):
• ARKit (iOS): Apple's AR platform is optimized for iOS devices
and is well-suited for creating interactive, immersive
experiences.
• ARCore (Android): Google's AR platform works across a wide
range of Android devices and is comparable to ARKit in terms of
capabilities.
• Snap Lens Studio: Ideal for creating AR filters and lenses for
Snapchat.
• Facebook Spark AR: Best for creating AR effects for Facebook
and Instagram.
Cont.,
Retail (e.g., virtual try-ons, product visualization):
• Vuforia: A versatile AR platform with strong support for image and
object recognition, making it suitable for product visualization.
• 8th Wall: Works across both iOS and Android, with a focus on web-
based AR, which is ideal for retail apps where accessibility is key.
Industrial (e.g., training, remote assistance, manufacturing):
• Microsoft HoloLens/MRTK (Mixed Reality Toolkit): Ideal for
enterprise-level applications requiring spatial understanding and 3D
visualization.
• PTC Vuforia Studio: Specifically designed for industrial use cases,
offering integration with CAD data and IoT.
• Magic Leap: Suitable for complex industrial and healthcare
applications due to its advanced spatial computing capabilities.
Education (e.g., interactive learning, simulations):
• Zappar: A good option for educators looking to create simple AR
experiences without requiring extensive coding knowledge.
• Unity with AR Foundation: Offers flexibility to create a wide range of
educational content, from simple 2D overlays to complex 3D
Target Devices
Mobile Devices (smartphones, tablets):
• ARKit: Best for targeting iOS devices (iPhones, iPads).
• ARCore: Best for Android devices, but also has cross-platform
capabilities.
• 8th Wall: Ideal for web-based AR experiences that need to work across
different devices without requiring app installation.
Wearables (AR glasses, headsets):
• Microsoft HoloLens: Ideal for enterprise and industrial applications
with hands-free interaction.
• Magic Leap: Good for immersive experiences in both enterprise and
entertainment.
• Google Glass Enterprise Edition: Tailored for industrial and
professional use cases where hands-free, lightweight AR is required.
Web-based AR:
• 8th Wall: Allows you to build AR experiences that can be accessed
directly through a browser, eliminating the need for an app download.
• Zappar: Provides a balance between ease of use and functionality, also
supporting web-based AR.
Development Environment and
Expertise
Unity with AR Foundation:
• If you have experience with Unity, AR Foundation provides a
powerful framework that supports both ARKit and ARCore,
making it a great option for cross-platform development.
Unreal Engine:
• Suitable for developers looking to create high-fidelity AR
experiences, especially in gaming or high-end visualizations.
Web-based AR Tools:
• A-Frame or Three.js: For web developers familiar with HTML
and JavaScript, these frameworks offer a way to create AR
experiences that run in browsers.
Easy-to-use Platforms:
• Zappar or Lens Studio: These platforms are more user-
friendly and require less technical expertise, making them
suitable for creators without a strong programming
background.
General Information
ARKit/ARCore: Best for mobile-based AR experiences on
iOS and Android.
Unity with AR Foundation: Great for cross-platform
development with a need for high flexibility.
Microsoft HoloLens/MRTK: Ideal for enterprise and
industrial use cases with spatial computing needs.
8th Wall/Zappar: Excellent for web-based AR, with easy
access and no need for app downloads.
Vuforia: Versatile for various applications, especially in
industrial settings.
Unreal Engine: Suitable for high-fidelity, immersive AR
experiences.
Integrating Hardware and Software
Integrating hardware and software in Augmented Reality
(AR) is a complex but essential process to ensure that AR
experiences are immersive, responsive, and effective.
The integration involves aligning the capabilities of the
hardware (such as sensors, cameras, and processors) with
the software (such as AR frameworks, applications, and
algorithms) to deliver seamless AR interactions.
Understanding the Hardware
Components
Cameras:
• Cameras are crucial for capturing the real world to overlay
digital content. High-resolution cameras improve the quality of
the AR experience.
• Depth-sensing cameras (like LiDAR on iPhones) allow for more
accurate placement of virtual objects in 3D space, enhancing
realism.
Sensors:
• IMU (Inertial Measurement Unit): Combines accelerometers,
gyroscopes, and sometimes magnetometers to track the
orientation and movement of the device.
• GPS: Essential for location-based AR, enabling applications to
overlay information based on the user’s geographical position.
• Depth Sensors: Used to understand the distance between
objects, enabling more sophisticated interaction with the virtual
and real world.
Cont.,
Processors:
• The choice of processor impacts the performance of AR
applications, especially in terms of real-time rendering and
object recognition.
• Modern mobile devices often include specialized chips (like
Apple’s A-series chips with Neural Engines) designed to handle
AR tasks efficiently.
Display Technology:
• Smartphones/Tablets: Utilize the existing screen for AR overlay,
but have limitations in field of view and user immersion.
• AR Glasses/Headsets (e.g., HoloLens, Magic Leap): Provide
more immersive experiences by projecting digital content
directly into the user’s view, allowing for hands-free interaction.
Selecting the Appropriate AR Software
AR SDKs (Software Development Kits):
• ARKit (iOS) and ARCore (Android): These SDKs
leverage the device’s hardware to create AR experiences.
They handle tasks such as motion tracking, environmental
understanding, and light estimation.
• Vuforia: Offers robust image and object recognition,
suitable for industrial and enterprise applications.
• MRTK (Mixed Reality Toolkit) for HoloLens: Provides
tools for developing applications on Microsoft’s HoloLens,
integrating with its unique hardware features like hand
tracking and spatial mapping.
Cont.,
Game Engines:
• Unity with AR Foundation: Facilitates cross-platform
development by providing a unified API that works with ARKit
and ARCore.
• Unreal Engine: Known for high-quality graphics, it’s suitable for
complex and visually intensive AR applications.
Custom Software Development:
• Sometimes, off-the-shelf AR SDKs may not meet all your needs.
In such cases, custom software development, possibly using
libraries like OpenCV for computer vision, may be necessary.
Integration Process
Calibration and Synchronization:
• Ensuring that the hardware (e.g., cameras and sensors) and
software (e.g., AR frameworks) are properly calibrated is crucial.
• This involves aligning the software’s understanding of the physical
world with the data captured by the hardware.
• Calibration often includes synchronizing the camera’s field of view
with the software’s rendering system to ensure that virtual objects
appear correctly in the user’s view.
Real-Time Data Processing:
• AR experiences require real-time processing of data from cameras
and sensors to render virtual objects accurately.
• The software must efficiently process this data to maintain a high
frame rate and low latency.
• Machine learning algorithms, often run on specialized hardware
(like GPUs or AI accelerators), can be used for tasks such as object
recognition or environment understanding.
Cont.,
Environmental Understanding:
• The software needs to interpret the physical environment to
allow for realistic interaction between virtual and real-world
objects.
• This includes detecting surfaces (planes), understanding light
conditions, and recognizing objects.
• Advanced AR experiences, particularly in wearables, rely on
spatial mapping and environmental meshing to create a more
immersive experience.
Rendering and Visualization:
• The rendering engine needs to ensure that virtual objects are
displayed with correct perspective, occlusion, and lighting
relative to the real world.
• This might involve using shaders to simulate realistic lighting
effects or adjusting the transparency of virtual objects to blend
them naturally with the physical environment.
Cont.,
User Interaction:
• Integrating user input, whether through touch (on mobile
devices), gestures (on AR glasses), or voice commands, is
essential for interactive AR applications.
• The software must process these inputs in real time, ensuring
that the system responds intuitively to user actions.
Optical & Inertial Calibration
Optical and inertial calibration are crucial steps in ensuring
that an Augmented Reality (AR) system functions
accurately and reliably.
These calibrations align the optical sensors (e.g., cameras)
and inertial sensors (e.g., accelerometers, gyroscopes) so
that the AR software can correctly interpret the real world
and overlay virtual objects seamlessly.
Understanding the Components
Optical Sensors:
• Typically, these are cameras that capture the visual information
from the real world.
• They are responsible for features like object recognition, marker
detection, and environment mapping.
• In some advanced AR devices, optical sensors might include
depth sensors like LiDAR, which capture depth information to
create 3D maps of the environment.
Inertial Sensors:
• These include accelerometers, gyroscopes, and sometimes
magnetometers, which track the device's orientation,
movement, and sometimes even its absolute heading (compass
direction).
• These sensors provide data on the device's motion, which is
crucial for stabilizing the AR content relative to the user’s
movements.
Why Calibration is Important?
Alignment:
• Ensures that the data from the optical and inertial sensors are
aligned correctly.
• Without proper calibration, the virtual objects may appear to
drift, lag, or be misaligned with the real world.
Accuracy:
• Calibration improves the accuracy of pose estimation, which is
critical for placing virtual objects accurately in 3D space.
Stability:
• Proper calibration reduces jitter and improves the stability of
AR content, making the experience more comfortable and
believable for users.
Optical Calibration
Optical calibration focuses on ensuring that the camera(s)
used in the AR system accurately capture and interpret the
visual data.
• Intrinsic Calibration:
• Purpose: To determine the internal parameters of the camera,
such as focal length, optical center, and lens distortion
coefficients.
• Process:
• Capture multiple images of a known calibration pattern, such as a
checkerboard, from different angles and distances.
• Use calibration software (like OpenCV’s calibration tools) to estimate
the camera’s intrinsic parameters.
• These parameters are then used to correct lens distortions and to
ensure that the images captured by the camera accurately represent
the physical world.
Cont.,
Extrinsic Calibration:
• Purpose: To determine the position and orientation of the
camera relative to other sensors (like depth sensors or
another camera in a stereo setup).
• Process:
• In a multi-camera setup, capture images of the same scene
simultaneously from different cameras.
• Use the captured data to calculate the relative position and
orientation between the cameras.
• This step is crucial for applications that rely on stereo vision or
require accurate spatial mapping.
Cont.,
Stereo Camera Calibration:
• Purpose: Used when an AR system employs two cameras
to capture stereoscopic images, allowing for depth
perception.
• Process:
• Perform intrinsic calibration on both cameras individually.
• Use stereo calibration techniques to determine the relative
orientation and position between the two cameras.
• This calibration allows the system to calculate the depth
information by comparing the two images and finding the
disparities.
Inertial Calibration
Inertial calibration ensures that the accelerometer,
gyroscope, and magnetometer (if present) provide accurate
and consistent data.
• Accelerometer Calibration:
• Purpose: To ensure that the accelerometer correctly measures
acceleration in all three axes.
• Process:
• Place the device on a flat surface and ensure that it is perfectly still.
• Record the accelerometer readings for all three axes and compare
them to the expected values (e.g., the gravity vector should read
approximately 9.81 m/s² along the axis pointing downward).
• Adjust the sensor’s scale and offset values to match the expected
output
Cont.,
Gyroscope Calibration:
• Purpose: To ensure that the gyroscope accurately
measures rotational rates without drift.
• Process:
• Keep the device stationary and record the gyroscope readings.
• Since the device is not rotating, the gyroscope should read
near-zero angular velocity. Any consistent deviation (bias) is
corrected through software by subtracting this bias from future
readings.
Cont.,
Magnetometer Calibration (if applicable):
• Purpose: To ensure that the magnetometer correctly
measures the Earth's magnetic field, which is crucial for
determining the device's orientation relative to the Earth's
magnetic north.
• Process:
• Perform a figure-eight motion with the device to recalibrate the
magnetometer.
• The software adjusts for any soft iron (distortions due to
surrounding metals) and hard iron (permanent magnetization)
effects by recalibrating the sensor’s output.
• This calibration is often done periodically or when the device
enters a new environment.
Tracking
Tracking in Augmented Reality (AR) refers to the
technology and techniques used to align virtual objects
with the real world in a way that makes them appear stable
and interactable.
Effective tracking ensures that virtual elements remain
consistent with real-world objects, positions, and
movements. Here are the primary types of tracking used in
AR:
• Marker-based Tracking
• Markerless Tracking (SLAM - Simultaneous Localization and
Mapping)
• GPS/Location-based Tracking
• Object Recognition-based Tracking
• Body and Face Tracking
• Sensor-based Tracking
Cont.,
Marker-based Tracking:
• Uses predefined images, patterns, or QR codes (called markers)
placed in the environment.
• The AR system recognizes these markers through the device’s
camera, and virtual objects are overlaid on or around the
marker.
• Use cases: Interactive product displays, AR games, and art
exhibitions.
Markerless Tracking (SLAM - Simultaneous Localization
and Mapping):
• Relies on the device’s sensors (camera, accelerometer, and
gyroscope) to create a map of the environment.
• SLAM maps the space in real-time, identifying surfaces and
objects where virtual content can be placed.
• Use cases: Indoor navigation, AR games like Pokémon Go, and
home design apps.
Cont.,
GPS/Location-based Tracking:
• Uses GPS data from the device to place virtual objects at
specific geographical locations.
• This is often combined with accelerometer and compass data for
better accuracy.
• Use cases: Location-based AR apps, tourism guides, and social
AR apps that show content in real-world locations.
Object Recognition-based Tracking:
• Recognizes real-world objects and places virtual objects in
relation to them.
• This can be based on specific physical objects, such as furniture
or art, or general categories of objects.
• Use cases: Retail apps (e.g., trying on virtual furniture in a
home), education apps that provide additional information when
objects are scanned.
Cont.,
Body and Face Tracking
• Tracks facial features or the entire body to overlay digital
content, such as virtual masks or animations, onto a person in
real-time.
• Use cases: Social media filters, AR makeup try-ons, and fitness
or motion-based apps.
Sensor-based Tracking
• Uses data from external sensors, like depth sensors (e.g.,
LiDAR), to understand the physical environment in 3D, allowing
more accurate placement and interaction of virtual objects.
• Use cases: Professional AR applications like architecture,
industrial maintenance, or medical training.
Challenges and Advances:
Lighting conditions: AR tracking can be sensitive to
lighting; poor lighting can affect camera-based tracking.
Occlusion: Properly occluding virtual objects behind real-
world objects remains a challenge in achieving realism.
Environmental complexity: The more detailed or dynamic
an environment is, the more sophisticated the tracking
system needs to be to avoid jitter or misplacement.
AR Computer Vision
AR Computer Vision is a subset of computer vision technology
applied to Augmented Reality (AR).
It enables devices to interpret and understand the physical
environment in real-time, allowing virtual objects to be
overlaid seamlessly onto the real world.
This integration is essential for creating immersive AR
experiences.
Image Recognition
• Identifies and classifies specific images or patterns from the camera
feed. This could include logos, QR codes, landmarks, or any
predefined objects.
• Using techniques like feature detection (SIFT, SURF, ORB), the
system matches key points in the image with a database to
recognize the object or image.
• Use cases: AR shopping (recognizing products), educational apps
(identifying objects), AR games, and content linked to brand logos.
Cont.,
Object Detection and Recognition
• Identifies and locates specific objects within a scene, such as furniture,
human faces, or even hand gestures.
• By applying machine learning models (like Convolutional Neural
Networks - CNNs) trained on specific objects, the system can classify
and locate objects in the camera’s view. This is used to place virtual
objects in relation to real-world items.
• Use cases: AR filters (detecting faces for masks), virtual try-ons
(clothes, makeup), object-based information overlays in educational
apps, and AR in robotics.
Simultaneous Localization and Mapping (SLAM)
• Allows devices to map an unknown environment while tracking their
position in it.
• SLAM combines data from cameras and inertial sensors (gyroscope,
accelerometer) to continuously update a map of the environment while
simultaneously tracking the camera's position within it. This is essential
for markerless AR.
• Use cases: AR navigation, virtual object placement in physical spaces
(e.g., furniture apps), interactive AR games, and spatial understanding.
Cont.,
Depth Sensing
• Determines the distance of objects from the camera, enabling
3D spatial awareness.
• Techniques like stereo vision (using two cameras), structured
light (projecting a known pattern onto a surface), or time-of-
flight (using light pulses) are used to measure depth.
• Newer AR devices like iPhones with LiDAR sensors can measure
depth more accurately, allowing virtual objects to interact more
realistically with the physical world.
• Use cases: Accurate virtual object placement, occlusion
(making virtual objects disappear behind real ones), AR
measuring tools, and immersive games.
Cont.,
Surface Detection and Plane Detection
• Identifies horizontal and vertical surfaces where virtual objects can be
anchored, such as tables, floors, or walls.
• The system looks for patterns in pixel movement (optical flow) and
texture in the environment to detect flat surfaces. Once a surface is
detected, virtual objects can be placed with stability.
• Use cases: Furniture apps, AR art installations, and object placement in
AR games.
Pose Estimation
• Determines the orientation and position of an object or person relative
to the camera.
• Using key point detection and geometric algorithms, pose estimation
identifies how an object is positioned in 3D space relative to the camera.
• This is used for aligning virtual objects with real ones or for creating
realistic interactions with the user.
• Use cases: AR gaming (aligning weapons or tools in first-person
perspective), virtual avatars, motion tracking, and gesture-based
interactions.
Cont.,
Feature Detection and Tracking
• Detects and tracks key visual features (edges, corners, etc.) in the
real-world environment to maintain consistent positioning of virtual
elements.
• Algorithms like Harris Corner Detection, Shi-Tomasi, or FAST are
used to detect stable visual features, which can then be tracked over
time to anchor AR objects in place.
• Use cases: AR annotations on real-world objects, interactive AR
games, and tracking moving objects in AR experiences.
Occlusion Handling
• Ensures that virtual objects appear behind real-world objects when
appropriate, creating a more realistic and immersive experience.
• Using depth information from depth sensors or advanced algorithms,
the AR system can determine when a real object is closer to the
camera than a virtual one and thus should occlude the virtual object.
• Use cases: Realistic AR scenes where virtual characters walk behind
real objects, or AR measuring apps where virtual measurements
disappear behind physical objects.
Cont.,
Scene Understanding and Semantic Segmentation
• Identifies and categorizes different parts of the environment
(e.g., distinguishing between the floor, walls, furniture, people,
and objects).
• Machine learning models, particularly CNNs or vision
transformers, are used to segment the camera feed into
different object classes (e.g., sky, road, person, car). This
enables more advanced interactions with the environment.
• Use cases: AR city navigation (placing virtual signs and
information on roads and buildings), interactive games, and
contextual AR overlays for educational purposes.
Mapping
Mapping in Augmented Reality (AR) refers to the process of
creating a digital representation of the physical environment,
allowing virtual objects to be overlaid, interact, and stay anchored
in the real world.
This is critical for immersive AR experiences, as it enables devices
to understand and respond to the geometry, layout, and objects in
a scene.
Plane Detection
• Plane detection is used to find flat surfaces (both horizontal and
vertical) in the environment, such as floors, tables, and walls. Once
detected, these planes serve as anchors for virtual objects.
• Using visual cues, such as parallel lines, textures, and depth
information, the AR system identifies flat surfaces in the environment.
Planes are essential for creating stable virtual content, ensuring objects
don’t “float” or move unrealistically.
• Use cases: Virtual furniture placement (e.g., IKEA Place), object
placement in AR gaming, and virtual art on walls.
Cont.,
Simultaneous Localization and Mapping (SLAM)
• SLAM is the foundational technology for most AR mapping
systems. It allows AR devices to build a map of the environment
in real time while simultaneously determining their own position
within it.
• This is done using data from the device’s camera and inertial
sensors (like accelerometers and gyroscopes).
• Process:
• Feature Detection: The system identifies key features in the
environment, such as edges, textures, and distinct objects.
• Tracking: These features are tracked across frames as the
device moves.
• Map Building: As the device moves through the environment,
it creates a 3D map of the surroundings, updating with new
features.
• Use cases: Markerless AR, indoor navigation, AR gaming,
Cont.,
Depth Mapping
• Depth mapping provides a sense of distance between the
camera and objects in the environment.
• Devices equipped with depth sensors, such as LiDAR or
structured light sensors, can create detailed 3D maps by
calculating the distance to surfaces.
• Depth is measured by projecting infrared light or other patterns
onto the environment and analyzing how the pattern deforms
when it hits surfaces. This allows for a precise understanding of
the 3D structure of the scene.
• Use cases: Accurate object placement, occlusion (where virtual
objects disappear behind real objects), AR measurement apps,
and enhancing realism in AR games.
Cont.,
Environment Understanding
• Beyond simple plane detection, some AR systems can classify
different parts of the environment (walls, ceilings, floors,
objects) and understand their semantic meaning.
• Using machine learning and computer vision, the AR system
analyzes the scene to distinguish between different objects and
surfaces.
• This can involve detecting furniture, appliances, or even
recognizing specific rooms.
• Use cases: Home design apps, where furniture is placed
intelligently, AR-based interior design, or advanced gaming
experiences that interact with the user’s surroundings.
Cont.,
3D Object Mapping and Reconstruction
• In some cases, AR needs to map not only surfaces but also complex
objects within the environment. This involves recognizing and
reconstructing 3D objects to place virtual elements correctly.
• The AR system scans the object from multiple angles, generating a
3D model. This can be done in real time, allowing for dynamic
interactions between virtual objects and real-world objects.
• Use cases: AR games where virtual objects interact with physical
objects (e.g., a virtual character jumping on a real sofa), or
educational apps where objects can be scanned and enhanced with
virtual annotations.
Cont.,
Marker-based Mapping
• Marker-based AR uses predefined images, QR codes, or fiducial
markers as anchors to position virtual objects in the physical
world.
• These markers are scanned by the camera and serve as a
reference point for positioning the virtual content.
• When the AR system detects a marker, it calculates the marker’s
position, orientation, and scale in 3D space. This allows virtual
objects to be placed accurately relative to the marker.
• Use cases: Interactive product displays, museum guides, or
simple AR games.
Cont.,
Occlusion and Collision Mapping
• Occlusion occurs when virtual objects are hidden behind real-
world objects, enhancing realism.
• Collision mapping ensures that virtual objects don’t pass
through physical ones, providing more realistic interaction.
• Using depth sensors or computer vision, the system determines
the depth of real-world objects and calculates when a virtual
object should be obscured.
• Collision mapping involves detecting when virtual and real-
world objects should not occupy the same space.
• Use cases: Games where virtual characters interact with real-
world objects, AR-based design apps, or simulations that involve
both real and virtual elements.
Cont.,
Spatial Mapping
• Spatial mapping involves creating a detailed, comprehensive 3D
model of the environment.
• This allows AR applications to place virtual objects more
accurately, accounting for walls, furniture, and other obstacles.
• Using data from cameras, depth sensors, or even external
sensors, the system creates a mesh or point cloud
representation of the environment.
• This allows for more precise interactions between virtual
objects and the physical world.
• Use cases: Industrial AR (e.g., factory maintenance),
architecture and construction apps, and immersive AR
experiences in entertainment.
Cont.,
Cloud-based AR Mapping
• Cloud-based AR mapping allows multiple users to share the
same AR experience by synchronizing mapping data through the
cloud. This makes it possible for virtual objects to be placed in
shared physical spaces, accessible to different users.
• AR data, such as maps and anchors, are uploaded to the cloud
and shared across multiple devices. This enables multi-user AR
interactions, where everyone sees the same virtual content in
the same location.
• Use cases: Multi-user AR games, collaborative design apps, or
tourism apps with shared AR experiences.
Platforms
There are several platforms available for developing
Augmented Reality (AR) applications, each with unique
tools, capabilities, and target devices. These platforms
provide software development kits (SDKs), frameworks,
and APIs to help developers create immersive AR
experiences. Here’s an overview of the most prominent AR
platforms:
Cont.,
ARKit (Apple)
• Platform: iOS and iPadOS
• Overview: ARKit is Apple’s framework for building AR
applications on iPhones and iPads. It leverages the hardware of
Apple devices, such as the camera, accelerometer, and LiDAR
sensors (on supported devices), to create immersive AR
experiences.
• Use cases: AR games, virtual product try-ons, face filters, and
AR shopping apps.
Cont.,
• Key Features:
• World Tracking: Uses visual-inertial odometry for precise
tracking of device movement.
• Plane Detection: Identifies horizontal and vertical surfaces for
object placement.
• Light Estimation: Analyzes the ambient light in the
environment for realistic lighting on virtual objects.
• Face Tracking: Tracks up to 50 facial expressions for real-time
face interactions.
• People Occlusion: Allows virtual objects to appear in front of
or behind people in the scene, improving realism.
• Location Anchors: Places AR objects at real-world locations
using geospatial data.
• Scene Reconstruction: Uses LiDAR sensors to create
accurate 3D maps of the environment.
Cont.,
ARCore (Google)
• Platform: Android (and some iOS support)
• Overview: ARCore is Google’s AR development platform
that enables augmented reality experiences on Android
devices, and it supports some iOS devices through cross-
platform development. ARCore uses a combination of the
device’s camera and sensors to understand the physical
world.
• Use cases: AR games, interactive retail apps, and
educational apps.
Cont.,
• Key Features:
• Motion Tracking: Tracks the device’s position relative to its
surroundings using visual-inertial odometry.
• Environmental Understanding: Detects flat surfaces
(horizontal and vertical planes) for placing virtual objects.
• Light Estimation: Measures the lighting in the environment to
adjust the appearance of virtual objects.
• Augmented Faces: Provides real-time face tracking for virtual
makeup, masks, and filters.
• Cloud Anchors: Enables multi-user shared AR experiences by
allowing different devices to place virtual objects in the same
physical space.
• Depth API: Provides more accurate depth perception for
realistic placement of virtual objects, including occlusion
effects.
Cont.,
Microsoft Mixed Reality (HoloLens and Windows
Mixed Reality)
• Platform: HoloLens, Windows PCs (with Mixed Reality
headsets)
• Overview: Microsoft’s Mixed Reality platform enables
developers to create AR (augmented) and MR (mixed
reality) experiences on HoloLens and Windows Mixed
Reality headsets. It’s focused on enterprise applications but
also supports creative development.
• Use cases: Industrial training, architectural visualization,
medical simulations, and collaborative design.
Cont.,
• Key Features:
• Spatial Mapping: Creates a detailed map of the environment
using sensors and cameras.
• Hand and Eye Tracking: Supports gesture-based interaction
and eye-tracking for natural user inputs.
• World Anchors: Places virtual objects that stay in the same
real-world position over time.
• Shared Experiences: Allows multiple users to interact with
the same virtual environment simultaneously.
• Holographic Rendering: Renders high-quality 3D holograms
that can interact with the physical world.
• Azure Spatial Anchors: Provides cloud-based anchors for
persistent, cross-device AR experiences.
Cont.,
Vuforia
• Platform: Android, iOS, Windows, and Unity
• Overview: Vuforia is one of the most popular AR SDKs that supports a
wide range of devices and platforms. It’s known for its strong capabilities
in object recognition and marker-based AR.
• Key Features:
• Marker-based Tracking: Uses specific images (markers) as reference points
for placing virtual objects.
• Object Recognition: Can recognize 3D objects and track them in real time.
• Model Targets: Allows virtual objects to be placed on physical objects like
machinery, cars, or furniture.
• Ground Plane: Detects surfaces without the need for markers, enabling
markerless AR experiences.
• Multi-Target Tracking: Can track multiple images and objects simultaneously.
• Cloud Recognition: Provides large-scale image recognition using cloud
storage.
• Use cases: AR product catalogs, industrial maintenance apps, retail and
marketing experiences, and educational tools.
Cont.,
Unity MARS (Mixed and Augmented Reality Studio)
• Platform: Unity (Android, iOS, AR/VR headsets)
• Overview: Unity MARS is a development environment within the
Unity game engine that simplifies AR and MR development. It offers
robust tools for creating experiences that interact with the real world.
• Key Features:
• Simulation View: Allows developers to simulate real-world environments
for testing AR experiences.
• Real-World Conditions: Enables content to adapt dynamically to physical
surroundings (e.g., adjusting size or placement based on the environment).
• Behavioral Controls: Provides AI-driven behaviors to make virtual objects
respond intelligently to real-world data.
• AR Templates: Prebuilt templates and assets for quick AR development.
• Cross-platform Integration: Supports ARCore, ARKit, and other major
AR SDKs.
• Use cases: Cross-platform AR games, interactive educational apps,
and industrial design tools.
Cont.,
Spark AR (Meta)
• Platform: Facebook, Instagram (iOS and Android)
• Overview: Spark AR is Meta’s platform for creating AR effects and
filters, primarily for Facebook and Instagram. It’s user-friendly,
allowing developers and creators to build AR experiences without
deep coding knowledge.
• Key Features:
• Face Tracking: Provides tools to create face filters and masks.
• Hand Tracking and Gesture Recognition: Allows interaction using
hand movements.
• Plane Tracking: Detects flat surfaces for placing 3D objects.
• Customizable Shaders: Enables developers to create custom effects and
materials.
• Target Tracking: Recognizes images or objects to trigger AR effects.
• Interactive Animations: Supports animations that respond to user
inputs, such as tapping or gestures.
• Use cases: Social media filters, interactive ads, and brand
marketing campaigns.
Cont.,
8th Wall
• Platform: Web-based (iOS, Android, desktop)
• Overview: 8th Wall is a leading WebAR platform, allowing developers
to create AR experiences that can be deployed on mobile and desktop
browsers. It’s known for its accessibility and scalability.
• Key Features:
• WebAR: No app download is required; experiences run directly in
browsers.
• Markerless AR: Supports plane detection for placing virtual objects in
real-world spaces.
• Face Effects: Includes facial tracking for creating face filters.
• World Effects: Allows the creation of world-locked content that stays
anchored in physical spaces.
• Multi-user AR: Supports collaborative AR experiences across devices.
• Analytics: Tracks user interactions for marketing and campaign insights.
• Use cases: Retail and marketing campaigns, interactive experiences
in museums, and educational AR.
Cont.,
Wikitude
• Platform: Android, iOS, Windows, and Unity
• Overview: Wikitude is an AR SDK that supports both marker-based
and markerless AR, known for its flexibility and wide range of
applications.
• Key Features:
• Instant Tracking: Enables markerless AR without the need for predefined
markers.
• Image and Object Recognition: Tracks and recognizes both 2D and 3D
objects.
• Geo-location AR: Uses location data to place AR objects based on GPS
coordinates.
• Cloud Recognition: Allows for large-scale object recognition via cloud
databases.
• SDK Compatibility: Works well with other SDKs like ARKit, ARCore, and
Unity.
• Use cases: Tourism apps, location-based AR, retail applications, and
industrial tools.
Cont.,
EasyAR
• Platform: Android, iOS, Windows, Unity
• Overview: EasyAR is another popular AR SDK focused on
providing accessible and robust tools for developing AR
experiences, particularly for mobile devices.
• Key Features:
• Surface Tracking: Detects and tracks flat surfaces for object
placement.
• Image Tracking: Recognizes images to trigger AR content.
• Cloud Recognition: Offers cloud-based image recognition.
• 3D Object Tracking: Tracks complex 3D
Lightings
Lighting in Augmented Reality (AR) plays a crucial role in
making virtual objects blend seamlessly into the real world.
Realistic lighting is essential for enhancing immersion,
creating believable AR experiences, and ensuring that
virtual elements interact naturally with their physical
surroundings.
Cont.,
Light Estimation
• Light estimation is a process where AR systems analyze the
lighting conditions in the physical environment to adjust the
appearance of virtual objects. This ensures that virtual objects
match the real-world light levels and color temperatures.
• Process:
• The AR device captures the ambient light using the camera and
other sensors.
• Based on this information, it calculates the intensity and color
temperature of the real-world light.
• The virtual objects are rendered with adjusted lighting,
including shadows, reflections, and highlights, to match the
scene.
• Use cases: AR filters, games, and virtual product
placement where lighting consistency is crucial for realism.
Cont.,
Directional Lighting
• Directional lighting is used to simulate light coming from a
specific direction, such as sunlight or a spotlight. In AR, virtual
objects need to reflect the direction of the real-world light
sources to appear natural.
• Process:
• The AR system estimates the direction of the primary light
source in the real world (e.g., sunlight coming through a
window).
• Virtual objects are then rendered with highlights and shadows
corresponding to the estimated light direction.
• Use cases: Outdoor AR experiences where sunlight is the
primary light source, or indoor scenarios with directional
artificial lighting.
Cont.,
Shadows in AR
• Shadows are essential for grounding virtual objects in the real world.
Without shadows, AR objects may appear to "float" unnaturally. Shadows
in AR are dynamically generated based on the lighting conditions and
object placement.
• Types of Shadows:
• Hard Shadows: Sharp shadows that occur in direct lighting, like
sunlight.
• Soft Shadows: Blurred, diffused shadows that result from ambient or
scattered light.
• Techniques:
• Projected Shadows: The AR system estimates the position of the light
source and casts shadows of virtual objects onto the detected surfaces
(like floors or tables).
• Dynamic Shadows: Shadows adjust in real time as the virtual object or
light source moves.
• Use cases: AR games where characters move through an
environment, virtual furniture apps, and AR product demos.
Cont.,
Reflections
• Reflections in AR help virtual objects blend into the environment by
mirroring real-world lighting and objects. Accurately simulating
reflections ensures that virtual objects appear shiny or reflective as
they would in the real world.
• Techniques:
• Screen-space Reflections (SSR): Reflections are computed based
on what the camera can see, making virtual objects reflect the
surrounding environment.
• Environment Mapping: Pre-generated environment maps are used
to approximate reflections, especially in real-time AR experiences.
• Challenges: Accurately capturing real-time reflections from
dynamic surroundings is computationally expensive, especially
on mobile devices.
• Use cases: Virtual mirrors, AR fashion try-ons (e.g., sunglasses
or jewelry), and shiny virtual surfaces like cars or metallic
objects.
Cont.,
HDR Lighting
• High Dynamic Range (HDR) lighting is used to capture a
broader range of light intensities in the real world, from the
darkest shadows to the brightest highlights. HDR helps virtual
objects match the real world in environments with extreme
lighting variations.
• Process: The AR system captures HDR images or data and
uses it to adjust the brightness and contrast of virtual
objects, ensuring they fit into scenes with complex lighting.
• Use cases: Outdoor AR experiences with varying sunlight
intensities, or indoor scenes with strong artificial lighting.