Abhi-report
Abhi-report
The input data can be 3d models, scene descriptions, data sets etc. A software application or
component that performs rendering is called a rendering engine, render engine, rendering
system, graphics engine. Rendering can produce images of scenes or objects defined using
coordinates in 3D space, seen from a particular viewpoint. Such 3D rendering uses knowledge
and ideas from optics, the study of visual perception, mathematics, and software engineering,
and it has applications such as video games, simulators, visual effects for films and television,
design visualization, and medical diagnosis. Realistic 3D rendering requires finding
approximate solutions to the rendering equation, which describes how light propagates in an
environment.
Solving the above rendering equation gives us the value of the colour of the pixel that we need
to use to convert 3d image into 2d image.
To generate rendered 2d image, we need a 3d model, the 3d model can be made using Polygon
Meshes. Polygon meshes are mesh like structure made up of different shaped polygons to
make sure we capture every detail in the 3d model. Amount of details are directly related to the
amount of polygons used to capture a 3d model. Less polygons, less details and edges. But it
will be computationally expensive to have a large number of polygons to cover the whole 3d
object.
Such as:
1. Scene Preparation
This is the initial step where all the required components of a 3D scene are prepared and
defined. These components include:
3D Models: Objects in the scene are defined using geometry (e.g., meshes made up of
polygons).
Materials and Textures: Each object is assigned surface properties (materials) like
colour, reflectivity, or roughness. Textures are 2D images mapped to 3D surfaces to
provide detailed appearances (e.g., wood grain or metal).
Lights: Virtual light sources are positioned in the scene, mimicking real-world lighting
conditions (e.g., directional lights, point lights, or ambient lighting).
Camera: A virtual camera is set up, determining the viewpoint and perspective from
which the scene will be rendered.
3. Geometry Processing
In this stage, the 3D geometry is processed to make it ready for projection onto a 2D screen.
Transformations:
o Model Transformation: Positions objects in the 3D world.
o View Transformation: Adjusts the scene based on the camera's position and
orientation.
o Projection Transformation: Converts 3D coordinates to 2D screen space
using perspective or orthographic projection.
Clipping: Objects or parts of objects outside the camera's field of view (frustum) are
removed to optimize rendering.
Tessellation: Converts higher-level shapes (like curves or surfaces) into triangles or
polygons that the renderer can process.
4. Rasterization or Ray-Tracing
This step involves determining how the 3D objects map onto the 2D screen and how they
interact with light. This can be done using different rendering techniques:
Rasterization:
Converts 3D geometry into pixels (rasterizes).
Determines which pixels (fragments) correspond to which objects in the scene.
Fast and widely used in real-time rendering (e.g., video games).
Ray Tracing:
Simulates the path of light rays as they interact with surfaces.
Calculates reflections, refractions, shadows, and global illumination.
Produces highly realistic images but is computationally intensive.
7. Image Post-Processing
Once the basic rendering is complete, additional effects are applied to enhance the visual
quality of the image.
Anti-Aliasing: Smoothens jagged edges of objects by blending pixel colors.
Motion Blur: Adds a blur effect to objects moving quickly.
Depth of Field: Simulates camera focus by blurring objects outside the focal plane.
Global Illumination: Computes how light bounces between surfaces for realistic
lighting.
Special Effects: Lens flares, bloom, color grading, and other cinematic effects.
8. Final Image Output
The final step involves generating the output image or sequence of images (for animations) and
saving them in a specific format.
File Formats:
o For Images: PNG, JPEG, BMP, TIFF.
o For Videos: MP4, AVI, MOV.
Resolution: Determines the image quality and size, such as 1920x1080 (Full HD) or
4K.
But, traditional ways are always lack some efficiency, I mean there’s always a new way in
which we can do a work efficiently and easily. Rendering do also have such a way, that is
Neural rendering.
Neural rendering:
The reconstruction of such a scene representation from observations using differentiable
rendering losses is known as inverse graphics or inverse rendering. Neural rendering is closely
related, and combines ideas from classical computer graphics and machine learning to create
algorithms for synthesizing images from real-world observations. Neural rendering is a leap
forward towards the goal of synthesizing photo-realistic image and video content.
The key concept behind neural rendering approaches is that they are differentiable. A
differentiable function is one whose derivative exists at each point in the domain. This is
important because machine learning is basically the chain rule with extra steps: a differentiable
rendering function can be learned with data, one gradient descent step at a time. Learning a
rendering function statistically through data is fundamentally different from the classic
rendering methods we described above, which calculate and extrapolate from the known laws
of physics.
Two great papers on this topic: one from Google Research [1] and the other from Facebook
Reality Labs [2]. Both of these works use a volume rendering technique known as ray
marching. Ray marching is when you shoot out a ray from the observer (camera) through a 3D
volume in space and ask a function: what is the colour and opacity at this particular point in
space? Neural rendering takes the next step by using a neural network to approximate this
function.
Real time rendering is one of many applications of neural rendering, which applies concepts
such as Rasterization in time-sensitive applications, such as gaming, augmented reality, and
virtual reality, where the rendering must happen within milliseconds.
How it works:
1. Scene Preparation:
o Think of the game as a virtual world made of objects, like trees, mountains, and
characters.
o Normally, the game engine would draw (render) these objects using lots of math to
calculate their appearance (shapes, lighting, shadows, etc.). This can take time,
especially for complex details like realistic lighting or reflections.
2. AI Artist Steps In:
o Instead of the engine calculating every single detail for every frame, it uses an AI
model (a trained neural network) to "guess" or "fill in" the missing details much
faster.
o For example, the AI knows how light behaves or how reflections should look, so it
quickly adds them without needing to calculate every bounce of light.
3. Training the AI:
o Before the game is released, the AI is trained on tons of data, like pictures of real-
world environments or high-quality renders. This training teaches the AI how to
recreate realistic details from simpler inputs.
4. Real-Time Rendering:
o When you're playing the game:
The game engine handles the basic scene (where objects are, their shapes, and
movement).
The AI artist enhances the scene in real time by adding realistic lighting,
shadows, and fine textures.
o For example:
If you move closer to a shiny car, the AI quickly generates accurate
reflections of nearby trees on the car's surface.
If you're in a dark cave, the AI adds realistic soft lighting that bounces off the
walls.
NVIDIA’s DLSS (Deep Learning Super Sampling) is an example of real time neural
rendering, which is an option in graphic settings for video games which produces a highly
realistic graphics which are actually based on real world examples.
Let’s consider NVIDIA’s DLSS feature, the company has [1] patent filed with the title Image
enhancement using one or more neural networks. The following is the brief description of
the patent:
Optimizing video generation and display for devices with limited resources. The technique
involves breaking down video processing into multiple stages with lower resource
requirements. This allows higher resolution videos to be generated and displayed at target
frame rates on devices with limited resources. The stages are:
3. Anti-aliasing filtering for improved quality. This avoids the challenge of generating high
resolution video directly with limited resources. It enables trade-offs between resolution,
quality, and resource usage.
Another patent was filed, from Microsoft [2], which ultimately does the same, rendering using
neural networks, but the method and the technologies it incorporates are different. The title of
the patent is High resolution neural rendering. The following is a brief summary of the patent:
Efficiently generating and rendering novel viewpoints of 3D scenes using separable neural
networks that can be cached for fast inference. The method involves training two separate
neural networks for positional encoding and directional encoding of points in a 3D scene. The
outputs of these networks are cached. When rendering a novel viewpoint, instead of re-
computing the scene, the cached outputs are looked up and combined using weighting schemes
based on the new view directions. This allows efficient viewpoint generation by replacing
network execution with cache lookups.
These two patents do have similarities, they are:
Resources:
1. https://www.mindstick.com/articles/333005/understanding-the-importance-of-
rendering-in-computer-graphics-and-various-techniques
2. https://en.wikipedia.org/wiki/Rendering_%28computer_graphics%29?utm_source=ch
atgpt.com
3. https://www.aimircg.com/process/
4. https://www.khanacademy.org/computing/pixar/rendering/rendering1/v/overview-
rendering
5. https://arxiv.org/abs/2111.05849?utm_source=chatgpt.com
6. https://archicgi.com/cgi-services/computer-3d-rendering-architecture-
design/?utm_source=chatgpt.com
7. https://paperswithcode.com/task/neural-
rendering#:~:text=%E2%80%A2%207%20datasets-
,Given%20a%20representation%20of%20a%203D%20scene%20of%20some%20kin
d,by%20image%2Fscene%20appearance%20manipulation.
~Abhi