0% found this document useful (0 votes)
2 views

Emerging Technology Presentation Report

The document presents an overview of DALL-E 2, an AI model developed by OpenAI that generates images from textual descriptions using advanced neural network techniques. It discusses the creation process, technical challenges, applications in various fields, and limitations of DALL-E 2, emphasizing its potential in enhancing creativity while acknowledging its current weaknesses. The conclusion highlights the importance of artists in the creative process despite the advancements in AI technology.

Uploaded by

ravenspar
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
2 views

Emerging Technology Presentation Report

The document presents an overview of DALL-E 2, an AI model developed by OpenAI that generates images from textual descriptions using advanced neural network techniques. It discusses the creation process, technical challenges, applications in various fields, and limitations of DALL-E 2, emphasizing its potential in enhancing creativity while acknowledging its current weaknesses. The conclusion highlights the importance of artists in the creative process despite the advancements in AI technology.

Uploaded by

ravenspar
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 8

Emerging Technology Presentation Report

DALL-E 2
The Future of Art?

Presented by
Tanmay R K
Davan C Reddy
Puneeth
Lakshmi
From
Department of Computer Science and Design, Dayanada
Sagar College of Engineering
Introduction

What is Artificial Intelligence?


Artificial Intelligence (AI) is a method of programming a
computer, robot, or other object to think like a smart human. AI is
the study of how the human brain thinks, learns, makes
decisions, and works to solve problems. Finally, this research
generates intelligent software systems. The goal of artificial
intelligence is to improve computer functions that are linked to
human understanding, such as reasoning, learning, and problem-
solving.

What is DALL-E 2
DALL-E is a neural network created by OpenAI that is capable of
generating images from textual descriptions. It is trained on a
dataset of text-image pairs, and is able to generate a wide range
of images based on the input it receives.
DALL-E uses a combination of natural language processing and
computer vision techniques to generate its images, and is able to
produce a wide range of images that are both creative and
realistic. It has been used to generate a wide range of images,
from realistic landscapes and animals to more fantastical and
surreal images.
Creation of DALL-E 2

DALL-E was created by a company called OpenAI.


Before the company was creating innovative text-to-image
machine learning concepts through DALL-E, it started out as a text
generator, more specifically a language processor. In 2019,
OpenAI had initially created a model called the GPT-2 that could
predict the next word within a text. It had 1.5 billion parameters
and was trained on 8 million web pages to produce its data set.
The goal was to predict the next word, similar to a text-to-text
generator. “On language tasks like question answering, reading
comprehension, summarization, and translation, GPT-2 begins to
learn these tasks from the raw text, using no task-specific training
data,” OpenAI stated. Its successor, the GPT-3 model, would
become the preliminary model for DALL-E, altered to generate
images instead of additional text.
Working of DALL-E
DALL-E 2 consist of 2 parts.
One to convert captions into representation of images called prior.
Then to turn this representation into an actual image called
Decoder.
The text and image representations used in DALL-E 2 are created
using another technology called Clip.
Clip in simple terms, is a neural network that returns the best
caption for a given image. It matches images to it's corresponding
captions. Along with this, a technique called Diffusion is used. In
this technique noise is added to an image in timestamps until it is
unrecognisable and from there try to generate the original image
back.
This information is then passed on to the Decoder, where a image
generation technique called Glide is used. Glide stands for Guided
Language-to-Image Diffusion for Generation and Editing. Glide is
similar to Diffusion but it also includes the clip embeddings to
support the image generation.
After a preliminary image is created that is 64 pixels, two upscaling
steps are used to make the images high resolution.
For image variation, the image is run through an encoder and trivial
aspects of the images are changed while the key subjects are
represented in different forms.
Technical Challenges
In order to share the magic of DALL-E 2 with a broad audience,
OpenAI needed to reduce the risks associated with powerful image
generation models. To this end, They needed overcome certain
challenges such that DALL-E 2 does not violate certain ethical
values. They put various guardrails in place to prevent generated
images from violating their content policy.
➢ Filter out sensitive images
➢ Prevent bias formed due to filtered dataset
➢ Prevent regurgitation of images

Filter out Sensitive Images


Without this mitigation, the model would learn to produce graphic
or explicit images when prompted for them, and might even return
such images unintentionally in response to seemingly
innocuous prompts.
Prevent bias formed due to filtered dataset
filtering training data can amplify biases for example models
trained on filtered data sometimes generated more images
depicting men and fewer images depicting women compared to
models trained on the original dataset.
Prevent regurgitation of images
Models like DALL·E 2 can sometimes reproduce images they were
trained on rather than creating novel images. Images that are
replicated many times in the dataset, and mitigate the issue by
removing images that are visually similar to other images in
the dataset.
Application
• 3D Models:-
DALL-E 2 can be used to generate initial ideation 3D models. This
opens up a world of possibilities for designers as the AI is really
good at making 3D mockups that look like the real thing.

• Artists and Designers:-


If you're a visual artist or designer, DALL-E 2 could be a valuable
tool for generating ideas and inspiration. The system can be used
to create illustrations, product designs, and even 3D models.

• Any Person:-
Anyone can use DALL-E. It allows a person to express their
thoughts in the form of image without the need for good artistic
skills. It opens a whole new space in the world of creativity.
Limitations
• Physical Quality Attributions
DALL-E 2 lacks the awareness of positions. When asked to
create an image with red block on top of blue block. It
produced images that aren’t remotely close to the description.

• Inability to form words


It struggles to form proper words and cannot create any
sentences with grammar. This aspect limits it’s capability in
images that require text.

• Lack of Detail
Dall-E while having wide data it cannot produce images with
high detail. This theyakness makes it almost impossible to
be used in cinematography and several other fields due to
it’s lack of detail. It also struggles with low resolution.
Conclusion
DALL-E 2 is an extraordinary general purpose AI tool for artistry
and creativity. It breaks the barriers of what AI was thought could
do and exceeds the boundary of AI capabilities. It’s creation opens
a new path to our imagination and allows our creativity to blossom.
But while it’s a great instrument for ideation, it is also limited by it’s
weaknesses. These weaknesses hold it back from making artists
obsolete. DALL-E’s limitations demonstrates the requirement of
actual artists and designers.
Future development of DALL-E 2 will undoubtedly focus on
improving on its weaknesses and further push the boundaries of
AI competence. This growth will certainly mark a new era in the
field of art and imagination.

References

• https://openai.com/blog/dall-e-2-pre-training-mitigations

• https://www.youtube.com/watch?v=qTgPSKKjfVg

• https://screenrant.com/future-dall-e-ai-explained

• https://www.assemblyai.com/blog/how-dall-e-2-actually-
works

You might also like