0% found this document useful (0 votes)
4 views

Gen AI

Uploaded by

Aishna Gupta
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
4 views

Gen AI

Uploaded by

Aishna Gupta
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 8

GenAI definition: refers to a type of AI that can create new content, such as images, text or

music by learning patterns and structures from input data.

Application of genAI:
-text: widely used in sectors like content creation, drafting emails, writing code, creating chatbot
responses. Large Language models(LLMs) power text generation and have widespread
applications. Chatgpt is an application of LLMs.
-Images: can create new images based on user prompts. Generated using microsoft bing
image, creator powered by DALL-E image model.. Imagen and DALL.E are the two popular text
to image models.
-AUdio: can create new audio based on user description.
-Video: new video can be generated by providing the right description to the genAI model.
-Code: help developers write code base on the task.

Discriminative vs GenAI:
So genAI is similar to convolutional neural network which create high level features of input
images.
genAI model learns and creates new artifacts.
Discriminative ai differentiate between the data. Example: KNN classifier which could
differentiate between spam and non-spam mail.
Discriminative ai doesnt learn about the data patterns. It just differentiate data points which is
also known as learning the decision boundary. Once decision boundary is known data can be
classified on the different basis

Gen ai- computationally expensive then Discriminative ai


Gen ai- severely impacted by outliers. Discriminative ai less.
Gen ai- output is text, image, audio while Discriminative ai is class, group or tag
Discriminative ai labeled data, genai both labeled and unlabeled data.

Foundation Models: form basis of genAI.

AI: science of developing intelligent machines. It enables the machine to sense, comprehend,
act and learn.
ML: it enables the machine to learn directly from data without being explicitly programmed.
Enables systems to learn patterns from and subsequently improve from experience.
ML can be supervised learning or unsupervised.
Supervised works with labelled data features + label. Example: regression , decision tree
Unsupervised works with unlabeled datsaet. Example: k-means clustering, principal component
analysis.
DL: artificial neural network(ANN) which mimics the human brain. Learn complex patterns from
data automatically. Discover patterns in data by automatically learning a hierarchical layer of
features. Works with both labeled and unalabed data. GenAI is sub field of DL

Genai: trained on vast unlabeled data such as wikipedia and data repositories.
When queried, the models return responses derived from the data, they have learned from.
These model are foundational models.
Foundational models can be classified in Language Models (such as generative pre trained
transformer eg:chatgpt) and Image Models(such as diffusion models eg: midjourney)
Foundational model are the ML models trained on huge amount of data that can be used for
many different applications. Trained on text, image, speech and structured data. Training is
self-supervised learning. Example: gpt, T5, BERT
self-supervised learning: models pretends as if there is some missing part in the training data
and tries to predict the missing data. For example:
In image data:

How chatgpt works:


Image generation:
- Generative art: magical painting created by computer. Computer uses algo to create
painting. Just like meta ai of instagram.
- Generative modeling: it is generating new data from existing data. It has three classes:
VAEs or variational autoencoder, GANs or generative adversarial networks and Diffusion
Models.
● application of generative models: image super resolution(generates high
resolution image), creating new art, image to image translation, text to image
translation, image captioning, photo editing, 3d world creation.
- VAEs: they use the analogous concept of ‘image projection’ called latent space.
Basically from a projection or shadow of an image you can predict what the image is.
VAEs create images from these latent spaces to learn the distribution.
● Latent space is a form of spatial representation of images in a compressed form
which helps the computer understand the important part of the image. It reflects
the proximity structure of the dataset.
● VAEs use encoder decoder architecture: encoder compresses the input image in
latent space while decoder does the opposite.
● Training goal of VAE: to make a reconstructed image as similar as possible to the
original image.
● Encoder: it learns the mapping from input data(X) to a condensed
representation(Z) in latent space. The network learns to extract the most relevant
features in latent space. Condensed image (Z) gives enough information to
reconstruct the original image.
● Decoder: it learns mapping back condensed representation Z to a constructed
image x*
● VAE follows an unsupervised learning approach as it learns from the data itself.
● VAE ARchitecture:

- GANs: get trained by playing a game between two of its components. It created fake
images that look real. First described in 2014.
● Goal of GANs: learn distribution form training and generating samples.
● GAN trains a network that learns transformation from noise to training data
distribution.
● GAN two components: Generator and Discriminator.
● Generator takes random noise as input and gives a fake image.
● Discriminators take fake images as input and classify it as fake or real.
● Generators and discriminators play the zero-sum game which means loss of
others is gain of others. Basically see-saw. So basically the first generator makes
a fake image based on given input and the discriminator has to come with the
verdict that it's fake and now the generator has to again make the image which
looks as real as possible so that the discriminator can't distinguish between real
and fake.
● Problem with GANs: less diversity, difficult to train, difficult to scale, difficult to
apply to new domain.
- Diffusion models: a type of generative model that learns the distribution of images by
modeling the way in which images are corrupted by noise and then denoised by neural
network. The idea is train the neural network to reverse the diffusion process which
gradually adds gaussian noise to image which results in pure noise.
● Denoising diffusion probabilistic: the diffusion model has two processes: forward
diffusion or noising process and reverse denoising process.


● Reverse diffusion is applied when the model is trained.
● Forward diffusion: its a fixed process in which noise is added slowly and
iteratively to the images.
● Example of forward diffusion: here the initial X0 image is dissued to Xt noise
where it no longer can’t be recognised as X0 image.

● Reverse denoising process: goal is to learn the noise in a noisy image and then
subtract it to get a clean image. This is a generative process that tries to reverse
the forward process.


● Guidance: it helps in generating images from text descriptions.

##LLMs: Large language model is an advanced deep learning model.


- It understands and generates human like text, revolutionary natural language processing
with its transformative capabilities.
- Leverages vast amount of data and powerful algorithms to process and comprehend
language in a highly sophisticated manner.
- Characteristics of LLMs: enormous dataset- trained on vast dataset, excel in general
purpose task, wide range of capabilities- can do machine translation, text summary,
sentiment analysis, language generation and more, highly advanced.
- RNNs: recurring neural network is the fundamental component of traditional language
models.
● RNN useful of sequential data, hence well suited for language modeling task

- RNN have few limitations and to overcome that encoder decoder RNN is introduced.
This rnn have two separate encoder and decoder . encoder processes the input
sequence and generate the fixed length representation called the context vector.and
capturing the essential information. The decoder takes the context vector generated by
the encoder and generate the output sequence(one element at a time).
- By using the encoder decoder RNN translation inscreases a lot but still there is a
information loss in between of encoder to decoder and hences its not an ideal model for
long range dependencies and comprehensive task. Hence we use attention mechanism.
- Attention Mechanism allows dynamically allocate their focus on the relevant part of their
input sequences. Their are few types of attention mechanism
● Generalised attention:compares input and output sequences elements, selecting
areas of focus based on scores.
● Self Attention: specific input sequence parts are chosen to compute the output
sequences.
● Multi-head attention: utilizes parallel head to process input and output sequences
● Additive Attention: correlates source and target sequences words using
alignment scores.
● Global Attention: attends to all scores words or predicts target sentences with
different context vectors.
LLm:

Transformer in LLM:

First step is the input sequence goes to the embedding layer where it is stored in the embedding
vector. Then next step positional encoding layer:
Example of a transformer is GPT3, a language model developed by openAI. Unsupervised
learning model and unimodal it can only accept text prompts.
BERT: developed by google, semi-supervised as used both labeled and unlabeled dataset.

##Consume and customize genAI.


There are 2 ways to use genAI:
1. Consume: in this model is used as it is or off the shelf, no change in model weights or
parameters, 2 ways to consume: prompt engineering and embeddings.
2. Customize: in this model is customized for task, model parameters are changed for
domain data, fine-tuning trains model on domain specific data.

##Embeddings: numerical representation of text. They are multi dimensional vectors which
helps in capturing true meaning of text.
-embeddings are saved in databases known as vector databases. Example of open course
vector database: chroma, facebook ai similarity search(faiss), lance. Proprietary vector
database: pinecoma, vectara.

##Fine-Tuning: retraining the generative ai model with new data for a specialised task.

##GenAI Architecture: there are 7 key components for architecting a genAI solution:
1. Foundational model selection
2. Deployment and accessibility
3. Adaptive approach
4. Enterprise readiness
5. Environmental and sustainability impact
6. Integrated application and developments
7. genAI Ops

Types of genAI applicability in accenture:

You might also like