Gen AI
Gen AI
Application of genAI:
-text: widely used in sectors like content creation, drafting emails, writing code, creating chatbot
responses. Large Language models(LLMs) power text generation and have widespread
applications. Chatgpt is an application of LLMs.
-Images: can create new images based on user prompts. Generated using microsoft bing
image, creator powered by DALL-E image model.. Imagen and DALL.E are the two popular text
to image models.
-AUdio: can create new audio based on user description.
-Video: new video can be generated by providing the right description to the genAI model.
-Code: help developers write code base on the task.
Discriminative vs GenAI:
So genAI is similar to convolutional neural network which create high level features of input
images.
genAI model learns and creates new artifacts.
Discriminative ai differentiate between the data. Example: KNN classifier which could
differentiate between spam and non-spam mail.
Discriminative ai doesnt learn about the data patterns. It just differentiate data points which is
also known as learning the decision boundary. Once decision boundary is known data can be
classified on the different basis
AI: science of developing intelligent machines. It enables the machine to sense, comprehend,
act and learn.
ML: it enables the machine to learn directly from data without being explicitly programmed.
Enables systems to learn patterns from and subsequently improve from experience.
ML can be supervised learning or unsupervised.
Supervised works with labelled data features + label. Example: regression , decision tree
Unsupervised works with unlabeled datsaet. Example: k-means clustering, principal component
analysis.
DL: artificial neural network(ANN) which mimics the human brain. Learn complex patterns from
data automatically. Discover patterns in data by automatically learning a hierarchical layer of
features. Works with both labeled and unalabed data. GenAI is sub field of DL
Genai: trained on vast unlabeled data such as wikipedia and data repositories.
When queried, the models return responses derived from the data, they have learned from.
These model are foundational models.
Foundational models can be classified in Language Models (such as generative pre trained
transformer eg:chatgpt) and Image Models(such as diffusion models eg: midjourney)
Foundational model are the ML models trained on huge amount of data that can be used for
many different applications. Trained on text, image, speech and structured data. Training is
self-supervised learning. Example: gpt, T5, BERT
self-supervised learning: models pretends as if there is some missing part in the training data
and tries to predict the missing data. For example:
In image data:
●
● Reverse diffusion is applied when the model is trained.
● Forward diffusion: its a fixed process in which noise is added slowly and
iteratively to the images.
● Example of forward diffusion: here the initial X0 image is dissued to Xt noise
where it no longer can’t be recognised as X0 image.
●
● Reverse denoising process: goal is to learn the noise in a noisy image and then
subtract it to get a clean image. This is a generative process that tries to reverse
the forward process.
●
● Guidance: it helps in generating images from text descriptions.
- RNN have few limitations and to overcome that encoder decoder RNN is introduced.
This rnn have two separate encoder and decoder . encoder processes the input
sequence and generate the fixed length representation called the context vector.and
capturing the essential information. The decoder takes the context vector generated by
the encoder and generate the output sequence(one element at a time).
- By using the encoder decoder RNN translation inscreases a lot but still there is a
information loss in between of encoder to decoder and hences its not an ideal model for
long range dependencies and comprehensive task. Hence we use attention mechanism.
- Attention Mechanism allows dynamically allocate their focus on the relevant part of their
input sequences. Their are few types of attention mechanism
● Generalised attention:compares input and output sequences elements, selecting
areas of focus based on scores.
● Self Attention: specific input sequence parts are chosen to compute the output
sequences.
● Multi-head attention: utilizes parallel head to process input and output sequences
● Additive Attention: correlates source and target sequences words using
alignment scores.
● Global Attention: attends to all scores words or predicts target sentences with
different context vectors.
LLm:
Transformer in LLM:
First step is the input sequence goes to the embedding layer where it is stored in the embedding
vector. Then next step positional encoding layer:
Example of a transformer is GPT3, a language model developed by openAI. Unsupervised
learning model and unimodal it can only accept text prompts.
BERT: developed by google, semi-supervised as used both labeled and unlabeled dataset.
##Embeddings: numerical representation of text. They are multi dimensional vectors which
helps in capturing true meaning of text.
-embeddings are saved in databases known as vector databases. Example of open course
vector database: chroma, facebook ai similarity search(faiss), lance. Proprietary vector
database: pinecoma, vectara.
##Fine-Tuning: retraining the generative ai model with new data for a specialised task.
##GenAI Architecture: there are 7 key components for architecting a genAI solution:
1. Foundational model selection
2. Deployment and accessibility
3. Adaptive approach
4. Enterprise readiness
5. Environmental and sustainability impact
6. Integrated application and developments
7. genAI Ops