0% found this document useful (0 votes)
22 views2 pages

Assignment 12 Jigyanshu Pati

Uploaded by

Jigyanshu Pati
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
22 views2 pages

Assignment 12 Jigyanshu Pati

Uploaded by

Jigyanshu Pati
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 2

Solution of Assignment 12

1) Discuss the factors that contribute to better domain generalization in


Large Language Models (LLMs). How do aspects such as model
architecture, training data, and netuning techniques impact an LLM's
ability to perform in domains unseen during pretraining?
Ans->LLMs need three things to work across di erent tasks: strong models,
rich data, and ne-tuning. Transformer architectures with lots of parameters
help them learn complex patterns. Diverse training data exposes them to
many language styles. Finally, techniques like few-shot learning and prompt-
tuning allow them to adapt to new tasks without needing a complete
overhaul. This lets LLMs tackle new areas with minimal retraining

2) Explain the concept of mode collapse in Generative Adversarial Networks


(GANs). What are its symptoms, and how does it a ect the quality and
diversity of generated outputs?
Ans->Mode collapse is a common issue that arises in Generative Adversarial
Networks (GANs). It occurs when the generator becomes xated on
producing a limited set of outputs, even though the training data contains a
wider variety of examples. This leads to a lack of diversity in the generated
samples, as the generator fails to explore the entire latent space
e ectively.Lack of diversity: The generated samples are very similar to each
other, with little variation in their characteristics.Degeneration: The generator
may start producing outputs that are increasingly unrealistic or
nonsensical.Instability: The training process becomes unstable, with frequent
oscillations in the generator's and discriminator's loss functions.

3) Compare and contrast Variational Autoencoders (VAEs) with standard


autoencoders. What unique capabilities do VAEs o er, particularly in terms of
latent space representation and generative abilities?
Ans->Variational Autoencoders (VAEs) and standard autoencoders both aim to
compress data into a lower-dimensional latent space and then reconstruct the
original data from this compressed representation, but they di er fundamentally in
their approach to encoding and their generative capabilities.n standard
autoencoders, the encoder learns a deterministic mapping from input data to a
xed latent space. Each input is mapped to a single point in the latent space,
meaning the model learns to compress data without explicitly capturing
uncertainty or probability distributions.VAEs, on the other hand, introduce
probabilistic modeling into the latent space. Instead of mapping each input to a
xed point, the encoder maps inputs to a distribution (typically Gaussian),
characterized by a mean and variance. This allows the VAE to model uncertainty
and encourages the latent space to follow a continuous, structured distribution,
often regularized to be close to a standard Gaussian.
fi
fi
ff
fi
fi
ff
ff
ff
ff
fi
4) Analyze the runtime complexity of transformer-based text encoder models in
relation to input sequence length. How does the self-attention mechanism
in uence this complexity, and what are the implications for processing long
sequences?
Ans->The runtime complexity of transformer-based text encoder models is
strongly in uenced by the self-attention mechanism, which is one of the key
components of the transformer architecture. Understanding how this
mechanism a ects complexity is crucial when dealing with long input
sequences.In summary, the self-attention mechanism in transformers
introduces quadratic complexity with respect to sequence length, posing
challenges for long-sequence processing. This has spurred the development
of more e cient transformer variants to improve scalability.
fl
ffi
fl
ff

You might also like