We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 2
Solution of Assignment 12
1) Discuss the factors that contribute to better domain generalization in
Large Language Models (LLMs). How do aspects such as model architecture, training data, and netuning techniques impact an LLM's ability to perform in domains unseen during pretraining? Ans->LLMs need three things to work across di erent tasks: strong models, rich data, and ne-tuning. Transformer architectures with lots of parameters help them learn complex patterns. Diverse training data exposes them to many language styles. Finally, techniques like few-shot learning and prompt- tuning allow them to adapt to new tasks without needing a complete overhaul. This lets LLMs tackle new areas with minimal retraining
2) Explain the concept of mode collapse in Generative Adversarial Networks
(GANs). What are its symptoms, and how does it a ect the quality and diversity of generated outputs? Ans->Mode collapse is a common issue that arises in Generative Adversarial Networks (GANs). It occurs when the generator becomes xated on producing a limited set of outputs, even though the training data contains a wider variety of examples. This leads to a lack of diversity in the generated samples, as the generator fails to explore the entire latent space e ectively.Lack of diversity: The generated samples are very similar to each other, with little variation in their characteristics.Degeneration: The generator may start producing outputs that are increasingly unrealistic or nonsensical.Instability: The training process becomes unstable, with frequent oscillations in the generator's and discriminator's loss functions.
3) Compare and contrast Variational Autoencoders (VAEs) with standard
autoencoders. What unique capabilities do VAEs o er, particularly in terms of latent space representation and generative abilities? Ans->Variational Autoencoders (VAEs) and standard autoencoders both aim to compress data into a lower-dimensional latent space and then reconstruct the original data from this compressed representation, but they di er fundamentally in their approach to encoding and their generative capabilities.n standard autoencoders, the encoder learns a deterministic mapping from input data to a xed latent space. Each input is mapped to a single point in the latent space, meaning the model learns to compress data without explicitly capturing uncertainty or probability distributions.VAEs, on the other hand, introduce probabilistic modeling into the latent space. Instead of mapping each input to a xed point, the encoder maps inputs to a distribution (typically Gaussian), characterized by a mean and variance. This allows the VAE to model uncertainty and encourages the latent space to follow a continuous, structured distribution, often regularized to be close to a standard Gaussian. fi fi ff fi fi ff ff ff ff fi 4) Analyze the runtime complexity of transformer-based text encoder models in relation to input sequence length. How does the self-attention mechanism in uence this complexity, and what are the implications for processing long sequences? Ans->The runtime complexity of transformer-based text encoder models is strongly in uenced by the self-attention mechanism, which is one of the key components of the transformer architecture. Understanding how this mechanism a ects complexity is crucial when dealing with long input sequences.In summary, the self-attention mechanism in transformers introduces quadratic complexity with respect to sequence length, posing challenges for long-sequence processing. This has spurred the development of more e cient transformer variants to improve scalability. fl ffi fl ff