0% found this document useful (0 votes)
3 views

Suppose You Have Good Knowledge in A Certain Topic Learning Allied Topics Becomes Easier As You Can Always Build On The Fundamentals

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
3 views

Suppose You Have Good Knowledge in A Certain Topic Learning Allied Topics Becomes Easier As You Can Always Build On The Fundamentals

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 7

Federated Transfer Learning (FTL) is a

learning scheme, where knowledge is


transferred from the rich features space of
a party to a party without enough features
or labels to train a performant model.

In other words, a powerful party leverages


knowledge to a small party which is not
able to train alone, with its features only.

Suppose you have good knowledge in a certain topic; learning allied


topics becomes easier as you can always build on the fundamentals
Federated Transfer Learning
• Transfer learning is the process of using the gained knowledge from
one model for a different task than it was originally intended for
• Vertical federated learning utilized with a pre-trained model that is
trained on a similar dataset for solving a different problem.
• One such example of Federated transfer learning is to train a
personalised model e.g. Movie recommendation for the user's past
browsing behavior
Examples
• Train a large CNN on the general task of image classification for the most
common 1000 image classes (often such a task may run for a few days over
billions of training images, and top world experts optimising the model). Reuse
that model to classify new classes of images, like classifying special kind of
plants or animals - using very few samples, few resources, and no expertise.
• Train a generic language model like BERT and then tune it for a specific task
such as questions answering.
• Train a typing prediction model on a phone's keyboard - you don't want to
send all the keystrokes to the server, or when you want to learn a speech
model for a smart speaker, but you don't want to send all the recordings to the
server. In these cases, you train and compute the parameters updates on the
device, and only send these to the central server holding the shared model.
Methodology
• The first step is to select a pre-trained model that acts as the base
model from which knowledge will be transferred to the model at
hand.
• There are two ways to use the inputs from the pre-trained model.
freeze a few layers of pre-trained models and then train the other
layers on the new dataset for the target model.
• The second approach is to build a new model by incorporating some
features from the layers in the pre-trained model.
Federated Learning TRANSFER LEARNING
Learning models from decentralized data The nodes share some overlapping samples but differ in
data features.
Ensuring data privacy Ransfer learning can work with fewer data and in a
shorter time, saving a lot of computational resources
and reducing the cost of building models
Training the model locally in each node Powerful party leverages knowledge to a small party
which is not able to train alone, with its features only
Sharing the model-updated local parameters (not the Knowledge is transferred from the rich features space of
data) and securely aggregating them to build a better a party to a party without enough features or labels to
global model. train a performant model
Learn from a large amount of data, while preserving Re-use of a pre-trained model on a new task to be
privacy. solved
Transfer learning is the ability to take a complex model Federated learning, is the ability to distribute the
that was trained for some task A, using a HUGE amount learning, often to the edge device (e.G a phone) - such
of training data and compute resources, and then with a that the training data never leaves the, yet they all
tiny bit of incremental work (small amount of data, small update a centralized shared model (though quite often
amount of resources) - you tune it to support a new task you will create a locally personalized model that is
B. It's usually driven mostly by sample efficiency optimised for that service).
(especially when you simply don't have a lot of samples
for task B), and compute efficiency.

You might also like