0% found this document useful (0 votes)
87 views

Developers Google Com Machine Learning Crash Course Multi CL

This document provides an overview of multi-class neural networks and the softmax activation function. Softmax assigns a probability to each class in a multi-class problem, with the probabilities summing to 1. It is implemented as a layer before the output layer, with the same number of nodes as the output layer. Softmax calculates probabilities for each possible class but variants like candidate sampling may calculate probabilities for a sample of negative classes to improve efficiency for problems with a large number of classes.

Uploaded by

cantahorn
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
87 views

Developers Google Com Machine Learning Crash Course Multi CL

This document provides an overview of multi-class neural networks and the softmax activation function. Softmax assigns a probability to each class in a multi-class problem, with the probabilities summing to 1. It is implemented as a layer before the output layer, with the same number of nodes as the output layer. Softmax calculates probabilities for each possible class but variants like candidate sampling may calculate probabilities for a sample of negative classes to improve efficiency for problems with a large number of classes.

Uploaded by

cantahorn
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 5

 Machine Learning Crash Course  Language Sign in

Google is committed to advancing racial equity for Black communities. See how.

Home  Products  Machine Learning  Courses 


Multi-Class Neural Networks: So max Send feedback

 Estimated Time: 8 minutes

Recall that logistic regression produces a decimal between 0 and 1.0. For example, a logistic regression output of 0.8 from an
email classifier suggests an 80% chance of an email being spam and a 20% chance of it being not spam. Clearly, the sum of
the probabilities of an email being either spam or not spam is 1.0.

Softmax extends this idea into a multi-class world. That is, Softmax assigns decimal probabilities to each class in a multi-
class problem. Those decimal probabilities must add up to 1.0. This additional constraint helps training converge more
quickly than it otherwise would.

For example, returning to the image analysis we saw in Figure 1, Softmax might produce the following likelihoods of an image
belonging to a particular class:

Class Probability

Create PDF in your applications with the Pdfcrowd HTML to PDF API PDFCROWD
ClassMachine Learning Crash Course Probability  Language Sign in

apple 0.001

bear 0.04

candy 0.008

dog 0.95

egg 0.001

Softmax is implemented through a neural network layer just before the output layer. The Softmax layer must have the same
number of nodes as the output layer.

hidden
hidden logits

apple: yes/no?

bear: yes/no?

candy: yes/no?

dog: yes/no?

egg: yes/no?

Softmax

Figure 2. A Softmax layer within a neural network.

Create PDF in your applications with the Pdfcrowd HTML to PDF API PDFCROWD
 Click

the plus icon to see the Softmax equation.
Machine Learning Crash Course  Language Sign in

So max Options

Consider the following variants of Softmax:

Full Softmax is the Softmax we've been discussing; that is, Softmax calculates a probability for every possible class.

Candidate sampling means that Softmax calculates a probability for all the positive labels but only for a random sample
of negative labels. For example, if we are interested in determining whether an input image is a beagle or a bloodhound,
we don't have to provide probabilities for every non-doggy example.

Full Softmax is fairly cheap when the number of classes is small but becomes prohibitively expensive when the number of
classes climbs. Candidate sampling can improve efficiency in problems having a large number of classes.

One Label vs. Many Labels

Softmax assumes that each example is a member of exactly one class. Some examples, however, can simultaneously be a
member of multiple classes. For such examples:

You may not use Softmax.

You must rely on multiple logistic regressions.

For example, suppose your examples are images containing exactly one item—a piece of fruit. Softmax can determine the
likelihood of that one item being a pear, an orange, an apple, and so on. If your examples are images containing all sorts of
things—bowls of different kinds of fruit—then you'll have to use multiple logistic regressions instead.

Create PDF in your applications with the Pdfcrowd HTML to PDF API PDFCROWD
 Machine
 Key Terms Learning Crash Course  Language Sign in

candidate sampling logistic regression


multi-class softmax

Help Center

Previous Next
 One vs. All Programming Exercise 

Was this page helpful?


Send feedback

Except as otherwise noted, the content of this page is licensed under the Creative Commons Attribution 4.0 License, and code samples are licensed under
the Apache 2.0 License. For details, see the Google Developers Site Policies. Java is a registered trademark of Oracle and/or its affiliates.

Last updated 2020-03-17 UTC.

Connect Programs Developer consoles

Blog Women Techmakers Google API Console


Create PDF in your applications with the Pdfcrowd HTML to PDF API PDFCROWD
Blog Women Techmakers Google API Console

 Machine Learning Crash Course


Facebook Google Developer Groups  Language
Google Cloud Platform Console
Sign in

Medium Google Developers Experts Google Play Console

Twitter Accelerators Firebase Console

YouTube Developer Student Clubs Actions on Google Console

Cast SDK Developer Console

Chrome Web Store Dashboard

Android Chrome Firebase Google Cloud Platform All products

Terms | Privacy Sign up for the Google Developers newsletter Subscribe Language

Create PDF in your applications with the Pdfcrowd HTML to PDF API PDFCROWD

You might also like