0% found this document useful (0 votes)
225 views

TensorFlow Workshop

The document describes TensorFlow, a machine learning framework that uses dataflow graphs to represent and execute computations. Key points: - TensorFlow defines computations as graphs of nodes called operations or ops, with edges representing multidimensional tensors flowing between them. - Common ops include constants, variables, math operations, neural network layers, and optimizers. Variables hold persistent state across executions. - The core API focuses on graph construction and execution. Higher-level APIs and libraries build upon it for machine learning tasks like training neural networks. - Distributed execution is supported via remote sessions that deploy subgraphs to worker devices like CPUs and GPUs. Training utilities help with common tasks like checkpointing and summaries.

Uploaded by

N
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
225 views

TensorFlow Workshop

The document describes TensorFlow, a machine learning framework that uses dataflow graphs to represent and execute computations. Key points: - TensorFlow defines computations as graphs of nodes called operations or ops, with edges representing multidimensional tensors flowing between them. - Common ops include constants, variables, math operations, neural network layers, and optimizers. Variables hold persistent state across executions. - The core API focuses on graph construction and execution. Higher-level APIs and libraries build upon it for machine learning tasks like training neural networks. - Distributed execution is supported via remote sessions that deploy subgraphs to worker devices like CPUs and GPUs. Training utilities help with common tasks like checkpointing and summaries.

Uploaded by

N
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 49

TensorFlow Workshop

Core TF Model
Yet another dataflow system
biases Graph of Nodes, also called Operations or ops.

weights Add Relu

MatMul Xent
examples

labels
s o rs
ith ten
Yet another dataflow systemw

biases Edges are N-dimensional arrays: Tensors

weights Add Relu

MatMul Xent
examples

labels
t a t e
ith s
Yet another dataflow systemw
'Biases' is a variable Some ops compute gradients −= updates biases

biases

... Add ... Mul −=

learning rate
ut e d
is t r i b
Yet another dataflow systemd

Device A Device B
biases

... Add ... Mul −=

learning rate

Devices: Processes, Machines, GPUs, etc


What's not in the Core Model

● Anything about neural networks, machine learning, ...


● Anything about backpropagation, differentiation, ...
● Anything about gradient descent, parameter servers…

These are built by combining existing operations, or defining new operations.

Core system can be applied to other problems than machine learning.


Core TF API
API Families

Graph Construction
● Assemble a Graph of Operations.

Graph Execution
● Deploy and execute operations in a Graph.
Hello, world!

from google3.learning.brain.public import tensorflow as tf

# Create an operation.
hello = tf.Constant("Hello, world!")
# Create a session with the "local" Tensorflow runtime.
sess = tf.Session("local")
# Execute that operation and print its result.
print sess.Run(hello)
Graph Construction

Library of predefined Ops


● Constant, Variables, Math ops, etc.

Functions to add Ops for common needs


● Gradients: Add Ops to compute derivatives.
● Training methods: Add Ops to update variables (SGD, Adagrad, etc.)

All operations are added to a global Default Graph.


Slightly more advanced calls let you control the Graph more precisely.
Value

Constant Constant

Op that outputs a constant value when run. (Surprising?)

from google3.learning.brain.public.tensorflow import *


a = Constant([1.0, 2.0, 3.0, 4.0]) # float vector
b = Constant([[5, 6], [7, 8]]) # int32 2x2 matrix

import numpy as np
c = Constant(np.random.rand(2, 4, 6, 8)) # double 2x4x6x8 tensor
Value Reference

Variable Variable
State

Op that holds state that persists across calls to Run()

v = Variable(shape=[4, 3], dtype=DT_FLOAT) # float 4x3 matrix


Value Reference

Variables Variable
State

Some Ops modify the Variable state: InitVariable, Assign, AssignSub, AssignAdd.

init = Assign(v, RandomParameters(shape=v.shape))

Updates the variable value when run.


Variable

State Assign Outputs the value for convenience

Random
Parameters
Math Ops

A variety of Operations for linear algebra, convolutions, etc.

c = Constant(...) c
w = Variable(...) MatMul
w Add
b = Variable(...)
b
y = Add(MatMul(c, w), b)

Overloaded Python operators help with construction: y = MatMul(c, w) + b


Operations, plenty of them

Documentation at http://go/tensorflow-ops

● Array ops ● Neural network ops


○ Concat ○ Non-linearities (Relu, …)
○ Slice ○ Convolutions (Conv2D, …)
○ Reshape ○ Pooling (AvgPool, …)
○ ... ● ...and many more
● Math ops ○ Constants, Data flow, Control flow,
○ Linear algebra (MatMul, …) Embedding, Initialization, I/O, Legacy
○ Component-wise ops (Mul, ...) Input Layers, Logging, Random,
○ Reduction ops (Sum, …) Sparse, State, Summary, Lua, etc.
Graph Construction Helpers

● Gradients
● Optimizers
● Higher-Level APIs in core TF
● Higher-Level libraries outside core TF
Gradients

Given a loss, add Ops to compute gradients for Variables.

many ops
var0 Op
Op loss
var1 Op
Gradients

Gradients(loss, [var0, var1]) # Generate gradients

many ops
var0 Op
Op loss

var1 Op
Op

Gradients for var0 Op


many ops

Gradients for var1 Op


x
MatMul y
w
Example

Gradients for MatMul

x Transpose
MatMul gw
gy
MatMul gx

w Transpose
Optimizers

Apply gradients to Variables: SGD(var, grad, learning_rate)

var

AssignSub
grad
Mul
learning_rate

Note: learning_rate is just output of an Op, it can easily be decayed


Easily Add Optimizers

Builtin
● SGD, Adagrad, Momentum, Adam, …

Contributed
● LazyAdam, NAdam, YellowFin, ...
Putting all together to train a Neural
Net

Build a Graph by adding Operations:


● For Variables to hold the parameters of the Neural Net.
● To compute the Neural Net output: e.g. classification predictions.
● To compute a training loss: e.g. cross entropy, parameter L2 norms.
● To calculate gradients for the parameters to train.
● To apply gradients with a training function.
MNIST Example

tutorials/mnist/mnist.py
● Shows both training and evaluation
● Also shows InputLayers: Ops that
read data from files
Distributed Execution
Graph Execution

Session API
● Stubby based API to deploy a Graph in a Tensorflow runtime
● Can run any subset of the graph
● Can add Ops to an existing Graph (for interactive use in colab for example)

Training Utilities
● Checkpoint, Recovery, Summaries, Avisu, Replicas, etc.
Tensorflow Runtimes

● "local": In address space of the Python program.


● Remote: In servers typically running on Borg.
Local Runtime

Python Program
Session
create graph Runtime
create session
sess.Run() CPU

GPU
Remote Runtime

Master RunSubGraph()
Python Program CPU
()
raph Worker
create graph
ateG

s])
create session
([op
Cre

sess.Run() GetTensor()
Run

Session CPU CPU


Worker Worker
GPU GPU
Deploying Graph, Running Ops

# ...Add ops to the graph...


sess = session.Session("local") # Deploy graph
Running and fetching output

an op Fetch

# Run an Op and fetch its output.


# "values" is a numpy ndarray.
values = sess.Run(<an op output>)
Running and fetching output

an op Fetch

Transitive closure of needed ops is Run


Execution happens in parallel
Feeding input, Running, and Fetching

an op Fetch

Feed

a_val = ...a numpy ndarray...


values = sess.Run(<an op output>,
feed_input({<a output>: a_val})
Feeding input, Running, and Fetching

an op Fetch

Feed

Only the required Ops are run.


Higher-Level Core TF API
Training Utilities

Training program typically runs multiple threads


● Execute the training op in a loop.
● Checkpoint every so often.
● Gather summaries for the Visualizer.
● Other, eg. monitors Nans, costs, etc.
Training Coordinator, Training Threads

Helper objects to help multithreaded training


● Thread classes to execute training op, summaries, etc
● Coordinator to start/stop them together, manage summaries

Makes it easy to train single or multiple replicas

Example:
learning/brain/models/mnist/mnist_replicas.py
Layers are ops that create Variables

def embedding(x, vocab_size, dense_size,


name=None, reuse=None, multiplier=1.0):
"""Embed x of type int64 into dense vectors."""
with tf.variable_scope( # Use scopes like this.
name, default_name="emb", values=[x], reuse=reuse):
embedding_var = tf.get_variable(
"kernel", [vocab_size, dense_size])
return tf.gather(embedding_var, x)
Models are built from Layers
def bytenet(inputs, targets, hparams):
final_encoder = common_layers.residual_dilated_conv(
inputs, hparams.num_block_repeat, "SAME", "encoder", hparams)
shifted_targets = common_layers.shift_left(targets)
kernel = (hparams.kernel_height, hparams.kernel_width)
decoder_start = common_layers.conv_block(
tf.concat([final_encoder, shifted_targets], axis=3),
hparams.hidden_size, [((1, 1), kernel)], padding="LEFT")
return common_layers.residual_dilated_conv(
decoder_start, hparams.num_block_repeat,
"LEFT", "decoder", hparams)
Estimator
Estimator and Experiment

estimator = tf.contrib.learn.Estimator(
model_fn=model_builder(model_name, hparams=hparams),
model_dir=output_dir,
config=tf.contrib.learn.RunConfig(master=...))
experiment = tf.contrib.learn.Experiment(
estimator=estimator,
train_input_fn=f1, eval_input_fn=f2,
eval_metrics=eval_metrics, train_steps=train_steps,
eval_steps=eval_steps, train_monitors=train_monitors)
model_fn
def model_fn(features, targets, mode):
"""Creates the prediction, loss, and train ops.

Args:
features: A dictionary of tensor by feature name.
targets: A tensor representing the labels (targets).
mode: The execution mode, tf.contrib.learn.ModeKeys.

Returns:
A tuple: prediction, loss, and train_op.
input_fn
def input_fn():
"""Supplies input to our model.

This function supplies input to our model, where this input is a


function of the mode. For example, we supply different data if
we're performing training versus evaluation.

Returns:
A tuple consisting of 1) a dictionary of tensors whose keys are
the feature names, and 2) a tensor of target labels if the mode
is not INFER (and None, otherwise).
"""
High-Level Exteral Libraries:
Tensor2Tensor
Tensor2Tensor (github)
Define, train, and evaluate ML tasks and models (especially sequence tasks).
● Many datasets (WMT, MSCoco, LM1B, etc.) and models (Transformer,
ByteNet, NeuralGPU, LSTM) already built in - mix and match!
● Eminently extensible - add a new Problem, T2TModel, or Modality
● Easy distributed training, both sync and async (and with support for multiple
GPUs per machine)
● Easy hyperparameter tuning
Tensor2Tensor Organization
● data_generators/ : generators for datasets which subclass Problem
● models/ : layers and models, models must subclass T2TModel
● utils/ : utilities, t2t_model class, etc.
● google/ : internal stuff (organized in the same way)
● t2t_trainer.py: main binary called to train:

t2t-trainer
--data_dir=$DATA_DIR --problems=$PROBLEM --model=$MODEL \
--hparams_set=$HPARAMS --output_dir=$TRAIN_DIR
Tensor2Tensor Models
@registry.register_model
class ByteNet(t2t_model.T2TModel):

def model_fn_body(self, features):


return bytenet_internal(
features["inputs"], features["targets"], self._hparams)
Notes:
● T2TModels are registered in a registry, get name (byte_net)
● May implement model_fn_body, default model_fn.
Adding a Problem
@registry.register_problem("wmt_ende_tokens_8k")
class WMTEnDeTokens8k(WMTProblem):
"""Problem spec for WMT En-De translation."""

@property
def targeted_vocab_size(self):
return 2**13 # 8192

def train_generator(self, data_dir, tmp_dir, train):


yield {“inputs”: [1,2], “targets”: [3, 4]}
Let the Tensors Flow!

You might also like