Notes for variable_scope, name_scope, and weight sharing #43

taehoonlee · 2019-03-04T09:46:52Z

TensorNets provides a seamless integration with regular TensorFlow APIs. You can define any models under tf.variable_scope and tf.name_scope to couple the model with your established scripts. This document shows basic examples for tf.variable_scope, tf.name_scope, and weight sharing. First, import two libraries:

import tensorflow as tf
import tensornets as nets

Let's get started with basic TensorFlow APIs. You can manage a prefix of variable names with tf.variable_scope and tf.name_scope. The difference is that tf.Variable will be affected by only tf.name_scope, while tf.Tensor by both tf.variable_scope and tf.name_scope. Also, the second tf.get_variable('w', [1]) will try to create the same variable if tf.variable_scope(reuse=None), or return the pointer of the existing variable otherwise (reuse=True, reuse=tf.AUTO_REUSE). Here is an example:

with tf.name_scope('foo'):
  with tf.variable_scope('goo'):
    with tf.name_scope('hoo'):

      # `tf.Variable` will be affected by only `tf.name_scope`.
      w = tf.get_variable('w', [1])
      assert w.name == 'goo/w:0'

      # `tf.Tensor` will be affected by both `tf.variable_scope` and `tf.name_scope`.
      s = tf.constant(-1.0)
      y = s * w
      assert s.name == 'foo/goo/hoo/Const:0'
      assert y.name == 'foo/goo/hoo/mul:0'

      # `tf.get_variable` will try to create the same variable again
      # if `tf.variable_scope(reuse=None)` (default).
      try:
        w2 = tf.get_variable('w', [1])
      except ValueError as e:
        print(e)  # Variable goo/w already exists, disallowed.

The principle is easily extended to TensorNets. The weights returned by get_weights are tf.Variable, and the outputs from get_outputs and get_middles are tf.Tensor. Thus, the weights will be affected by only tf.name_scope, while the outputs and the middles by both tf.variable_scope and tf.name_scope. Surely, the model function call can't be performed without reuse=True or reuse=tf.AUTO_REUSE because the function will try to create the same variable again.

with tf.name_scope('xoo'):
  with tf.variable_scope('yoo'):
    with tf.name_scope('zoo'):

      # The weights returned by `get_weights` are `tf.Variable`,
      # and the outputs from `get_outputs` and `get_middles` are `tf.Tensor`
      x1 = tf.placeholder(tf.float32, [None, 224, 224, 3], name='x1')
      model1 = nets.ResNet50(x1)

      # `tf.Variable` will be affected by only `tf.name_scope`.
      assert model1.get_weights()[-1].name == 'yoo/resnet50/logits/biases:0'

      # `tf.Tensor` will be affected by both `tf.variable_scope` and `tf.name_scope`.
      assert model1.name == 'xoo/yoo/zoo/resnet50/probs:0'
      assert model1.get_outputs()[-1].name == 'xoo/yoo/zoo/resnet50/probs:0'
      assert model1.get_middles()[-1].name == 'xoo/yoo/zoo/resnet50/conv5/block3/out:0'

      # `tf.get_variable` will try to create the same variable again
      # if `tf.variable_scope(reuse=None)` (default).
      try:
        x2 = tf.placeholder(tf.float32, [None, 224, 224, 3], name='x2')
        model2 = nets.ResNet50(x2)
      except ValueError as e:
        print(e)  # Variable yoo/resnet50/conv1/conv/weights already exists, disallowed.

And we can easily implement the concept of weight sharing by using tf.variable_scope(reuse=tf.AUTO_REUSE). An example is as follows:

with tf.variable_scope('boo', reuse=tf.AUTO_REUSE):
  w1 = tf.get_variable('w', [1])
  w2 = tf.get_variable('w', [1])
  assert w1 == w2
  assert w1.name == 'boo/w:0'
  s = tf.constant(-1.0)
  y1 = s * w1
  y2 = s * w2
  assert y1 != y2
  assert y1.name == 'boo/mul:0'
  assert y2.name == 'boo/mul_1:0'

TensorNets can be also easily integrated with tf.variable_scope:

with tf.variable_scope('koo', reuse=tf.AUTO_REUSE):
  x1 = tf.placeholder(tf.float32, [None, 224, 224, 3], name='x1')
  x2 = tf.placeholder(tf.float32, [None, 224, 224, 3], name='x2')
  model1 = nets.ResNet50(x1)
  model2 = nets.ResNet50(x2)
  for (a, b) in zip(model1.get_weights(), model2.get_weights()):
    assert a == b
  assert model1.get_weights()[-1].name == 'koo/resnet50/logits/biases:0'
  assert model1 != model2
  assert model1.name == 'koo/resnet50/probs:0'
  assert model2.name == 'koo/resnet50_1/probs:0'

Summary

I'd like to say that there are two patterns to implement weight sharing:

with tf.variable_scope:

with tf.variable_scope('koo', reuse=tf.AUTO_REUSE):
    model1 = nets.ResNet50(x1)
    model2 = nets.ResNet50(x2)

without variable_scope:

model1 = nets.ResNet50(x1, reuse=tf.AUTO_REUSE)
model2 = nets.ResNet50(x2, reuse=tf.AUTO_REUSE)

(equivalent to 2):

import functools
resnet = functools.partial(nets.ResNet50, reuse=tf.AUTO_REUSE)
model1 = resnet(x1)
model2 = resnet(x2)

And I recommend the following pattern used in deploying multiple clones in tf.slim:

with tf.name_scope('clone0'):
    model1 = nets.ResNet50(x1, reuse=tf.AUTO_REUSE)
with tf.name_scope('clone1'):
    model2 = nets.ResNet50(x2, reuse=tf.AUTO_REUSE)

for (a, b) in zip(model1.get_weights(), model2.get_weights()):
    assert a == b

assert model1.name == 'clone0/resnet50/probs:0'
assert model2.name == 'clone1/resnet50/probs:0'

Without tf.name_scope, tf.Tensor will be automatically named with a postfix-style (resnet50, resnet50_1, ...). I think that it may be difficult to manage tensor names in such cases.

The text was updated successfully, but these errors were encountered:

taehoonlee added the note label Mar 4, 2019

This was referenced Mar 4, 2019

Minor issue with name scopes #40

Closed

The output problem of sharing weights. #28

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Notes for variable_scope, name_scope, and weight sharing #43

Notes for variable_scope, name_scope, and weight sharing #43

taehoonlee commented Mar 4, 2019

Notes for variable_scope, name_scope, and weight sharing #43

Notes for variable_scope, name_scope, and weight sharing #43

Comments

taehoonlee commented Mar 4, 2019

Summary