Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Notes for variable_scope, name_scope, and weight sharing #43

Open
taehoonlee opened this issue Mar 4, 2019 · 0 comments
Open

Notes for variable_scope, name_scope, and weight sharing #43

taehoonlee opened this issue Mar 4, 2019 · 0 comments
Labels

Comments

@taehoonlee
Copy link
Owner

TensorNets provides a seamless integration with regular TensorFlow APIs. You can define any models under tf.variable_scope and tf.name_scope to couple the model with your established scripts. This document shows basic examples for tf.variable_scope, tf.name_scope, and weight sharing. First, import two libraries:

import tensorflow as tf
import tensornets as nets

Let's get started with basic TensorFlow APIs. You can manage a prefix of variable names with tf.variable_scope and tf.name_scope. The difference is that tf.Variable will be affected by only tf.name_scope, while tf.Tensor by both tf.variable_scope and tf.name_scope. Also, the second tf.get_variable('w', [1]) will try to create the same variable if tf.variable_scope(reuse=None), or return the pointer of the existing variable otherwise (reuse=True, reuse=tf.AUTO_REUSE). Here is an example:

with tf.name_scope('foo'):
  with tf.variable_scope('goo'):
    with tf.name_scope('hoo'):

      # `tf.Variable` will be affected by only `tf.name_scope`.
      w = tf.get_variable('w', [1])
      assert w.name == 'goo/w:0'

      # `tf.Tensor` will be affected by both `tf.variable_scope` and `tf.name_scope`.
      s = tf.constant(-1.0)
      y = s * w
      assert s.name == 'foo/goo/hoo/Const:0'
      assert y.name == 'foo/goo/hoo/mul:0'

      # `tf.get_variable` will try to create the same variable again
      # if `tf.variable_scope(reuse=None)` (default).
      try:
        w2 = tf.get_variable('w', [1])
      except ValueError as e:
        print(e)  # Variable goo/w already exists, disallowed.

The principle is easily extended to TensorNets. The weights returned by get_weights are tf.Variable, and the outputs from get_outputs and get_middles are tf.Tensor. Thus, the weights will be affected by only tf.name_scope, while the outputs and the middles by both tf.variable_scope and tf.name_scope. Surely, the model function call can't be performed without reuse=True or reuse=tf.AUTO_REUSE because the function will try to create the same variable again.

with tf.name_scope('xoo'):
  with tf.variable_scope('yoo'):
    with tf.name_scope('zoo'):

      # The weights returned by `get_weights` are `tf.Variable`,
      # and the outputs from `get_outputs` and `get_middles` are `tf.Tensor`
      x1 = tf.placeholder(tf.float32, [None, 224, 224, 3], name='x1')
      model1 = nets.ResNet50(x1)

      # `tf.Variable` will be affected by only `tf.name_scope`.
      assert model1.get_weights()[-1].name == 'yoo/resnet50/logits/biases:0'

      # `tf.Tensor` will be affected by both `tf.variable_scope` and `tf.name_scope`.
      assert model1.name == 'xoo/yoo/zoo/resnet50/probs:0'
      assert model1.get_outputs()[-1].name == 'xoo/yoo/zoo/resnet50/probs:0'
      assert model1.get_middles()[-1].name == 'xoo/yoo/zoo/resnet50/conv5/block3/out:0'

      # `tf.get_variable` will try to create the same variable again
      # if `tf.variable_scope(reuse=None)` (default).
      try:
        x2 = tf.placeholder(tf.float32, [None, 224, 224, 3], name='x2')
        model2 = nets.ResNet50(x2)
      except ValueError as e:
        print(e)  # Variable yoo/resnet50/conv1/conv/weights already exists, disallowed.

And we can easily implement the concept of weight sharing by using tf.variable_scope(reuse=tf.AUTO_REUSE). An example is as follows:

with tf.variable_scope('boo', reuse=tf.AUTO_REUSE):
  w1 = tf.get_variable('w', [1])
  w2 = tf.get_variable('w', [1])
  assert w1 == w2
  assert w1.name == 'boo/w:0'
  s = tf.constant(-1.0)
  y1 = s * w1
  y2 = s * w2
  assert y1 != y2
  assert y1.name == 'boo/mul:0'
  assert y2.name == 'boo/mul_1:0'

TensorNets can be also easily integrated with tf.variable_scope:

with tf.variable_scope('koo', reuse=tf.AUTO_REUSE):
  x1 = tf.placeholder(tf.float32, [None, 224, 224, 3], name='x1')
  x2 = tf.placeholder(tf.float32, [None, 224, 224, 3], name='x2')
  model1 = nets.ResNet50(x1)
  model2 = nets.ResNet50(x2)
  for (a, b) in zip(model1.get_weights(), model2.get_weights()):
    assert a == b
  assert model1.get_weights()[-1].name == 'koo/resnet50/logits/biases:0'
  assert model1 != model2
  assert model1.name == 'koo/resnet50/probs:0'
  assert model2.name == 'koo/resnet50_1/probs:0'

Summary

I'd like to say that there are two patterns to implement weight sharing:

  1. with tf.variable_scope:
with tf.variable_scope('koo', reuse=tf.AUTO_REUSE):
    model1 = nets.ResNet50(x1)
    model2 = nets.ResNet50(x2)
  1. without variable_scope:
model1 = nets.ResNet50(x1, reuse=tf.AUTO_REUSE)
model2 = nets.ResNet50(x2, reuse=tf.AUTO_REUSE)
  1. (equivalent to 2):
import functools
resnet = functools.partial(nets.ResNet50, reuse=tf.AUTO_REUSE)
model1 = resnet(x1)
model2 = resnet(x2)

And I recommend the following pattern used in deploying multiple clones in tf.slim:

with tf.name_scope('clone0'):
    model1 = nets.ResNet50(x1, reuse=tf.AUTO_REUSE)
with tf.name_scope('clone1'):
    model2 = nets.ResNet50(x2, reuse=tf.AUTO_REUSE)

for (a, b) in zip(model1.get_weights(), model2.get_weights()):
    assert a == b

assert model1.name == 'clone0/resnet50/probs:0'
assert model2.name == 'clone1/resnet50/probs:0'

Without tf.name_scope, tf.Tensor will be automatically named with a postfix-style (resnet50, resnet50_1, ...). I think that it may be difficult to manage tensor names in such cases.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

1 participant