本文档为TensorFlow参考文档,本转载已得到TensorFlow中文社区授权。
Note: Functions taking Tensor arguments can also take anything accepted by tf.convert_to_tensor.
See the Variables How To for a high level overview.
A variable maintains state in the graph across calls to run(). You add a variable to the graph by constructing an instance of the class Variable.
The Variable() constructor requires an initial value for the variable, which can be a Tensor of any type and shape. The initial value defines the type and shape of the variable. After construction, the type and shape of the variable are fixed. The value can be changed using one of the assign methods.
If you want to change the shape of a variable later you have to use an assign Op with validate_shape=False.
Just like any Tensor, variables created with Variable() can be used as inputs for other Ops in the graph. Additionally, all the operators overloaded for the Tensor class are carried over to variables, so you can also add nodes to the graph by just doing arithmetic on variables.
import tensorflow as tf # Create a variable. w = tf.Variable(<initial-value>, name=<optional-name>) # Use the variable in the graph like any Tensor. y = tf.matmul(w, ...another variable or tensor...) # The overloaded operators are available too. z = tf.sigmoid(w + b) # Assign a new value to the variable with `assign()` or a related method. w.assign(w + 1.0) w.assign_add(1.0)When you launch the graph, variables have to be explicitly initialized before you can run Ops that use their value. You can initialize a variable by running its initializer op, restoring the variable from a save file, or simply running an assign Op that assigns a value to the variable. In fact, the variable initializer op is just an assign Op that assigns the variable's initial value to the variable itself.
# Launch the graph in a session. with tf.Session() as sess: # Run the variable initializer. sess.run(w.initializer) # ...you now can run ops that use the value of 'w'...The most common initialization pattern is to use the convenience function initialize_all_variables()to add an Op to the graph that initializes all the variables. You then run that Op after launching the graph.
# Add an Op to initialize all variables. init_op = tf.initialize_all_variables() # Launch the graph in a session. with tf.Session() as sess: # Run the Op that initializes all variables. sess.run(init_op) # ...you can now run any Op that uses variable values...If you need to create a variable with an initial value dependent on another variable, use the other variable's initialized_value(). This ensures that variables are initialized in the right order.
All variables are automatically collected in the graph where they are created. By default, the constructor adds the new variable to the graph collection GraphKeys.VARIABLES. The convenience functionall_variables() returns the contents of that collection.
When building a machine learning model it is often convenient to distinguish betwen variables holding the trainable model parameters and other variables such as a global step variable used to count training steps. To make this easier, the variable constructor supports a trainable=<bool> parameter. If True, the new variable is also added to the graph collection GraphKeys.TRAINABLE_VARIABLES. The convenience function trainable_variables() returns the contents of this collection. The various Optimizer classes use this collection as the default list of variables to optimize.
Creating a variable.
Creates a new variable with value initial_value.
The new variable is added to the graph collections listed in collections, which defaults to [GraphKeys.VARIABLES].
If trainable is True the variable is also added to the graph collection GraphKeys.TRAINABLE_VARIABLES.
This constructor creates both a variable Op and an assign Op to set the variable to its initial value.
A Variable.
Returns the value of the initialized variable.
You should use this instead of the variable itself to initialize another variable with a value that depends on the value of this variable.
# Initialize 'v' with a random tensor. v = tf.Variable(tf.truncated_normal([10, 40])) # Use `initialized_value` to guarantee that `v` has been # initialized before its value is used to initialize `w`. # The random values are picked only once. w = tf.Variable(v.initialized_value() * 2.0)A Tensor holding the value of this variable after its initializer has run.
Changing a variable value.
Assigns a new value to the variable.
This is essentially a shortcut for assign(self, value).
A Tensor that will hold the new value of this variable after the assignment has completed.
Adds a value to this variable.
This is essentially a shortcut for assign_add(self, delta).
A Tensor that will hold the new value of this variable after the addition has completed.
Subtracts a value from this variable.
This is essentially a shortcut for assign_sub(self, delta).
A Tensor that will hold the new value of this variable after the subtraction has completed.
Subtracts IndexedSlices from this variable.
This is essentially a shortcut for scatter_sub(self, sparse_delta.indices, sparse_delta.values).
A Tensor that will hold the new value of this variable after the scattered subtraction has completed.
Increments this variable until it reaches limit.
When that Op is run it tries to increment the variable by 1. If incrementing the variable would bring it above limit then the Op raises the exception OutOfRangeError.
If no error is raised, the Op outputs the value of the variable before the increment.
This is essentially a shortcut for count_up_to(self, limit).
A Tensor that will hold the variable value before the increment. If no other Op modifies this variable, the values produced will all be distinct.
In a session, computes and returns the value of this variable.
This is not a graph construction method, it does not add ops to the graph.
This convenience method requires a session where the graph containing this variable has been launched. If no session is passed, the default session is used. See the Session class for more information on launching a graph and on sessions.
v = tf.Variable([1, 2]) init = tf.initialize_all_variables() with tf.Session() as sess: sess.run(init) # Usage passing the session explicitly. print v.eval(sess) # Usage with the default session. The 'with' block # above makes 'sess' the default session. print v.eval()A numpy ndarray with a copy of the value of this variable.
Properties.
The name of this variable.
The DType of this variable.
The TensorShape of this variable.
A TensorShape.
The device of this variable.
The initializer operation for this variable.
The Graph of this variable.
The Operation of this variable.
TensorFlow provides a set of functions to help manage the set of variables collected in the graph.
Returns all variables collected in the graph.
The Variable() constructor automatically adds new variables to the graph collection GraphKeys.VARIABLES. This convenience function returns the contents of that collection.
A list of Variable objects.
Returns all variables created with trainable=True.
When passed trainable=True, the Variable() constructor automatically adds new variables to the graph collection GraphKeys.TRAINABLE_VARIABLES. This convenience function returns the contents of that collection.
A list of Variable objects.
Returns an Op that initializes all variables.
This is just a shortcut for initialize_variables(all_variables())
An Op that initializes all variables in the graph.
Returns an Op that initializes a list of variables.
After you launch the graph in a session, you can run the returned Op to initialize all the variables in var_list. This Op runs all the initializers of the variables in var_list in parallel.
Calling initialize_variables() is equivalent to passing the list of initializers to Group().
If var_list is empty, however, the function still returns an Op that can be run. That Op just has no effect.
An Op that run the initializers of all the specified variables.
Returns an Op to check if variables are initialized.
When run, the returned Op will raise the exception FailedPreconditionError if any of the variables has not yet been initialized.
Note: This function is implemented by trying to fetch the values of the variables. If one of the variables is not initialized a message may be logged by the C++ runtime. This is expected.
An Op, or None if there are no variables.
Saves and restores variables.
See Variables for an overview of variables, saving and restoring.
The Saver class adds ops to save and restore variables to and from checkpoints. It also provides convenience methods to run these ops.
Checkpoints are binary files in a proprietary format which map variable names to tensor values. The best way to examine the contents of a checkpoint is to load it using a Saver.
Savers can automatically number checkpoint filenames with a provided counter. This lets you keep multiple checkpoints at different steps while training a model. For example you can number the checkpoint filenames with the training step number. To avoid filling up disks, savers manage checkpoint files automatically. For example, they can keep only the N most recent files, or one checkpoint for every N hours of training.
You number checkpoint filenames by passing a value to the optional global_step argument to save():
saver.save(sess, 'my-model', global_step=0) ==> filename: 'my-model-0' ... saver.save(sess, 'my-model', global_step=1000) ==> filename: 'my-model-1000'Additionally, optional arguments to the Saver() constructor let you control the proliferation of checkpoint files on disk:
max_to_keep indicates the maximum number of recent checkpoint files to keep. As new files are created, older files are deleted. If None or 0, all checkpoint files are kept. Defaults to 5 (that is, the 5 most recent checkpoint files are kept.)
keep_checkpoint_every_n_hours: In addition to keeping the most recent max_to_keep checkpoint files, you might want to keep one checkpoint file for every N hours of training. This can be useful if you want to later analyze how a model progressed during a long training session. For example, passing keep_checkpoint_every_n_hours=2 ensures that you keep one checkpoint file for every 2 hours of training. The default value of 10,000 hours effectively disables the feature.
Note that you still have to call the save() method to save the model. Passing these arguments to the constructor will not save variables automatically for you.
A training program that saves regularly looks like:
... # Create a saver. saver = tf.train.Saver(...variables...) # Launch the graph and train, saving the model every 1,000 steps. sess = tf.Session() for step in xrange(1000000): sess.run(..training_op..) if step % 1000 == 0: # Append the step number to the checkpoint name: saver.save(sess, 'my-model', global_step=step)In addition to checkpoint files, savers keep a protocol buffer on disk with the list of recent checkpoints. This is used to manage numbered checkpoint files and by latest_checkpoint(), which makes it easy to discover the path to the most recent checkpoint. That protocol buffer is stored in a file named 'checkpoint' next to the checkpoint files.
If you create several savers, you can specify a different filename for the protocol buffer file in the call to save().
Creates a Saver.
The constructor adds ops to save and restore variables.
var_list specifies the variables that will be saved and restored. It can be passed as a dict or a list:
A dict of names to variables: The keys are the names that will be used to save or restore the variables in the checkpoint files.A list of variables: The variables will be keyed with their op name in the checkpoint files.For example:
v1 = tf.Variable(..., name='v1') v2 = tf.Variable(..., name='v2') # Pass the variables as a dict: saver = tf.train.Saver({'v1': v1, 'v2': v2}) # Or pass them as a list. saver = tf.train.Saver([v1, v2]) # Passing a list is equivalent to passing a dict with the variable op names # as keys: saver = tf.train.Saver({v.op.name: v for v in [v1, v2]})The optional reshape argument, if True, allows restoring a variable from a save file where the variable had a different shape, but the same number of elements and type. This is useful if you have reshaped a variable and want to reload it from an older checkpoint.
The optional sharded argument, if True, instructs the saver to shard checkpoints per device.
Saves variables.
This method runs the ops added by the constructor for saving variables. It requires a session in which the graph was launched. The variables to save must also have been initialized.
The method returns the path of the newly created checkpoint file. This path can be passed directly to a call to restore().
A string: path at which the variables were saved. If the saver is sharded, this string ends with: '-?????-of-nnnnn' where 'nnnnn' is the number of shards created.
Restores previously saved variables.
This method runs the ops added by the constructor for restoring variables. It requires a session in which the graph was launched. The variables to restore do not have to have been initialized, as restoring is itself a way to initialize variables.
The save_path argument is typically a value previously returned from a save() call, or a call to latest_checkpoint().
Other utility methods.
List of not-yet-deleted checkpoint filenames.
You can pass any of the returned values to restore().
A list of checkpoint filenames, sorted from oldest to newest.
Sets the list of not-yet-deleted checkpoint filenames.
Generates a SaverDef representation of this saver.
A SaverDef proto.
Finds the filename of latest saved checkpoint file.
The full path to the latest checkpoint or None if no checkpoint was found.
Returns CheckpointState proto from the "checkpoint" file.
If the "checkpoint" file contains a valid CheckpointState proto, returns it.
A CheckpointState if the state was available, None otherwise.
Updates the content of the 'checkpoint' file.
This updates the checkpoint file containing a CheckpointState proto.
TensorFlow provides several classes and operations that you can use to create variables contingent on certain conditions.
Gets an existing variable with these parameters or create a new one.
This function prefixes the name with the current variable scope and performs reuse checks. See theVariable Scope How To for an extensive description of how reusing works. Here is a basic example:
with tf.variable_scope("foo"): v = get_variable("v", [1]) # v.name == "foo/v:0" w = get_variable("w", [1]) # w.name == "foo/w:0" with tf.variable_scope("foo", reuse=True) v1 = get_variable("v") # The same as v above.If initializer is None (the default), the default initializer passed in the constructor is used. If that one is None too, a UniformUnitScalingInitializer will be used.
The created or existing variable.
Returns the current variable scope.
Returns a context for variable scope.
Variable scope allows to create new variables and to share already created ones while providing checks to not create or share by accident. For details, see the Variable Scope How To, here we present only a few basic examples.
Simple example of how to create a new variable:
with tf.variable_scope("foo"): with tf.variable_scope("bar"): v = tf.get_variable("v", [1]) assert v.name == "foo/bar/v:0"Basic example of sharing a variable:
with tf.variable_scope("foo"): v = get_variable("v", [1]) with tf.variable_scope("foo", reuse=True): v1 = tf.get_variable("v", [1]) assert v1 == vSharing a variable by capturing a scope and setting reuse:
with tf.variable_scope("foo") as scope. v = get_variable("v", [1]) scope.reuse_variables() v1 = tf.get_variable("v", [1]) assert v1 == vTo prevent accidental sharing of variables, we raise an exception when getting an existing variable in a non-reusing scope.
with tf.variable_scope("foo") as scope. v = get_variable("v", [1]) v1 = tf.get_variable("v", [1]) # Raises ValueError("... v already exists ...").Similarly, we raise an exception when trying to get a variable that does not exist in reuse mode.
with tf.variable_scope("foo", reuse=True): v = get_variable("v", [1]) # Raises ValueError("... v does not exists ...").Note that the reuse flag is inherited: if we open a reusing scope, then all its sub-scopes become reusing as well.
A scope that can be to captured and reused.
Returns an initializer that generates Tensors with a single value.
An initializer that generates Tensors with a single value.
Returns an initializer that generates Tensors with a normal distribution.
An initializer that generates Tensors with a normal distribution.
Returns an initializer that generates a truncated normal distribution.
These values are similar to values from a random_normal_initializer except that values more than two standard deviations from the mean are discarded and re-drawn. This is the recommended initializer for neural network weights and filters.
An initializer that generates Tensors with a truncated normal distribution.
Returns an initializer that generates Tensors with a uniform distribution.
An initializer that generates Tensors with a uniform distribution.
Returns an initializer that generates tensors without scaling variance.
When initializing a deep network, it is in principle advantageous to keep the scale of the input variance constant, so it does not explode or diminish by reaching the final layer. If the input is x and the operation x * W, and we want to initialize W uniformly at random, we need to pick W from
[-sqrt(3) / sqrt(dim), sqrt(3) / sqrt(dim)]to keep the scale intact, where dim = W.shape[0] (the size of the input). A similar calculation for convolutional networks gives an analogous result with dim equal to the product of the first 3 dimensions. When nonlinearities are present, we need to multiply this by a constant factor. See https://arxiv.org/pdf/1412.6558v3.pdf for deeper motivation, experiments and the calculation of constants. In section 2.3 there, the constants were numerically computed: for a linear layer it's 1.0, relu: ~1.43, tanh: ~1.15.
An initializer that generates tensors with unit variance.
An adaptor for zeros() to match the Initializer spec.
The sparse update ops modify a subset of the entries in a dense Variable, either overwriting the entries or adding / subtracting a delta. These are useful for training embedding models and similar lookup-based networks, since only a small subset of embedding vectors change in any given step.
Since a sparse update of a large tensor may be generated automatically during gradient computation (as in the gradient of tf.gather), an IndexedSlices class is provided that encapsulates a set of sparse indices and values. IndexedSlices objects are detected and handled automatically by the optimizers in most cases.
Applies sparse updates to a variable reference.
This operation computes
# Scalar indices ref[indices, ...] = updates[...] # Vector indices (for each i) ref[indices[i], ...] = updates[i, ...] # High rank indices (for each i, ..., j) ref[indices[i, ..., j], ...] = updates[i, ..., j, ...]This operation outputs ref after the update is done. This makes it easier to chain operations that need to use the reset value.
If indices contains duplicate entries, lexicographically later entries override earlier entries.
Requires updates.shape = indices.shape + ref.shape[1:].
Same as ref. Returned as a convenience for operations that want to use the updated values after the update is done.
Adds sparse updates to a variable reference.
This operation computes
# Scalar indices ref[indices, ...] += updates[...] # Vector indices (for each i) ref[indices[i], ...] += updates[i, ...] # High rank indices (for each i, ..., j) ref[indices[i, ..., j], ...] += updates[i, ..., j, ...]This operation outputs ref after the update is done. This makes it easier to chain operations that need to use the reset value.
Duplicate entries are handled correctly: if multiple indices reference the same location, their contributions add.
Requires updates.shape = indices.shape + ref.shape[1:].
Same as ref. Returned as a convenience for operations that want to use the updated values after the update is done.
Subtracts sparse updates to a variable reference.
# Scalar indices ref[indices, ...] -= updates[...] # Vector indices (for each i) ref[indices[i], ...] -= updates[i, ...] # High rank indices (for each i, ..., j) ref[indices[i, ..., j], ...] -= updates[i, ..., j, ...]This operation outputs ref after the update is done. This makes it easier to chain operations that need to use the reset value.
Duplicate entries are handled correctly: if multiple indices reference the same location, their (negated) contributions add.
Requires updates.shape = indices.shape + ref.shape[1:].
Same as ref. Returned as a convenience for operations that want to use the updated values after the update is done.
Masks elements of IndexedSlices.
Given an IndexedSlices instance a, returns another IndexedSlices that contains a subset of the slices of a. Only the slices at indices specified in mask_indices are returned.
This is useful when you need to extract a subset of slices in an IndexedSlices object.
For example:
# `a` contains slices at indices [12, 26, 37, 45] from a large tensor # with shape [1000, 10] a.indices => [12, 26, 37, 45] tf.shape(a.values) => [4, 10] # `b` will be the subset of `a` slices at its second and third indices, so # we want to mask of its first and last indices (which are at absolute # indices 12, 45) b = tf.sparse_mask(a, [12, 45]) b.indices => [26, 37] tf.shape(b.values) => [2, 10]The masked IndexedSlices instance.
A sparse representation of a set of tensor slices at given indices.
This class is a simple wrapper for a pair of Tensor objects:
values: A Tensor of any dtype with shape [D0, D1, ..., Dn]. indices: A 1-D integer Tensor with shape [D0].An IndexedSlices is typically used to represent a subset of a larger tensor dense of shape [LARGE0, D1, .. , DN] where LARGE0 >> D0. The values in indices are the indices in the first dimension of the slices that have been extracted from the larger tensor.
The dense tensor dense represented by an IndexedSlices slices has
dense[slices.indices[i], :, :, :, ...] = slices.values[i, :, :, :, ...]The IndexedSlices class is used principally in the definition of gradients for operations that have sparse gradients (e.g. tf.gather).
Contrast this representation with SparseTensor, which uses multi-dimensional indices and scalar values.
Creates an IndexedSlices.
A Tensor containing the values of the slices.
A 1-D Tensor containing the indices of the slices.
A 1-D Tensor containing the shape of the corresponding dense tensor.
The name of this IndexedSlices.
The DType of elements in this tensor.
The name of the device on which values will be produced, or None.
The Operation that produces values as an output.
相关资源:DORA-State of Devops-2019.pdf