一步步带你探究如何高效使用TensorFlow

xiaoxiao2021-04-19 329

更多深度文章，请关注：https://yq.aliyun.com/cloud

更详细的Tensorflow教程：点击

Tensorflow基础知识：

Tensorflow和其他数字计算库（如numpy）之间最明显的区别在于Tensorflow中操作的是符号。这是一个强大的功能，这保证了Tensorflow可以做很多其他库（例如numpy）不能完成的事情（例如自动区分）。这可能也是它更复杂的原因。今天我们来一步步探秘Tensorflow，并为更有效地使用Tensorflow提供了一些指导方针和最佳实践。

我们从一个简单的例子开始，我们要乘以两个随机矩阵。首先我们来看一下在numpy中如何实现：

import numpy as np x = np.random.normal(size=[10, 10]) y = np.random.normal(size=[10, 10]) z = np.dot(x, y) print(z)

现在我们使用Tensorflow中执行完全相同的计算：

import tensorflow as tf x = tf.random_normal([10, 10]) y = tf.random_normal([10, 10]) z = tf.matmul(x, y) sess = tf.Session() z_val = sess.run(z) print(z_val)

与立即执行计算并将结果复制给输出变量z的numpy不同，tensorflow只给我们一个可以操作的张量类型。如果我们尝试直接打印z的值，我们得到这样的东西：

Tensor("MatMul:0", shape=(10, 10), dtype=float32)

由于两个输入都是已经定义的类型，tensorFlow能够推断张量的符号及其类型。为了计算张量的值，我们需要创建一个会话并使用Session.run方法进行评估。

要了解如此强大的符号计算到底是什么，我们可以看看另一个例子。假设我们有一个曲线的样本（例如f（x）= 5x ^ 2 + 3），并且我们要估计f（x）在不知道它的参数的前提下。我们定义参数函数为g（x，w）= w0 x ^ 2 + w1 x + w2，它是输入x和潜在参数w的函数，我们的目标是找到潜在参数，使得g（x， w）≈f（x）。这可以通过最小化损失函数来完成：L（w）=（f（x）-g（x，w））^ 2。虽然这问题有一个简单的封闭式的解决方案，但是我们选择使用一种更为通用的方法，可以应用于任何可以区分的任务，那就是使用随机梯度下降。我们在一组采样点上简单地计算相对于w的L（w）的平均梯度，并沿相反方向移动。

以下是在Tensorflow中如何完成：

import numpy as np import tensorflow as tf x = tf.placeholder(tf.float32) y = tf.placeholder(tf.float32) w = tf.get_variable("w", shape=[3, 1]) f = tf.stack([tf.square(x), x, tf.ones_like(x)], 1) yhat = tf.squeeze(tf.matmul(f, w), 1) loss = tf.nn.l2_loss(yhat - y) + 0.1 * tf.nn.l2_loss(w) train_op = tf.train.AdamOptimizer(0.1).minimize(loss) def generate_data(): x_val = np.random.uniform(-10.0, 10.0, size=100) y_val = 5 * np.square(x_val) + 3 return x_val, y_val sess = tf.Session() sess.run(tf.global_variables_initializer()) for _ in range(1000): x_val, y_val = generate_data() _, loss_val = sess.run([train_op, loss], {x: x_val, y: y_val}) print(loss_val) print(sess.run([w]))

通过运行这段代码，我们可以看到下面这组数据：

[4.9924135, 0.00040895029, 3.4504161]

这与我们的参数已经相当接近。

这只是Tensorflow可以做的冰山一角。许多问题，如优化具有数百万个参数的大型神经网络，都可以在Tensorflow中使用短短的几行代码高效地实现。而且Tensorflow可以跨多个设备和线程进行扩展，并支持各种平台。

理解静态形状和动态形状的区别：

Tensorflow中的张量在图形构造期间具有静态的形状属性。例如，我们可以定义一个形状的张量[None，128]：

import tensorflow as tf a = tf.placeholder([None, 128])

这意味着第一个维度可以是任意大小的，并且将在Session.run期间随机确定。Tensorflow有一个非常简单的API来展示静态形状：

static_shape = a.get_shape().as_list() # returns [None, 128]

为了获得张量的动态形状，你可以调用tf.shape op，它将返回一个表示给定形状的张量：

dynamic_shape = tf.shape(a)

我们可以使用Tensor.set_shape（）方法设置张量的静态形状：

a.set_shape([32, 128])

实际上使用tf.reshape（）操作更为安全：

a = tf.reshape(a, [32, 128])

这里有一个函数可以方便地返回静态形状，当静态可用而动态不可用的时候。

def get_shape(tensor): static_shape = tensor.get_shape().as_list() dynamic_shape = tf.unstack(tf.shape(tensor)) dims = [s[1] if s[0] is None else s[0] for s in zip(static_shape, dynamic_shape)] return dims

现在想象一下，如果我们要将三维的张量转换成二维的张量。在TensorFlow中我们可以使用get_shape（）函数：

b = placeholder([None, 10, 32]) shape = get_shape(tensor) b = tf.reshape(b, [shape[0], shape[1] * shape[2]])

请注意，无论是否静态指定形状，都可以这样做。

实际上，我们可以写一个通用的重塑功能来如何维度之间的转换：

import tensorflow as tf import numpy as np def reshape(tensor, dims_list): shape = get_shape(tensor) dims_prod = [] for dims in dims_list: if isinstance(dims, int): dims_prod.append(shape[dims]) elif all([isinstance(shape[d], int) for d in dims]): dims_prod.append(np.prod([shape[d] for d in dims])) else: dims_prod.append(tf.prod([shape[d] for d in dims])) tensor = tf.reshape(tensor, dims_prod) return tensor

然后转化为二维就变得非常容易了：

b = placeholder([None, 10, 32]) b = tf.reshape(b, [0, [1, 2]])

广播机制（broadcasting）的好与坏：

Tensorflow同样支持广播机制。当要执行加法和乘法运算时，你需要确保操作数的形状匹配，例如，你不能将形状[3，2]的张量添加到形状的张量[3,4]。但有一个特殊情况，那就是当你有一个单一的维度。Tensorflow隐含地功能可以将张量自动匹配另一个操作数的形状。例如：

import tensorflow as tf a = tf.constant([[1., 2.], [3., 4.]]) b = tf.constant([[1.], [2.]]) # c = a + tf.tile(a, [1, 2]) c = a + b

广播允许我们执行隐藏的功能，这使代码更简单，并且提高了内存的使用效率，因为我们不需要再使用其他的操作。为了连接不同长度的特征，我们通常平铺式的输入张量。这是各种神经网络架构的最常见模式：

a = tf.random_uniform([5, 3, 5]) b = tf.random_uniform([5, 1, 6]) # concat a and b and apply nonlinearity tiled_b = tf.tile(b, [1, 3, 1]) c = tf.concat([a, tiled_b], 2) d = tf.layers.dense(c, 10, activation=tf.nn.relu)

这可以通过广播机制更有效地完成。我们使用f（m（x + y））等于f（mx + my）的事实。所以我们可以分别进行线性运算，并使用广播进行隐式级联：

pa = tf.layers.dense(a, 10, activation=None) pb = tf.layers.dense(b, 10, activation=None) d = tf.nn.relu(pa + pb)

实际上，这段代码很普遍，只要在张量之间进行广播就可以应用于任意形状的张量：

def tile_concat_dense(a, b, units, activation=tf.nn.relu): pa = tf.layers.dense(a, units, activation=None) pb = tf.layers.dense(b, units, activation=None) c = pa + pb if activation is not None: c = activation(c) return c

到目前为止，我们讨论了广播的好的部分。但是你可能会问什么坏的部分？隐含的假设总是使调试更加困难，请考虑以下示例：

a = tf.constant([[1.], [2.]]) b = tf.constant([1., 2.]) c = tf.reduce_sum(a + b)

你认为C的数值是多少如果你猜到6，那是错的。这是因为当两个张量的等级不匹配时，Tensorflow会在元素操作之前自动扩展具有较低等级的张量，因此加法的结果将是[[2,3]， [3，4]]。

如果我们指定了我们想要减少的维度，避免这个错误就变得很容易了：

a = tf.constant([[1.], [2.]]) b = tf.constant([1., 2.]) c = tf.reduce_sum(a + b, 0)

这里c的值将是[5,7]。

使用Python实现原型内核和高级可视化的操作：

为了提高效率，Tensorflow中的操作内核完全是用C ++编写，但是在C ++中编写Tensorflow内核可能会相当痛苦。。使用tf.py_func（），你可以将任何python代码转换为Tensorflow操作。

例如，这是python如何在Tensorflow中实现一个简单的ReLU非线性内核：

import numpy as np import tensorflow as tf import uuid def relu(inputs): # Define the op in python def _relu(x): return np.maximum(x, 0.) # Define the op's gradient in python def _relu_grad(x): return np.float32(x > 0) # An adapter that defines a gradient op compatible with Tensorflow def _relu_grad_op(op, grad): x = op.inputs[0] x_grad = grad * tf.py_func(_relu_grad, [x], tf.float32) return x_grad # Register the gradient with a unique id grad_name = "MyReluGrad_" + str(uuid.uuid4()) tf.RegisterGradient(grad_name)(_relu_grad_op) # Override the gradient of the custom op g = tf.get_default_graph() with g.gradient_override_map({"PyFunc": grad_name}): output = tf.py_func(_relu, [inputs], tf.float32) return output

要验证梯度是否正确，你可以使用Tensorflow的梯度检查器：

x = tf.random_normal([10]) y = relu(x * x) with tf.Session(): diff = tf.test.compute_gradient_error(x, [10], y, [10]) print(diff)

compute_gradient_error（）是以数字的方式计算梯度，并返回与渐变的差异，因为我们想要的是一个很小的差异。

请注意，此实现效率非常低，只对原型设计有用，因为python代码不可并行化，不能在GPU上运行。

在实践中，我们通常使用python ops在Tensorboard上进行可视化。试想一下你正在构建图像分类模型，并希望在训练期间可视化你的模型预测。Tensorflow允许使用函数tf.summary.image（）进行可视化：

image = tf.placeholder(tf.float32) tf.summary.image("image", image)

但这只能显示输入图像，为了可视化预测，你必须找到一种方法来添加对图像的注释，这对于现有操作几乎是不可能的。一个更简单的方法是在python中进行绘图，并将其包装在一个python 方法中：

import io import matplotlib.pyplot as plt import numpy as np import PIL import tensorflow as tf def visualize_labeled_images(images, labels, max_outputs=3, name='image'): def _visualize_image(image, label): # Do the actual drawing in python fig = plt.figure(figsize=(3, 3), dpi=80) ax = fig.add_subplot(111) ax.imshow(image[::-1,...]) ax.text(0, 0, str(label), horizontalalignment='left', verticalalignment='top') fig.canvas.draw() # Write the plot as a memory file. buf = io.BytesIO() data = fig.savefig(buf, format='png') buf.seek(0) # Read the image and convert to numpy array img = PIL.Image.open(buf) return np.array(img.getdata()).reshape(img.size[0], img.size[1], -1) def _visualize_images(images, labels): # Only display the given number of examples in the batch outputs = [] for i in range(max_outputs): output = _visualize_image(images[i], labels[i]) outputs.append(output) return np.array(outputs, dtype=np.uint8) # Run the python op. figs = tf.py_func(_visualize_images, [images, labels], tf.uint8) return tf.summary.image(name, figs)

请注意，由于概要通常只能在一段时间内进行评估（不是每步），因此实施中可以使用该实现，而不用担心效率。

本文由北邮@爱可可-爱生活老师推荐，@阿里云云栖社区组织翻译。

文章原标题《Effective Tensorflow - Guides and best practices for effective use of Tensorflow》

作者：Vahid Kazemi 作者是google的件工程师，CS中的博士学位。从事机器学习，NLP和计算机视觉工作。

译者：袁虎审阅：

文章为简译，更为详细的内容，请查看原文

相关资源：七夕情人节表白HTML源码(两款)

技术

最新回复(0)