deeplearning.ai——构建深度神经网络做图像处理

xiaoxiao2023-10-12 165

4.1 Building your Deep Neural Network: Step by Step

1 - Packages

2 - Outline of the Assignment

3 - Initialization

3.1 - 2-layer Neural Network

3.2 - L-layer Neural Network

4 - Forward propagation module

4.1 - Linear Forward

4.2 - Linear-Activation Forward

4.3 - L-Layer Model

5 - Cost function

6 - Backward propagation module

6.1 - Linear backward

6.2 - Linear-Activation backward

6.3 - L-Model Backward

6.4 - Update Parameters

4.2 Deep Neural Network for Image Classification: Application

1 - Packages

2 - Dataset

3 - Architecture of your model

3.1 - 2-layer neural network

3.2 - L-layer deep neural network

3.3 - General methodology

4 - Two-layer neural network

5 - L-layer Neural Network

6 - Results Analysis

7 - Test with your own image (optional/ungraded exercise)

4.1 Building your Deep Neural Network: Step by Step

符号：

上标[l]表示第l层，例如：是第L层的激活函数，和是第L层的参数。上标(i)表示第i个样本，例如：是第i个训练样本。下标i表示向量的第i个输入，例如：表示第l层激活函数的第i个输入。

1 - Packages

首先，运行下面的单元来导入在这个作业中需要的所有包：

numpy是使用python进行科学计算的基础包。Matplotlib是一个用于在Python中绘制图形的库。dnn_utils提供了一些必要的函数。testCases提供了一些测试用例来评估函数的正确性。np.random.seed(1)用来保持所有的随机函数调用的一致性。 import numpy as np import h5py import matplotlib.pyplot as plt from testCases_v2 import * from dnn_utils_v2 import sigmoid, sigmoid_backward, relu, relu_backward %matplotlib inline plt.rcParams['figure.figsize'] = (5.0, 4.0) # set default size of plots plt.rcParams['image.interpolation'] = 'nearest' plt.rcParams['image.cmap'] = 'gray' %load_ext autoreload %autoreload 2 np.random.seed(1)

2 - Outline of the Assignment

为了构建神经网络，需要实现几个辅助函数。这些辅助函数将在下节作业中用来构造一个双层神经网络和一个L层的神经网络。下面是任务大纲：

1.初始化双层神经网络和L层神经网络的参数

2.实现前向传播模块（紫色部分）：

完成一步前向传播的LINEAR部分（得到）已有ACTIVATION函数（relu/sigmoid）合并前两步，得到新的[LINEAR→ACTIVATION]前向函数堆叠[LINEAR→RELU]前向函数L-1次（从第1层到第L-1层），在最后（最后一层L）加上一个[LINEAR→RELU]。得到新的L_model_forward函数。

3.计算损失

4.实现反向传播模块（红色部分）：

完成一步反向传播的LINEAR部分已有ACTIVATE函数的梯度（relu_backward/sigmoid_backward）合并前两部，得到新的[LINEAR→ACTIVATION]反向函数堆叠[LINEAR→RELU]反向函数L-1次，加上[LINEAR→SIGMOID]，得到新的L_model_backward函数。

5.最后更新参数

注意：对于每一个前向函数，都有一个对应的反向函数，这就是为什么前向模块的每一步都要存储一些值在一个缓存器中，在反向传播模块你会用到这些缓存值来计算梯度。

3 - Initialization

完成两个辅助函数来初始化模型的参数，第一个函数初始化双层神经网络的参数，第二个函数初始化L层神经网络的参数。

3.1 - 2-layer Neural Network

练习：创建并初始化双层神经网络的参数。

说明：

模型的结构是LINEAR→RELU→LINEAR→SIGMOID对于权重矩阵使用随机初始化，使用正确维度的np.random.randn(shape)*0.01对于偏置值初始化为零，使用np.zeros(shape) W1 = np.random.randn(n_h, n_x) * 0.01 b1 = np.zeros(shape=(n_h, 1)) W2 = np.random.randn(n_y, n_h) * 0.01 b2 = np.zeros(shape=(n_y, 1))

3.2 - L-layer Neural Network

是l层的神经元数目，如果输入X的大小是(12288,209)，即209个样本数：

练习：实现L层神经网络的初始化。

说明：

模型的结构是[LINEAR→RELU] X (L-1) → LINEAR → SIGMOID，即有L-1层使用RELU激活函数，接着是一个输出层使用sigmoid激活函数。对于权重矩阵使用随机初始化，使用np.random.randn(shape)*0.01对于偏置值初始化为零，使用np.zeros(shape)对于不同层的神经元数目，将其存在变量layer_dims中。例如，在“Planar Data classification model”中，layer_dims就是[2,4,1]：两个输入单元，单隐层有四个隐藏单元，输出层有有一个输出单元。意味着W1的大小是(4,2)，b1的大小是(4,1)，W2的大小的(1,4)，b2的大小是(1,1)。 np.random.seed(3) parameters = {} L = len(layer_dims) # number of layers in the network for l in range(1, L): parameters['W' + str(l)] = np.random.randn(layer_dims[l], layer_dims[l-1]) * 0.01 parameters['b' + str(l)] = np.zeros(shape=(layer_dims[l], 1))

4 - Forward propagation module

4.1 - Linear Forward

首先实现三个函数：

LINEARLINEAR → ACTIVATION，其中ACTIVATION是RELU或sigmoid[LINEAR → RELU] X (L-1) → LINEAR → SIGMOID（整个模型）

线性前向模块计算如下：

其中，

练习：构建前向传播的线性模块。

Z = np.dot(W, A) + b

4.2 - Linear-Activation Forward

两个激活函数Sigmoid和ReLU，都返回两个值，激活值"a"，一个"cache"包含了"Z"（将反馈给相应的反向函数的值）。

练习：实现LINEAR→ACTIVATION层的前向传播，数学表示为，使用linear_forward()和正确的激活函数。

if activation == "sigmoid": # Inputs: "A_prev, W, b". Outputs: "A, activation_cache". Z, linear_cache = linear_forward(A_prev, W, b) A, activation_cache = sigmoid(Z) elif activation == "relu": # Inputs: "A_prev, W, b". Outputs: "A, activation_cache". Z, linear_cache = linear_forward(A_prev, W, b) A, activation_cache = relu(Z)

4.3 - L-Layer Model

为了更方便地实现L层神经网络，需要一个函数来复制上一步（linear_activation_forward和RELU）L-1次，然后紧接着是linear_activation_forward和SIGMOID。

练习：实现上述模型的前向传播。

说明：，即

caches = [] A = X L = len(parameters) // 2 # number of layers in the neural network # Implement [LINEAR -> RELU]*(L-1). Add "cache" to the "caches" list. for l in range(1, L): A_prev = A A, cache = linear_activation_forward(A_prev, parameters['W' + str(l)], parameters['b' + str(l)], "relu") caches.append(cache) # Implement LINEAR -> SIGMOID. Add "cache" to the "caches" list. AL, cache = linear_activation_forward(A, parameters['W' + str(L)], parameters['b' + str(L)], "sigmoid") caches.append(cache)

5 - Cost function

实现前向和反向传播，需要计算损失以检查模型是否在学习。

练习：计算交叉熵损失J，公式如下：

cost = -np.mean(Y * np.log(AL) + (1 - Y) * np.log(1 - AL))

6 - Backward propagation module

和前向传播类似，用下列三步构建反向传播：

LINEAR反向LINEAR→ACTIVATION反向，其中ACTIVATION计算RELU或sigmoid激活函数的导数[LINEAR→RELU] X (L-1) → LINEAR → SIGMOID反向（整个模型）

6.1 - Linear backward

对于第l层，线性部分是

假设已经计算了，想得到

使用计算下列三个输出：

练习：利用上述三个公式实现linear_backward()

dW = np.dot(dZ, np.transpose(A_prev)) / m db = np.mean(dZ).reshape(b.shape) dA_prev = np.dot(np.transpose(W), dZ)

6.2 - Linear-Activation backward

提供了两个反向函数，sigmoid_backward和relu_backward，返回dZ，它们计算的是：

练习：实现LINEAR→ACTIVATION层的反向传播。

if activation == "relu": dZ = relu_backward(dA, activation_cache) dA_prev, dW, db = linear_backward(dZ, linear_cache) elif activation == "sigmoid": dZ = sigmoid_backward(dA, activation_cache) dA_prev, dW, db = linear_backward(dZ, linear_cache)

6.3 - L-Model Backward

接下来实现整个网络的反向传播，当实现L_model_forward函数时，在每一次迭代中，都储存了一个包含(X,W,b,Z)的cache，在反向传播模块会用到这些值来计算梯度，因此，在L_model_backward函数中，会从L层开始反向迭代所有隐藏层，每一步都会用到缓存值来反向传播通过第l层：

初始化反向传播：输出为，因此代码中需要计算，使用下列公式：

dAL = - (np.divide(Y, AL) - np.divide(1 - Y, 1 - AL)) # derivative of cost with respect to AL

可以使用这个激活后的梯度dAL来保持反向传播，如上图所示，现在可以把dAL反馈给LINEAR→SIGMOID反向函数。之后，需要使用for循环迭代所有使用了LINEAR→RELU反向函数的其他层。需要存储每一个dA，dW和db在grads字典中，使用下列式子：

练习：实现[LINEAR→RELU] X (L-1) → LINEAR → SIGMOID模型的反向传播。

grads = {} L = len(caches) # the number of layers m = AL.shape[1] Y = Y.reshape(AL.shape) # after this line, Y is the same shape as AL # Initializing the backpropagation dAL = -(np.divide(Y, AL) - np.divide(1-Y, 1-AL)) # Lth layer (SIGMOID -> LINEAR) gradients. # Inputs: "AL, Y, caches". # Outputs: "grads["dAL"], grads["dWL"], grads["dbL"] current_cache = caches[L-1] grads["dA" + str(L)], grads["dW" + str(L)], grads["db" + str(L)] = \ linear_activation_backward(dAL, current_cache, "sigmoid") for l in reversed(range(L - 1)): # lth layer: (RELU -> LINEAR) gradients. # Inputs: "grads["dA" + str(l + 2)], caches". # Outputs: "grads["dA" + str(l + 1)] , grads["dW" + str(l + 1)] , grads["db" + str(l + 1)] current_cache = caches[l] grads["dA" + str(l + 1)] , grads["dW" + str(l + 1)] , grads["db" + str(l + 1)] = \ linear_activation_backward(grads["dA" + str(l + 2)], current_cache, "relu")

6.4 - Update Parameters

更新模型的参数，使用梯度下降：

其中是学习率，计算得到的结果存入参数字典。

练习：实现update_parameters()来使用梯度下降更新参数。

L = len(parameters) // 2 # number of layers in the neural network # Update rule for each parameter. Use a for loop. for l in range(L): parameters["W" + str(l+1)] = parameters["W" + str(l+1)] - \ learning_rate * grads["dW" + str(l+1)] parameters["b" + str(l+1)] = parameters["b" + str(l+1)] - \ learning_rate * grads["db" + str(l+1)]

4.2 Deep Neural Network for Image Classification: Application

1 - Packages

numpy是使用python进行科学计算的基础包。Matplotlib是一个用于在Python中绘制图形的库。h5py是一个与H5文件的数据集交互的通用包。PIL和scipy用来在最后使用自己的图片来测试模型dnn_app_utils提供了4.1作业中实现的函数。np.random.seed(1)用来保持所有的随机函数调用的一致性。 import time import numpy as np import h5py import matplotlib.pyplot as plt import scipy from PIL import Image from scipy import ndimage from dnn_app_utils_v2 import * %matplotlib inline plt.rcParams['figure.figsize'] = (5.0, 4.0) # set default size of plots plt.rcParams['image.interpolation'] = 'nearest' plt.rcParams['image.cmap'] = 'gray' %load_ext autoreload %autoreload 2 np.random.seed(1)

2 - Dataset

使用上一次作业中的猫数据集，当时的测试准确率为70%。

问题描述：给定一个数据集（"data.h5"），包含：

训练集，m_train张图片，标记为猫(1)和非猫(0)测试集，m_test张图片，标记为猫和非猫每张图片的大小为(num_px,num_px,3)，三通道RGB train_x_orig, train_y, test_x_orig, test_y, classes = load_data() # Explore your dataset m_train = train_x_orig.shape[0] num_px = train_x_orig.shape[1] m_test = test_x_orig.shape[0] print ("Number of training examples: " + str(m_train)) print ("Number of testing examples: " + str(m_test)) print ("Each image is of size: (" + str(num_px) + ", " + str(num_px) + ", 3)") print ("train_x_orig shape: " + str(train_x_orig.shape)) print ("train_y shape: " + str(train_y.shape)) print ("test_x_orig shape: " + str(test_x_orig.shape)) print ("test_y shape: " + str(test_y.shape)) Number of training examples: 209 Number of testing examples: 50 Each image is of size: (64, 64, 3) train_x_orig shape: (209, 64, 64, 3) train_y shape: (1, 209) test_x_orig shape: (50, 64, 64, 3) test_y shape: (1, 50)

其中一张图片：

# Example of a picture index = 50 plt.imshow(train_x_orig[index]) print ("y = " + str(train_y[0,index]) + ". It's a " + classes[train_y[0,index]].decode("utf-8") + " picture.") y = 1. It's a cat picture.

首先对图片进行reshape和standardize操作：

# Reshape the training and test examples train_x_flatten = train_x_orig.reshape(train_x_orig.shape[0], -1).T # The "-1" makes reshape flatten the remaining dimensions test_x_flatten = test_x_orig.reshape(test_x_orig.shape[0], -1).T # Standardize data to have feature values between 0 and 1. train_x = train_x_flatten/255. test_x = test_x_flatten/255. print ("train_x's shape: " + str(train_x.shape)) print ("test_x's shape: " + str(test_x.shape)) train_x's shape: (12288, 209) test_x's shape: (12288, 50)

3 - Architecture of your model

构建两个不同的模型：双层神经网络和L层神经网络。

3.1 - 2-layer neural network

3.2 - L-layer deep neural network

3.3 - General methodology

构建模型的一般方法：

1.初始化参数/定义超参数

2.num_iterations次循环：

前向传播计算损失函数反向传播更新参数

3.使用训练后的参数预测标签

4 - Two-layer neural network

问题：使用上一节实现的辅助函数，双层神经网络结构：LINEAR→RELU→LINEAR→SIGMOID.

def initialize_parameters(n_x, n_h, n_y): ... return parameters def linear_activation_forward(A_prev, W, b, activation): ... return A, cache def compute_cost(AL, Y): ... return cost def linear_activation_backward(dA, cache, activation): ... return dA_prev, dW, db def update_parameters(parameters, grads, learning_rate): ... return parameters

定义常数：

### CONSTANTS DEFINING THE MODEL #### n_x = 12288 # num_px * num_px * 3 n_h = 7 n_y = 1 layers_dims = (n_x, n_h, n_y)

双层神经网络实现：

# GRADED FUNCTION: two_layer_model def two_layer_model(X, Y, layers_dims, learning_rate = 0.0075, num_iterations = 3000, print_cost=False): """ Implements a two-layer neural network: LINEAR->RELU->LINEAR->SIGMOID. Arguments: X -- input data, of shape (n_x, number of examples) Y -- true "label" vector (containing 0 if cat, 1 if non-cat), of shape (1, number of examples) layers_dims -- dimensions of the layers (n_x, n_h, n_y) num_iterations -- number of iterations of the optimization loop learning_rate -- learning rate of the gradient descent update rule print_cost -- If set to True, this will print the cost every 100 iterations Returns: parameters -- a dictionary containing W1, W2, b1, and b2 """ np.random.seed(1) grads = {} costs = [] # to keep track of the cost m = X.shape[1] # number of examples (n_x, n_h, n_y) = layers_dims # Initialize parameters dictionary, by calling one of the functions you'd previously implemented ### START CODE HERE ### (≈ 1 line of code) parameters = initialize_parameters(n_x, n_h, n_y) ### END CODE HERE ### # Get W1, b1, W2 and b2 from the dictionary parameters. W1 = parameters["W1"] b1 = parameters["b1"] W2 = parameters["W2"] b2 = parameters["b2"] # Loop (gradient descent) for i in range(0, num_iterations): # Forward propagation: LINEAR -> RELU -> LINEAR -> SIGMOID. Inputs: "X, W1, b1". Output: "A1, cache1, A2, cache2". ### START CODE HERE ### (≈ 2 lines of code) A1, cache1 = linear_activation_forward(X, W1, b1, "relu") A2, cache2 = linear_activation_forward(A1, W2, b2, "sigmoid") ### END CODE HERE ### # Compute cost ### START CODE HERE ### (≈ 1 line of code) cost = compute_cost(A2, Y) ### END CODE HERE ### # Initializing backward propagation dA2 = -(np.divide(Y, A2) - np.divide(1-Y, 1-A2)) # Backward propagation. Inputs: "dA2, cache2, cache1". Outputs: "dA1, dW2, db2; also dA0 (not used), dW1, db1". ### START CODE HERE ### (≈ 2 lines of code) dA1, dW2, db2 = linear_activation_backward(dA2, cache2, "sigmoid") dA0, dW1, db1 = linear_activation_backward(dA1, cache1, "relu") ### END CODE HERE ### # Set grads['dWl'] to dW1, grads['db1'] to db1, grads['dW2'] to dW2, grads['db2'] to db2 grads['dW1'] = dW1 grads['db1'] = db1 grads['dW2'] = dW2 grads['db2'] = db2 # Update parameters. ### START CODE HERE ### (approx. 1 line of code) parameters = update_parameters(parameters, grads, learning_rate) ### END CODE HERE ### # Retrieve W1, b1, W2, b2 from parameters W1 = parameters["W1"] b1 = parameters["b1"] W2 = parameters["W2"] b2 = parameters["b2"] # Print the cost every 100 training example if print_cost and i % 100 == 0: print("Cost after iteration {}: {}".format(i, np.squeeze(cost))) if print_cost and i % 100 == 0: costs.append(cost) # plot the cost plt.plot(np.squeeze(costs)) plt.ylabel('cost') plt.xlabel('iterations (per tens)') plt.title("Learning rate =" + str(learning_rate)) plt.show() return parameters

训练：

parameters = two_layer_model(train_x, train_y, layers_dims = (n_x, n_h, n_y), num_iterations = 2500, print_cost=True)

训练集上的准确率：

predictions_train = predict(train_x, train_y, parameters) Accuracy: 1.0

测试集上的准确率：

predictions_test = predict(test_x, test_y, parameters) Accuracy: 0.72

5 - L-layer Neural Network

问题：使用上一节实现的辅助函数，L层神经网络结构：[LINEAR→RELU] X (L-1) → LINEAR → SIGMOID.

def initialize_parameters_deep(layer_dims): ... return parameters def L_model_forward(X, parameters): ... return AL, caches def compute_cost(AL, Y): ... return cost def L_model_backward(AL, Y, caches): ... return grads def update_parameters(parameters, grads, learning_rate): ... return parameters

定义常数：

### CONSTANTS ### layers_dims = [12288, 20, 7, 5, 1] # 5-layer model

L层神经网络实现：

# GRADED FUNCTION: L_layer_model def L_layer_model(X, Y, layers_dims, learning_rate = 0.0075, num_iterations = 3000, print_cost=False):#lr was 0.009 """ Implements a L-layer neural network: [LINEAR->RELU]*(L-1)->LINEAR->SIGMOID. Arguments: X -- data, numpy array of shape (number of examples, num_px * num_px * 3) Y -- true "label" vector (containing 0 if cat, 1 if non-cat), of shape (1, number of examples) layers_dims -- list containing the input size and each layer size, of length (number of layers + 1). learning_rate -- learning rate of the gradient descent update rule num_iterations -- number of iterations of the optimization loop print_cost -- if True, it prints the cost every 100 steps Returns: parameters -- parameters learnt by the model. They can then be used to predict. """ np.random.seed(1) costs = [] # keep track of cost # Parameters initialization. ### START CODE HERE ### parameters = initialize_parameters_deep(layers_dims) ### END CODE HERE ### # Loop (gradient descent) for i in range(0, num_iterations): # Forward propagation: [LINEAR -> RELU]*(L-1) -> LINEAR -> SIGMOID. ### START CODE HERE ### (≈ 1 line of code) AL, caches = L_model_forward(X, parameters) ### END CODE HERE ### # Compute cost. ### START CODE HERE ### (≈ 1 line of code) cost = compute_cost(AL, Y) ### END CODE HERE ### # Backward propagation. ### START CODE HERE ### (≈ 1 line of code) grads = L_model_backward(AL, Y, caches) ### END CODE HERE ### # Update parameters. ### START CODE HERE ### (≈ 1 line of code) parameters = update_parameters(parameters, grads, learning_rate) ### END CODE HERE ### # Print the cost every 100 training example if print_cost and i % 100 == 0: print ("Cost after iteration %i: %f" %(i, cost)) if print_cost and i % 100 == 0: costs.append(cost) # plot the cost plt.plot(np.squeeze(costs)) plt.ylabel('cost') plt.xlabel('iterations (per tens)') plt.title("Learning rate =" + str(learning_rate)) plt.show() return parameters

训练：

parameters = L_layer_model(train_x, train_y, layers_dims, num_iterations = 2500, print_cost = True)

训练集上的准确率：

pred_train = predict(train_x, train_y, parameters) Accuracy: 0.9856459330143541

测试集上的准确率：

pred_test = predict(test_x, test_y, parameters) Accuracy: 0.8

6 - Results Analysis

print_mislabeled_images(classes, test_x, test_y, pred_test)

一些错分类的图片：

7 - Test with your own image (optional/ungraded exercise)

%matplotlib inline ## START CODE HERE ## my_image = "my_image.jpg" # change this to the name of your image file my_label_y = [1] # the true class of your image (1 -> cat, 0 -> non-cat) ## END CODE HERE ## fname = "images/" + my_image image = np.array(ndimage.imread(fname, flatten=False)) my_image = scipy.misc.imresize(image, size=(num_px,num_px)).reshape((num_px*num_px*3,1)) my_predicted_image = predict(my_image, my_label_y, parameters) plt.imshow(image) print ("y = " + str(np.squeeze(my_predicted_image)) + ", your L-layer model predicts a \"" + classes[int(np.squeeze(my_predicted_image)),].decode("utf-8") + "\" picture.") Accuracy: 1.0 y = 1.0, your L-layer model predicts a "cat" picture.

最新回复(0)