上采样介绍 + Bilinear pytorch代码解析

xiaoxiao2024-11-11 67

上采样

上采样，任何可以让你的图像变成更高分辨率的技术。

最简单的方式是重采样和插值：将输入图片input image进行rescale到一个想要的尺寸，而且计算每个点的像素点，使用如双线性插值bilinear等插值方法对其余点进行插值。

Unpooling

在CNN中常用的来表示max pooling的逆操作。因为max pooling不可逆，因此使用近似的方式来反转得到max pooling操作之前的原始情况：

Deconvolution(反卷积）

也被称为分数步长卷积(convolution with fractional strides)或者转置卷积(transpose convolution)或者后向卷积backwards strided convolution。与Unpooling不同，使用反卷积来对图像进行上采样是可以习得的。通常用来对卷积层的结果进行上采样，使其回到原始图片的分辨率。

Pytorch

在PyTorch中，上采样的层被封装在torch.nn中的Vision Layers里面，一共有4种：

① PixelShuffle ② Upsample ③ UpsamplingNearest2d ④ UpsamplingBilinear2d

upsample

torch.nn.functional.upsample(input, size=None, scale_factor=None, mode='nearest', align_corners=None)

推荐使用interpolate

scale_factor 在高度、宽度和深度上面的放大倍数。mode 上采样的方法，包括最近邻（nearest），线性插值（linear），双线性插值（bilinear），三次线性插值（trilinear），默认是最近邻（nearest）

interpolate

torch.nn.functional.interpolate(input, size=None, scale_factor=None, mode='nearest', align_corners=None)

例子：

#-*-coding:utf-8-*- import torch import torch.nn as nn import numpy as np # pool of non-square window ## In the simplest case, the output value of the layer # with input size (N,C,H,W)(N, C, H, W)(N,C,H,W), # output (N,C,Hout,Wout)(N, C, H_{out}, W_{out})(N,C,Hout,Wout) and kernel_size (kH,kW)(kH, kW)(kH,kW) can be precisely described as: data = np.arange(64).reshape((8,8)) A = torch.Tensor(data.reshape(1,1,8,8)) print('A=',A) ## MAX POOL maxpool = nn.MaxPool2d((2, 2), stride=(2, 2),return_indices=True) B,indices = maxpool(A) print('B=',B) ## Upsample Upsample = nn.Upsample(scale_factor=2, mode='bilinear') C = Upsample(B) print('C=',C) D = nn.functional.interpolate(B,scale_factor=2, mode='bilinear') print('D=',D) ### max unpool maxunpool = nn.MaxUnpool2d(kernel_size=(2,2),stride=(2,2)) E = maxunpool(B,indices) print('E=',E)

结果

A= tensor([[[[ 0., 1., 2., 3., 4., 5., 6., 7.], [ 8., 9., 10., 11., 12., 13., 14., 15.], [16., 17., 18., 19., 20., 21., 22., 23.], [24., 25., 26., 27., 28., 29., 30., 31.], [32., 33., 34., 35., 36., 37., 38., 39.], [40., 41., 42., 43., 44., 45., 46., 47.], [48., 49., 50., 51., 52., 53., 54., 55.], [56., 57., 58., 59., 60., 61., 62., 63.]]]]) B= tensor([[[[ 9., 11., 13., 15.], [25., 27., 29., 31.], [41., 43., 45., 47.], [57., 59., 61., 63.]]]]) C= tensor([[[[ 9.0000, 9.5000, 10.5000, 11.5000, 12.5000, 13.5000, 14.5000, 15.0000], [13.0000, 13.5000, 14.5000, 15.5000, 16.5000, 17.5000, 18.5000, 19.0000], [21.0000, 21.5000, 22.5000, 23.5000, 24.5000, 25.5000, 26.5000, 27.0000], [29.0000, 29.5000, 30.5000, 31.5000, 32.5000, 33.5000, 34.5000, 35.0000], [37.0000, 37.5000, 38.5000, 39.5000, 40.5000, 41.5000, 42.5000, 43.0000], [45.0000, 45.5000, 46.5000, 47.5000, 48.5000, 49.5000, 50.5000, 51.0000], [53.0000, 53.5000, 54.5000, 55.5000, 56.5000, 57.5000, 58.5000, 59.0000], [57.0000, 57.5000, 58.5000, 59.5000, 60.5000, 61.5000, 62.5000, 63.0000]]]]) D= tensor([[[[ 9.0000, 9.5000, 10.5000, 11.5000, 12.5000, 13.5000, 14.5000, 15.0000], [13.0000, 13.5000, 14.5000, 15.5000, 16.5000, 17.5000, 18.5000, 19.0000], [21.0000, 21.5000, 22.5000, 23.5000, 24.5000, 25.5000, 26.5000, 27.0000], [29.0000, 29.5000, 30.5000, 31.5000, 32.5000, 33.5000, 34.5000, 35.0000], [37.0000, 37.5000, 38.5000, 39.5000, 40.5000, 41.5000, 42.5000, 43.0000], [45.0000, 45.5000, 46.5000, 47.5000, 48.5000, 49.5000, 50.5000, 51.0000], [53.0000, 53.5000, 54.5000, 55.5000, 56.5000, 57.5000, 58.5000, 59.0000], [57.0000, 57.5000, 58.5000, 59.5000, 60.5000, 61.5000, 62.5000, 63.0000]]]]) E= tensor([[[[ 0., 0., 0., 0., 0., 0., 0., 0.], [ 0., 9., 0., 11., 0., 13., 0., 15.], [ 0., 0., 0., 0., 0., 0., 0., 0.], [ 0., 25., 0., 27., 0., 29., 0., 31.], [ 0., 0., 0., 0., 0., 0., 0., 0.], [ 0., 41., 0., 43., 0., 45., 0., 47.], [ 0., 0., 0., 0., 0., 0., 0., 0.], [ 0., 57., 0., 59., 0., 61., 0., 63.]]]])

mode

nearest

图像矩阵 234 38 22 67 44 12 89 65 63

坐标 ----------------------＞X | | | | | Y

放大为（4，4） ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?

根据公式 srcX=dstX* (srcWidth/dstWidth) srcY = dstY * (srcHeight/dstHeight) 小数用四舍五入或截断

依次填完每个象素 234 38 22 22 67 44 12 12 89 65 63 63 89 65 63 63

Bilinear Interpolation

根据于待求点P最近4个点的像素值，计算出P点的像素值。

最新回复(0)