Accelerating Physically Based Rendering by CNN
第一部分Abstract
为了得到真实的图片,一个必要条件是解渲染方程.渲染方程能够通过蒙特卡洛积分估计出。蒙特卡洛渲染系统通过发射光线在世界空间中采样估算出场景函数。虽然少量的采样能够很快得出估算结果,但是这个结果与真实值之间有较大的误差,在图像中的具体表现是大量的噪声。因为蒙特卡洛估计的方差随着采样数量的增长而线性减少,为了得到真实的估计需要增加大量的采样。计算大量采样光线需要较长的渲染时间,极大的影响了蒙特卡洛渲染方法在现代工业中的应用。为了解决这个问题,本文提出一种基于深度学习的框架,利用少量采样图像的输入可以产生近似大量采样的图像效果。本文利用少量采样得到噪声蒙特卡洛图像与相应大量采样得到的图像作为网络的输入,网络使用带有残差连接的卷积神经网络。
第二部分 相关工作,背景介绍(Related work)
traditional work
learning-based
第三部分 方法介绍
问题定义:我们的任务是利用低采样图像获得高采样图像,利用带有残差连接的卷积神经网络能够显著的减少得到高质量图像的时间。
数据生成: 训练数据的处理,利用blender,本文生成了100对不同的低采样图像与相应的高采样图像。其中多数图像尺寸为1280*960
网络设计:将VGG16作为基础模型,并在其中添加残差块
第四部分
训练:本文生成通过在图像中随机裁剪出patch为(224,224)大小的图像作为网络的输入。因为基于物理的渲染与自然图像具有非常相似的特征,因此本文将VGG16作为基础层。网络的输入尺寸为(224,224,3)的patch,VGG16的输出形状为(7,7,512 ).之后送入残差块作为上采样,最终的输出形状为(224,224,3)。损失函数为MSE,衡量网络的输出图像与真是图像块之间的误差。
效果对比:
1.全连接网络,全卷积网络
2.全卷积网络生成的图像duller than groundtruth images.第一个原因可能是因为patchsize太小造成,在增加了patchsize之后,全卷积神经网络输出的图像更加接近真实图像。
3.为了改善最终图像质量,本文将预训练的CGG16作为基础模型,但是将VGG16中的全连接层排除掉。最终使用残差网络输出图像与真实图像之间的均方误差作为网络的损失函数。
第五部分实施细节(代码解读)
//导入包设置
import numpy
as np
import theano
as th
import theano
.tensor
as T
from keras
.utils
import np_utils
import keras
.models
as models
from keras
.layers
import Input
,merge
from keras
.layers
.core
import Reshape
,Dense
,Dropout
,Activation
,Flatten
from keras
.layers
.advanced_activations
import LeakyReLU
from keras
.activations
import *
from keras
.layers
.wrappers
import TimeDistributed
from keras
.layers
.noise
import GaussianNoise
from keras
.layers
.convolutional
import Conv2D
, Convolution2D
, MaxPooling2D
, ZeroPadding2D
, Deconv2D
, UpSampling2D
from keras
.layers
.recurrent
import LSTM
from keras
.callbacks
import ModelCheckpoint
from keras
.regularizers
import *
from keras
.layers
.normalization
import *
from keras
.optimizers
import *
from keras
.datasets
import mnist
import matplotlib
.pyplot
as plt
import cPickle
, random
, sys
, keras
from keras
.models
import Model
from IPython
import display
sys
.path
.append
("../common")
from keras
.utils
import np_utils
创建网络
加载训预练的VGG16网络
def create_network():
from keras
.applications
.vgg16
import VGG16
input_tensor
= Input
(shape
=(224, 224, 3))
base_model
= VGG19
(input_tensor
=input_tensor
,weights
='imagenet', include_top
=False)
for layer
in base_model
.layers
:
layer
.trainable
= False
x
= base_model
.output
上采样块用残差连接
a
= Conv2D
(256, (3, 3), activation
='relu', padding
='same')(x
)
x
= keras
.layers
.merge
.Add
()(a
,x
)
x
= UpSampling2D
(size
=(2,2))(x
)
a
= x
= Conv2D
(128, (3, 3), activation
='relu', padding
='same')(x
)
x
= keras
.layers
.merge
.Add
()(a
,x
)
x
= UpSampling2D
(size
=(2,2))(x
)
a
= Conv2D
(64, (3, 3), activation
='relu', padding
='same')(x
)
x
= keras
.layers
.merge
.Add
()(a
,x
)
x
= UpSampling2D
(size
=(2,2))(x
)
a
= Conv2D
(32, (3, 3), activation
='relu', padding
='same')(x
)
x
= keras
.layers
.merge
.Add
()(a
,x
)
x
= UpSampling2D
(size
=(2,2))(x
)
a
= Conv2D
(32, (3, 3), activation
='relu', padding
='same')(x
)
x
= keras
.layers
.merge
.Add
()(a
,x
)
x
= UpSampling2D
(size
=(2,2))(x
)
x
= keras
.layers
.concatenate
([x
, input_tensor
])
x
= Conv2D
(64, (3, 3), activation
='relu', padding
='same')(x
)
x
= Conv2D
(32, (3, 3), activation
='relu', padding
='same')(x
)
out
= Conv2D
(3, (3, 3), activation
='sigmoid', padding
='same')(x
)
model
= Model
(inputs
=base_model
.input, outputs
=out
)
model
.compile(optimizer
='adadelta', loss
='mean_squared_error')
return model
切割图片
from keras
.preprocessing
.image
import ImageDataGenerator
, array_to_img
, img_to_array
, load_img
datagen
= ImageDataGenerator
(rescale
=1./255)
lowsample_data_dir
= '../processed_data/train/low/'
highsample_data_dir
= '../processed_data/train/high/'
def random_generate_from_data(sample_size
, patch_size
):
low_imgs
=[]
high_imgs
=[]
low_imgs
.append
(cv2
.imread
('../raw_data/low/blenderman.png'))
high_imgs
.append
(cv2
.imread
('../raw_data/high/blenderman.png'))
low_imgs
.append
(cv2
.imread
('../raw_data/low/classroom_low.png'))
high_imgs
.append
(cv2
.imread
('../raw_data/high/classroom_high.png'))
low_imgs
.append
(cv2
.imread
('../raw_data/low/pa_low.png'))
high_imgs
.append
(cv2
.imread
('../raw_data/high/pa_high.png'))
lows
= []
highs
= []
for i
in range(sample_size
):
num
= np
.random
.randint
(len(low_imgs
))
low
= low_imgs
[num
]
high
= high_imgs
[num
]
x_max
= low
.shape
[0] - patch_size
y_max
= low
.shape
[1] - patch_size
x
= np
.random
.randint
(x_max
)
y
= np
.random
.randint
(y_max
)
low_sample
= low
[x
:x
+patch_size
, y
:y
+patch_size
, :]
high_sample
= high
[x
:x
+patch_size
, y
:y
+patch_size
, :]
lows
.append
(low_sample
)
highs
.append
(high_sample
)
return np
.array
(lows
), np
.array
(highs
)
patch图像生成与归一化
//lowsampleimgs
, highsampleimgs
= random_generate_from_data
(10000, 224)
lowsampleimgs
= lowsampleimgs
.astype
('float32') / 255.
highsampleimgs
= highsampleimgs
.astype
('float32') / 255.
print np
.max(highsampleimgs
)
print "train low sample img shape: ", lowsampleimgs
.shape
print "train high sample img shape: ", highsampleimgs
.shape
将低采样patch与高采样patch画出来
for i
in range(5):
plt
.figure
(i
)
plt
.grid
(b
=False)
plt
.subplot
(221)
plt
.imshow
(lowsampleimgs
[i
,:,:,:])
plt
.subplot
(222)
plt
.imshow
(highsampleimgs
[i
,:,:,:])
plt
.show
()
训练网络
checkpointer
= ModelCheckpoint
(filepath
="./Models/model_x.hdf5", verbose
=0)
m_model
.fit
(lowsampleimgs
, highsampleimgs
,
epochs
=100,
batch_size
=30,
shuffle
=True,
callbacks
=[checkpointer
])
作者在这里的意思是否是对抗网络的思路?
Then we unfreeze the VGG layer and train for another 10 epochs.
for layer
in m_model
.layers
:
layer
.trainable
= True
m_model
.fit
(lowsampleimgs
, highsampleimgs
,
epochs
=10,
batch_size
=30,
shuffle
=True,
callbacks
=[checkpointer
])
此处的loss为0.4082(具以往的经验来看这并不是一个非常好的结果)
Result evaluation
from keras
.models
import load_model
m_model
= load_model
('dl/GAN/Models/model_x.hdf5')
result
= m_model
.predict
(testimgs
[:10])
for i
in range(10):
plt
.figure
(i
)
plt
.subplot
(131)
plt
.imshow
(lowsampleimgs
[i
])
plt
.subplot
(132)
plt
.imshow
(highsampleimgs
[i
])
plt
.subplot
(133)
plt
.imshow
(result
[i
])
plt
.show
()
作者在此处将各个patch最终combine,但是这个过程最终没有代码,展示了一些最终的降噪结果
扩展:One-shot learning for video rendering
为了更好的扩展网络,引入one-shot learning scheme for video rendering,因为不同的帧在一个连续的视频中有非常类似的场景结构和光照情况。
总结思考
one-shot learning是什么?
参考文章为《吴恩达深度学习》
从人脸识别开始,人脸识别分为人脸验证与人脸分辨,人脸验证为输入ID,Image,系统验证输入信息是否是这个人(1对1问题)。分辨问题是,得到输入图片,输出…one-shot learning只能通过一个样本学习。
学习一个“similarity” function d(image1,image2)=degree of difference between images
转载请注明原文地址: https://yun.8miu.com/read-136350.html