本文档为TensorFlow参考文档,本转载已得到TensorFlow中文社区授权。
Note: Functions taking Tensor arguments can also take anything accepted by tf.convert_to_tensor.
TensorFlow provides Ops to decode and encode JPEG and PNG formats. Encoded images are represented by scalar string Tensors, decoded images by 3-D uint8 tensors of shape [height, width, channels].
The encode and decode Ops apply to one image at a time. Their input and output are all of variable size. If you need fixed size images, pass the output of the decode Ops to one of the cropping and resizing Ops.
Note: The PNG encode and decode Ops support RGBA, but the conversions Ops presently only support RGB, HSV, and GrayScale.
Decode a JPEG-encoded image to a uint8 tensor.
The attr channels indicates the desired number of color channels for the decoded image.
Accepted values are:
0: Use the number of channels in the JPEG-encoded image.1: output a grayscale image.3: output an RGB image.If needed, the JPEG-encoded image is transformed to match the requested number of color channels.
The attr ratio allows downscaling the image by an integer factor during decoding. Allowed values are: 1, 2, 4, and 8. This is much faster than downscaling the image later.
A Tensor of type uint8. 3-D with shape [height, width, channels]..
JPEG-encode an image.
image is a 3-D uint8 Tensor of shape [height, width, channels].
The attr format can be used to override the color format of the encoded output. Values can be:
'': Use a default format based on the number of channels in the image. grayscale: Output a grayscale JPEG image. The channels dimension of image must be 1. rgb: Output an RGB JPEG image. The channels dimension of image must be 3.If format is not specified or is the empty string, a default format is picked in function of the number of channels in image:
1: Output a grayscale image.3: Output an RGB image.A Tensor of type string. 0-D. JPEG-encoded image.
Decode a PNG-encoded image to a uint8 tensor.
The attr channels indicates the desired number of color channels for the decoded image.
Accepted values are:
0: Use the number of channels in the PNG-encoded image.1: output a grayscale image.3: output an RGB image.4: output an RGBA image.If needed, the PNG-encoded image is transformed to match the requested number of color channels.
A Tensor of type uint8. 3-D with shape [height, width, channels].
PNG-encode an image.
image is a 3-D uint8 Tensor of shape [height, width, channels] where channels is:
1: for grayscale.3: for RGB.4: for RGBA.The ZLIB compression level, compression, can be -1 for the PNG-encoder default or a value from 0 to 9. 9 is the highest compression level, generating the smallest output, but is slower.
A Tensor of type string. 0-D. PNG-encoded image.
The resizing Ops accept input images as tensors of several types. They always output resized images as float32 tensors.
The convenience function resize_images() supports both 4-D and 3-D tensors as input and output. 4-D tensors are for batches of images, 3-D tensors for individual images.
Other resizing Ops only support 3-D individual images as input: resize_area, resize_bicubic,resize_bilinear, resize_nearest_neighbor.
Example:
# Decode a JPG image and resize it to 299 by 299. image = tf.image.decode_jpeg(...) resized_image = tf.image.resize_bilinear(image, [299, 299])Maybe refer to the Queue examples that show how to add images to a Queue after resizing them to a fixed size, and how to dequeue batches of resized images from the Queue.
Resize images to new_width, new_height using the specified method.
Resized images will be distorted if their original aspect ratio is not the same as new_width, new_height. To avoid distortions see resize_image_with_crop_or_pad.
method can be one of:
ResizeMethod.BILINEAR: [Bilinear interpolation.] (https://en.wikipedia.org/wiki/Bilinear_interpolation) ResizeMethod.NEAREST_NEIGHBOR: [Nearest neighbor interpolation.] (https://en.wikipedia.org/wiki/Nearest-neighbor_interpolation) ResizeMethod.BICUBIC: [Bicubic interpolation.] (https://en.wikipedia.org/wiki/Bicubic_interpolation) ResizeMethod.AREA: Area interpolation.If images was 4-D, a 4-D float Tensor of shape [batch, new_height, new_width, channels]. If imageswas 3-D, a 3-D float Tensor of shape [new_height, new_width, channels].
Resize images to size using area interpolation.
Input images can be of different types but output images are always float.
A Tensor of type float32. 4-D with shape [batch, new_height, new_width, channels].
Resize images to size using bicubic interpolation.
Input images can be of different types but output images are always float.
A Tensor of type float32. 4-D with shape [batch, new_height, new_width, channels].
Resize images to size using bilinear interpolation.
Input images can be of different types but output images are always float.
A Tensor of type float32. 4-D with shape [batch, new_height, new_width, channels].
Resize images to size using nearest neighbor interpolation.
Input images can be of different types but output images are always float.
A Tensor. Has the same type as images. 4-D with shape [batch, new_height, new_width, channels].
Crops and/or pads an image to a target width and height.
Resizes an image to a target width and height by either centrally cropping the image or padding it evenly with zeros.
If width or height is greater than the specified target_width or target_height respectively, this op centrally crops along that dimension. If width or height is smaller than the specified target_width ortarget_height respectively, this op centrally pads with 0 along that dimension.
Cropped and/or padded image of shape [target_height, target_width, channels]
Pad image with zeros to the specified height and width.
Adds offset_height rows of zeros on top, offset_width columns of zeros on the left, and then pads the image on the bottom and right with zeros until it has dimensions target_height, target_width.
This op does nothing if offset_* is zero and the image already has size target_height by target_width.
3-D tensor of shape [target_height, target_width, channels]
Crops an image to a specified bounding box.
This op cuts a rectangular part out of image. The top-left corner of the returned image is at offset_height, offset_width in image, and its lower-right corner is at `offset_height + target_height, offset_width + target_width'.
3-D tensor of image with shape [target_height, target_width, channels]
Randomly crops image to size [target_height, target_width].
The offset of the output within image is uniformly random. image always fully contains the result.
A cropped 3-D tensor of shape [target_height, target_width, channels].
Extracts a glimpse from the input tensor.
Returns a set of windows called glimpses extracted at location offsets from the input tensor. If the windows only partially overlaps the inputs, the non overlapping areas will be filled with random noise.
The result is a 4-D tensor of shape [batch_size, glimpse_height, glimpse_width, channels]. The channels and batch dimensions are the same as that of the input tensor. The height and width of the output windows are specified in the size parameter.
The argument normalized and centered controls how the windows are built:
If the coordinates are normalized but not centered, 0.0 and 1.0 correspond to the minimum and maximum of each height and width dimension.If the coordinates are both normalized and centered, they range from -1.0 to 1.0. The coordinates (-1.0, -1.0) correspond to the upper left corner, the lower right corner is located at (1.0, 1.0) and the center is at (0, 0).If the coordinates are not normalized they are interpreted as numbers of pixels.A Tensor of type float32. A tensor representing the glimpses [batch_size, glimpse_height, glimpse_width, channels].
Flip an image horizontally (upside down).
Outputs the contents of image flipped along the first dimension, which is height.
See also reverse().
A 3-D tensor of the same type and shape as image.
Randomly flips an image vertically (upside down).
With a 1 in 2 chance, outputs the contents of image flipped along the first dimension, which is height. Otherwise output the image as-is.
A 3-D tensor of the same type and shape as image.
Flip an image horizontally (left to right).
Outputs the contents of image flipped along the second dimension, which is width.
See also reverse().
A 3-D tensor of the same type and shape as image.
Randomly flip an image horizontally (left to right).
With a 1 in 2 chance, outputs the contents of image flipped along the second dimension, which is width. Otherwise output the image as-is.
A 3-D tensor of the same type and shape as image.
Transpose an image by swapping the first and second dimension.
See also transpose().
A 3-D tensor of shape [width, height, channels]
TensorFlow provides functions to adjust images in various ways: brightness, contrast, hue, and saturation. Each adjustment can be done with predefined parameters or with random parameters picked from predefined intervals. Random adjustments are often useful to expand a training set and reduce overfitting.
Adjust the brightness of RGB or Grayscale images.
The value delta is added to all components of the tensor image. image and delta are cast to float before adding, and the resulting values are clamped to [min_value, max_value]. Finally, the result is cast back to images.dtype.
If min_value or max_value are not given, they are set to the minimum and maximum allowed values for image.dtype respectively.
A tensor of the same shape and type as image.
Adjust the brightness of images by a random factor.
Equivalent to adjust_brightness() using a delta randomly picked in the interval [-max_delta, max_delta).
Note that delta is picked as a float. Because for integer type images, the brightness adjusted result is rounded before casting, integer images may have modifications in the range [-max_delta,max_delta].
3-D tensor of images of shape [height, width, channels]
Adjust contrast of RGB or grayscale images.
images is a tensor of at least 3 dimensions. The last 3 dimensions are interpreted as [height, width, channels]. The other dimensions only represent a collection of images, such as [batch, height, width, channels].
Contrast is adjusted independently for each channel of each image.
For each channel, this Op first computes the mean of the image pixels in the channel and then adjusts each component x of each pixel to (x - mean) * contrast_factor + mean.
The adjusted values are then clipped to fit in the [min_value, max_value] interval. If min_value or max_value is not given, it is replaced with the minimum and maximum values for the data type of images respectively.
The contrast-adjusted image is always computed as float, and it is cast back to its original type after clipping.
The constrast-adjusted image or images.
Adjust the contrase of an image by a random factor.
Equivalent to adjust_constrast() but uses a contrast_factor randomly picked in the interval [lower, upper].
3-D tensor of shape [height, width, channels].
Linearly scales image to have zero mean and unit norm.
This op computes (x - mean) / adjusted_stddev, where mean is the average of all values in image, and adjusted_stddev = max(stddev, 1.0/srqt(image.NumElements())).
stddev is the standard deviation of all values in image. It is capped away from zero to protect against division by 0 when handling uniform images.
Note that this implementation is limited:
It only whitens based on the statistics of an individual image.It does not take into account the covariance structure.The whitened image with same shape as image.