Coverage for /pythoncovmergedfiles/medio/medio/usr/local/lib/python3.8/site-packages/tensorflow/python/ops/image_ops.py: 26%
68 statements
« prev ^ index » next coverage.py v7.4.0, created at 2024-01-03 07:57 +0000
« prev ^ index » next coverage.py v7.4.0, created at 2024-01-03 07:57 +0000
1# Copyright 2015 The TensorFlow Authors. All Rights Reserved.
2#
3# Licensed under the Apache License, Version 2.0 (the "License");
4# you may not use this file except in compliance with the License.
5# You may obtain a copy of the License at
6#
7# http://www.apache.org/licenses/LICENSE-2.0
8#
9# Unless required by applicable law or agreed to in writing, software
10# distributed under the License is distributed on an "AS IS" BASIS,
11# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
12# See the License for the specific language governing permissions and
13# limitations under the License.
14# ==============================================================================
16"""Image ops.
18The `tf.image` module contains various functions for image
19processing and decoding-encoding Ops.
21Many of the encoding/decoding functions are also available in the
22core `tf.io` module.
24## Image processing
26### Resizing
28The resizing Ops accept input images as tensors of several types. They always
29output resized images as float32 tensors.
31The convenience function `tf.image.resize` supports both 4-D
32and 3-D tensors as input and output. 4-D tensors are for batches of images,
333-D tensors for individual images.
35Resized images will be distorted if their original aspect ratio is not the
36same as size. To avoid distortions see tf.image.resize_with_pad.
38* `tf.image.resize`
39* `tf.image.resize_with_pad`
40* `tf.image.resize_with_crop_or_pad`
42The Class `tf.image.ResizeMethod` provides various resize methods like
43`bilinear`, `nearest_neighbor`.
45### Converting Between Colorspaces
47Image ops work either on individual images or on batches of images, depending on
48the shape of their input Tensor.
50If 3-D, the shape is `[height, width, channels]`, and the Tensor represents one
51image. If 4-D, the shape is `[batch_size, height, width, channels]`, and the
52Tensor represents `batch_size` images.
54Currently, `channels` can usefully be 1, 2, 3, or 4. Single-channel images are
55grayscale, images with 3 channels are encoded as either RGB or HSV. Images
56with 2 or 4 channels include an alpha channel, which has to be stripped from the
57image before passing the image to most image processing functions (and can be
58re-attached later).
60Internally, images are either stored in as one `float32` per channel per pixel
61(implicitly, values are assumed to lie in `[0,1)`) or one `uint8` per channel
62per pixel (values are assumed to lie in `[0,255]`).
64TensorFlow can convert between images in RGB or HSV or YIQ.
66* `tf.image.rgb_to_grayscale`, `tf.image.grayscale_to_rgb`
67* `tf.image.rgb_to_hsv`, `tf.image.hsv_to_rgb`
68* `tf.image.rgb_to_yiq`, `tf.image.yiq_to_rgb`
69* `tf.image.rgb_to_yuv`, `tf.image.yuv_to_rgb`
70* `tf.image.image_gradients`
71* `tf.image.convert_image_dtype`
73### Image Adjustments
75TensorFlow provides functions to adjust images in various ways: brightness,
76contrast, hue, and saturation. Each adjustment can be done with predefined
77parameters or with random parameters picked from predefined intervals. Random
78adjustments are often useful to expand a training set and reduce overfitting.
80If several adjustments are chained it is advisable to minimize the number of
81redundant conversions by first converting the images to the most natural data
82type and representation.
84* `tf.image.adjust_brightness`
85* `tf.image.adjust_contrast`
86* `tf.image.adjust_gamma`
87* `tf.image.adjust_hue`
88* `tf.image.adjust_jpeg_quality`
89* `tf.image.adjust_saturation`
90* `tf.image.random_brightness`
91* `tf.image.random_contrast`
92* `tf.image.random_hue`
93* `tf.image.random_saturation`
94* `tf.image.per_image_standardization`
96### Working with Bounding Boxes
98* `tf.image.draw_bounding_boxes`
99* `tf.image.combined_non_max_suppression`
100* `tf.image.generate_bounding_box_proposals`
101* `tf.image.non_max_suppression`
102* `tf.image.non_max_suppression_overlaps`
103* `tf.image.non_max_suppression_padded`
104* `tf.image.non_max_suppression_with_scores`
105* `tf.image.pad_to_bounding_box`
106* `tf.image.sample_distorted_bounding_box`
108### Cropping
110* `tf.image.central_crop`
111* `tf.image.crop_and_resize`
112* `tf.image.crop_to_bounding_box`
113* `tf.io.decode_and_crop_jpeg`
114* `tf.image.extract_glimpse`
115* `tf.image.random_crop`
116* `tf.image.resize_with_crop_or_pad`
118### Flipping, Rotating and Transposing
120* `tf.image.flip_left_right`
121* `tf.image.flip_up_down`
122* `tf.image.random_flip_left_right`
123* `tf.image.random_flip_up_down`
124* `tf.image.rot90`
125* `tf.image.transpose`
127## Image decoding and encoding
129TensorFlow provides Ops to decode and encode JPEG and PNG formats. Encoded
130images are represented by scalar string Tensors, decoded images by 3-D uint8
131tensors of shape `[height, width, channels]`. (PNG also supports uint16.)
133Note: `decode_gif` returns a 4-D array `[num_frames, height, width, 3]`
135The encode and decode Ops apply to one image at a time. Their input and output
136are all of variable size. If you need fixed size images, pass the output of
137the decode Ops to one of the cropping and resizing Ops.
139* `tf.io.decode_bmp`
140* `tf.io.decode_gif`
141* `tf.io.decode_image`
142* `tf.io.decode_jpeg`
143* `tf.io.decode_and_crop_jpeg`
144* `tf.io.decode_png`
145* `tf.io.encode_jpeg`
146* `tf.io.encode_png`
148"""
149from tensorflow.python.framework import constant_op
150from tensorflow.python.framework import dtypes
151from tensorflow.python.framework import ops
152from tensorflow.python.ops import array_ops
153from tensorflow.python.ops import gen_image_ops
154from tensorflow.python.ops import linalg_ops
155# go/tf-wildcard-import
156# pylint: disable=wildcard-import
157from tensorflow.python.ops.gen_image_ops import *
158from tensorflow.python.ops.image_ops_impl import *
159# pylint: enable=wildcard-import
161# TODO(drpng): remove these once internal use has discontinued.
162# pylint: disable=unused-import
163from tensorflow.python.ops.image_ops_impl import _Check3DImage
164from tensorflow.python.ops.image_ops_impl import _ImageDimensions
165# pylint: enable=unused-import
167_IMAGE_DTYPES = frozenset([
168 dtypes.uint8, dtypes.int32, dtypes.int64, dtypes.float16, dtypes.float32,
169 dtypes.float64
170])
173def flat_transforms_to_matrices(transforms):
174 """Converts `tf.contrib.image` projective transforms to affine matrices.
176 Note that the output matrices map output coordinates to input coordinates. For
177 the forward transformation matrix, call `tf.linalg.inv` on the result.
179 Args:
180 transforms: Vector of length 8, or batches of transforms with shape `(N,
181 8)`.
183 Returns:
184 3D tensor of matrices with shape `(N, 3, 3)`. The output matrices map the
185 *output coordinates* (in homogeneous coordinates) of each transform to the
186 corresponding *input coordinates*.
188 Raises:
189 ValueError: If `transforms` have an invalid shape.
190 """
191 with ops.name_scope("flat_transforms_to_matrices"):
192 transforms = ops.convert_to_tensor(transforms, name="transforms")
193 if transforms.shape.ndims not in (1, 2):
194 raise ValueError("Transforms should be 1D or 2D, got: %s" % transforms)
195 # Make the transform(s) 2D in case the input is a single transform.
196 transforms = array_ops.reshape(transforms, constant_op.constant([-1, 8]))
197 num_transforms = array_ops.shape(transforms)[0]
198 # Add a column of ones for the implicit last entry in the matrix.
199 return array_ops.reshape(
200 array_ops.concat(
201 [transforms, array_ops.ones([num_transforms, 1])], axis=1),
202 constant_op.constant([-1, 3, 3]))
205def matrices_to_flat_transforms(transform_matrices):
206 """Converts affine matrices to `tf.contrib.image` projective transforms.
208 Note that we expect matrices that map output coordinates to input coordinates.
209 To convert forward transformation matrices, call `tf.linalg.inv` on the
210 matrices and use the result here.
212 Args:
213 transform_matrices: One or more affine transformation matrices, for the
214 reverse transformation in homogeneous coordinates. Shape `(3, 3)` or `(N,
215 3, 3)`.
217 Returns:
218 2D tensor of flat transforms with shape `(N, 8)`, which may be passed into
219 `tf.contrib.image.transform`.
221 Raises:
222 ValueError: If `transform_matrices` have an invalid shape.
223 """
224 with ops.name_scope("matrices_to_flat_transforms"):
225 transform_matrices = ops.convert_to_tensor(
226 transform_matrices, name="transform_matrices")
227 if transform_matrices.shape.ndims not in (2, 3):
228 raise ValueError("Matrices should be 2D or 3D, got: %s" %
229 transform_matrices)
230 # Flatten each matrix.
231 transforms = array_ops.reshape(transform_matrices,
232 constant_op.constant([-1, 9]))
233 # Divide each matrix by the last entry (normally 1).
234 transforms /= transforms[:, 8:9]
235 return transforms[:, :8]
238@ops.RegisterGradient("ImageProjectiveTransformV2")
239def _image_projective_transform_grad(op, grad):
240 """Computes the gradient for ImageProjectiveTransform."""
241 images = op.inputs[0]
242 transforms = op.inputs[1]
243 interpolation = op.get_attr("interpolation")
244 fill_mode = op.get_attr("fill_mode")
246 image_or_images = ops.convert_to_tensor(images, name="images")
247 transform_or_transforms = ops.convert_to_tensor(
248 transforms, name="transforms", dtype=dtypes.float32)
250 if image_or_images.dtype.base_dtype not in _IMAGE_DTYPES:
251 raise TypeError("Invalid dtype %s." % image_or_images.dtype)
252 if len(transform_or_transforms.get_shape()) == 1:
253 transforms = transform_or_transforms[None]
254 elif len(transform_or_transforms.get_shape()) == 2:
255 transforms = transform_or_transforms
256 else:
257 raise TypeError("Transforms should have rank 1 or 2.")
259 # Invert transformations
260 transforms = flat_transforms_to_matrices(transforms=transforms)
261 inverse = linalg_ops.matrix_inverse(transforms)
262 transforms = matrices_to_flat_transforms(inverse)
263 output = gen_image_ops.image_projective_transform_v2(
264 images=grad,
265 transforms=transforms,
266 output_shape=array_ops.shape(image_or_images)[1:3],
267 interpolation=interpolation,
268 fill_mode=fill_mode)
269 return [output, None, None]
272@ops.RegisterGradient("ImageProjectiveTransformV3")
273def _image_projective_transform_v3_grad(op, grad):
274 """Computes the gradient for ImageProjectiveTransform."""
275 images = op.inputs[0]
276 transforms = op.inputs[1]
277 interpolation = op.get_attr("interpolation")
278 fill_mode = op.get_attr("fill_mode")
280 image_or_images = ops.convert_to_tensor(images, name="images")
281 transform_or_transforms = ops.convert_to_tensor(
282 transforms, name="transforms", dtype=dtypes.float32)
284 if image_or_images.dtype.base_dtype not in _IMAGE_DTYPES:
285 raise TypeError("Invalid dtype %s." % image_or_images.dtype)
286 if len(transform_or_transforms.get_shape()) == 1:
287 transforms = transform_or_transforms[None]
288 elif len(transform_or_transforms.get_shape()) == 2:
289 transforms = transform_or_transforms
290 else:
291 raise TypeError("Transforms should have rank 1 or 2.")
293 # Invert transformations
294 transforms = flat_transforms_to_matrices(transforms=transforms)
295 inverse = linalg_ops.matrix_inverse(transforms)
296 transforms = matrices_to_flat_transforms(inverse)
297 output = gen_image_ops.image_projective_transform_v3(
298 images=grad,
299 transforms=transforms,
300 output_shape=array_ops.shape(image_or_images)[1:3],
301 interpolation=interpolation,
302 fill_mode=fill_mode,
303 fill_value=0.0)
304 return [output, None, None, None]