Coverage for /pythoncovmergedfiles/medio/medio/usr/local/lib/python3.8/site-packages/tensorflow/python/ops/nn_ops.py: 24%
1348 statements
« prev ^ index » next coverage.py v7.4.0, created at 2024-01-03 07:57 +0000
« prev ^ index » next coverage.py v7.4.0, created at 2024-01-03 07:57 +0000
1# Copyright 2015 The TensorFlow Authors. All Rights Reserved.
2#
3# Licensed under the Apache License, Version 2.0 (the "License");
4# you may not use this file except in compliance with the License.
5# You may obtain a copy of the License at
6#
7# http://www.apache.org/licenses/LICENSE-2.0
8#
9# Unless required by applicable law or agreed to in writing, software
10# distributed under the License is distributed on an "AS IS" BASIS,
11# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
12# See the License for the specific language governing permissions and
13# limitations under the License.
14# ==============================================================================
15"""Primitive Neural Net (NN) Operations.
17## Notes on padding
19Several neural network operations, such as `tf.nn.conv2d` and
20`tf.nn.max_pool2d`, take a `padding` parameter, which controls how the input is
21padded before running the operation. The input is padded by inserting values
22(typically zeros) before and after the tensor in each spatial dimension. The
23`padding` parameter can either be the string `'VALID'`, which means use no
24padding, or `'SAME'` which adds padding according to a formula which is
25described below. Certain ops also allow the amount of padding per dimension to
26be explicitly specified by passing a list to `padding`.
28In the case of convolutions, the input is padded with zeros. In case of pools,
29the padded input values are ignored. For example, in a max pool, the sliding
30window ignores padded values, which is equivalent to the padded values being
31`-infinity`.
33### `'VALID'` padding
35Passing `padding='VALID'` to an op causes no padding to be used. This causes the
36output size to typically be smaller than the input size, even when the stride is
37one. In the 2D case, the output size is computed as:
39```python
40out_height = ceil((in_height - filter_height + 1) / stride_height)
41out_width = ceil((in_width - filter_width + 1) / stride_width)
42```
44The 1D and 3D cases are similar. Note `filter_height` and `filter_width` refer
45to the filter size after dilations (if any) for convolutions, and refer to the
46window size for pools.
48### `'SAME'` padding
50With `'SAME'` padding, padding is applied to each spatial dimension. When the
51strides are 1, the input is padded such that the output size is the same as the
52input size. In the 2D case, the output size is computed as:
54```python
55out_height = ceil(in_height / stride_height)
56out_width = ceil(in_width / stride_width)
57```
59The amount of padding used is the smallest amount that results in the output
60size. The formula for the total amount of padding per dimension is:
62```python
63if (in_height % strides[1] == 0):
64 pad_along_height = max(filter_height - stride_height, 0)
65else:
66 pad_along_height = max(filter_height - (in_height % stride_height), 0)
67if (in_width % strides[2] == 0):
68 pad_along_width = max(filter_width - stride_width, 0)
69else:
70 pad_along_width = max(filter_width - (in_width % stride_width), 0)
71```
73Finally, the padding on the top, bottom, left and right are:
75```python
76pad_top = pad_along_height // 2
77pad_bottom = pad_along_height - pad_top
78pad_left = pad_along_width // 2
79pad_right = pad_along_width - pad_left
80```
82Note that the division by 2 means that there might be cases when the padding on
83both sides (top vs bottom, right vs left) are off by one. In this case, the
84bottom and right sides always get the one additional padded pixel. For example,
85when pad_along_height is 5, we pad 2 pixels at the top and 3 pixels at the
86bottom. Note that this is different from existing libraries such as PyTorch and
87Caffe, which explicitly specify the number of padded pixels and always pad the
88same number of pixels on both sides.
90Here is an example of `'SAME'` padding:
92>>> in_height = 5
93>>> filter_height = 3
94>>> stride_height = 2
95>>>
96>>> in_width = 2
97>>> filter_width = 2
98>>> stride_width = 1
99>>>
100>>> inp = tf.ones((2, in_height, in_width, 2))
101>>> filter = tf.ones((filter_height, filter_width, 2, 2))
102>>> strides = [stride_height, stride_width]
103>>> output = tf.nn.conv2d(inp, filter, strides, padding='SAME')
104>>> output.shape[1] # output_height: ceil(5 / 2)
1053
106>>> output.shape[2] # output_width: ceil(2 / 1)
1072
109### Explicit padding
111Certain ops, like `tf.nn.conv2d`, also allow a list of explicit padding amounts
112to be passed to the `padding` parameter. This list is in the same format as what
113is passed to `tf.pad`, except the padding must be a nested list, not a tensor.
114For example, in the 2D case, the list is in the format `[[0, 0], [pad_top,
115pad_bottom], [pad_left, pad_right], [0, 0]]` when `data_format` is its default
116value of `'NHWC'`. The two `[0, 0]` pairs indicate the batch and channel
117dimensions have no padding, which is required, as only spatial dimensions can
118have padding.
120For example:
122>>> inp = tf.ones((1, 3, 3, 1))
123>>> filter = tf.ones((2, 2, 1, 1))
124>>> strides = [1, 1]
125>>> padding = [[0, 0], [1, 2], [0, 1], [0, 0]]
126>>> output = tf.nn.conv2d(inp, filter, strides, padding=padding)
127>>> tuple(output.shape)
128(1, 5, 3, 1)
129>>> # Equivalently, tf.pad can be used, since convolutions pad with zeros.
130>>> inp = tf.pad(inp, padding)
131>>> # 'VALID' means to use no padding in conv2d (we already padded inp)
132>>> output2 = tf.nn.conv2d(inp, filter, strides, padding='VALID')
133>>> tf.debugging.assert_equal(output, output2)
135### Difference between convolution and pooling layers
136How padding is used in convolution layers and pooling layers is different. For
137convolution layers, padding is filled with values of zero, and padding is
138multiplied with kernels. For pooling layers, padding is excluded from the
139computation. For example when applying average pooling to a 4x4 grid, how much
140padding is added will not impact the output. Here is an example that
141demonstrates the difference.
143>>> x_in = np.array([[
144... [[2], [2]],
145... [[1], [1]],
146... [[1], [1]]]])
147>>> kernel_in = np.array([ # simulate the avg_pool with conv2d
148... [ [[0.25]], [[0.25]] ],
149... [ [[0.25]], [[0.25]] ]])
150>>> x = tf.constant(x_in, dtype=tf.float32)
151>>> kernel = tf.constant(kernel_in, dtype=tf.float32)
152>>> conv_out = tf.nn.conv2d(x, kernel, strides=[1, 1, 1, 1], padding='SAME')
153>>> pool_out = tf.nn.avg_pool(x, [2, 2], strides=[1, 1, 1, 1], padding='SAME')
154>>> print(conv_out.shape, pool_out.shape)
155(1, 3, 2, 1) (1, 3, 2, 1)
156>>> tf.reshape(conv_out, [3, 2]).numpy() # conv2d takes account of padding
157array([[1.5 , 0.75],
158 [1. , 0.5 ],
159 [0.5 , 0.25]], dtype=float32)
160>>> tf.reshape(pool_out, [3, 2]).numpy() # avg_pool excludes padding
161array([[1.5, 1.5],
162 [1. , 1. ],
163 [1. , 1. ]], dtype=float32)
165"""
167import functools
168import numbers
170import numpy as np
172from tensorflow.python.eager import context
173from tensorflow.python.framework import config
174from tensorflow.python.framework import constant_op
175from tensorflow.python.framework import dtypes
176from tensorflow.python.framework import errors_impl
177from tensorflow.python.framework import graph_util
178from tensorflow.python.framework import ops
179from tensorflow.python.framework import random_seed
180from tensorflow.python.framework import tensor_shape
181from tensorflow.python.framework import tensor_util
182from tensorflow.python.ops import array_ops
183from tensorflow.python.ops import array_ops_stack
184from tensorflow.python.ops import check_ops
185from tensorflow.python.ops import gen_math_ops
186from tensorflow.python.ops import gen_nn_ops
187from tensorflow.python.ops import math_ops
188from tensorflow.python.ops import random_ops
189from tensorflow.python.ops import stateless_random_ops
190from tensorflow.python.ops import variables as variables_lib
191# go/tf-wildcard-import
192# pylint: disable=wildcard-import
193from tensorflow.python.ops.gen_nn_ops import *
194# pylint: enable=wildcard-import
195from tensorflow.python.platform import device_context
196from tensorflow.python.util import deprecation
197from tensorflow.python.util import dispatch
198from tensorflow.python.util.compat import collections_abc
199from tensorflow.python.util.deprecation import deprecated_args
200from tensorflow.python.util.deprecation import deprecated_argument_lookup
202from tensorflow.python.util.tf_export import tf_export
204# Aliases for some automatically-generated names.
205local_response_normalization = gen_nn_ops.lrn
207# pylint: disable=protected-access
208# pylint: disable=g-classes-have-attributes
210# Acceptable channels last formats (robust to H, W, D order).
211_CHANNELS_LAST_FORMATS = frozenset({
212 "NWC", "NHC", "NHWC", "NWHC", "NDHWC", "NDWHC", "NHDWC", "NHWDC", "NWDHC",
213 "NWHDC"
214})
217def _get_sequence(value, n, channel_index, name):
218 """Formats a value input for gen_nn_ops."""
219 # Performance is fast-pathed for common cases:
220 # `None`, `list`, `tuple` and `int`.
221 if value is None:
222 return [1] * (n + 2)
224 # Always convert `value` to a `list`.
225 if isinstance(value, list):
226 pass
227 elif isinstance(value, tuple):
228 value = list(value)
229 elif isinstance(value, int):
230 value = [value]
231 elif not isinstance(value, collections_abc.Sized):
232 value = [value]
233 else:
234 value = list(value) # Try casting to a list.
236 len_value = len(value)
238 # Fully specified, including batch and channel dims.
239 if len_value == n + 2:
240 return value
242 # Apply value to spatial dims only.
243 if len_value == 1:
244 value = value * n # Broadcast to spatial dimensions.
245 elif len_value != n:
246 raise ValueError(f"{name} should be of length 1, {n} or {n + 2}. "
247 f"Received: {name}={value} of length {len_value}")
249 # Add batch and channel dims (always 1).
250 if channel_index == 1:
251 return [1, 1] + value
252 else:
253 return [1] + value + [1]
256def _non_atrous_convolution(
257 input, # pylint: disable=redefined-builtin
258 filter, # pylint: disable=redefined-builtin
259 padding,
260 data_format=None, # pylint: disable=redefined-builtin
261 strides=None,
262 name=None):
263 """Computes sums of N-D convolutions (actually cross correlation).
265 It is required that 1 <= N <= 3.
267 This is used to implement the more generic `convolution` function, which
268 extends the interface of this function with a `dilation_rate` parameter.
270 Args:
272 input: Rank N+2 tensor of type T of shape
273 `[batch_size] + input_spatial_shape + [in_channels]` if `data_format`
274 does not start with `"NC"`, or
275 `[batch_size, in_channels] + input_spatial_shape` if `data_format` starts
276 with `"NC"`.
277 filter: Rank N+2 tensor of type T of shape
278 `filter_spatial_shape + [in_channels, out_channels]`. Rank of either
279 `input` or `filter` must be known.
280 padding: Padding method to use, must be either "VALID" or "SAME".
281 data_format: A string or None. Specifies whether the channel dimension of
282 the `input` and output is the last dimension (default, or if `data_format`
283 does not start with "NC"), or the second dimension (if `data_format`
284 starts with "NC"). For N=1, the valid values are "NWC" (default) and
285 "NCW". For N=2, the valid values are "NHWC" (default) and "NCHW".
286 For N=3, the valid values are "NDHWC" (default) and "NCDHW".
287 strides: Sequence of N positive integers, defaults to `[1] * N`.
288 name: Name prefix to use.
290 Returns:
291 Rank N+2 tensor of type T of shape
292 `[batch_size] + output_spatial_shape + [out_channels]`, where
293 if padding == "SAME":
294 output_spatial_shape = input_spatial_shape
295 if padding == "VALID":
296 output_spatial_shape = input_spatial_shape - filter_spatial_shape + 1.
298 Raises:
299 ValueError: if ranks are incompatible.
301 """
302 with ops.name_scope(name, "non_atrous_convolution", [input, filter]) as scope:
303 input = ops.convert_to_tensor(input, name="input") # pylint: disable=redefined-builtin
304 input_shape = input.shape
305 filter = ops.convert_to_tensor(filter, name="filter") # pylint: disable=redefined-builtin
306 filter_shape = filter.shape
307 op = _NonAtrousConvolution(
308 input_shape,
309 filter_shape=filter_shape,
310 padding=padding,
311 data_format=data_format,
312 strides=strides,
313 name=scope)
314 return op(input, filter)
317class _NonAtrousConvolution:
318 """Helper class for _non_atrous_convolution.
320 Note that this class assumes that shapes of input and filter passed to
321 `__call__` are compatible with `input_shape` and filter_shape passed to the
322 constructor.
324 Args:
325 input_shape: static input shape, i.e. input.shape.
326 filter_shape: static filter shape, i.e. filter.shape.
327 padding: see _non_atrous_convolution.
328 data_format: see _non_atrous_convolution.
329 strides: see _non_atrous_convolution.
330 name: see _non_atrous_convolution.
331 num_batch_dims: (Optional.) The number of batch dimensions in the input;
332 if not provided, the default of `1` is used.
333 """
335 def __init__(
336 self,
337 input_shape,
338 filter_shape,
339 padding,
340 data_format=None,
341 strides=None,
342 name=None,
343 num_batch_dims=1):
344 # filter shape is always rank num_spatial_dims + 2
345 # and num_spatial_dims == input_shape.ndims - num_batch_dims - 1
346 if input_shape.ndims is not None:
347 filter_shape = filter_shape.with_rank(
348 input_shape.ndims - num_batch_dims + 1)
349 self.padding = padding
350 self.name = name
351 # input shape is == num_spatial_dims + num_batch_dims + 1
352 # and filter_shape is always rank num_spatial_dims + 2
353 if filter_shape.ndims is not None:
354 input_shape = input_shape.with_rank(
355 filter_shape.ndims + num_batch_dims - 1)
356 if input_shape.ndims is None:
357 raise ValueError(
358 "Rank of convolution must be known. "
359 f"Received: input_shape={input_shape} of rank {input_shape.rank}")
360 if input_shape.ndims < 3 or input_shape.ndims - num_batch_dims + 1 > 5:
361 raise ValueError(
362 "`input_shape.rank - num_batch_dims + 1` must be at least 3 and at "
363 f"most 5. Received: input_shape.rank={input_shape.rank} and "
364 f"num_batch_dims={num_batch_dims}")
365 conv_dims = input_shape.ndims - num_batch_dims - 1
366 if strides is None:
367 strides = [1] * conv_dims
368 elif len(strides) != conv_dims:
369 raise ValueError(
370 f"`len(strides)` should be {conv_dims}. "
371 f"Received: strides={strides} of length {len(strides)}")
372 if conv_dims == 1:
373 # conv1d uses the 2-d data format names
374 if data_format is None:
375 data_format = "NWC"
376 elif data_format not in {"NCW", "NWC", "NCHW", "NHWC"}:
377 raise ValueError("`data_format` must be 'NWC' or 'NCW'. "
378 f"Received: data_format={data_format}")
379 self.strides = strides[0]
380 self.data_format = data_format
381 self.conv_op = self._conv1d
382 elif conv_dims == 2:
383 if data_format is None or data_format == "NHWC":
384 data_format = "NHWC"
385 strides = [1] + list(strides) + [1]
386 elif data_format == "NCHW":
387 strides = [1, 1] + list(strides)
388 else:
389 raise ValueError("`data_format` must be 'NHWC' or 'NCHW'. "
390 f"Received: data_format={data_format}")
391 self.strides = strides
392 self.data_format = data_format
393 self.conv_op = conv2d
394 elif conv_dims == 3:
395 if data_format is None or data_format == "NDHWC":
396 strides = [1] + list(strides) + [1]
397 elif data_format == "NCDHW":
398 strides = [1, 1] + list(strides)
399 else:
400 raise ValueError("`data_format` must be 'NDHWC' or 'NCDHW'. "
401 f"Received: data_format={data_format}")
402 self.strides = strides
403 self.data_format = data_format
404 self.conv_op = _conv3d_expanded_batch
406 # Note that we need this adapter since argument names for conv1d don't match
407 # those for gen_nn_ops.conv2d and gen_nn_ops.conv3d.
408 # pylint: disable=redefined-builtin
409 def _conv1d(self, input, filter, strides, padding, data_format, name):
410 return conv1d(
411 value=input,
412 filters=filter,
413 stride=strides,
414 padding=padding,
415 data_format=data_format,
416 name=name)
417 # pylint: enable=redefined-builtin
419 def __call__(self, inp, filter): # pylint: disable=redefined-builtin
420 return self.conv_op(
421 input=inp,
422 filter=filter,
423 strides=self.strides,
424 padding=self.padding,
425 data_format=self.data_format,
426 name=self.name)
429def squeeze_batch_dims(inp, op, inner_rank, name=None):
430 """Returns `unsqueeze_batch(op(squeeze_batch(inp)))`.
432 Where `squeeze_batch` reshapes `inp` to shape
433 `[prod(inp.shape[:-inner_rank])] + inp.shape[-inner_rank:]`
434 and `unsqueeze_batch` does the reverse reshape but on the output.
436 Args:
437 inp: A tensor with dims `batch_shape + inner_shape` where `inner_shape`
438 is length `inner_rank`.
439 op: A callable that takes a single input tensor and returns a single.
440 output tensor.
441 inner_rank: A python integer.
442 name: A string.
444 Returns:
445 `unsqueeze_batch_op(squeeze_batch(inp))`.
446 """
447 with ops.name_scope(name, "squeeze_batch_dims", [inp]):
448 inp = ops.convert_to_tensor(inp, name="input")
449 shape = inp.shape
451 inner_shape = shape[-inner_rank:]
452 if not inner_shape.is_fully_defined():
453 inner_shape = array_ops.shape(inp)[-inner_rank:]
455 batch_shape = shape[:-inner_rank]
456 if not batch_shape.is_fully_defined():
457 batch_shape = array_ops.shape(inp)[:-inner_rank]
459 if isinstance(inner_shape, tensor_shape.TensorShape):
460 inp_reshaped = array_ops.reshape(inp, [-1] + inner_shape.as_list())
461 else:
462 inp_reshaped = array_ops.reshape(
463 inp, array_ops.concat(([-1], inner_shape), axis=-1))
465 out_reshaped = op(inp_reshaped)
467 out_inner_shape = out_reshaped.shape[-inner_rank:]
468 if not out_inner_shape.is_fully_defined():
469 out_inner_shape = array_ops.shape(out_reshaped)[-inner_rank:]
471 out = array_ops.reshape(
472 out_reshaped, array_ops.concat((batch_shape, out_inner_shape), axis=-1))
474 out.set_shape(inp.shape[:-inner_rank] + out.shape[-inner_rank:])
475 return out
478@tf_export("nn.dilation2d", v1=[])
479@dispatch.add_dispatch_support
480def dilation2d_v2(
481 input, # pylint: disable=redefined-builtin
482 filters, # pylint: disable=redefined-builtin
483 strides,
484 padding,
485 data_format,
486 dilations,
487 name=None):
488 """Computes the grayscale dilation of 4-D `input` and 3-D `filters` tensors.
490 The `input` tensor has shape `[batch, in_height, in_width, depth]` and the
491 `filters` tensor has shape `[filter_height, filter_width, depth]`, i.e., each
492 input channel is processed independently of the others with its own
493 structuring function. The `output` tensor has shape
494 `[batch, out_height, out_width, depth]`. The spatial dimensions of the output
495 tensor depend on the `padding` algorithm. We currently only support the
496 default "NHWC" `data_format`.
498 In detail, the grayscale morphological 2-D dilation is the max-sum correlation
499 (for consistency with `conv2d`, we use unmirrored filters):
501 output[b, y, x, c] =
502 max_{dy, dx} input[b,
503 strides[1] * y + rates[1] * dy,
504 strides[2] * x + rates[2] * dx,
505 c] +
506 filters[dy, dx, c]
508 Max-pooling is a special case when the filter has size equal to the pooling
509 kernel size and contains all zeros.
511 Note on duality: The dilation of `input` by the `filters` is equal to the
512 negation of the erosion of `-input` by the reflected `filters`.
514 Args:
515 input: A `Tensor`. Must be one of the following types: `float32`, `float64`,
516 `int32`, `uint8`, `int16`, `int8`, `int64`, `bfloat16`, `uint16`, `half`,
517 `uint32`, `uint64`.
518 4-D with shape `[batch, in_height, in_width, depth]`.
519 filters: A `Tensor`. Must have the same type as `input`.
520 3-D with shape `[filter_height, filter_width, depth]`.
521 strides: A list of `ints` that has length `>= 4`.
522 The stride of the sliding window for each dimension of the input
523 tensor. Must be: `[1, stride_height, stride_width, 1]`.
524 padding: A `string` from: `"SAME", "VALID"`.
525 The type of padding algorithm to use. See
526 [here](https://www.tensorflow.org/api_docs/python/tf/nn#notes_on_padding_2)
527 for more information.
528 data_format: A `string`, only `"NHWC"` is currently supported.
529 dilations: A list of `ints` that has length `>= 4`.
530 The input stride for atrous morphological dilation. Must be:
531 `[1, rate_height, rate_width, 1]`.
532 name: A name for the operation (optional).
534 Returns:
535 A `Tensor`. Has the same type as `input`.
536 """
537 if data_format != "NHWC":
538 raise ValueError("`data_format` values other than 'NHWC' are not "
539 f"supported. Received: data_format={data_format}")
541 return gen_nn_ops.dilation2d(input=input,
542 filter=filters,
543 strides=strides,
544 rates=dilations,
545 padding=padding,
546 name=name)
549@tf_export(v1=["nn.dilation2d"])
550@dispatch.add_dispatch_support
551def dilation2d_v1( # pylint: disable=missing-docstring
552 input, # pylint: disable=redefined-builtin
553 filter=None, # pylint: disable=redefined-builtin
554 strides=None,
555 rates=None,
556 padding=None,
557 name=None,
558 filters=None,
559 dilations=None):
560 filter = deprecated_argument_lookup("filters", filters, "filter", filter)
561 rates = deprecated_argument_lookup("dilations", dilations, "rates", rates)
562 return gen_nn_ops.dilation2d(input, filter, strides, rates, padding, name)
565dilation2d_v1.__doc__ = gen_nn_ops.dilation2d.__doc__
568@tf_export("nn.with_space_to_batch")
569@dispatch.add_dispatch_support
570def with_space_to_batch(
571 input, # pylint: disable=redefined-builtin
572 dilation_rate,
573 padding,
574 op,
575 filter_shape=None,
576 spatial_dims=None,
577 data_format=None):
578 """Performs `op` on the space-to-batch representation of `input`.
580 This has the effect of transforming sliding window operations into the
581 corresponding "atrous" operation in which the input is sampled at the
582 specified `dilation_rate`.
584 In the special case that `dilation_rate` is uniformly 1, this simply returns:
586 op(input, num_spatial_dims, padding)
588 Otherwise, it returns:
590 batch_to_space_nd(
591 op(space_to_batch_nd(input, adjusted_dilation_rate, adjusted_paddings),
592 num_spatial_dims,
593 "VALID")
594 adjusted_dilation_rate,
595 adjusted_crops),
597 where:
599 adjusted_dilation_rate is an int64 tensor of shape [max(spatial_dims)],
600 adjusted_{paddings,crops} are int64 tensors of shape [max(spatial_dims), 2]
602 defined as follows:
604 We first define two int64 tensors `paddings` and `crops` of shape
605 `[num_spatial_dims, 2]` based on the value of `padding` and the spatial
606 dimensions of the `input`:
608 If `padding = "VALID"`, then:
610 paddings, crops = required_space_to_batch_paddings(
611 input_shape[spatial_dims],
612 dilation_rate)
614 If `padding = "SAME"`, then:
616 dilated_filter_shape =
617 filter_shape + (filter_shape - 1) * (dilation_rate - 1)
619 paddings, crops = required_space_to_batch_paddings(
620 input_shape[spatial_dims],
621 dilation_rate,
622 [(dilated_filter_shape - 1) // 2,
623 dilated_filter_shape - 1 - (dilated_filter_shape - 1) // 2])
625 Because `space_to_batch_nd` and `batch_to_space_nd` assume that the spatial
626 dimensions are contiguous starting at the second dimension, but the specified
627 `spatial_dims` may not be, we must adjust `dilation_rate`, `paddings` and
628 `crops` in order to be usable with these operations. For a given dimension,
629 if the block size is 1, and both the starting and ending padding and crop
630 amounts are 0, then space_to_batch_nd effectively leaves that dimension alone,
631 which is what is needed for dimensions not part of `spatial_dims`.
632 Furthermore, `space_to_batch_nd` and `batch_to_space_nd` handle this case
633 efficiently for any number of leading and trailing dimensions.
635 For 0 <= i < len(spatial_dims), we assign:
637 adjusted_dilation_rate[spatial_dims[i] - 1] = dilation_rate[i]
638 adjusted_paddings[spatial_dims[i] - 1, :] = paddings[i, :]
639 adjusted_crops[spatial_dims[i] - 1, :] = crops[i, :]
641 All unassigned values of `adjusted_dilation_rate` default to 1, while all
642 unassigned values of `adjusted_paddings` and `adjusted_crops` default to 0.
644 Note in the case that `dilation_rate` is not uniformly 1, specifying "VALID"
645 padding is equivalent to specifying `padding = "SAME"` with a filter_shape of
646 `[1]*N`.
648 Advanced usage. Note the following optimization: A sequence of
649 `with_space_to_batch` operations with identical (not uniformly 1)
650 `dilation_rate` parameters and "VALID" padding
652 net = with_space_to_batch(net, dilation_rate, "VALID", op_1)
653 ...
654 net = with_space_to_batch(net, dilation_rate, "VALID", op_k)
656 can be combined into a single `with_space_to_batch` operation as follows:
658 def combined_op(converted_input, num_spatial_dims, _):
659 result = op_1(converted_input, num_spatial_dims, "VALID")
660 ...
661 result = op_k(result, num_spatial_dims, "VALID")
663 net = with_space_to_batch(net, dilation_rate, "VALID", combined_op)
665 This eliminates the overhead of `k-1` calls to `space_to_batch_nd` and
666 `batch_to_space_nd`.
668 Similarly, a sequence of `with_space_to_batch` operations with identical (not
669 uniformly 1) `dilation_rate` parameters, "SAME" padding, and odd filter
670 dimensions
672 net = with_space_to_batch(net, dilation_rate, "SAME", op_1, filter_shape_1)
673 ...
674 net = with_space_to_batch(net, dilation_rate, "SAME", op_k, filter_shape_k)
676 can be combined into a single `with_space_to_batch` operation as follows:
678 def combined_op(converted_input, num_spatial_dims, _):
679 result = op_1(converted_input, num_spatial_dims, "SAME")
680 ...
681 result = op_k(result, num_spatial_dims, "SAME")
683 net = with_space_to_batch(net, dilation_rate, "VALID", combined_op)
685 Args:
686 input: Tensor of rank > max(spatial_dims).
687 dilation_rate: int32 Tensor of *known* shape [num_spatial_dims].
688 padding: str constant equal to "VALID" or "SAME"
689 op: Function that maps (input, num_spatial_dims, padding) -> output
690 filter_shape: If padding = "SAME", specifies the shape of the convolution
691 kernel/pooling window as an integer Tensor of shape [>=num_spatial_dims].
692 If padding = "VALID", filter_shape is ignored and need not be specified.
693 spatial_dims: Monotonically increasing sequence of `num_spatial_dims`
694 integers (which are >= 1) specifying the spatial dimensions of `input`
695 and output. Defaults to: `range(1, num_spatial_dims+1)`.
696 data_format: A string or None. Specifies whether the channel dimension of
697 the `input` and output is the last dimension (default, or if `data_format`
698 does not start with "NC"), or the second dimension (if `data_format`
699 starts with "NC"). For N=1, the valid values are "NWC" (default) and
700 "NCW". For N=2, the valid values are "NHWC" (default) and "NCHW".
701 For N=3, the valid values are "NDHWC" (default) and "NCDHW".
703 Returns:
704 The output Tensor as described above, dimensions will vary based on the op
705 provided.
707 Raises:
708 ValueError: if `padding` is invalid or the arguments are incompatible.
709 ValueError: if `spatial_dims` are invalid.
710 """
711 input = ops.convert_to_tensor(input, name="input") # pylint: disable=redefined-builtin
712 input_shape = input.shape
714 def build_op(num_spatial_dims, padding):
715 return lambda inp, _: op(inp, num_spatial_dims, padding)
717 new_op = _WithSpaceToBatch(
718 input_shape,
719 dilation_rate,
720 padding,
721 build_op,
722 filter_shape=filter_shape,
723 spatial_dims=spatial_dims,
724 data_format=data_format)
725 return new_op(input, None)
728class _WithSpaceToBatch:
729 """Helper class for with_space_to_batch.
731 Note that this class assumes that shapes of input and filter passed to
732 `__call__` are compatible with `input_shape`, `filter_shape`, and
733 `spatial_dims` passed to the constructor.
735 Arguments
736 input_shape: static shape of input. i.e. input.shape.
737 dilation_rate: see `with_space_to_batch`.
738 padding: see `with_space_to_batch`.
739 build_op: Function that maps (num_spatial_dims, paddings) -> (function that
740 maps (input, filter) -> output).
741 filter_shape: see `with_space_to_batch`.
742 spatial_dims: `see with_space_to_batch`.
743 data_format: see `with_space_to_batch`.
744 num_batch_dims: (Optional). Number of batch dims in `input_shape`.
745 """
747 def __init__(self,
748 input_shape,
749 dilation_rate,
750 padding,
751 build_op,
752 filter_shape=None,
753 spatial_dims=None,
754 data_format=None,
755 num_batch_dims=1):
756 """Helper class for _with_space_to_batch."""
757 dilation_rate = ops.convert_to_tensor(
758 dilation_rate, dtypes.int32, name="dilation_rate")
759 if dilation_rate.shape.ndims not in (None, 1):
760 raise ValueError(
761 "`dilation_rate.shape.rank` must be 1. Received: "
762 f"dilation_rate={dilation_rate} of rank {dilation_rate.shape.rank}")
764 if not dilation_rate.shape.is_fully_defined():
765 raise ValueError(
766 "`dilation_rate.shape` must be fully defined. Received: "
767 f"dilation_rate={dilation_rate} with shape "
768 f"{dilation_rate.shape}")
770 num_spatial_dims = dilation_rate.shape.dims[0].value
772 if data_format is not None and data_format.startswith("NC"):
773 starting_spatial_dim = num_batch_dims + 1
774 else:
775 starting_spatial_dim = num_batch_dims
777 if spatial_dims is None:
778 spatial_dims = range(starting_spatial_dim,
779 num_spatial_dims + starting_spatial_dim)
780 orig_spatial_dims = list(spatial_dims)
781 spatial_dims = sorted(set(int(x) for x in orig_spatial_dims))
782 if spatial_dims != orig_spatial_dims or any(x < 1 for x in spatial_dims):
783 raise ValueError(
784 "`spatial_dims` must be a monotonically increasing sequence of "
785 f"positive integers. Received: spatial_dims={orig_spatial_dims}")
787 if data_format is not None and data_format.startswith("NC"):
788 expected_input_rank = spatial_dims[-1]
789 else:
790 expected_input_rank = spatial_dims[-1] + 1
792 try:
793 input_shape.with_rank_at_least(expected_input_rank)
794 except ValueError:
795 raise ValueError(
796 f"`input.shape.rank` must be at least {expected_input_rank}. "
797 f"Received: input.shape={input_shape} with rank {input_shape.rank}")
799 const_rate = tensor_util.constant_value(dilation_rate)
800 rate_or_const_rate = dilation_rate
801 if const_rate is not None:
802 rate_or_const_rate = const_rate
803 if np.any(const_rate < 1):
804 raise ValueError(
805 "`dilation_rate` must be positive. "
806 f"Received: dilation_rate={const_rate}")
807 if np.all(const_rate == 1):
808 self.call = build_op(num_spatial_dims, padding)
809 return
811 padding, explicit_paddings = convert_padding(padding)
813 # We have two padding contributions. The first is used for converting "SAME"
814 # to "VALID". The second is required so that the height and width of the
815 # zero-padded value tensor are multiples of rate.
817 # Padding required to reduce to "VALID" convolution
818 if padding == "SAME":
819 if filter_shape is None:
820 raise ValueError(
821 "`filter_shape` must be specified for `padding='SAME'`. "
822 f"Received: filter_shape={filter_shape} and padding={padding}")
823 filter_shape = ops.convert_to_tensor(filter_shape, name="filter_shape")
824 const_filter_shape = tensor_util.constant_value(filter_shape)
825 if const_filter_shape is not None:
826 filter_shape = const_filter_shape
827 self.base_paddings = _with_space_to_batch_base_paddings(
828 const_filter_shape, num_spatial_dims, rate_or_const_rate)
829 else:
830 self.num_spatial_dims = num_spatial_dims
831 self.rate_or_const_rate = rate_or_const_rate
832 self.base_paddings = None
833 elif padding == "VALID":
834 self.base_paddings = np.zeros([num_spatial_dims, 2], np.int32)
835 elif padding == "EXPLICIT":
836 base_paddings = (np.array(explicit_paddings)
837 .reshape([num_spatial_dims + 2, 2]))
838 # Remove batch and channel dimensions
839 if data_format is not None and data_format.startswith("NC"):
840 self.base_paddings = base_paddings[2:]
841 else:
842 self.base_paddings = base_paddings[1:-1]
843 else:
844 raise ValueError("`padding` must be one of 'SAME' or 'VALID'. "
845 f"Received: padding={padding}")
847 self.input_shape = input_shape
848 self.spatial_dims = spatial_dims
849 self.dilation_rate = dilation_rate
850 self.data_format = data_format
851 self.op = build_op(num_spatial_dims, "VALID")
852 self.call = self._with_space_to_batch_call
854 def _with_space_to_batch_call(self, inp, filter): # pylint: disable=redefined-builtin
855 """Call functionality for with_space_to_batch."""
856 # Handle input whose shape is unknown during graph creation.
857 input_spatial_shape = None
858 input_shape = self.input_shape
859 spatial_dims = self.spatial_dims
860 if input_shape.ndims is not None:
861 input_shape_list = input_shape.as_list()
862 input_spatial_shape = [input_shape_list[i] for i in spatial_dims]
863 if input_spatial_shape is None or None in input_spatial_shape:
864 input_shape_tensor = array_ops.shape(inp)
865 input_spatial_shape = array_ops_stack.stack(
866 [input_shape_tensor[i] for i in spatial_dims])
868 base_paddings = self.base_paddings
869 if base_paddings is None:
870 # base_paddings could not be computed at build time since static filter
871 # shape was not fully defined.
872 filter_shape = array_ops.shape(filter)
873 base_paddings = _with_space_to_batch_base_paddings(
874 filter_shape, self.num_spatial_dims, self.rate_or_const_rate)
876 paddings, crops = array_ops.required_space_to_batch_paddings(
877 input_shape=input_spatial_shape,
878 base_paddings=base_paddings,
879 block_shape=self.dilation_rate)
881 dilation_rate = _with_space_to_batch_adjust(self.dilation_rate, 1,
882 spatial_dims)
883 paddings = _with_space_to_batch_adjust(paddings, 0, spatial_dims)
884 crops = _with_space_to_batch_adjust(crops, 0, spatial_dims)
885 input_converted = array_ops.space_to_batch_nd(
886 input=inp, block_shape=dilation_rate, paddings=paddings)
888 result = self.op(input_converted, filter)
890 result_converted = array_ops.batch_to_space_nd(
891 input=result, block_shape=dilation_rate, crops=crops)
893 # Recover channel information for output shape if channels are not last.
894 if self.data_format is not None and self.data_format.startswith("NC"):
895 if not result_converted.shape.dims[1].value and filter is not None:
896 output_shape = result_converted.shape.as_list()
897 output_shape[1] = filter.shape[-1]
898 result_converted.set_shape(output_shape)
900 return result_converted
902 def __call__(self, inp, filter): # pylint: disable=redefined-builtin
903 return self.call(inp, filter)
906def _with_space_to_batch_base_paddings(filter_shape, num_spatial_dims,
907 rate_or_const_rate):
908 """Helper function to compute base_paddings."""
909 # Spatial dimensions of the filters and the upsampled filters in which we
910 # introduce (rate - 1) zeros between consecutive filter values.
911 filter_spatial_shape = filter_shape[:num_spatial_dims]
912 pad_extra_shape = (filter_spatial_shape - 1) * rate_or_const_rate
914 # When full_padding_shape is odd, we pad more at end, following the same
915 # convention as conv2d.
916 pad_extra_start = pad_extra_shape // 2
917 pad_extra_end = pad_extra_shape - pad_extra_start
918 base_paddings = array_ops_stack.stack(
919 [[pad_extra_start[i], pad_extra_end[i]] for i in range(num_spatial_dims)])
920 return base_paddings
923def _with_space_to_batch_adjust(orig, fill_value, spatial_dims):
924 """Returns an `adjusted` version of `orig` based on `spatial_dims`.
926 Tensor of the same type as `orig` and with shape
927 `[max(spatial_dims), ...]` where:
929 adjusted[spatial_dims[i] - 1, ...] = orig[i, ...]
931 for 0 <= i < len(spatial_dims), and
933 adjusted[j, ...] = fill_value
935 for j != spatial_dims[i] - 1 for some i.
937 If `orig` is a constant value, then the result will be a constant value.
939 Args:
940 orig: Tensor of rank > max(spatial_dims).
941 fill_value: Numpy scalar (of same data type as `orig) specifying the fill
942 value for non-spatial dimensions.
943 spatial_dims: See with_space_to_batch.
945 Returns:
946 `adjusted` tensor.
947 """
948 fill_dims = orig.get_shape().as_list()[1:]
949 dtype = orig.dtype.as_numpy_dtype
950 parts = []
951 const_orig = tensor_util.constant_value(orig)
952 const_or_orig = const_orig if const_orig is not None else orig
953 prev_spatial_dim = 0
954 i = 0
955 while i < len(spatial_dims):
956 start_i = i
957 start_spatial_dim = spatial_dims[i]
958 if start_spatial_dim > 1:
959 # Fill in any gap from the previous spatial dimension (or dimension 1 if
960 # this is the first spatial dimension) with `fill_value`.
961 parts.append(
962 np.full(
963 [start_spatial_dim - 1 - prev_spatial_dim] + fill_dims,
964 fill_value,
965 dtype=dtype))
966 # Find the largest value of i such that:
967 # [spatial_dims[start_i], ..., spatial_dims[i]]
968 # == [start_spatial_dim, ..., start_spatial_dim + i - start_i],
969 # i.e. the end of a contiguous group of spatial dimensions.
970 while (i + 1 < len(spatial_dims) and
971 spatial_dims[i + 1] == spatial_dims[i] + 1):
972 i += 1
973 parts.append(const_or_orig[start_i:i + 1])
974 prev_spatial_dim = spatial_dims[i]
975 i += 1
976 if const_orig is not None:
977 return np.concatenate(parts)
978 else:
979 return array_ops.concat(parts, 0)
982def _get_strides_and_dilation_rate(num_spatial_dims, strides, dilation_rate):
983 """Helper function for verifying strides and dilation_rate arguments.
985 This is used by `convolution` and `pool`.
987 Args:
988 num_spatial_dims: int
989 strides: Optional. List of N ints >= 1. Defaults to `[1]*N`. If any value
990 of strides is > 1, then all values of dilation_rate must be 1.
991 dilation_rate: Optional. List of N ints >= 1. Defaults to `[1]*N`. If any
992 value of dilation_rate is > 1, then all values of strides must be 1.
994 Returns:
995 Normalized (strides, dilation_rate) as int32 numpy arrays of shape
996 [num_spatial_dims].
998 Raises:
999 ValueError: if the parameters are invalid.
1000 """
1001 if dilation_rate is None:
1002 dilation_rate = [1] * num_spatial_dims
1003 elif len(dilation_rate) != num_spatial_dims:
1004 raise ValueError(f"`len(dilation_rate)` should be {num_spatial_dims}. "
1005 f"Received: dilation_rate={dilation_rate} of length "
1006 f"{len(dilation_rate)}")
1007 dilation_rate = np.array(dilation_rate, dtype=np.int32)
1008 if np.any(dilation_rate < 1):
1009 raise ValueError("all values of `dilation_rate` must be positive. "
1010 f"Received: dilation_rate={dilation_rate}")
1012 if strides is None:
1013 strides = [1] * num_spatial_dims
1014 elif len(strides) != num_spatial_dims:
1015 raise ValueError(f"`len(strides)` should be {num_spatial_dims}. "
1016 f"Received: strides={strides} of length {len(strides)}")
1017 strides = np.array(strides, dtype=np.int32)
1018 if np.any(strides < 1):
1019 raise ValueError("all values of `strides` must be positive. "
1020 f"Received: strides={strides}")
1022 if np.any(strides > 1) and np.any(dilation_rate > 1):
1023 raise ValueError(
1024 "`strides > 1` not supported in conjunction with `dilation_rate > 1`. "
1025 f"Received: strides={strides} and dilation_rate={dilation_rate}")
1026 return strides, dilation_rate
1029@tf_export(v1=["nn.convolution"])
1030@dispatch.add_dispatch_support
1031def convolution(
1032 input, # pylint: disable=redefined-builtin
1033 filter, # pylint: disable=redefined-builtin
1034 padding,
1035 strides=None,
1036 dilation_rate=None,
1037 name=None,
1038 data_format=None,
1039 filters=None,
1040 dilations=None): # pylint: disable=g-doc-args
1041 """Computes sums of N-D convolutions (actually cross-correlation).
1043 This also supports either output striding via the optional `strides` parameter
1044 or atrous convolution (also known as convolution with holes or dilated
1045 convolution, based on the French word "trous" meaning holes in English) via
1046 the optional `dilation_rate` parameter. Currently, however, output striding
1047 is not supported for atrous convolutions.
1049 Specifically, in the case that `data_format` does not start with "NC", given
1050 a rank (N+2) `input` Tensor of shape
1052 [num_batches,
1053 input_spatial_shape[0],
1054 ...,
1055 input_spatial_shape[N-1],
1056 num_input_channels],
1058 a rank (N+2) `filter` Tensor of shape
1060 [spatial_filter_shape[0],
1061 ...,
1062 spatial_filter_shape[N-1],
1063 num_input_channels,
1064 num_output_channels],
1066 an optional `dilation_rate` tensor of shape N (defaults to `[1]*N`) specifying
1067 the filter upsampling/input downsampling rate, and an optional list of N
1068 `strides` (defaults to `[1]*N`), this computes for each N-D spatial output
1069 position `(x[0], ..., x[N-1])`:
1071 ```
1072 output[b, x[0], ..., x[N-1], k] =
1073 sum_{z[0], ..., z[N-1], q}
1074 filter[z[0], ..., z[N-1], q, k] *
1075 padded_input[b,
1076 x[0]*strides[0] + dilation_rate[0]*z[0],
1077 ...,
1078 x[N-1]*strides[N-1] + dilation_rate[N-1]*z[N-1],
1079 q]
1080 ```
1082 where b is the index into the batch, k is the output channel number, q is the
1083 input channel number, and z is the N-D spatial offset within the filter. Here,
1084 `padded_input` is obtained by zero padding the input using an effective
1085 spatial filter shape of `(spatial_filter_shape-1) * dilation_rate + 1` and
1086 output striding `strides`.
1088 In the case that `data_format` does start with `"NC"`, the `input` and output
1089 (but not the `filter`) are simply transposed as follows:
1091 ```python
1092 convolution(input, data_format, **kwargs) =
1093 tf.transpose(convolution(tf.transpose(input, [0] + range(2,N+2) + [1]),
1094 **kwargs),
1095 [0, N+1] + range(1, N+1))
1096 ```
1098 It is required that 1 <= N <= 3.
1100 Args:
1101 input: An (N+2)-D `Tensor` of type `T`, of shape
1102 `[batch_size] + input_spatial_shape + [in_channels]` if data_format does
1103 not start with "NC" (default), or
1104 `[batch_size, in_channels] + input_spatial_shape` if data_format starts
1105 with "NC".
1106 filter: An (N+2)-D `Tensor` with the same type as `input` and shape
1107 `spatial_filter_shape + [in_channels, out_channels]`.
1108 padding: A string, either `"VALID"` or `"SAME"`. The padding algorithm.
1109 `"valid"` means no padding. `"same"` results in padding evenly to
1110 the left/right or up/down of the input such that output has the same
1111 height/width dimension as the input when the strides are 1. See
1112 [here](https://www.tensorflow.org/api_docs/python/tf/nn#notes_on_padding_2)
1113 for more information.
1114 strides: Optional. Sequence of N ints >= 1. Specifies the output stride.
1115 Defaults to `[1]*N`. If any value of strides is > 1, then all values of
1116 dilation_rate must be 1.
1117 dilation_rate: Optional. Sequence of N ints >= 1. Specifies the filter
1118 upsampling/input downsampling rate. In the literature, the same parameter
1119 is sometimes called `input stride` or `dilation`. The effective filter
1120 size used for the convolution will be `spatial_filter_shape +
1121 (spatial_filter_shape - 1) * (rate - 1)`, obtained by inserting
1122 (dilation_rate[i]-1) zeros between consecutive elements of the original
1123 filter in each spatial dimension i. If any value of dilation_rate is > 1,
1124 then all values of strides must be 1.
1125 name: Optional name for the returned tensor.
1126 data_format: A string or None. Specifies whether the channel dimension of
1127 the `input` and output is the last dimension (default, or if `data_format`
1128 does not start with "NC"), or the second dimension (if `data_format`
1129 starts with "NC"). For N=1, the valid values are "NWC" (default) and
1130 "NCW". For N=2, the valid values are "NHWC" (default) and "NCHW".
1131 For N=3, the valid values are "NDHWC" (default) and "NCDHW".
1133 Returns:
1134 A `Tensor` with the same type as `input` of shape
1136 `[batch_size] + output_spatial_shape + [out_channels]`
1138 if data_format is None or does not start with "NC", or
1140 `[batch_size, out_channels] + output_spatial_shape`
1142 if data_format starts with "NC",
1143 where `output_spatial_shape` depends on the value of `padding`.
1145 If padding == "SAME":
1146 output_spatial_shape[i] = ceil(input_spatial_shape[i] / strides[i])
1148 If padding == "VALID":
1149 output_spatial_shape[i] =
1150 ceil((input_spatial_shape[i] -
1151 (spatial_filter_shape[i]-1) * dilation_rate[i])
1152 / strides[i]).
1154 Raises:
1155 ValueError: If input/output depth does not match `filter` shape, if padding
1156 is other than `"VALID"` or `"SAME"`, or if data_format is invalid.
1158 """
1159 filter = deprecated_argument_lookup("filters", filters, "filter", filter)
1160 dilation_rate = deprecated_argument_lookup(
1161 "dilations", dilations, "dilation_rate", dilation_rate)
1162 return convolution_internal(
1163 input,
1164 filter,
1165 strides=strides,
1166 padding=padding,
1167 data_format=data_format,
1168 dilations=dilation_rate,
1169 name=name)
1172@tf_export("nn.convolution", v1=[])
1173@dispatch.add_dispatch_support
1174def convolution_v2( # pylint: disable=missing-docstring
1175 input, # pylint: disable=redefined-builtin
1176 filters,
1177 strides=None,
1178 padding="VALID",
1179 data_format=None,
1180 dilations=None,
1181 name=None):
1182 return convolution_internal(
1183 input, # pylint: disable=redefined-builtin
1184 filters,
1185 strides=strides,
1186 padding=padding,
1187 data_format=data_format,
1188 dilations=dilations,
1189 name=name)
1192convolution_v2.__doc__ = deprecation.rewrite_argument_docstring(
1193 deprecation.rewrite_argument_docstring(
1194 convolution.__doc__, "dilation_rate", "dilations"),
1195 "filter", "filters")
1198def convolution_internal(
1199 input, # pylint: disable=redefined-builtin
1200 filters,
1201 strides=None,
1202 padding="VALID",
1203 data_format=None,
1204 dilations=None,
1205 name=None,
1206 call_from_convolution=True,
1207 num_spatial_dims=None):
1208 """Internal function which performs rank agnostic convolution.
1210 Args:
1211 input: See `convolution`.
1212 filters: See `convolution`.
1213 strides: See `convolution`.
1214 padding: See `convolution`.
1215 data_format: See `convolution`.
1216 dilations: See `convolution`.
1217 name: See `convolution`.
1218 call_from_convolution: See `convolution`.
1219 num_spatial_dims: (Optional.). It is a integer describing the
1220 rank of the spatial dimensions. For `1-D`, `2-D` and `3-D` convolutions,
1221 the value of `num_spatial_dims` is `1`, `2`, and `3`, respectively.
1222 This argument is only required to disambiguate the rank of `batch_shape`
1223 when `filter_shape.ndims is None` and `len(batch_shape) > 1`. For
1224 backwards compatibility, if `num_spatial_dims is None` and
1225 `filter_shape.ndims is None`, then `len(batch_shape)` is assumed to be
1226 `1` (i.e., the input is expected to be
1227 `[batch_size, num_channels] + input_spatial_shape`
1228 or `[batch_size] + input_spatial_shape + [num_channels]`.
1230 Returns:
1231 A tensor of shape and dtype matching that of `input`.
1233 Raises:
1234 ValueError: If input and filter both have unknown shapes, or if
1235 `num_spatial_dims` is provided and incompatible with the value
1236 estimated from `filters.shape`.
1237 """
1238 if (not isinstance(filters, variables_lib.Variable) and
1239 not tensor_util.is_tf_type(filters)):
1240 with ops.name_scope("convolution_internal", None, [filters, input]):
1241 filters = ops.convert_to_tensor(filters, name='filters')
1242 if (not isinstance(input, ops.Tensor) and not tensor_util.is_tf_type(input)):
1243 with ops.name_scope("convolution_internal", None, [filters, input]):
1244 input = ops.convert_to_tensor(input, name="input")
1246 filters_rank = filters.shape.rank
1247 inputs_rank = input.shape.rank
1248 if num_spatial_dims is None:
1249 if filters_rank:
1250 num_spatial_dims = filters_rank - 2
1251 elif inputs_rank:
1252 num_spatial_dims = inputs_rank - 2
1253 else:
1254 raise ValueError(
1255 "When `num_spatial_dims` is not set, one of `input.shape.rank` or "
1256 "`filters.shape.rank` must be known. "
1257 f"Received: input.shape={input.shape} of rank {inputs_rank} and "
1258 f"filters.shape={filters.shape} of rank {filters_rank}")
1259 elif filters_rank and filters_rank - 2 != num_spatial_dims:
1260 raise ValueError(
1261 "`filters.shape.rank - 2` should equal `num_spatial_dims`. Received: "
1262 f"filters.shape={filters.shape} of rank {filters_rank} and "
1263 f"num_spatial_dims={num_spatial_dims}")
1265 if inputs_rank:
1266 num_batch_dims = inputs_rank - num_spatial_dims - 1 # Channel dimension.
1267 else:
1268 num_batch_dims = 1 # By default, assume single batch dimension.
1270 if num_spatial_dims not in {1, 2, 3}:
1271 raise ValueError(
1272 "`num_spatial_dims` must be 1, 2, or 3. "
1273 f"Received: num_spatial_dims={num_spatial_dims}.")
1275 if data_format is None or data_format in _CHANNELS_LAST_FORMATS:
1276 channel_index = num_batch_dims + num_spatial_dims
1277 else:
1278 channel_index = num_batch_dims
1280 if dilations is None:
1281 dilations = _get_sequence(dilations, num_spatial_dims, channel_index,
1282 "dilations")
1283 is_dilated_conv = False
1284 else:
1285 dilations = _get_sequence(dilations, num_spatial_dims, channel_index,
1286 "dilations")
1287 is_dilated_conv = any(i != 1 for i in dilations)
1289 strides = _get_sequence(strides, num_spatial_dims, channel_index, "strides")
1290 has_tpu_context = device_context.enclosing_tpu_context() is not None
1292 if name:
1293 default_name = None
1294 elif not has_tpu_context or call_from_convolution:
1295 default_name = "convolution"
1296 elif num_spatial_dims == 2: # Most common case.
1297 default_name = "Conv2D"
1298 elif num_spatial_dims == 3:
1299 default_name = "Conv3D"
1300 else:
1301 default_name = "conv1d"
1303 with ops.name_scope(name, default_name, [input, filters]) as name:
1304 # Fast path for TPU or if no dilation, as gradient only supported on TPU
1305 # for dilations.
1306 if not is_dilated_conv or has_tpu_context:
1307 if num_spatial_dims == 2: # Most common case.
1308 op = _conv2d_expanded_batch
1309 elif num_spatial_dims == 3:
1310 op = _conv3d_expanded_batch
1311 else:
1312 op = conv1d
1314 return op(
1315 input,
1316 filters,
1317 strides,
1318 padding=padding,
1319 data_format=data_format,
1320 dilations=dilations,
1321 name=name)
1322 else:
1323 if channel_index == 1:
1324 strides = strides[2:]
1325 dilations = dilations[2:]
1326 else:
1327 strides = strides[1:-1]
1328 dilations = dilations[1:-1]
1330 op = Convolution(
1331 tensor_shape.as_shape(input.shape),
1332 tensor_shape.as_shape(filters.shape),
1333 padding,
1334 strides=strides,
1335 dilation_rate=dilations,
1336 name=name,
1337 data_format=data_format,
1338 num_spatial_dims=num_spatial_dims)
1339 return op(input, filters)
1342class Convolution:
1343 """Helper class for convolution.
1345 Note that this class assumes that shapes of input and filter passed to
1346 `__call__` are compatible with `input_shape`, `filter_shape`, and
1347 `num_spatial_dims` passed to the constructor.
1349 Arguments
1350 input_shape: static shape of input. i.e. input.shape. Its length is
1351 `batch_shape + input_spatial_shape + [num_channels]` if `data_format`
1352 does not start with `NC`, or
1353 `batch_shape + [num_channels] + input_spatial_shape` if `data_format`
1354 starts with `NC`.
1355 filter_shape: static shape of the filter. i.e. filter.shape.
1356 padding: The padding algorithm, must be "SAME" or "VALID".
1357 strides: see convolution.
1358 dilation_rate: see convolution.
1359 name: see convolution.
1360 data_format: A string or `None`. Specifies whether the channel dimension of
1361 the `input` and output is the last dimension (if `data_format` is `None`
1362 or does not start with `NC`), or the first post-batch dimension (i.e. if
1363 `data_format` starts with `NC`).
1364 num_spatial_dims: (Usually optional.) Python integer, the rank of the
1365 spatial and channel dimensions. For `1-D`, `2-D` and `3-D` convolutions,
1366 the value of `num_spatial_dims` is `1`, `2`, and `3`, respectively.
1367 This argument is only required to disambiguate the rank of `batch_shape`
1368 when `filter_shape.ndims is None` and `len(batch_shape) > 1`. For
1369 backwards compatibility, if `num_spatial_dims is None` and
1370 `filter_shape.ndims is None`, then `len(batch_shape)` is assumed to be
1371 `1` (i.e., the input is expected to be
1372 `[batch_size, num_channels] + input_spatial_shape`
1373 or `[batch_size] + input_spatial_shape + [num_channels]`.
1374 """
1376 def __init__(self,
1377 input_shape,
1378 filter_shape,
1379 padding,
1380 strides=None,
1381 dilation_rate=None,
1382 name=None,
1383 data_format=None,
1384 num_spatial_dims=None):
1385 """Helper function for convolution."""
1386 num_batch_dims = None
1387 filter_shape = tensor_shape.as_shape(filter_shape)
1388 input_shape = tensor_shape.as_shape(input_shape)
1390 if filter_shape.ndims is not None:
1391 if (num_spatial_dims is not None and
1392 filter_shape.ndims != num_spatial_dims + 2):
1393 raise ValueError(
1394 "`filters.shape.rank` must be `num_spatial_dims + 2`. Received: "
1395 f"filters.shape={filter_shape} of rank {filter_shape.rank} and "
1396 f"num_spatial_dims={num_spatial_dims}")
1397 else:
1398 num_spatial_dims = filter_shape.ndims - 2
1400 if input_shape.ndims is not None and num_spatial_dims is not None:
1401 num_batch_dims = input_shape.ndims - num_spatial_dims - 1
1403 if num_spatial_dims is None:
1404 num_spatial_dims = input_shape.ndims - 2
1405 else:
1406 if input_shape.ndims is not None:
1407 if input_shape.ndims < num_spatial_dims + 2:
1408 raise ValueError(
1409 "`input.shape.rank` must be >= than `num_spatial_dims + 2`. "
1410 f"Received: input.shape={input_shape} of rank {input_shape.rank} "
1411 f"and num_spatial_dims={num_spatial_dims}")
1412 else:
1413 if num_batch_dims is None:
1414 num_batch_dims = input_shape.ndims - num_spatial_dims - 1
1416 if num_spatial_dims is None:
1417 raise ValueError(
1418 "When `num_spatial_dims` is not set, one of `input.shape.rank` or "
1419 "`filters.shape.rank` must be known. "
1420 f"Received: input.shape={input_shape} of rank {input_shape.rank} and "
1421 f"`filters.shape={filter_shape}` of rank {filter_shape.rank}")
1423 if num_batch_dims is None:
1424 num_batch_dims = 1
1426 if num_batch_dims < 1:
1427 raise ValueError(
1428 f"Batch dims should be >= 1, but found {num_batch_dims}. "
1429 "Batch dims was estimated as "
1430 "`input.shape.rank - num_spatial_dims - 1` and `num_spatial_dims` "
1431 "was either provided or estimated as `filters.shape.rank - 2`. "
1432 f"Received: input.shape={input_shape} of rank {input_shape.rank}, "
1433 f"filters.shape={filter_shape} of rank {filter_shape.rank}, and "
1434 f"num_spatial_dims={num_spatial_dims}")
1436 if data_format is None or not data_format.startswith("NC"):
1437 input_channels_dim = tensor_shape.dimension_at_index(
1438 input_shape, num_spatial_dims + num_batch_dims)
1439 spatial_dims = range(num_batch_dims, num_spatial_dims + num_batch_dims)
1440 else:
1441 input_channels_dim = tensor_shape.dimension_at_index(
1442 input_shape, num_batch_dims)
1443 spatial_dims = range(
1444 num_batch_dims + 1, num_spatial_dims + num_batch_dims + 1)
1446 filter_dim = tensor_shape.dimension_at_index(filter_shape, num_spatial_dims)
1447 if not (input_channels_dim % filter_dim).is_compatible_with(0):
1448 raise ValueError(
1449 "The number of input channels is not divisible by the corresponding "
1450 f"number of output filters. Received: input.shape={input_shape} with "
1451 f"{input_channels_dim} channels and filters.shape={filter_shape} "
1452 f"with {filter_dim} output filters.")
1454 strides, dilation_rate = _get_strides_and_dilation_rate(
1455 num_spatial_dims, strides, dilation_rate)
1457 self.input_shape = input_shape
1458 self.filter_shape = filter_shape
1459 self.data_format = data_format
1460 self.strides = strides
1461 self.padding = padding
1462 self.name = name
1463 self.dilation_rate = dilation_rate
1464 self.num_batch_dims = num_batch_dims
1465 self.num_spatial_dims = num_spatial_dims
1466 self.conv_op = _WithSpaceToBatch(
1467 input_shape,
1468 dilation_rate=dilation_rate,
1469 padding=padding,
1470 build_op=self._build_op,
1471 filter_shape=filter_shape,
1472 spatial_dims=spatial_dims,
1473 data_format=data_format,
1474 num_batch_dims=num_batch_dims)
1476 def _build_op(self, _, padding):
1477 return _NonAtrousConvolution(
1478 self.input_shape,
1479 filter_shape=self.filter_shape,
1480 padding=padding,
1481 data_format=self.data_format,
1482 strides=self.strides,
1483 name=self.name,
1484 num_batch_dims=self.num_batch_dims)
1486 def __call__(self, inp, filter): # pylint: disable=redefined-builtin
1487 # TPU convolution supports dilations greater than 1.
1488 if device_context.enclosing_tpu_context() is not None:
1489 return convolution_internal(
1490 inp,
1491 filter,
1492 strides=self.strides,
1493 padding=self.padding,
1494 data_format=self.data_format,
1495 dilations=self.dilation_rate,
1496 name=self.name,
1497 call_from_convolution=False,
1498 num_spatial_dims=self.num_spatial_dims)
1499 else:
1500 return self.conv_op(inp, filter)
1503@tf_export(v1=["nn.pool"])
1504@dispatch.add_dispatch_support
1505def pool(
1506 input, # pylint: disable=redefined-builtin
1507 window_shape,
1508 pooling_type,
1509 padding,
1510 dilation_rate=None,
1511 strides=None,
1512 name=None,
1513 data_format=None,
1514 dilations=None):
1515 """Performs an N-D pooling operation.
1517 In the case that `data_format` does not start with "NC", computes for
1518 0 <= b < batch_size,
1519 0 <= x[i] < output_spatial_shape[i],
1520 0 <= c < num_channels:
1522 ```
1523 output[b, x[0], ..., x[N-1], c] =
1524 REDUCE_{z[0], ..., z[N-1]}
1525 input[b,
1526 x[0] * strides[0] - pad_before[0] + dilation_rate[0]*z[0],
1527 ...
1528 x[N-1]*strides[N-1] - pad_before[N-1] + dilation_rate[N-1]*z[N-1],
1529 c],
1530 ```
1532 where the reduction function REDUCE depends on the value of `pooling_type`,
1533 and pad_before is defined based on the value of `padding` as described in
1534 the "returns" section of `tf.nn.convolution` for details.
1535 The reduction never includes out-of-bounds positions.
1537 In the case that `data_format` starts with `"NC"`, the `input` and output are
1538 simply transposed as follows:
1540 ```python
1541 pool(input, data_format, **kwargs) =
1542 tf.transpose(pool(tf.transpose(input, [0] + range(2,N+2) + [1]),
1543 **kwargs),
1544 [0, N+1] + range(1, N+1))
1545 ```
1547 Args:
1548 input: Tensor of rank N+2, of shape
1549 `[batch_size] + input_spatial_shape + [num_channels]` if data_format does
1550 not start with "NC" (default), or
1551 `[batch_size, num_channels] + input_spatial_shape` if data_format starts
1552 with "NC". Pooling happens over the spatial dimensions only.
1553 window_shape: Sequence of N ints >= 1.
1554 pooling_type: Specifies pooling operation, must be "AVG" or "MAX".
1555 padding: The padding algorithm, must be "SAME" or "VALID".
1556 See the "returns" section of `tf.nn.convolution` for details.
1557 dilation_rate: Optional. Dilation rate. List of N ints >= 1.
1558 Defaults to `[1]*N`. If any value of dilation_rate is > 1, then all
1559 values of strides must be 1.
1560 strides: Optional. Sequence of N ints >= 1. Defaults to `[1]*N`.
1561 If any value of strides is > 1, then all values of dilation_rate must be
1562 1.
1563 name: Optional. Name of the op.
1564 data_format: A string or None. Specifies whether the channel dimension of
1565 the `input` and output is the last dimension (default, or if `data_format`
1566 does not start with "NC"), or the second dimension (if `data_format`
1567 starts with "NC"). For N=1, the valid values are "NWC" (default) and
1568 "NCW". For N=2, the valid values are "NHWC" (default) and "NCHW".
1569 For N=3, the valid values are "NDHWC" (default) and "NCDHW".
1570 dilations: Alias for dilation_rate
1572 Returns:
1573 Tensor of rank N+2, of shape
1574 [batch_size] + output_spatial_shape + [num_channels]
1576 if data_format is None or does not start with "NC", or
1578 [batch_size, num_channels] + output_spatial_shape
1580 if data_format starts with "NC",
1581 where `output_spatial_shape` depends on the value of padding:
1583 If padding = "SAME":
1584 output_spatial_shape[i] = ceil(input_spatial_shape[i] / strides[i])
1586 If padding = "VALID":
1587 output_spatial_shape[i] =
1588 ceil((input_spatial_shape[i] - (window_shape[i] - 1) * dilation_rate[i])
1589 / strides[i]).
1591 Raises:
1592 ValueError: if arguments are invalid.
1594 """
1595 dilation_rate = deprecated_argument_lookup(
1596 "dilations", dilations, "dilation_rate", dilation_rate)
1597 # pylint: enable=line-too-long
1598 with ops.name_scope(name, "%s_pool" % (pooling_type.lower()),
1599 [input]) as scope:
1600 input = ops.convert_to_tensor(input, name="input") # pylint: disable=redefined-builtin
1602 num_spatial_dims = len(window_shape)
1603 if num_spatial_dims < 1 or num_spatial_dims > 3:
1604 raise ValueError("`len(window_shape)` must be 1, 2, or 3. Received: "
1605 f"window_shape={window_shape} of length "
1606 f"{len(window_shape)}")
1608 input.get_shape().with_rank(num_spatial_dims + 2)
1610 strides, dilation_rate = _get_strides_and_dilation_rate(
1611 num_spatial_dims, strides, dilation_rate)
1613 if padding == "SAME" and np.any(dilation_rate > 1):
1614 raise ValueError(
1615 "pooling with 'SAME' padding is not implemented for "
1616 f"`dilation_rate` > 1. Received: padding={padding} and "
1617 f"dilation_rate={dilation_rate}")
1619 if np.any(strides > window_shape):
1620 raise ValueError(
1621 "`strides` > `window_shape` not supported due to inconsistency "
1622 f"between CPU and GPU implementations. Received: strides={strides} "
1623 f"and window_shape={window_shape}")
1625 pooling_ops = {
1626 ("MAX", 1): max_pool,
1627 ("MAX", 2): max_pool,
1628 ("MAX", 3): max_pool3d, # pylint: disable=undefined-variable
1629 ("AVG", 1): avg_pool,
1630 ("AVG", 2): avg_pool,
1631 ("AVG", 3): avg_pool3d, # pylint: disable=undefined-variable
1632 }
1633 op_key = (pooling_type, num_spatial_dims)
1634 if op_key not in pooling_ops:
1635 raise ValueError(
1636 f"{num_spatial_dims}-D {pooling_type} pooling is not supported.")
1638 if data_format is None or not data_format.startswith("NC"):
1639 adjusted_window_shape = [1] + list(window_shape) + [1]
1640 adjusted_strides = [1] + list(strides) + [1]
1641 spatial_dims = range(1, num_spatial_dims + 1)
1642 else:
1643 adjusted_window_shape = [1, 1] + list(window_shape)
1644 adjusted_strides = [1, 1] + list(strides)
1645 spatial_dims = range(2, num_spatial_dims + 2)
1647 if num_spatial_dims == 1:
1648 if data_format is None or data_format == "NWC":
1649 data_format_kwargs = dict(data_format="NHWC")
1650 elif data_format == "NCW":
1651 data_format_kwargs = dict(data_format="NCHW")
1652 else:
1653 raise ValueError("data_format must be either 'NWC' or 'NCW'. "
1654 f"Received: data_format={data_format}")
1655 adjusted_window_shape = [1] + adjusted_window_shape
1656 adjusted_strides = [1] + adjusted_strides
1657 else:
1658 data_format_kwargs = dict(data_format=data_format)
1660 def op(converted_input, _, converted_padding): # pylint: disable=missing-docstring
1661 if num_spatial_dims == 1:
1662 converted_input = array_ops.expand_dims(converted_input,
1663 spatial_dims[0])
1664 result = pooling_ops[op_key](
1665 converted_input,
1666 adjusted_window_shape,
1667 adjusted_strides,
1668 converted_padding,
1669 name=scope,
1670 **data_format_kwargs)
1671 if num_spatial_dims == 1:
1672 result = array_ops.squeeze(result, [spatial_dims[0]])
1673 return result
1675 return with_space_to_batch(
1676 input=input,
1677 dilation_rate=dilation_rate,
1678 padding=padding,
1679 op=op,
1680 spatial_dims=spatial_dims,
1681 filter_shape=window_shape)
1684@tf_export("nn.pool", v1=[])
1685@dispatch.add_dispatch_support
1686def pool_v2(
1687 input, # pylint: disable=redefined-builtin
1688 window_shape,
1689 pooling_type,
1690 strides=None,
1691 padding="VALID",
1692 data_format=None,
1693 dilations=None,
1694 name=None):
1695 # pylint: disable=line-too-long
1696 """Performs an N-D pooling operation.
1698 In the case that `data_format` does not start with "NC", computes for
1699 0 <= b < batch_size,
1700 0 <= x[i] < output_spatial_shape[i],
1701 0 <= c < num_channels:
1703 ```
1704 output[b, x[0], ..., x[N-1], c] =
1705 REDUCE_{z[0], ..., z[N-1]}
1706 input[b,
1707 x[0] * strides[0] - pad_before[0] + dilation_rate[0]*z[0],
1708 ...
1709 x[N-1]*strides[N-1] - pad_before[N-1] + dilation_rate[N-1]*z[N-1],
1710 c],
1711 ```
1713 where the reduction function REDUCE depends on the value of `pooling_type`,
1714 and pad_before is defined based on the value of `padding` as described in
1715 the "returns" section of `tf.nn.convolution` for details.
1716 The reduction never includes out-of-bounds positions.
1718 In the case that `data_format` starts with `"NC"`, the `input` and output are
1719 simply transposed as follows:
1721 ```python
1722 pool(input, data_format, **kwargs) =
1723 tf.transpose(pool(tf.transpose(input, [0] + range(2,N+2) + [1]),
1724 **kwargs),
1725 [0, N+1] + range(1, N+1))
1726 ```
1728 Args:
1729 input: Tensor of rank N+2, of shape `[batch_size] + input_spatial_shape +
1730 [num_channels]` if data_format does not start with "NC" (default), or
1731 `[batch_size, num_channels] + input_spatial_shape` if data_format starts
1732 with "NC". Pooling happens over the spatial dimensions only.
1733 window_shape: Sequence of N ints >= 1.
1734 pooling_type: Specifies pooling operation, must be "AVG" or "MAX".
1735 strides: Optional. Sequence of N ints >= 1. Defaults to `[1]*N`. If any value of
1736 strides is > 1, then all values of dilation_rate must be 1.
1737 padding: The padding algorithm, must be "SAME" or "VALID". Defaults to "SAME".
1738 See
1739 [here](https://www.tensorflow.org/api_docs/python/tf/nn#notes_on_padding_2)
1740 for more information.
1741 data_format: A string or None. Specifies whether the channel dimension of
1742 the `input` and output is the last dimension (default, or if `data_format`
1743 does not start with "NC"), or the second dimension (if `data_format`
1744 starts with "NC"). For N=1, the valid values are "NWC" (default) and
1745 "NCW". For N=2, the valid values are "NHWC" (default) and "NCHW". For
1746 N=3, the valid values are "NDHWC" (default) and "NCDHW".
1747 dilations: Optional. Dilation rate. List of N ints >= 1. Defaults to
1748 `[1]*N`. If any value of dilation_rate is > 1, then all values of strides
1749 must be 1.
1750 name: Optional. Name of the op.
1752 Returns:
1753 Tensor of rank N+2, of shape
1754 [batch_size] + output_spatial_shape + [num_channels]
1756 if data_format is None or does not start with "NC", or
1758 [batch_size, num_channels] + output_spatial_shape
1760 if data_format starts with "NC",
1761 where `output_spatial_shape` depends on the value of padding:
1763 If padding = "SAME":
1764 output_spatial_shape[i] = ceil(input_spatial_shape[i] / strides[i])
1766 If padding = "VALID":
1767 output_spatial_shape[i] =
1768 ceil((input_spatial_shape[i] - (window_shape[i] - 1) * dilation_rate[i])
1769 / strides[i]).
1771 Raises:
1772 ValueError: if arguments are invalid.
1773 """
1774 return pool(
1775 input=input,
1776 window_shape=window_shape,
1777 pooling_type=pooling_type,
1778 padding=padding,
1779 dilation_rate=dilations,
1780 strides=strides,
1781 name=name,
1782 data_format=data_format)
1785@tf_export("nn.atrous_conv2d")
1786@dispatch.add_dispatch_support
1787def atrous_conv2d(value, filters, rate, padding, name=None):
1788 """Atrous convolution (a.k.a. convolution with holes or dilated convolution).
1790 This function is a simpler wrapper around the more general
1791 `tf.nn.convolution`, and exists only for backwards compatibility. You can
1792 use `tf.nn.convolution` to perform 1-D, 2-D, or 3-D atrous convolution.
1794 Computes a 2-D atrous convolution, also known as convolution with holes or
1795 dilated convolution, given 4-D `value` and `filters` tensors. If the `rate`
1796 parameter is equal to one, it performs regular 2-D convolution. If the `rate`
1797 parameter is greater than one, it performs convolution with holes, sampling
1798 the input values every `rate` pixels in the `height` and `width` dimensions.
1799 This is equivalent to convolving the input with a set of upsampled filters,
1800 produced by inserting `rate - 1` zeros between two consecutive values of the
1801 filters along the `height` and `width` dimensions, hence the name atrous
1802 convolution or convolution with holes (the French word trous means holes in
1803 English).
1805 More specifically:
1807 ```
1808 output[batch, height, width, out_channel] =
1809 sum_{dheight, dwidth, in_channel} (
1810 filters[dheight, dwidth, in_channel, out_channel] *
1811 value[batch, height + rate*dheight, width + rate*dwidth, in_channel]
1812 )
1813 ```
1815 Atrous convolution allows us to explicitly control how densely to compute
1816 feature responses in fully convolutional networks. Used in conjunction with
1817 bilinear interpolation, it offers an alternative to `conv2d_transpose` in
1818 dense prediction tasks such as semantic image segmentation, optical flow
1819 computation, or depth estimation. It also allows us to effectively enlarge
1820 the field of view of filters without increasing the number of parameters or
1821 the amount of computation.
1823 For a description of atrous convolution and how it can be used for dense
1824 feature extraction, please see: (Chen et al., 2015). The same operation is
1825 investigated further in (Yu et al., 2016). Previous works that effectively
1826 use atrous convolution in different ways are, among others,
1827 (Sermanet et al., 2014) and (Giusti et al., 2013).
1828 Atrous convolution is also closely related to the so-called noble identities
1829 in multi-rate signal processing.
1831 There are many different ways to implement atrous convolution (see the refs
1832 above). The implementation here reduces
1834 ```python
1835 atrous_conv2d(value, filters, rate, padding=padding)
1836 ```
1838 to the following three operations:
1840 ```python
1841 paddings = ...
1842 net = space_to_batch(value, paddings, block_size=rate)
1843 net = conv2d(net, filters, strides=[1, 1, 1, 1], padding="VALID")
1844 crops = ...
1845 net = batch_to_space(net, crops, block_size=rate)
1846 ```
1848 Advanced usage. Note the following optimization: A sequence of `atrous_conv2d`
1849 operations with identical `rate` parameters, 'SAME' `padding`, and filters
1850 with odd heights/ widths:
1852 ```python
1853 net = atrous_conv2d(net, filters1, rate, padding="SAME")
1854 net = atrous_conv2d(net, filters2, rate, padding="SAME")
1855 ...
1856 net = atrous_conv2d(net, filtersK, rate, padding="SAME")
1857 ```
1859 can be equivalently performed cheaper in terms of computation and memory as:
1861 ```python
1862 pad = ... # padding so that the input dims are multiples of rate
1863 net = space_to_batch(net, paddings=pad, block_size=rate)
1864 net = conv2d(net, filters1, strides=[1, 1, 1, 1], padding="SAME")
1865 net = conv2d(net, filters2, strides=[1, 1, 1, 1], padding="SAME")
1866 ...
1867 net = conv2d(net, filtersK, strides=[1, 1, 1, 1], padding="SAME")
1868 net = batch_to_space(net, crops=pad, block_size=rate)
1869 ```
1871 because a pair of consecutive `space_to_batch` and `batch_to_space` ops with
1872 the same `block_size` cancel out when their respective `paddings` and `crops`
1873 inputs are identical.
1875 Args:
1876 value: A 4-D `Tensor` of type `float`. It needs to be in the default "NHWC"
1877 format. Its shape is `[batch, in_height, in_width, in_channels]`.
1878 filters: A 4-D `Tensor` with the same type as `value` and shape
1879 `[filter_height, filter_width, in_channels, out_channels]`. `filters`'
1880 `in_channels` dimension must match that of `value`. Atrous convolution is
1881 equivalent to standard convolution with upsampled filters with effective
1882 height `filter_height + (filter_height - 1) * (rate - 1)` and effective
1883 width `filter_width + (filter_width - 1) * (rate - 1)`, produced by
1884 inserting `rate - 1` zeros along consecutive elements across the
1885 `filters`' spatial dimensions.
1886 rate: A positive int32. The stride with which we sample input values across
1887 the `height` and `width` dimensions. Equivalently, the rate by which we
1888 upsample the filter values by inserting zeros across the `height` and
1889 `width` dimensions. In the literature, the same parameter is sometimes
1890 called `input stride` or `dilation`.
1891 padding: A string, either `'VALID'` or `'SAME'`. The padding algorithm. See
1892 [here](https://www.tensorflow.org/api_docs/python/tf/nn#notes_on_padding_2)
1893 for more information.
1894 name: Optional name for the returned tensor.
1896 Returns:
1897 A `Tensor` with the same type as `value`.
1898 Output shape with `'VALID'` padding is:
1900 [batch, height - rate * (filter_width - 1),
1901 width - rate * (filter_height - 1), out_channels].
1903 Output shape with `'SAME'` padding is:
1905 [batch, height, width, out_channels].
1907 Raises:
1908 ValueError: If input/output depth does not match `filters`' shape, or if
1909 padding is other than `'VALID'` or `'SAME'`.
1911 References:
1912 Multi-Scale Context Aggregation by Dilated Convolutions:
1913 [Yu et al., 2016](https://arxiv.org/abs/1511.07122)
1914 ([pdf](https://arxiv.org/pdf/1511.07122.pdf))
1915 Semantic Image Segmentation with Deep Convolutional Nets and Fully
1916 Connected CRFs:
1917 [Chen et al., 2015](http://arxiv.org/abs/1412.7062)
1918 ([pdf](https://arxiv.org/pdf/1412.7062))
1919 OverFeat - Integrated Recognition, Localization and Detection using
1920 Convolutional Networks:
1921 [Sermanet et al., 2014](https://arxiv.org/abs/1312.6229)
1922 ([pdf](https://arxiv.org/pdf/1312.6229.pdf))
1923 Fast Image Scanning with Deep Max-Pooling Convolutional Neural Networks:
1924 [Giusti et al., 2013]
1925 (https://ieeexplore.ieee.org/abstract/document/6738831)
1926 ([pdf](https://arxiv.org/pdf/1302.1700.pdf))
1927 """
1928 return convolution(
1929 input=value,
1930 filter=filters,
1931 padding=padding,
1932 dilation_rate=np.broadcast_to(rate, (2,)),
1933 name=name)
1936def convert_padding(padding, expected_length=4):
1937 """Converts Python padding to C++ padding for ops which take EXPLICIT padding.
1939 Args:
1940 padding: the `padding` argument for a Python op which supports EXPLICIT
1941 padding.
1942 expected_length: Expected number of entries in the padding list when
1943 explicit padding is used.
1945 Returns:
1946 (padding, explicit_paddings) pair, which should be passed as attributes to a
1947 C++ op.
1949 Raises:
1950 ValueError: If padding is invalid.
1951 """
1952 explicit_paddings = []
1953 if padding == "EXPLICIT":
1954 raise ValueError("'EXPLICIT' is not a valid value for `padding`. To use "
1955 "explicit padding, `padding` must be a list.")
1956 if isinstance(padding, (list, tuple)):
1957 for i, dim_paddings in enumerate(padding):
1958 if not isinstance(dim_paddings, (list, tuple)):
1959 raise ValueError("When `padding` is a list, each element of `padding` "
1960 "must be a list/tuple of size 2. Received: "
1961 f"padding={padding} with element at index {i} of type "
1962 f"{type(dim_paddings)}")
1963 if len(dim_paddings) != 2:
1964 raise ValueError("When `padding` is a list, each element of `padding` "
1965 "must be a list/tuple of size 2. Received: "
1966 f"padding={padding} with element at index {i} of size "
1967 f"{len(dim_paddings)}")
1968 explicit_paddings.extend(dim_paddings)
1969 if len(padding) != expected_length:
1970 raise ValueError(
1971 f"When padding is a list, it must be of size {expected_length}. "
1972 f"Received: padding={padding} of size {len(padding)}")
1973 padding = "EXPLICIT"
1974 return padding, explicit_paddings
1977@tf_export(v1=["nn.conv1d"])
1978@dispatch.add_dispatch_support
1979@deprecation.deprecated_arg_values(
1980 None,
1981 "`NCHW` for data_format is deprecated, use `NCW` instead",
1982 warn_once=True,
1983 data_format="NCHW")
1984@deprecation.deprecated_arg_values(
1985 None,
1986 "`NHWC` for data_format is deprecated, use `NWC` instead",
1987 warn_once=True,
1988 data_format="NHWC")
1989def conv1d(
1990 value=None,
1991 filters=None,
1992 stride=None,
1993 padding=None,
1994 use_cudnn_on_gpu=None,
1995 data_format=None,
1996 name=None,
1997 input=None, # pylint: disable=redefined-builtin
1998 dilations=None):
1999 r"""Computes a 1-D convolution of input with rank `>=3` and a `3-D` filter.
2001 Given an input tensor of shape
2002 `batch_shape + [in_width, in_channels]`
2003 if `data_format` is `"NWC"`, or
2004 `batch_shape + [in_channels, in_width]`
2005 if `data_format` is `"NCW"`,
2006 and a filter / kernel tensor of shape
2007 `[filter_width, in_channels, out_channels]`, this op reshapes
2008 the arguments to pass them to `conv2d` to perform the equivalent
2009 convolution operation.
2011 Internally, this op reshapes the input tensors and invokes `tf.nn.conv2d`.
2012 For example, if `data_format` does not start with "NC", a tensor of shape
2013 `batch_shape + [in_width, in_channels]`
2014 is reshaped to
2015 `batch_shape + [1, in_width, in_channels]`,
2016 and the filter is reshaped to
2017 `[1, filter_width, in_channels, out_channels]`.
2018 The result is then reshaped back to
2019 `batch_shape + [out_width, out_channels]`
2020 \(where out_width is a function of the stride and padding as in conv2d\) and
2021 returned to the caller.
2023 Args:
2024 value: A Tensor of rank at least 3. Must be of type `float16`, `float32`, or
2025 `float64`.
2026 filters: A Tensor of rank at least 3. Must have the same type as `value`.
2027 stride: An int or list of `ints` that has length `1` or `3`. The number of
2028 entries by which the filter is moved right at each step.
2029 padding: 'SAME' or 'VALID'
2030 use_cudnn_on_gpu: An optional `bool`. Defaults to `True`.
2031 data_format: An optional `string` from `"NWC", "NCW"`. Defaults to `"NWC"`,
2032 the data is stored in the order of `batch_shape + [in_width,
2033 in_channels]`. The `"NCW"` format stores data as `batch_shape +
2034 [in_channels, in_width]`.
2035 name: A name for the operation (optional).
2036 input: Alias for value.
2037 dilations: An int or list of `ints` that has length `1` or `3` which
2038 defaults to 1. The dilation factor for each dimension of input. If set to
2039 k > 1, there will be k-1 skipped cells between each filter element on that
2040 dimension. Dilations in the batch and depth dimensions must be 1.
2042 Returns:
2043 A `Tensor`. Has the same type as input.
2045 Raises:
2046 ValueError: if `data_format` is invalid.
2047 """
2048 value = deprecation.deprecated_argument_lookup("input", input, "value", value)
2049 with ops.name_scope(name, "conv1d", [value, filters]) as name:
2050 # Reshape the input tensor to batch_shape + [1, in_width, in_channels]
2051 if data_format is None or data_format == "NHWC" or data_format == "NWC":
2052 data_format = "NHWC"
2053 spatial_start_dim = -3
2054 channel_index = 2
2055 elif data_format == "NCHW" or data_format == "NCW":
2056 data_format = "NCHW"
2057 spatial_start_dim = -2
2058 channel_index = 1
2059 else:
2060 raise ValueError("`data_format` must be 'NWC' or 'NCW'. "
2061 f"Received: data_format={data_format}")
2062 strides = [1] + _get_sequence(stride, 1, channel_index, "stride")
2063 dilations = [1] + _get_sequence(dilations, 1, channel_index, "dilations")
2065 value = array_ops.expand_dims(value, spatial_start_dim)
2066 filters = array_ops.expand_dims(filters, 0)
2067 if value.shape.ndims in (4, 3, 2, 1, 0, None):
2068 result = gen_nn_ops.conv2d(
2069 value,
2070 filters,
2071 strides,
2072 padding,
2073 use_cudnn_on_gpu=use_cudnn_on_gpu,
2074 data_format=data_format,
2075 dilations=dilations,
2076 name=name)
2077 else:
2078 result = squeeze_batch_dims(
2079 value,
2080 functools.partial(
2081 gen_nn_ops.conv2d,
2082 filter=filters,
2083 strides=strides,
2084 padding=padding,
2085 use_cudnn_on_gpu=use_cudnn_on_gpu,
2086 data_format=data_format,
2087 dilations=dilations,
2088 ),
2089 inner_rank=3,
2090 name=name)
2091 return array_ops.squeeze(result, [spatial_start_dim])
2094@tf_export("nn.conv1d", v1=[])
2095@dispatch.add_dispatch_support
2096def conv1d_v2(
2097 input, # pylint: disable=redefined-builtin
2098 filters,
2099 stride,
2100 padding,
2101 data_format="NWC",
2102 dilations=None,
2103 name=None):
2104 r"""Computes a 1-D convolution given 3-D input and filter tensors.
2106 Given an input tensor of shape
2107 `batch_shape + [in_width, in_channels]`
2108 if `data_format` is `"NWC"`, or
2109 `batch_shape + [in_channels, in_width]`
2110 if `data_format` is `"NCW"`,
2111 and a filter / kernel tensor of shape
2112 `[filter_width, in_channels, out_channels]`, this op reshapes
2113 the arguments to pass them to `conv2d` to perform the equivalent
2114 convolution operation.
2116 Internally, this op reshapes the input tensors and invokes `tf.nn.conv2d`.
2117 For example, if `data_format` does not start with `"NC"`, a tensor of shape
2118 `batch_shape + [in_width, in_channels]`
2119 is reshaped to
2120 `batch_shape + [1, in_width, in_channels]`,
2121 and the filter is reshaped to
2122 `[1, filter_width, in_channels, out_channels]`.
2123 The result is then reshaped back to
2124 `batch_shape + [out_width, out_channels]`
2125 \(where out_width is a function of the stride and padding as in conv2d\) and
2126 returned to the caller.
2128 Args:
2129 input: A Tensor of rank at least 3. Must be of type `float16`, `float32`, or
2130 `float64`.
2131 filters: A Tensor of rank at least 3. Must have the same type as `input`.
2132 stride: An int or list of `ints` that has length `1` or `3`. The number of
2133 entries by which the filter is moved right at each step.
2134 padding: 'SAME' or 'VALID'. See
2135 [here](https://www.tensorflow.org/api_docs/python/tf/nn#notes_on_padding_2)
2136 for more information.
2137 data_format: An optional `string` from `"NWC", "NCW"`. Defaults to `"NWC"`,
2138 the data is stored in the order of
2139 `batch_shape + [in_width, in_channels]`. The `"NCW"` format stores data
2140 as `batch_shape + [in_channels, in_width]`.
2141 dilations: An int or list of `ints` that has length `1` or `3` which
2142 defaults to 1. The dilation factor for each dimension of input. If set to
2143 k > 1, there will be k-1 skipped cells between each filter element on that
2144 dimension. Dilations in the batch and depth dimensions must be 1.
2145 name: A name for the operation (optional).
2147 Returns:
2148 A `Tensor`. Has the same type as input.
2150 Raises:
2151 ValueError: if `data_format` is invalid.
2152 """
2153 return conv1d(
2154 input, # pylint: disable=redefined-builtin
2155 filters,
2156 stride,
2157 padding,
2158 use_cudnn_on_gpu=True,
2159 data_format=data_format,
2160 name=name,
2161 dilations=dilations)
2164@tf_export("nn.conv1d_transpose")
2165@dispatch.add_dispatch_support
2166def conv1d_transpose(
2167 input, # pylint: disable=redefined-builtin
2168 filters,
2169 output_shape,
2170 strides,
2171 padding="SAME",
2172 data_format="NWC",
2173 dilations=None,
2174 name=None):
2175 """The transpose of `conv1d`.
2177 This operation is sometimes called "deconvolution" after
2178 (Zeiler et al., 2010), but is actually the transpose (gradient) of `conv1d`
2179 rather than an actual deconvolution.
2181 Args:
2182 input: A 3-D `Tensor` of type `float` and shape
2183 `[batch, in_width, in_channels]` for `NWC` data format or
2184 `[batch, in_channels, in_width]` for `NCW` data format.
2185 filters: A 3-D `Tensor` with the same type as `input` and shape
2186 `[filter_width, output_channels, in_channels]`. `filter`'s
2187 `in_channels` dimension must match that of `input`.
2188 output_shape: A 1-D `Tensor`, containing three elements, representing the
2189 output shape of the deconvolution op.
2190 strides: An int or list of `ints` that has length `1` or `3`. The number of
2191 entries by which the filter is moved right at each step.
2192 padding: A string, either `'VALID'` or `'SAME'`. The padding algorithm. See
2193 [here](https://www.tensorflow.org/api_docs/python/tf/nn#notes_on_padding_2)
2194 for more information.
2195 data_format: A string. `'NWC'` and `'NCW'` are supported.
2196 dilations: An int or list of `ints` that has length `1` or `3` which
2197 defaults to 1. The dilation factor for each dimension of input. If set to
2198 k > 1, there will be k-1 skipped cells between each filter element on that
2199 dimension. Dilations in the batch and depth dimensions must be 1.
2200 name: Optional name for the returned tensor.
2202 Returns:
2203 A `Tensor` with the same type as `input`.
2205 Raises:
2206 ValueError: If input/output depth does not match `filter`'s shape, if
2207 `output_shape` is not at 3-element vector, if `padding` is other than
2208 `'VALID'` or `'SAME'`, or if `data_format` is invalid.
2210 References:
2211 Deconvolutional Networks:
2212 [Zeiler et al., 2010]
2213 (https://ieeexplore.ieee.org/abstract/document/5539957)
2214 ([pdf]
2215 (http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.232.4023&rep=rep1&type=pdf))
2216 """
2217 with ops.name_scope(name, "conv1d_transpose",
2218 [input, filters, output_shape]) as name:
2219 # The format could be either NWC or NCW, map to NHWC or NCHW
2220 if data_format is None or data_format == "NWC":
2221 data_format = "NHWC"
2222 spatial_start_dim = 1
2223 channel_index = 2
2224 elif data_format == "NCW":
2225 data_format = "NCHW"
2226 spatial_start_dim = 2
2227 channel_index = 1
2228 else:
2229 raise ValueError("`data_format` must be 'NWC' or 'NCW'. "
2230 f"Received: data_format={data_format}")
2232 # Reshape the input tensor to [batch, 1, in_width, in_channels]
2233 strides = [1] + _get_sequence(strides, 1, channel_index, "stride")
2234 dilations = [1] + _get_sequence(dilations, 1, channel_index, "dilations")
2236 input = array_ops.expand_dims(input, spatial_start_dim)
2237 filters = array_ops.expand_dims(filters, 0)
2238 output_shape = list(output_shape) if not isinstance(
2239 output_shape, ops.Tensor) else output_shape
2240 output_shape = array_ops.concat([output_shape[: spatial_start_dim], [1],
2241 output_shape[spatial_start_dim:]], 0)
2243 result = gen_nn_ops.conv2d_backprop_input(
2244 input_sizes=output_shape,
2245 filter=filters,
2246 out_backprop=input,
2247 strides=strides,
2248 padding=padding,
2249 data_format=data_format,
2250 dilations=dilations,
2251 name=name)
2252 return array_ops.squeeze(result, spatial_start_dim)
2255@tf_export("nn.conv2d", v1=[])
2256@dispatch.add_dispatch_support
2257def conv2d_v2(input, # pylint: disable=redefined-builtin
2258 filters,
2259 strides,
2260 padding,
2261 data_format="NHWC",
2262 dilations=None,
2263 name=None):
2264 # pylint: disable=line-too-long
2265 r"""Computes a 2-D convolution given `input` and 4-D `filters` tensors.
2267 The `input` tensor may have rank `4` or higher, where shape dimensions `[:-3]`
2268 are considered batch dimensions (`batch_shape`).
2270 Given an input tensor of shape
2271 `batch_shape + [in_height, in_width, in_channels]` and a filter / kernel
2272 tensor of shape `[filter_height, filter_width, in_channels, out_channels]`,
2273 this op performs the following:
2275 1. Flattens the filter to a 2-D matrix with shape
2276 `[filter_height * filter_width * in_channels, output_channels]`.
2277 2. Extracts image patches from the input tensor to form a *virtual*
2278 tensor of shape `[batch, out_height, out_width,
2279 filter_height * filter_width * in_channels]`.
2280 3. For each patch, right-multiplies the filter matrix and the image patch
2281 vector.
2283 In detail, with the default NHWC format,
2285 output[b, i, j, k] =
2286 sum_{di, dj, q} input[b, strides[1] * i + di, strides[2] * j + dj, q] *
2287 filter[di, dj, q, k]
2289 Must have `strides[0] = strides[3] = 1`. For the most common case of the same
2290 horizontal and vertical strides, `strides = [1, stride, stride, 1]`.
2292 Usage Example:
2294 >>> x_in = np.array([[
2295 ... [[2], [1], [2], [0], [1]],
2296 ... [[1], [3], [2], [2], [3]],
2297 ... [[1], [1], [3], [3], [0]],
2298 ... [[2], [2], [0], [1], [1]],
2299 ... [[0], [0], [3], [1], [2]], ]])
2300 >>> kernel_in = np.array([
2301 ... [ [[2, 0.1]], [[3, 0.2]] ],
2302 ... [ [[0, 0.3]], [[1, 0.4]] ], ])
2303 >>> x = tf.constant(x_in, dtype=tf.float32)
2304 >>> kernel = tf.constant(kernel_in, dtype=tf.float32)
2305 >>> tf.nn.conv2d(x, kernel, strides=[1, 1, 1, 1], padding='VALID')
2306 <tf.Tensor: shape=(1, 4, 4, 2), dtype=float32, numpy=..., dtype=float32)>
2308 Args:
2309 input: A `Tensor`. Must be one of the following types:
2310 `half`, `bfloat16`, `float32`, `float64`.
2311 A Tensor of rank at least 4. The dimension order is interpreted according
2312 to the value of `data_format`; with the all-but-inner-3 dimensions acting
2313 as batch dimensions. See below for details.
2314 filters: A `Tensor`. Must have the same type as `input`.
2315 A 4-D tensor of shape
2316 `[filter_height, filter_width, in_channels, out_channels]`
2317 strides: An int or list of `ints` that has length `1`, `2` or `4`. The
2318 stride of the sliding window for each dimension of `input`. If a single
2319 value is given it is replicated in the `H` and `W` dimension. By default
2320 the `N` and `C` dimensions are set to 1. The dimension order is determined
2321 by the value of `data_format`, see below for details.
2322 padding: Either the `string` `"SAME"` or `"VALID"` indicating the type of
2323 padding algorithm to use, or a list indicating the explicit paddings at
2324 the start and end of each dimension. See
2325 [here](https://www.tensorflow.org/api_docs/python/tf/nn#notes_on_padding_2)
2326 for more information. When explicit padding is used and data_format is
2327 `"NHWC"`, this should be in the form `[[0, 0], [pad_top, pad_bottom],
2328 [pad_left, pad_right], [0, 0]]`. When explicit padding used and
2329 data_format is `"NCHW"`, this should be in the form `[[0, 0], [0, 0],
2330 [pad_top, pad_bottom], [pad_left, pad_right]]`.
2331 data_format: An optional `string` from: `"NHWC", "NCHW"`.
2332 Defaults to `"NHWC"`.
2333 Specify the data format of the input and output data. With the
2334 default format "NHWC", the data is stored in the order of:
2335 `batch_shape + [height, width, channels]`.
2336 Alternatively, the format could be "NCHW", the data storage order of:
2337 `batch_shape + [channels, height, width]`.
2338 dilations: An int or list of `ints` that has length `1`, `2` or `4`,
2339 defaults to 1. The dilation factor for each dimension of`input`. If a
2340 single value is given it is replicated in the `H` and `W` dimension. By
2341 default the `N` and `C` dimensions are set to 1. If set to k > 1, there
2342 will be k-1 skipped cells between each filter element on that dimension.
2343 The dimension order is determined by the value of `data_format`, see above
2344 for details. Dilations in the batch and depth dimensions if a 4-d tensor
2345 must be 1.
2346 name: A name for the operation (optional).
2348 Returns:
2349 A `Tensor`. Has the same type as `input` and the same outer batch shape.
2350 """
2351 # pylint: enable=line-too-long
2352 return conv2d(input, # pylint: disable=redefined-builtin
2353 filters,
2354 strides,
2355 padding,
2356 use_cudnn_on_gpu=True,
2357 data_format=data_format,
2358 dilations=dilations,
2359 name=name)
2362@tf_export(v1=["nn.conv2d"])
2363@dispatch.add_dispatch_support
2364def conv2d( # pylint: disable=redefined-builtin,dangerous-default-value
2365 input,
2366 filter=None,
2367 strides=None,
2368 padding=None,
2369 use_cudnn_on_gpu=True,
2370 data_format="NHWC",
2371 dilations=[1, 1, 1, 1],
2372 name=None,
2373 filters=None):
2374 r"""Computes a 2-D convolution given 4-D `input` and `filter` tensors.
2376 Given an input tensor of shape `[batch, in_height, in_width, in_channels]`
2377 and a filter / kernel tensor of shape
2378 `[filter_height, filter_width, in_channels, out_channels]`, this op
2379 performs the following:
2381 1. Flattens the filter to a 2-D matrix with shape
2382 `[filter_height * filter_width * in_channels, output_channels]`.
2383 2. Extracts image patches from the input tensor to form a *virtual*
2384 tensor of shape `[batch, out_height, out_width,
2385 filter_height * filter_width * in_channels]`.
2386 3. For each patch, right-multiplies the filter matrix and the image patch
2387 vector.
2389 In detail, with the default NHWC format,
2391 output[b, i, j, k] =
2392 sum_{di, dj, q} input[b, strides[1] * i + di, strides[2] * j + dj, q]
2393 * filter[di, dj, q, k]
2395 Must have `strides[0] = strides[3] = 1`. For the most common case of the same
2396 horizontal and vertical strides, `strides = [1, stride, stride, 1]`.
2398 Args:
2399 input: A `Tensor`. Must be one of the following types:
2400 `half`, `bfloat16`, `float32`, `float64`.
2401 A 4-D tensor. The dimension order is interpreted according to the value
2402 of `data_format`, see below for details.
2403 filter: A `Tensor`. Must have the same type as `input`.
2404 A 4-D tensor of shape
2405 `[filter_height, filter_width, in_channels, out_channels]`
2406 strides: An int or list of `ints` that has length `1`, `2` or `4`. The
2407 stride of the sliding window for each dimension of `input`. If a single
2408 value is given it is replicated in the `H` and `W` dimension. By default
2409 the `N` and `C` dimensions are set to 1. The dimension order is determined
2410 by the value of `data_format`, see below for details.
2411 padding: Either the `string` `"SAME"` or `"VALID"` indicating the type of
2412 padding algorithm to use, or a list indicating the explicit paddings at
2413 the start and end of each dimension. When explicit padding is used and
2414 data_format is `"NHWC"`, this should be in the form `[[0, 0], [pad_top,
2415 pad_bottom], [pad_left, pad_right], [0, 0]]`. When explicit padding used
2416 and data_format is `"NCHW"`, this should be in the form `[[0, 0], [0, 0],
2417 [pad_top, pad_bottom], [pad_left, pad_right]]`.
2418 use_cudnn_on_gpu: An optional `bool`. Defaults to `True`.
2419 data_format: An optional `string` from: `"NHWC", "NCHW"`.
2420 Defaults to `"NHWC"`.
2421 Specify the data format of the input and output data. With the
2422 default format "NHWC", the data is stored in the order of:
2423 [batch, height, width, channels].
2424 Alternatively, the format could be "NCHW", the data storage order of:
2425 [batch, channels, height, width].
2426 dilations: An int or list of `ints` that has length `1`, `2` or `4`,
2427 defaults to 1. The dilation factor for each dimension of`input`. If a
2428 single value is given it is replicated in the `H` and `W` dimension. By
2429 default the `N` and `C` dimensions are set to 1. If set to k > 1, there
2430 will be k-1 skipped cells between each filter element on that dimension.
2431 The dimension order is determined by the value of `data_format`, see above
2432 for details. Dilations in the batch and depth dimensions if a 4-d tensor
2433 must be 1.
2434 name: A name for the operation (optional).
2435 filters: Alias for filter.
2437 Returns:
2438 A `Tensor`. Has the same type as `input`.
2439 """
2440 filter = deprecation.deprecated_argument_lookup(
2441 "filters", filters, "filter", filter)
2442 padding, explicit_paddings = convert_padding(padding)
2443 if data_format is None:
2444 data_format = "NHWC"
2445 channel_index = 1 if data_format.startswith("NC") else 3
2447 strides = _get_sequence(strides, 2, channel_index, "strides")
2448 dilations = _get_sequence(dilations, 2, channel_index, "dilations")
2450 shape = input.shape
2451 # shape object may lack ndims, e.g., if input is an np.ndarray. In that case,
2452 # we fall back to len(shape).
2453 ndims = getattr(shape, "ndims", -1)
2454 if ndims == -1:
2455 ndims = len(shape)
2456 if ndims in (4, 3, 2, 1, 0, None):
2457 # We avoid calling squeeze_batch_dims to reduce extra python function
2458 # call slowdown in eager mode. This branch doesn't require reshapes.
2459 return gen_nn_ops.conv2d(
2460 input,
2461 filter=filter,
2462 strides=strides,
2463 padding=padding,
2464 use_cudnn_on_gpu=use_cudnn_on_gpu,
2465 explicit_paddings=explicit_paddings,
2466 data_format=data_format,
2467 dilations=dilations,
2468 name=name)
2469 return squeeze_batch_dims(
2470 input,
2471 functools.partial(
2472 gen_nn_ops.conv2d,
2473 filter=filter,
2474 strides=strides,
2475 padding=padding,
2476 use_cudnn_on_gpu=use_cudnn_on_gpu,
2477 explicit_paddings=explicit_paddings,
2478 data_format=data_format,
2479 dilations=dilations),
2480 inner_rank=3,
2481 name=name)
2484@tf_export(v1=["nn.conv2d_backprop_filter"])
2485@dispatch.add_dispatch_support
2486def conv2d_backprop_filter( # pylint: disable=redefined-builtin,dangerous-default-value
2487 input,
2488 filter_sizes,
2489 out_backprop,
2490 strides,
2491 padding,
2492 use_cudnn_on_gpu=True,
2493 data_format="NHWC",
2494 dilations=[1, 1, 1, 1],
2495 name=None):
2496 r"""Computes the gradients of convolution with respect to the filter.
2498 Args:
2499 input: A `Tensor`. Must be one of the following types:
2500 `half`, `bfloat16`, `float32`, `float64`.
2501 4-D with shape `[batch, in_height, in_width, in_channels]`.
2502 filter_sizes: A `Tensor` of type `int32`.
2503 An integer vector representing the tensor shape of `filter`,
2504 where `filter` is a 4-D
2505 `[filter_height, filter_width, in_channels, out_channels]` tensor.
2506 out_backprop: A `Tensor`. Must have the same type as `input`.
2507 4-D with shape `[batch, out_height, out_width, out_channels]`.
2508 Gradients w.r.t. the output of the convolution.
2509 strides: A list of `ints`.
2510 The stride of the sliding window for each dimension of the input
2511 of the convolution. Must be in the same order as the dimension specified
2512 with format.
2513 padding: Either the `string` `"SAME"` or `"VALID"` indicating the type of
2514 padding algorithm to use, or a list indicating the explicit paddings at
2515 the start and end of each dimension. When explicit padding is used and
2516 data_format is `"NHWC"`, this should be in the form `[[0, 0], [pad_top,
2517 pad_bottom], [pad_left, pad_right], [0, 0]]`. When explicit padding used
2518 and data_format is `"NCHW"`, this should be in the form `[[0, 0], [0, 0],
2519 [pad_top, pad_bottom], [pad_left, pad_right]]`.
2520 use_cudnn_on_gpu: An optional `bool`. Defaults to `True`.
2521 data_format: An optional `string` from: `"NHWC", "NCHW"`.
2522 Defaults to `"NHWC"`.
2523 Specify the data format of the input and output data. With the
2524 default format "NHWC", the data is stored in the order of:
2525 [batch, in_height, in_width, in_channels].
2526 Alternatively, the format could be "NCHW", the data storage order of:
2527 [batch, in_channels, in_height, in_width].
2528 dilations: An optional list of `ints`. Defaults to `[1, 1, 1, 1]`.
2529 1-D tensor of length 4. The dilation factor for each dimension of
2530 `input`. If set to k > 1, there will be k-1 skipped cells between each
2531 filter element on that dimension. The dimension order is determined by
2532 the value of `data_format`, see above for details. Dilations in the batch
2533 and depth dimensions must be 1.
2534 name: A name for the operation (optional).
2536 Returns:
2537 A `Tensor`. Has the same type as `input`.
2538 """
2539 padding, explicit_paddings = convert_padding(padding)
2540 return gen_nn_ops.conv2d_backprop_filter(
2541 input, filter_sizes, out_backprop, strides, padding, use_cudnn_on_gpu,
2542 explicit_paddings, data_format, dilations, name)
2545@tf_export(v1=["nn.conv2d_backprop_input"])
2546@dispatch.add_dispatch_support
2547def conv2d_backprop_input( # pylint: disable=redefined-builtin,dangerous-default-value
2548 input_sizes,
2549 filter=None,
2550 out_backprop=None,
2551 strides=None,
2552 padding=None,
2553 use_cudnn_on_gpu=True,
2554 data_format="NHWC",
2555 dilations=[1, 1, 1, 1],
2556 name=None,
2557 filters=None):
2558 r"""Computes the gradients of convolution with respect to the input.
2560 Args:
2561 input_sizes: A `Tensor` of type `int32`.
2562 An integer vector representing the shape of `input`,
2563 where `input` is a 4-D `[batch, height, width, channels]` tensor.
2564 filter: A `Tensor`. Must be one of the following types:
2565 `half`, `bfloat16`, `float32`, `float64`.
2566 4-D with shape
2567 `[filter_height, filter_width, in_channels, out_channels]`.
2568 out_backprop: A `Tensor`. Must have the same type as `filter`.
2569 4-D with shape `[batch, out_height, out_width, out_channels]`.
2570 Gradients w.r.t. the output of the convolution.
2571 strides: A list of `ints`.
2572 The stride of the sliding window for each dimension of the input
2573 of the convolution. Must be in the same order as the dimension specified
2574 with format.
2575 padding: Either the `string` `"SAME"` or `"VALID"` indicating the type of
2576 padding algorithm to use, or a list indicating the explicit paddings at
2577 the start and end of each dimension. When explicit padding is used and
2578 data_format is `"NHWC"`, this should be in the form `[[0, 0], [pad_top,
2579 pad_bottom], [pad_left, pad_right], [0, 0]]`. When explicit padding used
2580 and data_format is `"NCHW"`, this should be in the form `[[0, 0], [0, 0],
2581 [pad_top, pad_bottom], [pad_left, pad_right]]`.
2582 use_cudnn_on_gpu: An optional `bool`. Defaults to `True`.
2583 data_format: An optional `string` from: `"NHWC", "NCHW"`.
2584 Defaults to `"NHWC"`.
2585 Specify the data format of the input and output data. With the
2586 default format "NHWC", the data is stored in the order of:
2587 [batch, in_height, in_width, in_channels].
2588 Alternatively, the format could be "NCHW", the data storage order of:
2589 [batch, in_channels, in_height, in_width].
2590 dilations: An optional list of `ints`. Defaults to `[1, 1, 1, 1]`.
2591 1-D tensor of length 4. The dilation factor for each dimension of
2592 `input`. If set to k > 1, there will be k-1 skipped cells between each
2593 filter element on that dimension. The dimension order is determined by
2594 the value of `data_format`, see above for details. Dilations in the batch
2595 and depth dimensions must be 1.
2596 name: A name for the operation (optional).
2597 filters: Alias for filter.
2599 Returns:
2600 A `Tensor`. Has the same type as `filter`.
2601 """
2602 filter = deprecation.deprecated_argument_lookup(
2603 "filters", filters, "filter", filter)
2604 padding, explicit_paddings = convert_padding(padding)
2605 return gen_nn_ops.conv2d_backprop_input(
2606 input_sizes, filter, out_backprop, strides, padding, use_cudnn_on_gpu,
2607 explicit_paddings, data_format, dilations, name)
2610@tf_export(v1=["nn.conv2d_transpose"])
2611@dispatch.add_dispatch_support
2612def conv2d_transpose(
2613 value=None,
2614 filter=None, # pylint: disable=redefined-builtin
2615 output_shape=None,
2616 strides=None,
2617 padding="SAME",
2618 data_format="NHWC",
2619 name=None,
2620 input=None, # pylint: disable=redefined-builtin
2621 filters=None,
2622 dilations=None):
2623 """The transpose of `conv2d`.
2625 This operation is sometimes called "deconvolution" after
2626 (Zeiler et al., 2010), but is really the transpose (gradient) of `conv2d`
2627 rather than an actual deconvolution.
2629 Args:
2630 value: A 4-D `Tensor` of type `float` and shape
2631 `[batch, height, width, in_channels]` for `NHWC` data format or
2632 `[batch, in_channels, height, width]` for `NCHW` data format.
2633 filter: A 4-D `Tensor` with the same type as `value` and shape
2634 `[height, width, output_channels, in_channels]`. `filter`'s
2635 `in_channels` dimension must match that of `value`.
2636 output_shape: A 1-D `Tensor` representing the output shape of the
2637 deconvolution op.
2638 strides: An int or list of `ints` that has length `1`, `2` or `4`. The
2639 stride of the sliding window for each dimension of `input`. If a single
2640 value is given it is replicated in the `H` and `W` dimension. By default
2641 the `N` and `C` dimensions are set to 0. The dimension order is determined
2642 by the value of `data_format`, see below for details.
2643 padding: A string, either `'VALID'` or `'SAME'`. The padding algorithm.
2644 See the "returns" section of `tf.nn.convolution` for details.
2645 data_format: A string. 'NHWC' and 'NCHW' are supported.
2646 name: Optional name for the returned tensor.
2647 input: Alias for value.
2648 filters: Alias for filter.
2649 dilations: An int or list of `ints` that has length `1`, `2` or `4`,
2650 defaults to 1. The dilation factor for each dimension of`input`. If a
2651 single value is given it is replicated in the `H` and `W` dimension. By
2652 default the `N` and `C` dimensions are set to 1. If set to k > 1, there
2653 will be k-1 skipped cells between each filter element on that dimension.
2654 The dimension order is determined by the value of `data_format`, see above
2655 for details. Dilations in the batch and depth dimensions if a 4-d tensor
2656 must be 1.
2658 Returns:
2659 A `Tensor` with the same type as `value`.
2661 Raises:
2662 ValueError: If input/output depth does not match `filter`'s shape, or if
2663 padding is other than `'VALID'` or `'SAME'`.
2665 References:
2666 Deconvolutional Networks:
2667 [Zeiler et al., 2010]
2668 (https://ieeexplore.ieee.org/abstract/document/5539957)
2669 ([pdf]
2670 (http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.232.4023&rep=rep1&type=pdf))
2671 """
2672 value = deprecated_argument_lookup("input", input, "value", value)
2673 filter = deprecated_argument_lookup("filters", filters, "filter", filter)
2674 with ops.name_scope(name, "conv2d_transpose",
2675 [value, filter, output_shape]) as name:
2676 return conv2d_transpose_v2(
2677 value,
2678 filter,
2679 output_shape,
2680 strides,
2681 padding=padding,
2682 data_format=data_format,
2683 dilations=dilations,
2684 name=name)
2687@tf_export("nn.conv2d_transpose", v1=[])
2688@dispatch.add_dispatch_support
2689def conv2d_transpose_v2(
2690 input, # pylint: disable=redefined-builtin
2691 filters, # pylint: disable=redefined-builtin
2692 output_shape,
2693 strides,
2694 padding="SAME",
2695 data_format="NHWC",
2696 dilations=None,
2697 name=None):
2698 """The transpose of `conv2d`.
2700 This operation is sometimes called "deconvolution" after
2701 (Zeiler et al., 2010), but is really the transpose (gradient) of
2702 `atrous_conv2d` rather than an actual deconvolution.
2704 Args:
2705 input: A 4-D `Tensor` of type `float` and shape `[batch, height, width,
2706 in_channels]` for `NHWC` data format or `[batch, in_channels, height,
2707 width]` for `NCHW` data format.
2708 filters: A 4-D `Tensor` with the same type as `input` and shape `[height,
2709 width, output_channels, in_channels]`. `filter`'s `in_channels` dimension
2710 must match that of `input`.
2711 output_shape: A 1-D `Tensor` representing the output shape of the
2712 deconvolution op.
2713 strides: An int or list of `ints` that has length `1`, `2` or `4`. The
2714 stride of the sliding window for each dimension of `input`. If a single
2715 value is given it is replicated in the `H` and `W` dimension. By default
2716 the `N` and `C` dimensions are set to 0. The dimension order is determined
2717 by the value of `data_format`, see below for details.
2718 padding: Either the `string` `"SAME"` or `"VALID"` indicating the type of
2719 padding algorithm to use, or a list indicating the explicit paddings at
2720 the start and end of each dimension. See
2721 [here](https://www.tensorflow.org/api_docs/python/tf/nn#notes_on_padding_2)
2722 for more information. When explicit padding is used and data_format is
2723 `"NHWC"`, this should be in the form `[[0, 0], [pad_top, pad_bottom],
2724 [pad_left, pad_right], [0, 0]]`. When explicit padding used and
2725 data_format is `"NCHW"`, this should be in the form `[[0, 0], [0, 0],
2726 [pad_top, pad_bottom], [pad_left, pad_right]]`.
2727 data_format: A string. 'NHWC' and 'NCHW' are supported.
2728 dilations: An int or list of `ints` that has length `1`, `2` or `4`,
2729 defaults to 1. The dilation factor for each dimension of`input`. If a
2730 single value is given it is replicated in the `H` and `W` dimension. By
2731 default the `N` and `C` dimensions are set to 1. If set to k > 1, there
2732 will be k-1 skipped cells between each filter element on that dimension.
2733 The dimension order is determined by the value of `data_format`, see above
2734 for details. Dilations in the batch and depth dimensions if a 4-d tensor
2735 must be 1.
2736 name: Optional name for the returned tensor.
2738 Returns:
2739 A `Tensor` with the same type as `input`.
2741 Raises:
2742 ValueError: If input/output depth does not match `filter`'s shape, or if
2743 padding is other than `'VALID'` or `'SAME'`.
2745 References:
2746 Deconvolutional Networks:
2747 [Zeiler et al., 2010]
2748 (https://ieeexplore.ieee.org/abstract/document/5539957)
2749 ([pdf]
2750 (http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.232.4023&rep=rep1&type=pdf))
2751 """
2752 with ops.name_scope(name, "conv2d_transpose",
2753 [input, filter, output_shape]) as name:
2754 if data_format is None:
2755 data_format = "NHWC"
2756 channel_index = 1 if data_format.startswith("NC") else 3
2758 strides = _get_sequence(strides, 2, channel_index, "strides")
2759 dilations = _get_sequence(dilations, 2, channel_index, "dilations")
2760 padding, explicit_paddings = convert_padding(padding)
2762 return gen_nn_ops.conv2d_backprop_input(
2763 input_sizes=output_shape,
2764 filter=filters,
2765 out_backprop=input,
2766 strides=strides,
2767 padding=padding,
2768 explicit_paddings=explicit_paddings,
2769 data_format=data_format,
2770 dilations=dilations,
2771 name=name)
2774def _conv2d_expanded_batch(
2775 input, # pylint: disable=redefined-builtin
2776 filters,
2777 strides,
2778 padding,
2779 data_format,
2780 dilations,
2781 name):
2782 """Helper function for `convolution_internal`; handles expanded batches."""
2783 # Try really hard to avoid modifying the legacy name scopes - return early.
2784 input_rank = input.shape.rank
2785 if input_rank is None or input_rank < 5:
2786 # We avoid calling squeeze_batch_dims to reduce extra python function
2787 # call slowdown in eager mode. This branch doesn't require reshapes.
2788 return gen_nn_ops.conv2d(
2789 input,
2790 filter=filters,
2791 strides=strides,
2792 padding=padding,
2793 data_format=data_format,
2794 dilations=dilations,
2795 name=name)
2796 return squeeze_batch_dims(
2797 input,
2798 functools.partial(
2799 gen_nn_ops.conv2d,
2800 filter=filters,
2801 strides=strides,
2802 padding=padding,
2803 data_format=data_format,
2804 dilations=dilations),
2805 inner_rank=3,
2806 name=name)
2809@tf_export("nn.atrous_conv2d_transpose")
2810@dispatch.add_dispatch_support
2811def atrous_conv2d_transpose(value,
2812 filters,
2813 output_shape,
2814 rate,
2815 padding,
2816 name=None):
2817 """The transpose of `atrous_conv2d`.
2819 This operation is sometimes called "deconvolution" after
2820 (Zeiler et al., 2010), but is really the transpose (gradient) of
2821 `atrous_conv2d` rather than an actual deconvolution.
2823 Args:
2824 value: A 4-D `Tensor` of type `float`. It needs to be in the default `NHWC`
2825 format. Its shape is `[batch, in_height, in_width, in_channels]`.
2826 filters: A 4-D `Tensor` with the same type as `value` and shape
2827 `[filter_height, filter_width, out_channels, in_channels]`. `filters`'
2828 `in_channels` dimension must match that of `value`. Atrous convolution is
2829 equivalent to standard convolution with upsampled filters with effective
2830 height `filter_height + (filter_height - 1) * (rate - 1)` and effective
2831 width `filter_width + (filter_width - 1) * (rate - 1)`, produced by
2832 inserting `rate - 1` zeros along consecutive elements across the
2833 `filters`' spatial dimensions.
2834 output_shape: A 1-D `Tensor` of shape representing the output shape of the
2835 deconvolution op, of form `[batch, out_height, out_width, out_channels]`.
2836 rate: A positive int32. The stride with which we sample input values across
2837 the `height` and `width` dimensions. Equivalently, the rate by which we
2838 upsample the filter values by inserting zeros across the `height` and
2839 `width` dimensions. In the literature, the same parameter is sometimes
2840 called `input stride` or `dilation`.
2841 padding: A string, either `'VALID'` or `'SAME'`. The padding algorithm. See
2842 [here](https://www.tensorflow.org/api_docs/python/tf/nn#notes_on_padding_2)
2843 for more information.
2844 name: Optional name for the returned tensor.
2846 Returns:
2847 A `Tensor` with the same type as `value`.
2849 Raises:
2850 ValueError: If input/output depth does not match `filters`' shape, or if
2851 padding is other than `'VALID'` or `'SAME'`, or if the `rate` is less
2852 than one, or if the output_shape is not a tensor with 4 elements.
2854 References:
2855 Deconvolutional Networks:
2856 [Zeiler et al., 2010]
2857 (https://ieeexplore.ieee.org/abstract/document/5539957)
2858 ([pdf]
2859 (http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.232.4023&rep=rep1&type=pdf))
2860 """
2861 with ops.name_scope(name, "atrous_conv2d_transpose",
2862 [value, filters, output_shape]) as name:
2863 value = ops.convert_to_tensor(value, name="value")
2864 filters = ops.convert_to_tensor(filters, name="filters")
2865 if not value.get_shape().dims[3].is_compatible_with(filters.get_shape()[3]):
2866 raise ValueError(
2867 "`value` channel count must be compatible with `filters` input "
2868 f"channel count. Received: value.shape={value.get_shape()} with "
2869 f"channel count {value.get_shape()[3]} and "
2870 f"filters.shape={filters.get_shape()} with input channel count "
2871 f"{filters.get_shape()[3]}.")
2872 if rate < 1:
2873 raise ValueError(f"`rate` cannot be less than one. Received: rate={rate}")
2875 if rate == 1:
2876 return conv2d_transpose(
2877 value,
2878 filters,
2879 output_shape,
2880 strides=[1, 1, 1, 1],
2881 padding=padding,
2882 data_format="NHWC")
2884 output_shape_ = ops.convert_to_tensor(output_shape, name="output_shape")
2885 if not output_shape_.get_shape().is_compatible_with(
2886 tensor_shape.TensorShape([4])):
2887 raise ValueError("`output_shape` must have shape (4,). "
2888 f"Received: output_shape={output_shape_.get_shape()}")
2890 if isinstance(output_shape, tuple):
2891 output_shape = list(output_shape)
2893 if isinstance(output_shape, (list, np.ndarray)):
2894 # output_shape's shape should be == [4] if reached this point.
2895 if not filters.get_shape().dims[2].is_compatible_with(output_shape[3]):
2896 raise ValueError(
2897 "`output_shape` channel count must be compatible with `filters` "
2898 f"output channel count. Received: output_shape={output_shape} with "
2899 f"channel count {output_shape[3]} and "
2900 f"filters.shape={filters.get_shape()} with output channel count "
2901 f"{filters.get_shape()[3]}.")
2903 # We have two padding contributions. The first is used for converting "SAME"
2904 # to "VALID". The second is required so that the height and width of the
2905 # zero-padded value tensor are multiples of rate.
2907 # Padding required to reduce to "VALID" convolution
2908 if padding == "SAME":
2909 # Handle filters whose shape is unknown during graph creation.
2910 if filters.get_shape().is_fully_defined():
2911 filter_shape = filters.get_shape().as_list()
2912 else:
2913 filter_shape = array_ops.shape(filters)
2914 filter_height, filter_width = filter_shape[0], filter_shape[1]
2916 # Spatial dimensions of the filters and the upsampled filters in which we
2917 # introduce (rate - 1) zeros between consecutive filter values.
2918 filter_height_up = filter_height + (filter_height - 1) * (rate - 1)
2919 filter_width_up = filter_width + (filter_width - 1) * (rate - 1)
2921 pad_height = filter_height_up - 1
2922 pad_width = filter_width_up - 1
2924 # When pad_height (pad_width) is odd, we pad more to bottom (right),
2925 # following the same convention as conv2d().
2926 pad_top = pad_height // 2
2927 pad_bottom = pad_height - pad_top
2928 pad_left = pad_width // 2
2929 pad_right = pad_width - pad_left
2930 elif padding == "VALID":
2931 pad_top = 0
2932 pad_bottom = 0
2933 pad_left = 0
2934 pad_right = 0
2935 else:
2936 raise ValueError("`padding` must be either 'VALID' or 'SAME'. "
2937 f"Received: padding={padding}")
2939 in_height = output_shape[1] + pad_top + pad_bottom
2940 in_width = output_shape[2] + pad_left + pad_right
2942 # More padding so that rate divides the height and width of the input.
2943 pad_bottom_extra = (rate - in_height % rate) % rate
2944 pad_right_extra = (rate - in_width % rate) % rate
2946 # The paddings argument to space_to_batch is just the extra padding
2947 # component.
2948 space_to_batch_pad = [[0, pad_bottom_extra], [0, pad_right_extra]]
2950 value = array_ops.space_to_batch(
2951 input=value, paddings=space_to_batch_pad, block_size=rate)
2953 input_sizes = [
2954 rate * rate * output_shape[0], (in_height + pad_bottom_extra) // rate,
2955 (in_width + pad_right_extra) // rate, output_shape[3]
2956 ]
2958 value = gen_nn_ops.conv2d_backprop_input(
2959 input_sizes=input_sizes,
2960 filter=filters,
2961 out_backprop=value,
2962 strides=[1, 1, 1, 1],
2963 padding="VALID",
2964 data_format="NHWC")
2966 # The crops argument to batch_to_space includes both padding components.
2967 batch_to_space_crop = [[pad_top, pad_bottom + pad_bottom_extra],
2968 [pad_left, pad_right + pad_right_extra]]
2970 return array_ops.batch_to_space(
2971 input=value, crops=batch_to_space_crop, block_size=rate)
2974@tf_export(v1=["nn.depthwise_conv2d_native"])
2975@dispatch.add_dispatch_support
2976@deprecation.deprecated_endpoints("nn.depthwise_conv2d_native")
2977def depthwise_conv2d_native( # pylint: disable=redefined-builtin,dangerous-default-value
2978 input,
2979 filter,
2980 strides,
2981 padding,
2982 data_format="NHWC",
2983 dilations=[1, 1, 1, 1],
2984 name=None):
2985 r"""Computes a 2-D depthwise convolution.
2987 Given an input tensor of shape `[batch, in_height, in_width, in_channels]`
2988 and a filter / kernel tensor of shape
2989 `[filter_height, filter_width, in_channels, channel_multiplier]`, containing
2990 `in_channels` convolutional filters of depth 1, `depthwise_conv2d` applies
2991 a different filter to each input channel (expanding from 1 channel to
2992 `channel_multiplier` channels for each), then concatenates the results
2993 together. Thus, the output has `in_channels * channel_multiplier` channels.
2995 ```
2996 for k in 0..in_channels-1
2997 for q in 0..channel_multiplier-1
2998 output[b, i, j, k * channel_multiplier + q] =
2999 sum_{di, dj} input[b, strides[1] * i + di, strides[2] * j + dj, k] *
3000 filter[di, dj, k, q]
3001 ```
3003 Must have `strides[0] = strides[3] = 1`. For the most common case of the same
3004 horizontal and vertices strides, `strides = [1, stride, stride, 1]`.
3006 Args:
3007 input: A `Tensor`. Must be one of the following types: `half`, `bfloat16`,
3008 `float32`, `float64`.
3009 filter: A `Tensor`. Must have the same type as `input`.
3010 strides: A list of `ints`. 1-D of length 4. The stride of the sliding
3011 window for each dimension of `input`.
3012 padding: Controls how to pad the image before applying the convolution. Can
3013 be the string `"SAME"` or `"VALID"` indicating the type of padding
3014 algorithm to use, or a list indicating the explicit paddings at the start
3015 and end of each dimension. When explicit padding is used and data_format
3016 is `"NHWC"`, this should be in the form `[[0, 0], [pad_top, pad_bottom],
3017 [pad_left, pad_right], [0, 0]]`. When explicit padding used and
3018 data_format is `"NCHW"`, this should be in the form `[[0, 0], [0, 0],
3019 [pad_top, pad_bottom], [pad_left, pad_right]]`.
3020 data_format: An optional `string` from: `"NHWC", "NCHW"`. Defaults to
3021 `"NHWC"`. Specify the data format of the input and output data. With the
3022 default format "NHWC", the data is stored in the order of: [batch, height,
3023 width, channels].
3024 Alternatively, the format could be "NCHW", the data storage order of:
3025 [batch, channels, height, width].
3026 dilations: An optional list of `ints`. Defaults to `[1, 1, 1, 1]`. 1-D
3027 tensor of length 4. The dilation factor for each dimension of `input`. If
3028 set to k > 1, there will be k-1 skipped cells between each filter element
3029 on that dimension. The dimension order is determined by the value of
3030 `data_format`, see above for details. Dilations in the batch and depth
3031 dimensions must be 1.
3032 name: A name for the operation (optional).
3034 Returns:
3035 A `Tensor`. Has the same type as `input`.
3036 """
3037 padding, explicit_paddings = convert_padding(padding)
3038 return gen_nn_ops.depthwise_conv2d_native(
3039 input,
3040 filter,
3041 strides,
3042 padding,
3043 explicit_paddings=explicit_paddings,
3044 data_format=data_format,
3045 dilations=dilations,
3046 name=name)
3049@tf_export(
3050 "nn.depthwise_conv2d_backprop_input",
3051 v1=[
3052 "nn.depthwise_conv2d_native_backprop_input",
3053 "nn.depthwise_conv2d_backprop_input"
3054 ])
3055@dispatch.add_dispatch_support
3056@deprecation.deprecated_endpoints("nn.depthwise_conv2d_native_backprop_input")
3057def depthwise_conv2d_native_backprop_input( # pylint: disable=redefined-builtin,dangerous-default-value
3058 input_sizes,
3059 filter,
3060 out_backprop,
3061 strides,
3062 padding,
3063 data_format="NHWC",
3064 dilations=[1, 1, 1, 1],
3065 name=None):
3066 r"""Computes the gradients of depthwise convolution with respect to the input.
3068 Args:
3069 input_sizes: A `Tensor` of type `int32`. An integer vector representing the
3070 shape of `input`, based on `data_format`. For example, if `data_format`
3071 is 'NHWC' then `input` is a 4-D `[batch, height, width, channels]` tensor.
3072 filter: A `Tensor`. Must be one of the following types: `half`, `bfloat16`,
3073 `float32`, `float64`. 4-D with shape `[filter_height, filter_width,
3074 in_channels, depthwise_multiplier]`.
3075 out_backprop: A `Tensor`. Must have the same type as `filter`. 4-D with
3076 shape based on `data_format`. For example, if `data_format` is 'NHWC'
3077 then out_backprop shape is `[batch, out_height, out_width, out_channels]`.
3078 Gradients w.r.t. the output of the convolution.
3079 strides: A list of `ints`. The stride of the sliding window for each
3080 dimension of the input of the convolution.
3081 padding: Controls how to pad the image before applying the convolution. Can
3082 be the string `"SAME"` or `"VALID"` indicating the type of padding
3083 algorithm to use, or a list indicating the explicit paddings at the start
3084 and end of each dimension. See
3085 [here](https://www.tensorflow.org/api_docs/python/tf/nn#notes_on_padding_2)
3086 for more information. When explicit padding is used and data_format is
3087 `"NHWC"`, this should be in the form `[[0, 0], [pad_top, pad_bottom],
3088 [pad_left, pad_right], [0, 0]]`. When explicit padding used and
3089 data_format is `"NCHW"`, this should be in the form `[[0, 0], [0, 0],
3090 [pad_top, pad_bottom], [pad_left, pad_right]]`.
3091 data_format: An optional `string` from: `"NHWC", "NCHW"`. Defaults to
3092 `"NHWC"`. Specify the data format of the input and output data. With the
3093 default format "NHWC", the data is stored in the order of: [batch, height,
3094 width, channels].
3095 Alternatively, the format could be "NCHW", the data storage order of:
3096 [batch, channels, height, width].
3097 dilations: An optional list of `ints`. Defaults to `[1, 1, 1, 1]`. 1-D
3098 tensor of length 4. The dilation factor for each dimension of `input`. If
3099 set to k > 1, there will be k-1 skipped cells between each filter element
3100 on that dimension. The dimension order is determined by the value of
3101 `data_format`, see above for details. Dilations in the batch and depth
3102 dimensions must be 1.
3103 name: A name for the operation (optional).
3105 Returns:
3106 A `Tensor`. Has the same type as `filter`.
3107 """
3108 padding, explicit_paddings = convert_padding(padding)
3109 return gen_nn_ops.depthwise_conv2d_native_backprop_input(
3110 input_sizes,
3111 filter,
3112 out_backprop,
3113 strides,
3114 padding,
3115 explicit_paddings=explicit_paddings,
3116 data_format=data_format,
3117 dilations=dilations,
3118 name=name)
3121@tf_export(
3122 "nn.depthwise_conv2d_backprop_filter",
3123 v1=[
3124 "nn.depthwise_conv2d_native_backprop_filter",
3125 "nn.depthwise_conv2d_backprop_filter"
3126 ])
3127@dispatch.add_dispatch_support
3128@deprecation.deprecated_endpoints("nn.depthwise_conv2d_native_backprop_filter")
3129def depthwise_conv2d_native_backprop_filter( # pylint: disable=redefined-builtin,dangerous-default-value
3130 input,
3131 filter_sizes,
3132 out_backprop,
3133 strides,
3134 padding,
3135 data_format="NHWC",
3136 dilations=[1, 1, 1, 1],
3137 name=None):
3138 r"""Computes the gradients of depthwise convolution with respect to the filter.
3140 Args:
3141 input: A `Tensor`. Must be one of the following types: `half`, `bfloat16`,
3142 `float32`, `float64`. 4-D with shape based on `data_format`. For example,
3143 if `data_format` is 'NHWC' then `input` is a 4-D `[batch, in_height,
3144 in_width, in_channels]` tensor.
3145 filter_sizes: A `Tensor` of type `int32`. An integer vector representing the
3146 tensor shape of `filter`, where `filter` is a 4-D `[filter_height,
3147 filter_width, in_channels, depthwise_multiplier]` tensor.
3148 out_backprop: A `Tensor`. Must have the same type as `input`. 4-D with shape
3149 based on `data_format`. For example, if `data_format` is 'NHWC' then
3150 out_backprop shape is `[batch, out_height, out_width, out_channels]`.
3151 Gradients w.r.t. the output of the convolution.
3152 strides: A list of `ints`. The stride of the sliding window for each
3153 dimension of the input of the convolution.
3154 padding: Controls how to pad the image before applying the convolution. Can
3155 be the string `"SAME"` or `"VALID"` indicating the type of padding
3156 algorithm to use, or a list indicating the explicit paddings at the start
3157 and end of each dimension. See
3158 [here](https://www.tensorflow.org/api_docs/python/tf/nn#notes_on_padding_2)
3159 for more information. When explicit padding is used and data_format is
3160 `"NHWC"`, this should be in the form `[[0, 0], [pad_top, pad_bottom],
3161 [pad_left, pad_right], [0, 0]]`. When explicit padding used and
3162 data_format is `"NCHW"`, this should be in the form `[[0, 0], [0, 0],
3163 [pad_top, pad_bottom], [pad_left, pad_right]]`.
3164 data_format: An optional `string` from: `"NHWC", "NCHW"`. Defaults to
3165 `"NHWC"`. Specify the data format of the input and output data. With the
3166 default format "NHWC", the data is stored in the order of: [batch, height,
3167 width, channels].
3168 Alternatively, the format could be "NCHW", the data storage order of:
3169 [batch, channels, height, width].
3170 dilations: An optional list of `ints`. Defaults to `[1, 1, 1, 1]`. 1-D
3171 tensor of length 4. The dilation factor for each dimension of `input`. If
3172 set to k > 1, there will be k-1 skipped cells between each filter element
3173 on that dimension. The dimension order is determined by the value of
3174 `data_format`, see above for details. Dilations in the batch and depth
3175 dimensions must be 1.
3176 name: A name for the operation (optional).
3178 Returns:
3179 A `Tensor`. Has the same type as `input`.
3180 """
3181 padding, explicit_paddings = convert_padding(padding)
3182 return gen_nn_ops.depthwise_conv2d_native_backprop_filter(
3183 input,
3184 filter_sizes,
3185 out_backprop,
3186 strides,
3187 padding,
3188 explicit_paddings=explicit_paddings,
3189 data_format=data_format,
3190 dilations=dilations,
3191 name=name)
3194def _conv3d_expanded_batch(
3195 input, # pylint: disable=redefined-builtin
3196 filter, # pylint: disable=redefined-builtin
3197 strides,
3198 padding,
3199 data_format,
3200 dilations=None,
3201 name=None):
3202 """Helper function for `conv3d`; handles expanded batches."""
3203 shape = input.shape
3204 # shape object may lack ndims, e.g., if input is an np.ndarray. In that case,
3205 # we fall back to len(shape).
3206 ndims = getattr(shape, "ndims", -1)
3207 if ndims == -1:
3208 ndims = len(shape)
3209 if ndims in (5, 4, 3, 2, 1, 0, None):
3210 # We avoid calling squeeze_batch_dims to reduce extra python function
3211 # call slowdown in eager mode. This branch doesn't require reshapes.
3212 return gen_nn_ops.conv3d(
3213 input,
3214 filter,
3215 strides,
3216 padding,
3217 data_format=data_format,
3218 dilations=dilations,
3219 name=name)
3220 else:
3221 return squeeze_batch_dims(
3222 input,
3223 functools.partial(
3224 gen_nn_ops.conv3d,
3225 filter=filter,
3226 strides=strides,
3227 padding=padding,
3228 data_format=data_format,
3229 dilations=dilations),
3230 inner_rank=4,
3231 name=name)
3234@tf_export("nn.conv3d", v1=[])
3235@dispatch.add_dispatch_support
3236def conv3d_v2(input, # pylint: disable=redefined-builtin,missing-docstring
3237 filters,
3238 strides,
3239 padding,
3240 data_format="NDHWC",
3241 dilations=None,
3242 name=None):
3243 if dilations is None:
3244 dilations = [1, 1, 1, 1, 1]
3245 return _conv3d_expanded_batch(input, filters, strides, padding, data_format,
3246 dilations, name)
3249@tf_export(v1=["nn.conv3d"])
3250@dispatch.add_dispatch_support
3251def conv3d_v1( # pylint: disable=missing-docstring,dangerous-default-value
3252 input, # pylint: disable=redefined-builtin
3253 filter=None, # pylint: disable=redefined-builtin
3254 strides=None,
3255 padding=None,
3256 data_format="NDHWC",
3257 dilations=[1, 1, 1, 1, 1],
3258 name=None,
3259 filters=None):
3260 filter = deprecated_argument_lookup("filters", filters, "filter", filter)
3261 return gen_nn_ops.conv3d(
3262 input, filter, strides, padding, data_format, dilations, name)
3265conv3d_v2.__doc__ = deprecation.rewrite_argument_docstring(
3266 gen_nn_ops.conv3d.__doc__, "filter", "filters")
3267conv3d_v1.__doc__ = gen_nn_ops.conv3d.__doc__
3270@tf_export(v1=["nn.conv3d_transpose"])
3271@dispatch.add_dispatch_support
3272def conv3d_transpose(
3273 value,
3274 filter=None, # pylint: disable=redefined-builtin
3275 output_shape=None,
3276 strides=None,
3277 padding="SAME",
3278 data_format="NDHWC",
3279 name=None,
3280 input=None, # pylint: disable=redefined-builtin
3281 filters=None,
3282 dilations=None):
3283 """The transpose of `conv3d`.
3285 This operation is sometimes called "deconvolution" after
3286 (Zeiler et al., 2010), but is really the transpose (gradient) of `conv3d`
3287 rather than an actual deconvolution.
3289 Args:
3290 value: A 5-D `Tensor` of type `float` and shape
3291 `[batch, depth, height, width, in_channels]`.
3292 filter: A 5-D `Tensor` with the same type as `value` and shape
3293 `[depth, height, width, output_channels, in_channels]`. `filter`'s
3294 `in_channels` dimension must match that of `value`.
3295 output_shape: A 1-D `Tensor` representing the output shape of the
3296 deconvolution op.
3297 strides: A list of ints. The stride of the sliding window for each
3298 dimension of the input tensor.
3299 padding: A string, either `'VALID'` or `'SAME'`. The padding algorithm.
3300 See the "returns" section of `tf.nn.convolution` for details.
3301 data_format: A string, either `'NDHWC'` or `'NCDHW`' specifying the layout
3302 of the input and output tensors. Defaults to `'NDHWC'`.
3303 name: Optional name for the returned tensor.
3304 input: Alias of value.
3305 filters: Alias of filter.
3306 dilations: An int or list of `ints` that has length `1`, `3` or `5`,
3307 defaults to 1. The dilation factor for each dimension of`input`. If a
3308 single value is given it is replicated in the `D`, `H` and `W` dimension.
3309 By default the `N` and `C` dimensions are set to 1. If set to k > 1, there
3310 will be k-1 skipped cells between each filter element on that dimension.
3311 The dimension order is determined by the value of `data_format`, see above
3312 for details. Dilations in the batch and depth dimensions if a 5-d tensor
3313 must be 1.
3315 Returns:
3316 A `Tensor` with the same type as `value`.
3318 Raises:
3319 ValueError: If input/output depth does not match `filter`'s shape, or if
3320 padding is other than `'VALID'` or `'SAME'`.
3322 References:
3323 Deconvolutional Networks:
3324 [Zeiler et al., 2010]
3325 (https://ieeexplore.ieee.org/abstract/document/5539957)
3326 ([pdf]
3327 (http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.232.4023&rep=rep1&type=pdf))
3328 """
3329 filter = deprecated_argument_lookup("filters", filters, "filter", filter)
3330 value = deprecated_argument_lookup("input", input, "value", value)
3331 return conv3d_transpose_v2(
3332 value,
3333 filter,
3334 output_shape,
3335 strides,
3336 padding=padding,
3337 data_format=data_format,
3338 dilations=dilations,
3339 name=name)
3342@tf_export("nn.conv3d_transpose", v1=[])
3343@dispatch.add_dispatch_support
3344def conv3d_transpose_v2(input, # pylint: disable=redefined-builtin
3345 filters,
3346 output_shape,
3347 strides,
3348 padding="SAME",
3349 data_format="NDHWC",
3350 dilations=None,
3351 name=None):
3352 """The transpose of `conv3d`.
3354 This operation is sometimes called "deconvolution" after
3355 (Zeiler et al., 2010), but is really the transpose (gradient) of `conv3d`
3356 rather than an actual deconvolution.
3358 Args:
3359 input: A 5-D `Tensor` of type `float` and shape `[batch, depth, height,
3360 width, in_channels]` for `NDHWC` data format or `[batch, in_channels,
3361 depth, height, width]` for `NCDHW` data format.
3362 filters: A 5-D `Tensor` with the same type as `input` and shape `[depth,
3363 height, width, output_channels, in_channels]`. `filter`'s `in_channels`
3364 dimension must match that of `input`.
3365 output_shape: A 1-D `Tensor` representing the output shape of the
3366 deconvolution op.
3367 strides: An int or list of `ints` that has length `1`, `3` or `5`. The
3368 stride of the sliding window for each dimension of `input`. If a single
3369 value is given it is replicated in the `D`, `H` and `W` dimension. By
3370 default the `N` and `C` dimensions are set to 0. The dimension order is
3371 determined by the value of `data_format`, see below for details.
3372 padding: A string, either `'VALID'` or `'SAME'`. The padding algorithm. See
3373 [here](https://www.tensorflow.org/api_docs/python/tf/nn#notes_on_padding_2)
3374 for more information.
3375 data_format: A string. 'NDHWC' and 'NCDHW' are supported.
3376 dilations: An int or list of `ints` that has length `1`, `3` or `5`,
3377 defaults to 1. The dilation factor for each dimension of`input`. If a
3378 single value is given it is replicated in the `D`, `H` and `W` dimension.
3379 By default the `N` and `C` dimensions are set to 1. If set to k > 1, there
3380 will be k-1 skipped cells between each filter element on that dimension.
3381 The dimension order is determined by the value of `data_format`, see above
3382 for details. Dilations in the batch and depth dimensions if a 5-d tensor
3383 must be 1.
3384 name: Optional name for the returned tensor.
3386 Returns:
3387 A `Tensor` with the same type as `input`.
3389 References:
3390 Deconvolutional Networks:
3391 [Zeiler et al., 2010]
3392 (https://ieeexplore.ieee.org/abstract/document/5539957)
3393 ([pdf]
3394 (http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.232.4023&rep=rep1&type=pdf))
3395 """
3396 with ops.name_scope(name, "conv3d_transpose",
3397 [input, filter, output_shape]) as name:
3398 if data_format is None:
3399 data_format = "NDHWC"
3400 channel_index = 1 if data_format.startswith("NC") else 4
3402 strides = _get_sequence(strides, 3, channel_index, "strides")
3403 dilations = _get_sequence(dilations, 3, channel_index, "dilations")
3405 return gen_nn_ops.conv3d_backprop_input_v2(
3406 input_sizes=output_shape,
3407 filter=filters,
3408 out_backprop=input,
3409 strides=strides,
3410 padding=padding,
3411 data_format=data_format,
3412 dilations=dilations,
3413 name=name)
3416CONV_TRANSPOSE_OPS = (
3417 conv1d_transpose,
3418 conv2d_transpose_v2,
3419 conv3d_transpose_v2,
3420)
3423@tf_export("nn.conv_transpose")
3424@dispatch.add_dispatch_support
3425def conv_transpose(input, # pylint: disable=redefined-builtin
3426 filters,
3427 output_shape,
3428 strides,
3429 padding="SAME",
3430 data_format=None,
3431 dilations=None,
3432 name=None):
3433 """The transpose of `convolution`.
3435 This operation is sometimes called "deconvolution" after
3436 (Zeiler et al., 2010), but is really the transpose (gradient) of `conv3d`
3437 rather than an actual deconvolution.
3439 Args:
3440 input: An N+2 dimensional `Tensor` of shape
3441 `[batch_size] + input_spatial_shape + [in_channels]` if data_format does
3442 not start with "NC" (default), or
3443 `[batch_size, in_channels] + input_spatial_shape` if data_format starts
3444 with "NC". It must be one of the following types:
3445 `half`, `bfloat16`, `float32`, `float64`.
3446 filters: An N+2 dimensional `Tensor` with the same type as `input` and
3447 shape `spatial_filter_shape + [in_channels, out_channels]`.
3448 output_shape: A 1-D `Tensor` representing the output shape of the
3449 deconvolution op.
3450 strides: An int or list of `ints` that has length `1`, `N` or `N+2`. The
3451 stride of the sliding window for each dimension of `input`. If a single
3452 value is given it is replicated in the spatial dimensions. By default
3453 the `N` and `C` dimensions are set to 0. The dimension order is determined
3454 by the value of `data_format`, see below for details.
3455 padding: A string, either `'VALID'` or `'SAME'`. The padding algorithm. See
3456 [here](https://www.tensorflow.org/api_docs/python/tf/nn#notes_on_padding_2)
3457 for more information.
3458 data_format: A string or None. Specifies whether the channel dimension of
3459 the `input` and output is the last dimension (default, or if `data_format`
3460 does not start with "NC"), or the second dimension (if `data_format`
3461 starts with "NC"). For N=1, the valid values are "NWC" (default) and
3462 "NCW". For N=2, the valid values are "NHWC" (default) and "NCHW".
3463 For N=3, the valid values are "NDHWC" (default) and "NCDHW".
3464 dilations: An int or list of `ints` that has length `1`, `N` or `N+2`,
3465 defaults to 1. The dilation factor for each dimension of`input`. If a
3466 single value is given it is replicated in the spatial dimensions. By
3467 default the `N` and `C` dimensions are set to 1. If set to k > 1, there
3468 will be k-1 skipped cells between each filter element on that dimension.
3469 The dimension order is determined by the value of `data_format`, see above
3470 for details.
3471 name: A name for the operation (optional). If not specified "conv_transpose"
3472 is used.
3474 Returns:
3475 A `Tensor` with the same type as `value`.
3477 References:
3478 Deconvolutional Networks:
3479 [Zeiler et al., 2010]
3480 (https://ieeexplore.ieee.org/abstract/document/5539957)
3481 ([pdf]
3482 (http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.232.4023&rep=rep1&type=pdf))
3483 """
3484 with ops.name_scope(name, "conv_transpose",
3485 [input, filter, output_shape]) as name:
3486 if tensor_util.is_tf_type(output_shape):
3487 n = output_shape.shape[0] - 2
3488 elif isinstance(output_shape, collections_abc.Sized):
3489 n = len(output_shape) - 2
3490 else:
3491 raise ValueError("`output_shape` must be a tensor or sized collection. "
3492 f"Received: output_shape={output_shape}")
3494 if not 1 <= n <= 3:
3495 raise ValueError(
3496 f"`output_shape` must be of length 3, 4 or 5. "
3497 f"Received: output_shape={output_shape} of length {n + 2}.")
3499 op = CONV_TRANSPOSE_OPS[n-1]
3500 return op(
3501 input,
3502 filters,
3503 output_shape,
3504 strides,
3505 padding=padding,
3506 data_format=data_format,
3507 dilations=dilations,
3508 name=name)
3511@tf_export("nn.bias_add")
3512@dispatch.add_dispatch_support
3513def bias_add(value, bias, data_format=None, name=None):
3514 """Adds `bias` to `value`.
3516 This is (mostly) a special case of `tf.add` where `bias` is restricted to 1-D.
3517 Broadcasting is supported, so `value` may have any number of dimensions.
3518 Unlike `tf.add`, the type of `bias` is allowed to differ from `value` in the
3519 case where both types are quantized.
3521 Args:
3522 value: A `Tensor` with type `float`, `double`, `int64`, `int32`, `uint8`,
3523 `int16`, `int8`, `complex64`, or `complex128`.
3524 bias: A 1-D `Tensor` with size matching the channel dimension of `value`.
3525 Must be the same type as `value` unless `value` is a quantized type,
3526 in which case a different quantized type may be used.
3527 data_format: A string. 'N...C' and 'NC...' are supported. If `None` (the
3528 default) is specified then 'N..C' is assumed.
3529 name: A name for the operation (optional).
3531 Returns:
3532 A `Tensor` with the same type as `value`.
3534 Raises:
3535 ValueError if data format is unrecognized, if `value` has less than two
3536 dimensions when `data_format` is 'N..C'/`None` or `value` has less
3537 then three dimensions when `data_format` is `NC..`, if `bias` does not
3538 have exactly one dimension (is a vector), or if the size of `bias`
3539 does not match the size of the channel dimension of `value`.
3540 """
3541 with ops.name_scope(name, "BiasAdd", [value, bias]) as name:
3542 if data_format is not None:
3543 if data_format.startswith("NC"):
3544 data_format = "NCHW"
3545 elif data_format.startswith("N") and data_format.endswith("C"):
3546 data_format = "NHWC"
3547 else:
3548 raise ValueError("`data_format` must be of the form `N...C` or "
3549 f"`NC...`. Received: data_format={data_format}")
3551 if not context.executing_eagerly():
3552 value = ops.convert_to_tensor(value, name="input")
3553 bias = ops.convert_to_tensor(bias, dtype=value.dtype, name="bias")
3555 return gen_nn_ops.bias_add(value, bias, data_format=data_format, name=name)
3558def bias_add_v1(value, bias, name=None):
3559 """Adds `bias` to `value`.
3561 This is a deprecated version of bias_add and will soon to be removed.
3563 This is (mostly) a special case of `tf.add` where `bias` is restricted to 1-D.
3564 Broadcasting is supported, so `value` may have any number of dimensions.
3565 Unlike `tf.add`, the type of `bias` is allowed to differ from `value` in the
3566 case where both types are quantized.
3568 Args:
3569 value: A `Tensor` with type `float`, `double`, `int64`, `int32`, `uint8`,
3570 `int16`, `int8`, `complex64`, or `complex128`.
3571 bias: A 1-D `Tensor` with size matching the last dimension of `value`.
3572 Must be the same type as `value` unless `value` is a quantized type,
3573 in which case a different quantized type may be used.
3574 name: A name for the operation (optional).
3576 Returns:
3577 A `Tensor` with the same type as `value`.
3578 """
3579 with ops.name_scope(name, "BiasAddV1", [value, bias]) as name:
3580 value = ops.convert_to_tensor(value, name="input")
3581 bias = ops.convert_to_tensor(bias, dtype=value.dtype, name="bias")
3582 return gen_nn_ops.bias_add_v1(value, bias, name=name)
3585@tf_export(v1=["nn.crelu"])
3586@dispatch.add_dispatch_support
3587def crelu(features, name=None, axis=-1):
3588 """Computes Concatenated ReLU.
3590 Concatenates a ReLU which selects only the positive part of the activation
3591 with a ReLU which selects only the *negative* part of the activation.
3592 Note that as a result this non-linearity doubles the depth of the activations.
3593 Source: [Understanding and Improving Convolutional Neural Networks via
3594 Concatenated Rectified Linear Units. W. Shang, et
3595 al.](https://arxiv.org/abs/1603.05201)
3597 Args:
3598 features: A `Tensor` with type `float`, `double`, `int32`, `int64`, `uint8`,
3599 `int16`, or `int8`.
3600 name: A name for the operation (optional).
3601 axis: The axis that the output values are concatenated along. Default is -1.
3603 Returns:
3604 A `Tensor` with the same type as `features`.
3606 References:
3607 Understanding and Improving Convolutional Neural Networks via Concatenated
3608 Rectified Linear Units:
3609 [Shang et al., 2016](http://proceedings.mlr.press/v48/shang16)
3610 ([pdf](http://proceedings.mlr.press/v48/shang16.pdf))
3611 """
3612 with ops.name_scope(name, "CRelu", [features]) as name:
3613 features = ops.convert_to_tensor(features, name="features")
3614 c = array_ops.concat([features, -features], axis, name=name) # pylint: disable=invalid-unary-operand-type
3615 return gen_nn_ops.relu(c)
3618@tf_export("nn.crelu", v1=[])
3619@dispatch.add_dispatch_support
3620def crelu_v2(features, axis=-1, name=None):
3621 return crelu(features, name=name, axis=axis)
3622crelu_v2.__doc__ = crelu.__doc__
3625@tf_export("nn.relu6")
3626@dispatch.register_unary_elementwise_api
3627@dispatch.add_dispatch_support
3628def relu6(features, name=None):
3629 """Computes Rectified Linear 6: `min(max(features, 0), 6)`.
3631 In comparison with `tf.nn.relu`, relu6 activation functions have shown to
3632 empirically perform better under low-precision conditions (e.g. fixed point
3633 inference) by encouraging the model to learn sparse features earlier.
3634 Source: [Convolutional Deep Belief Networks on CIFAR-10: Krizhevsky et al.,
3635 2010](http://www.cs.utoronto.ca/~kriz/conv-cifar10-aug2010.pdf).
3637 For example:
3639 >>> x = tf.constant([-3.0, -1.0, 0.0, 6.0, 10.0], dtype=tf.float32)
3640 >>> y = tf.nn.relu6(x)
3641 >>> y.numpy()
3642 array([0., 0., 0., 6., 6.], dtype=float32)
3644 Args:
3645 features: A `Tensor` with type `float`, `double`, `int32`, `int64`, `uint8`,
3646 `int16`, or `int8`.
3647 name: A name for the operation (optional).
3649 Returns:
3650 A `Tensor` with the same type as `features`.
3652 References:
3653 Convolutional Deep Belief Networks on CIFAR-10:
3654 Krizhevsky et al., 2010
3655 ([pdf](http://www.cs.utoronto.ca/~kriz/conv-cifar10-aug2010.pdf))
3656 """
3657 with ops.name_scope(name, "Relu6", [features]) as name:
3658 features = ops.convert_to_tensor(features, name="features")
3659 return gen_nn_ops.relu6(features, name=name)
3662@tf_export("nn.leaky_relu")
3663@dispatch.register_unary_elementwise_api
3664@dispatch.add_dispatch_support
3665def leaky_relu(features, alpha=0.2, name=None):
3666 """Compute the Leaky ReLU activation function.
3668 Source: [Rectifier Nonlinearities Improve Neural Network Acoustic Models.
3669 AL Maas, AY Hannun, AY Ng - Proc. ICML, 2013]
3670 (https://ai.stanford.edu/~amaas/papers/relu_hybrid_icml2013_final.pdf).
3672 Args:
3673 features: A `Tensor` representing preactivation values. Must be one of
3674 the following types: `float16`, `float32`, `float64`, `int32`, `int64`.
3675 alpha: Slope of the activation function at x < 0.
3676 name: A name for the operation (optional).
3678 Returns:
3679 The activation value.
3681 References:
3682 Rectifier Nonlinearities Improve Neural Network Acoustic Models:
3683 [Maas et al., 2013]
3684 (http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.693.1422)
3685 ([pdf]
3686 (http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.693.1422&rep=rep1&type=pdf))
3687 """
3688 with ops.name_scope(name, "LeakyRelu", [features, alpha]) as name:
3689 features = ops.convert_to_tensor(features, name="features")
3690 if features.dtype.is_integer:
3691 features = math_ops.cast(features, dtypes.float32)
3692 if isinstance(alpha, np.ndarray):
3693 alpha = alpha.item()
3694 return gen_nn_ops.leaky_relu(features, alpha=alpha, name=name)
3697@tf_export("nn.gelu", v1=[])
3698@dispatch.register_unary_elementwise_api
3699@dispatch.add_dispatch_support
3700def gelu(features, approximate=False, name=None):
3701 """Compute the Gaussian Error Linear Unit (GELU) activation function.
3703 Gaussian error linear unit (GELU) computes
3704 `x * P(X <= x)`, where `P(X) ~ N(0, 1)`.
3705 The (GELU) nonlinearity weights inputs by their value, rather than gates
3706 inputs by their sign as in ReLU.
3708 For example:
3710 >>> x = tf.constant([-3.0, -1.0, 0.0, 1.0, 3.0], dtype=tf.float32)
3711 >>> y = tf.nn.gelu(x)
3712 >>> y.numpy()
3713 array([-0.00404951, -0.15865529, 0. , 0.8413447 , 2.9959507 ],
3714 dtype=float32)
3715 >>> y = tf.nn.gelu(x, approximate=True)
3716 >>> y.numpy()
3717 array([-0.00363752, -0.15880796, 0. , 0.841192 , 2.9963627 ],
3718 dtype=float32)
3720 Args:
3721 features: A `float Tensor` representing preactivation values.
3722 approximate: An optional `bool`. Defaults to `False`. Whether to enable
3723 approximation.
3724 name: A name for the operation (optional).
3726 Returns:
3727 A `Tensor` with the same type as `features`.
3729 Raises:
3730 ValueError: if `features` is not a floating point `Tensor`.
3732 References:
3733 [Gaussian Error Linear Units (GELUs)](https://arxiv.org/abs/1606.08415).
3734 """
3735 with ops.name_scope(name, "Gelu", [features]):
3736 features = ops.convert_to_tensor(features, name="features")
3737 if not features.dtype.is_floating:
3738 raise ValueError(
3739 "`features.dtype` must be a floating point tensor."
3740 f"Received:features.dtype={features.dtype}")
3741 if approximate:
3742 coeff = math_ops.cast(0.044715, features.dtype)
3743 return 0.5 * features * (
3744 1.0 + math_ops.tanh(0.7978845608028654 *
3745 (features + coeff * math_ops.pow(features, 3))))
3746 else:
3747 return 0.5 * features * (1.0 + math_ops.erf(
3748 features / math_ops.cast(1.4142135623730951, features.dtype)))
3751def _flatten_outer_dims(logits):
3752 """Flattens logits' outer dimensions and keep its last dimension."""
3753 rank = array_ops.rank(logits)
3754 last_dim_size = array_ops.slice(
3755 array_ops.shape(logits), [math_ops.subtract(rank, 1)], [1])
3756 output = array_ops.reshape(logits, array_ops.concat([[-1], last_dim_size], 0))
3758 # Set output shape if known.
3759 if not context.executing_eagerly():
3760 shape = logits.get_shape()
3761 if shape is not None and shape.dims is not None:
3762 shape = shape.as_list()
3763 product = 1
3764 product_valid = True
3765 for d in shape[:-1]:
3766 if d is None:
3767 product_valid = False
3768 break
3769 else:
3770 product *= d
3771 if product_valid:
3772 output_shape = [product, shape[-1]]
3773 output.set_shape(output_shape)
3775 return output
3778def _wrap_2d_function(inputs, compute_op, dim=-1, name=None):
3779 """Helper function for ops that accept and return 2d inputs of same shape.
3781 It reshapes and transposes the inputs into a 2-D Tensor and then invokes
3782 the given function. The output would be transposed and reshaped back.
3783 If the given function returns a tuple of tensors, each of them will be
3784 transposed and reshaped.
3786 Args:
3787 inputs: A non-empty `Tensor`. Must be one of the following types: `half`,
3788 `float32`, `float64`.
3789 compute_op: The function to wrap. Must accept the input tensor as its first
3790 arugment, and a second keyword argument `name`.
3791 dim: The dimension softmax would be performed on. The default is -1 which
3792 indicates the last dimension.
3793 name: A name for the operation (optional).
3795 Returns:
3796 A `Tensor`. Has the same shape as inputs. If compute_op returns multiple
3797 tensors, each of them have the same shape as the input.
3798 Raises:
3799 InvalidArgumentError: if `inputs` is empty or `dim` is beyond the last
3800 dimension of `inputs`.
3801 """
3803 def _swap_axis(input_tensor, dim_index, last_index, name=None):
3804 """Swaps logits's dim_index and last_index."""
3805 return array_ops.transpose(
3806 input_tensor,
3807 array_ops.concat([
3808 math_ops.range(dim_index), [last_index],
3809 math_ops.range(dim_index + 1, last_index), [dim_index]
3810 ], 0),
3811 name=name)
3813 inputs = ops.convert_to_tensor(inputs)
3815 # We need its original shape for shape inference.
3816 shape = inputs.get_shape()
3817 is_last_dim = (dim == -1) or (dim == shape.ndims - 1)
3819 if is_last_dim:
3820 return compute_op(inputs, name=name)
3822 dim_val = dim
3823 if isinstance(dim, ops.Tensor):
3824 dim_val = tensor_util.constant_value(dim)
3825 if dim_val is not None and not -shape.ndims <= dim_val < shape.ndims:
3826 raise errors_impl.InvalidArgumentError(
3827 None, None,
3828 f"`dim` must be in the range [{-shape.ndims}, {shape.ndims}) where "
3829 f"{shape.ndims} is the number of dimensions in the input. "
3830 f"Received: dim={dim_val}")
3832 # If dim is not the last dimension, we have to do a transpose so that we can
3833 # still perform the op on its last dimension.
3835 # In case dim is negative (and is not last dimension -1), add shape.ndims
3836 ndims = array_ops.rank(inputs)
3837 if not isinstance(dim, ops.Tensor):
3838 if dim < 0:
3839 dim += ndims
3840 else:
3841 dim = array_ops.where(math_ops.less(dim, 0), dim + ndims, dim)
3843 # Swap logits' dimension of dim and its last dimension.
3844 input_rank = array_ops.rank(inputs)
3845 dim_axis = dim % shape.ndims
3846 inputs = _swap_axis(inputs, dim_axis, math_ops.subtract(input_rank, 1))
3848 # Do the actual call on its last dimension.
3849 def fix_output(output):
3850 output = _swap_axis(
3851 output, dim_axis, math_ops.subtract(input_rank, 1), name=name)
3853 # Make shape inference work since transpose may erase its static shape.
3854 output.set_shape(shape)
3855 return output
3857 outputs = compute_op(inputs)
3858 if isinstance(outputs, tuple):
3859 return tuple(fix_output(output) for output in outputs)
3860 else:
3861 return fix_output(outputs)
3864@tf_export("nn.softmax", "math.softmax", v1=[])
3865@dispatch.add_dispatch_support
3866def softmax_v2(logits, axis=None, name=None):
3867 """Computes softmax activations.
3869 Used for multi-class predictions. The sum of all outputs generated by softmax
3870 is 1.
3872 This function performs the equivalent of
3874 ```python
3875 softmax = tf.exp(logits) / tf.reduce_sum(tf.exp(logits), axis, keepdims=True)
3876 ```
3877 Example usage:
3879 >>> softmax = tf.nn.softmax([-1, 0., 1.])
3880 >>> softmax
3881 <tf.Tensor: shape=(3,), dtype=float32,
3882 numpy=array([0.09003057, 0.24472848, 0.66524094], dtype=float32)>
3883 >>> sum(softmax)
3884 <tf.Tensor: shape=(), dtype=float32, numpy=1.0>
3886 Args:
3887 logits: A non-empty `Tensor`. Must be one of the following types: `half`,
3888 `float32`, `float64`.
3889 axis: The dimension softmax would be performed on. The default is -1 which
3890 indicates the last dimension.
3891 name: A name for the operation (optional).
3893 Returns:
3894 A `Tensor`. Has the same type and shape as `logits`.
3896 Raises:
3897 InvalidArgumentError: if `logits` is empty or `axis` is beyond the last
3898 dimension of `logits`.
3899 """
3900 if axis is None:
3901 axis = -1
3902 return _wrap_2d_function(logits, gen_nn_ops.softmax, axis, name)
3905@tf_export(v1=["nn.softmax", "math.softmax"])
3906@dispatch.add_dispatch_support
3907@deprecation.deprecated_args(None, "dim is deprecated, use axis instead", "dim")
3908def softmax(logits, axis=None, name=None, dim=None):
3909 axis = deprecation.deprecated_argument_lookup("axis", axis, "dim", dim)
3910 if axis is None:
3911 axis = -1
3912 return _wrap_2d_function(logits, gen_nn_ops.softmax, axis, name)
3915softmax.__doc__ = softmax_v2.__doc__
3918@tf_export(v1=["nn.log_softmax", "math.log_softmax"])
3919@dispatch.register_unary_elementwise_api
3920@dispatch.add_dispatch_support
3921@deprecation.deprecated_args(None, "dim is deprecated, use axis instead", "dim")
3922def log_softmax(logits, axis=None, name=None, dim=None):
3923 """Computes log softmax activations.
3925 For each batch `i` and class `j` we have
3927 logsoftmax = logits - log(reduce_sum(exp(logits), axis))
3929 Args:
3930 logits: A non-empty `Tensor`. Must be one of the following types: `half`,
3931 `float32`, `float64`.
3932 axis: The dimension softmax would be performed on. The default is -1 which
3933 indicates the last dimension.
3934 name: A name for the operation (optional).
3935 dim: Deprecated alias for `axis`.
3937 Returns:
3938 A `Tensor`. Has the same type as `logits`. Same shape as `logits`.
3940 Raises:
3941 InvalidArgumentError: if `logits` is empty or `axis` is beyond the last
3942 dimension of `logits`.
3943 """
3944 axis = deprecation.deprecated_argument_lookup("axis", axis, "dim", dim)
3945 if axis is None:
3946 axis = -1
3947 return _wrap_2d_function(logits, gen_nn_ops.log_softmax, axis, name)
3950@tf_export("nn.log_softmax", "math.log_softmax", v1=[])
3951@dispatch.add_dispatch_support
3952def log_softmax_v2(logits, axis=None, name=None):
3953 """Computes log softmax activations.
3955 For each batch `i` and class `j` we have
3957 logsoftmax = logits - log(reduce_sum(exp(logits), axis))
3959 Args:
3960 logits: A non-empty `Tensor`. Must be one of the following types: `half`,
3961 `float32`, `float64`.
3962 axis: The dimension softmax would be performed on. The default is -1 which
3963 indicates the last dimension.
3964 name: A name for the operation (optional).
3966 Returns:
3967 A `Tensor`. Has the same type as `logits`. Same shape as `logits`.
3969 Raises:
3970 InvalidArgumentError: if `logits` is empty or `axis` is beyond the last
3971 dimension of `logits`.
3972 """
3973 if axis is None:
3974 axis = -1
3975 return _wrap_2d_function(logits, gen_nn_ops.log_softmax, axis, name)
3978def _ensure_xent_args(name, labels, logits):
3979 if labels is None or logits is None:
3980 raise ValueError(f"Both `labels` and `logits` must be provided for {name}"
3981 f"Received: labels={labels} and logits={logits}")
3984@tf_export("nn.softmax_cross_entropy_with_logits", v1=[])
3985@dispatch.add_dispatch_support
3986def softmax_cross_entropy_with_logits_v2(labels, logits, axis=-1, name=None):
3987 """Computes softmax cross entropy between `logits` and `labels`.
3989 Measures the probability error in discrete classification tasks in which the
3990 classes are mutually exclusive (each entry is in exactly one class). For
3991 example, each CIFAR-10 image is labeled with one and only one label: an image
3992 can be a dog or a truck, but not both.
3994 **NOTE:** While the classes are mutually exclusive, their probabilities
3995 need not be. All that is required is that each row of `labels` is
3996 a valid probability distribution. If they are not, the computation of the
3997 gradient will be incorrect.
3999 If using exclusive `labels` (wherein one and only
4000 one class is true at a time), see `sparse_softmax_cross_entropy_with_logits`.
4002 Usage:
4004 >>> logits = [[4.0, 2.0, 1.0], [0.0, 5.0, 1.0]]
4005 >>> labels = [[1.0, 0.0, 0.0], [0.0, 0.8, 0.2]]
4006 >>> tf.nn.softmax_cross_entropy_with_logits(labels=labels, logits=logits)
4007 <tf.Tensor: shape=(2,), dtype=float32,
4008 numpy=array([0.16984604, 0.82474494], dtype=float32)>
4010 **WARNING:** This op expects unscaled logits, since it performs a `softmax`
4011 on `logits` internally for efficiency. Do not call this op with the
4012 output of `softmax`, as it will produce incorrect results.
4014 A common use case is to have logits and labels of shape
4015 `[batch_size, num_classes]`, but higher dimensions are supported, with
4016 the `axis` argument specifying the class dimension.
4018 `logits` and `labels` must have the same dtype (either `float16`, `float32`,
4019 or `float64`).
4021 Backpropagation will happen into both `logits` and `labels`. To disallow
4022 backpropagation into `labels`, pass label tensors through `tf.stop_gradient`
4023 before feeding it to this function.
4025 **Note that to avoid confusion, it is required to pass only named arguments to
4026 this function.**
4028 Args:
4029 labels: Each vector along the class dimension should hold a valid
4030 probability distribution e.g. for the case in which labels are of shape
4031 `[batch_size, num_classes]`, each row of `labels[i]` must be a valid
4032 probability distribution.
4033 logits: Per-label activations, typically a linear output. These activation
4034 energies are interpreted as unnormalized log probabilities.
4035 axis: The class dimension. Defaulted to -1 which is the last dimension.
4036 name: A name for the operation (optional).
4038 Returns:
4039 A `Tensor` that contains the softmax cross entropy loss. Its type is the
4040 same as `logits` and its shape is the same as `labels` except that it does
4041 not have the last dimension of `labels`.
4042 """
4043 return softmax_cross_entropy_with_logits_v2_helper(
4044 labels=labels, logits=logits, axis=axis, name=name)
4047@tf_export(v1=["nn.softmax_cross_entropy_with_logits_v2"])
4048@dispatch.add_dispatch_support
4049@deprecated_args(None, "dim is deprecated, use axis instead", "dim")
4050def softmax_cross_entropy_with_logits_v2_helper(
4051 labels, logits, axis=None, name=None, dim=None):
4052 """Computes softmax cross entropy between `logits` and `labels`.
4054 Measures the probability error in discrete classification tasks in which the
4055 classes are mutually exclusive (each entry is in exactly one class). For
4056 example, each CIFAR-10 image is labeled with one and only one label: an image
4057 can be a dog or a truck, but not both.
4059 **NOTE:** While the classes are mutually exclusive, their probabilities
4060 need not be. All that is required is that each row of `labels` is
4061 a valid probability distribution. If they are not, the computation of the
4062 gradient will be incorrect.
4064 If using exclusive `labels` (wherein one and only
4065 one class is true at a time), see `sparse_softmax_cross_entropy_with_logits`.
4067 **WARNING:** This op expects unscaled logits, since it performs a `softmax`
4068 on `logits` internally for efficiency. Do not call this op with the
4069 output of `softmax`, as it will produce incorrect results.
4071 A common use case is to have logits and labels of shape
4072 `[batch_size, num_classes]`, but higher dimensions are supported, with
4073 the `axis` argument specifying the class dimension.
4075 `logits` and `labels` must have the same dtype (either `float16`, `float32`,
4076 or `float64`).
4078 Backpropagation will happen into both `logits` and `labels`. To disallow
4079 backpropagation into `labels`, pass label tensors through `tf.stop_gradient`
4080 before feeding it to this function.
4082 **Note that to avoid confusion, it is required to pass only named arguments to
4083 this function.**
4085 Args:
4086 labels: Each vector along the class dimension should hold a valid
4087 probability distribution e.g. for the case in which labels are of shape
4088 `[batch_size, num_classes]`, each row of `labels[i]` must be a valid
4089 probability distribution.
4090 logits: Unscaled log probabilities.
4091 axis: The class dimension. Defaulted to -1 which is the last dimension.
4092 name: A name for the operation (optional).
4093 dim: Deprecated alias for axis.
4095 Returns:
4096 A `Tensor` that contains the softmax cross entropy loss. Its type is the
4097 same as `logits` and its shape is the same as `labels` except that it does
4098 not have the last dimension of `labels`.
4099 """
4100 # TODO(pcmurray) Raise an error when the labels do not sum to 1. Note: This
4101 # could break users who call this with bad labels, but disregard the bad
4102 # results.
4103 axis = deprecated_argument_lookup("axis", axis, "dim", dim)
4104 del dim
4105 if axis is None:
4106 axis = -1
4108 with ops.name_scope(name, "softmax_cross_entropy_with_logits",
4109 [logits, labels]) as name:
4110 logits = ops.convert_to_tensor(logits, name="logits")
4111 labels = ops.convert_to_tensor(labels, name="labels")
4112 convert_to_float32 = (
4113 logits.dtype == dtypes.float16 or logits.dtype == dtypes.bfloat16)
4114 precise_logits = math_ops.cast(
4115 logits, dtypes.float32) if convert_to_float32 else logits
4116 # labels and logits must be of the same type
4117 labels = math_ops.cast(labels, precise_logits.dtype)
4118 input_rank = array_ops.rank(precise_logits)
4119 # For shape inference.
4120 shape = logits.get_shape()
4122 # Move the dim to the end if dim is not the last dimension.
4123 if axis != -1:
4125 def _move_dim_to_end(tensor, dim_index, rank):
4126 return array_ops.transpose(
4127 tensor,
4128 array_ops.concat([
4129 math_ops.range(dim_index),
4130 math_ops.range(dim_index + 1, rank), [dim_index]
4131 ], 0))
4133 precise_logits = _move_dim_to_end(precise_logits, axis, input_rank)
4134 labels = _move_dim_to_end(labels, axis, input_rank)
4136 input_shape = array_ops.shape(precise_logits)
4138 # Make precise_logits and labels into matrices.
4139 precise_logits = _flatten_outer_dims(precise_logits)
4140 labels = _flatten_outer_dims(labels)
4142 # Do the actual op computation.
4143 if config.is_op_determinism_enabled():
4144 log_probs = log_softmax_v2(precise_logits)
4145 cost = -math_ops.reduce_sum(labels * log_probs, axis=1)
4146 else:
4147 # The second output tensor contains the gradients. We use it in
4148 # CrossEntropyGrad() in nn_grad but not here.
4149 cost, unused_backprop = gen_nn_ops.softmax_cross_entropy_with_logits(
4150 precise_logits, labels, name=name)
4152 # The output cost shape should be the input minus axis.
4153 output_shape = array_ops.slice(input_shape, [0],
4154 [math_ops.subtract(input_rank, 1)])
4155 cost = array_ops.reshape(cost, output_shape)
4157 # Make shape inference work since reshape and transpose may erase its static
4158 # shape.
4159 if not context.executing_eagerly(
4160 ) and shape is not None and shape.dims is not None:
4161 shape = shape.as_list()
4162 del shape[axis]
4163 cost.set_shape(shape)
4165 if convert_to_float32:
4166 return math_ops.cast(cost, logits.dtype)
4167 else:
4168 return cost
4171_XENT_DEPRECATION = """
4172Future major versions of TensorFlow will allow gradients to flow
4173into the labels input on backprop by default.
4175See `tf.nn.softmax_cross_entropy_with_logits_v2`.
4176"""
4179@tf_export(v1=["nn.softmax_cross_entropy_with_logits"])
4180@dispatch.add_dispatch_support
4181@deprecation.deprecated(date=None, instructions=_XENT_DEPRECATION)
4182def softmax_cross_entropy_with_logits(
4183 labels=None,
4184 logits=None,
4185 dim=-1,
4186 name=None,
4187 axis=None):
4188 """Computes softmax cross entropy between `logits` and `labels`.
4190 Measures the probability error in discrete classification tasks in which the
4191 classes are mutually exclusive (each entry is in exactly one class). For
4192 example, each CIFAR-10 image is labeled with one and only one label: an image
4193 can be a dog or a truck, but not both.
4195 **NOTE:** While the classes are mutually exclusive, their probabilities
4196 need not be. All that is required is that each row of `labels` is
4197 a valid probability distribution. If they are not, the computation of the
4198 gradient will be incorrect.
4200 If using exclusive `labels` (wherein one and only
4201 one class is true at a time), see `sparse_softmax_cross_entropy_with_logits`.
4203 **WARNING:** This op expects unscaled logits, since it performs a `softmax`
4204 on `logits` internally for efficiency. Do not call this op with the
4205 output of `softmax`, as it will produce incorrect results.
4207 A common use case is to have logits and labels of shape
4208 `[batch_size, num_classes]`, but higher dimensions are supported, with
4209 the `dim` argument specifying the class dimension.
4211 Backpropagation will happen only into `logits`. To calculate a cross entropy
4212 loss that allows backpropagation into both `logits` and `labels`, see
4213 `tf.nn.softmax_cross_entropy_with_logits_v2`.
4215 **Note that to avoid confusion, it is required to pass only named arguments to
4216 this function.**
4218 Args:
4219 labels: Each vector along the class dimension should hold a valid
4220 probability distribution e.g. for the case in which labels are of shape
4221 `[batch_size, num_classes]`, each row of `labels[i]` must be a valid
4222 probability distribution.
4223 logits: Per-label activations, typically a linear output. These activation
4224 energies are interpreted as unnormalized log probabilities.
4225 dim: The class dimension. Defaulted to -1 which is the last dimension.
4226 name: A name for the operation (optional).
4227 axis: Alias for dim.
4229 Returns:
4230 A `Tensor` that contains the softmax cross entropy loss. Its type is the
4231 same as `logits` and its shape is the same as `labels` except that it does
4232 not have the last dimension of `labels`.
4233 """
4234 dim = deprecated_argument_lookup("axis", axis, "dim", dim)
4235 _ensure_xent_args("softmax_cross_entropy_with_logits", labels, logits)
4237 with ops.name_scope(name, "softmax_cross_entropy_with_logits_sg",
4238 [logits, labels]) as name:
4239 labels = array_ops.stop_gradient(labels, name="labels_stop_gradient")
4241 return softmax_cross_entropy_with_logits_v2(
4242 labels=labels, logits=logits, axis=dim, name=name)
4245def _sparse_softmax_cross_entropy_with_rank_2_logits(logits, labels, name):
4246 if config.is_op_determinism_enabled():
4247 # TODO(duncanriach): Implement a GPU-deterministic version of this op at
4248 # the C++/CUDA level.
4250 # The actual op functionality
4251 log_probs = log_softmax_v2(logits)
4252 cost = math_ops.negative(array_ops.gather(log_probs, labels, batch_dims=1))
4254 # Force the output to be NaN when the corresponding label is invalid.
4255 # Without the selective gradient gating provided by the following code,
4256 # backprop into the actual op functionality above, when there are invalid
4257 # labels, leads to corruption of the gradients associated with valid labels.
4258 # TODO(duncanriach): Uncover the source of the aforementioned corruption.
4259 nan_tensor = constant_op.constant(float("Nan"), dtype=logits.dtype)
4260 cost_all_nans = array_ops.broadcast_to(nan_tensor, array_ops.shape(cost))
4261 class_count = math_ops.cast(array_ops.shape(logits)[-1], labels.dtype)
4262 cost = array_ops.where(
4263 math_ops.logical_or(
4264 math_ops.less(labels, 0),
4265 math_ops.greater_equal(labels, class_count)), cost_all_nans, cost)
4266 else:
4267 # The second output tensor contains the gradients. We use it in
4268 # _CrossEntropyGrad() in nn_grad but not here.
4269 cost, _ = gen_nn_ops.sparse_softmax_cross_entropy_with_logits(
4270 logits, labels, name=name)
4271 return cost
4274@tf_export(v1=["nn.sparse_softmax_cross_entropy_with_logits"])
4275@dispatch.add_dispatch_support
4276def sparse_softmax_cross_entropy_with_logits(
4277 labels=None,
4278 logits=None,
4279 name=None):
4280 """Computes sparse softmax cross entropy between `logits` and `labels`.
4282 Measures the probability error in discrete classification tasks in which the
4283 classes are mutually exclusive (each entry is in exactly one class). For
4284 example, each CIFAR-10 image is labeled with one and only one label: an image
4285 can be a dog or a truck, but not both.
4287 **NOTE:** For this operation, the probability of a given label is considered
4288 exclusive. That is, soft classes are not allowed, and the `labels` vector
4289 must provide a single specific index for the true class for each row of
4290 `logits` (each minibatch entry). For soft softmax classification with
4291 a probability distribution for each entry, see
4292 `softmax_cross_entropy_with_logits_v2`.
4294 **WARNING:** This op expects unscaled logits, since it performs a `softmax`
4295 on `logits` internally for efficiency. Do not call this op with the
4296 output of `softmax`, as it will produce incorrect results.
4298 A common use case is to have logits of shape
4299 `[batch_size, num_classes]` and have labels of shape
4300 `[batch_size]`, but higher dimensions are supported, in which
4301 case the `dim`-th dimension is assumed to be of size `num_classes`.
4302 `logits` must have the dtype of `float16`, `float32`, or `float64`, and
4303 `labels` must have the dtype of `int32` or `int64`.
4305 **Note that to avoid confusion, it is required to pass only named arguments to
4306 this function.**
4308 Args:
4309 labels: `Tensor` of shape `[d_0, d_1, ..., d_{r-1}]` (where `r` is rank of
4310 `labels` and result) and dtype `int32` or `int64`. Each entry in `labels`
4311 must be an index in `[0, num_classes)`. Other values will raise an
4312 exception when this op is run on CPU, and return `NaN` for corresponding
4313 loss and gradient rows on GPU.
4314 logits: Per-label activations (typically a linear output) of shape
4315 `[d_0, d_1, ..., d_{r-1}, num_classes]` and dtype `float16`, `float32`, or
4316 `float64`. These activation energies are interpreted as unnormalized log
4317 probabilities.
4318 name: A name for the operation (optional).
4320 Returns:
4321 A `Tensor` of the same shape as `labels` and of the same type as `logits`
4322 with the softmax cross entropy loss.
4324 Raises:
4325 ValueError: If logits are scalars (need to have rank >= 1) or if the rank
4326 of the labels is not equal to the rank of the logits minus one.
4327 """
4328 _ensure_xent_args("sparse_softmax_cross_entropy_with_logits", labels, logits)
4330 # TODO(pcmurray) Raise an error when the label is not an index in
4331 # [0, num_classes). Note: This could break users who call this with bad
4332 # labels, but disregard the bad results.
4334 # Reshape logits and labels to rank 2.
4335 with ops.name_scope(name, "SparseSoftmaxCrossEntropyWithLogits",
4336 [labels, logits]):
4337 labels = ops.convert_to_tensor(labels)
4338 logits = ops.convert_to_tensor(logits)
4339 precise_logits = math_ops.cast(logits, dtypes.float32) if (dtypes.as_dtype(
4340 logits.dtype) == dtypes.float16) else logits
4342 # Store label shape for result later.
4343 labels_static_shape = labels.get_shape()
4344 labels_shape = array_ops.shape(labels)
4345 static_shapes_fully_defined = (
4346 labels_static_shape.is_fully_defined() and
4347 logits.get_shape()[:-1].is_fully_defined())
4348 if logits.get_shape().ndims is not None and logits.get_shape().ndims == 0:
4349 raise ValueError(
4350 f"`logits` cannot be a scalar. Received logits={logits}`")
4351 if logits.get_shape().ndims is not None and (
4352 labels_static_shape.ndims is not None and
4353 labels_static_shape.ndims != logits.get_shape().ndims - 1):
4354 raise ValueError(
4355 "`labels.shape.rank` must equal `logits.shape.rank - 1`. "
4356 f"Received: labels.shape={labels_static_shape} of rank "
4357 f"{labels_static_shape.rank} and logits.shape={logits.get_shape()} "
4358 f"of rank {logits.get_shape().rank}")
4359 if (static_shapes_fully_defined and
4360 labels_static_shape != logits.get_shape()[:-1]):
4361 raise ValueError(
4362 "`labels.shape` must equal `logits.shape` except for "
4363 f"the last dimension. Received: labels.shape={labels_static_shape} "
4364 f"and logits.shape={logits.get_shape()}")
4365 # Check if no reshapes are required.
4366 if logits.get_shape().ndims == 2:
4367 cost = _sparse_softmax_cross_entropy_with_rank_2_logits(
4368 precise_logits, labels, name=name)
4369 if logits.dtype == dtypes.float16:
4370 return math_ops.cast(cost, dtypes.float16)
4371 else:
4372 return cost
4374 # Perform a check of the dynamic shapes if the static shapes are not fully
4375 # defined.
4376 shape_checks = []
4377 if not static_shapes_fully_defined:
4378 shape_checks.append(
4379 check_ops.assert_equal(
4380 array_ops.shape(labels),
4381 array_ops.shape(logits)[:-1]))
4382 with ops.control_dependencies(shape_checks):
4383 # Reshape logits to 2 dim, labels to 1 dim.
4384 num_classes = array_ops.shape(logits)[array_ops.rank(logits) - 1]
4385 precise_logits = array_ops.reshape(precise_logits, [-1, num_classes])
4386 labels = array_ops.reshape(labels, [-1])
4387 cost = _sparse_softmax_cross_entropy_with_rank_2_logits(
4388 precise_logits, labels, name=name)
4389 cost = array_ops.reshape(cost, labels_shape)
4390 cost.set_shape(labels_static_shape)
4391 if logits.dtype == dtypes.float16:
4392 return math_ops.cast(cost, dtypes.float16)
4393 else:
4394 return cost
4397@tf_export("nn.sparse_softmax_cross_entropy_with_logits", v1=[])
4398@dispatch.add_dispatch_support
4399def sparse_softmax_cross_entropy_with_logits_v2(labels, logits, name=None):
4400 """Computes sparse softmax cross entropy between `logits` and `labels`.
4402 Measures the probability error in discrete classification tasks in which the
4403 classes are mutually exclusive (each entry is in exactly one class). For
4404 example, each CIFAR-10 image is labeled with one and only one label: an image
4405 can be a dog or a truck, but not both.
4407 Note: For this operation, the probability of a given label is considered
4408 exclusive. That is, soft classes are not allowed, and the `labels` vector
4409 must provide a single specific index for the true class for each row of
4410 `logits` (each minibatch entry). For soft softmax classification with
4411 a probability distribution for each entry, see
4412 `softmax_cross_entropy_with_logits_v2`.
4414 Warning: This op expects unscaled logits, since it performs a `softmax`
4415 on `logits` internally for efficiency. Do not call this op with the
4416 output of `softmax`, as it will produce incorrect results.
4418 A common use case is to have logits of shape
4419 `[batch_size, num_classes]` and have labels of shape
4420 `[batch_size]`, but higher dimensions are supported, in which
4421 case the `dim`-th dimension is assumed to be of size `num_classes`.
4422 `logits` must have the dtype of `float16`, `float32`, or `float64`, and
4423 `labels` must have the dtype of `int32` or `int64`.
4425 >>> logits = tf.constant([[2., -5., .5, -.1],
4426 ... [0., 0., 1.9, 1.4],
4427 ... [-100., 100., -100., -100.]])
4428 >>> labels = tf.constant([0, 3, 1])
4429 >>> tf.nn.sparse_softmax_cross_entropy_with_logits(
4430 ... labels=labels, logits=logits).numpy()
4431 array([0.29750752, 1.1448325 , 0. ], dtype=float32)
4433 To avoid confusion, passing only named arguments to this function is
4434 recommended.
4436 Args:
4437 labels: `Tensor` of shape `[d_0, d_1, ..., d_{r-1}]` (where `r` is rank of
4438 `labels` and result) and dtype `int32` or `int64`. Each entry in `labels`
4439 must be an index in `[0, num_classes)`. Other values will raise an
4440 exception when this op is run on CPU, and return `NaN` for corresponding
4441 loss and gradient rows on GPU.
4442 logits: Unscaled log probabilities of shape `[d_0, d_1, ..., d_{r-1},
4443 num_classes]` and dtype `float16`, `float32`, or `float64`.
4444 name: A name for the operation (optional).
4446 Returns:
4447 A `Tensor` of the same shape as `labels` and of the same type as `logits`
4448 with the softmax cross entropy loss.
4450 Raises:
4451 ValueError: If logits are scalars (need to have rank >= 1) or if the rank
4452 of the labels is not equal to the rank of the logits minus one.
4453 """
4454 return sparse_softmax_cross_entropy_with_logits(
4455 labels=labels, logits=logits, name=name)
4458@tf_export("nn.avg_pool", v1=["nn.avg_pool_v2"])
4459@dispatch.add_dispatch_support
4460def avg_pool_v2(input, ksize, strides, padding, data_format=None, name=None): # pylint: disable=redefined-builtin
4461 """Performs the avg pooling on the input.
4463 Each entry in `output` is the mean of the corresponding size `ksize`
4464 window in `value`.
4466 Args:
4467 input: Tensor of rank N+2, of shape `[batch_size] + input_spatial_shape +
4468 [num_channels]` if `data_format` does not start with "NC" (default), or
4469 `[batch_size, num_channels] + input_spatial_shape` if data_format starts
4470 with "NC". Pooling happens over the spatial dimensions only.
4471 ksize: An int or list of `ints` that has length `1`, `N` or `N+2`. The size
4472 of the window for each dimension of the input tensor.
4473 strides: An int or list of `ints` that has length `1`, `N` or `N+2`. The
4474 stride of the sliding window for each dimension of the input tensor.
4475 padding: A string, either `'VALID'` or `'SAME'`. The padding algorithm. See
4476 [here](https://www.tensorflow.org/api_docs/python/tf/nn#notes_on_padding_2)
4477 for more information.
4478 data_format: A string. Specifies the channel dimension. For N=1 it can be
4479 either "NWC" (default) or "NCW", for N=2 it can be either "NHWC" (default)
4480 or "NCHW" and for N=3 either "NDHWC" (default) or "NCDHW".
4481 name: Optional name for the operation.
4483 Returns:
4484 A `Tensor` of format specified by `data_format`.
4485 The average pooled output tensor.
4486 """
4487 if input.shape is not None:
4488 n = len(input.shape) - 2
4489 elif data_format is not None:
4490 n = len(data_format) - 2
4491 else:
4492 raise ValueError(
4493 "`input` must have a static shape or `data_format` must be given. "
4494 f"Received: input.shape={input.shape} and "
4495 f"data_format={data_format}")
4496 if not 1 <= n <= 3:
4497 raise ValueError(
4498 f"`input.shape.rank` must be 3, 4 or 5. Received: "
4499 f"input.shape={input.shape} of rank {n + 2}.")
4501 if data_format is None:
4502 channel_index = n + 1
4503 else:
4504 channel_index = 1 if data_format.startswith("NC") else n + 1
4506 ksize = _get_sequence(ksize, n, channel_index, "ksize")
4507 strides = _get_sequence(strides, n, channel_index, "strides")
4509 avg_pooling_ops = {
4510 1: avg_pool1d,
4511 2: gen_nn_ops.avg_pool,
4512 3: gen_nn_ops.avg_pool3d
4513 }
4515 op = avg_pooling_ops[n]
4516 return op(
4517 input,
4518 ksize=ksize,
4519 strides=strides,
4520 padding=padding,
4521 data_format=data_format,
4522 name=name)
4525@tf_export(v1=["nn.avg_pool", "nn.avg_pool2d"])
4526@dispatch.add_dispatch_support
4527def avg_pool(value, ksize, strides, padding, data_format="NHWC",
4528 name=None, input=None): # pylint: disable=redefined-builtin
4529 """Performs the average pooling on the input.
4531 Each entry in `output` is the mean of the corresponding size `ksize`
4532 window in `value`.
4534 Args:
4535 value: A 4-D `Tensor` of shape `[batch, height, width, channels]` and type
4536 `float32`, `float64`, `qint8`, `quint8`, or `qint32`.
4537 ksize: An int or list of `ints` that has length `1`, `2` or `4`. The size of
4538 the window for each dimension of the input tensor.
4539 strides: An int or list of `ints` that has length `1`, `2` or `4`. The
4540 stride of the sliding window for each dimension of the input tensor.
4541 padding: A string, either `'VALID'` or `'SAME'`. The padding algorithm.
4542 See the "returns" section of `tf.nn.convolution` for details.
4543 data_format: A string. 'NHWC' and 'NCHW' are supported.
4544 name: Optional name for the operation.
4545 input: Alias for value.
4547 Returns:
4548 A `Tensor` with the same type as `value`. The average pooled output tensor.
4549 """
4550 with ops.name_scope(name, "AvgPool", [value]) as name:
4551 value = deprecation.deprecated_argument_lookup(
4552 "input", input, "value", value)
4554 if data_format is None:
4555 data_format = "NHWC"
4556 channel_index = 1 if data_format.startswith("NC") else 3
4558 ksize = _get_sequence(ksize, 2, channel_index, "ksize")
4559 strides = _get_sequence(strides, 2, channel_index, "strides")
4561 return gen_nn_ops.avg_pool(
4562 value,
4563 ksize=ksize,
4564 strides=strides,
4565 padding=padding,
4566 data_format=data_format,
4567 name=name)
4570@tf_export("nn.avg_pool2d", v1=[])
4571@dispatch.add_dispatch_support
4572def avg_pool2d(input, ksize, strides, padding, data_format="NHWC", name=None): # pylint: disable=redefined-builtin
4573 """Performs the average pooling on the input.
4575 Each entry in `output` is the mean of the corresponding size `ksize`
4576 window in `value`.
4578 Args:
4579 input: A 4-D `Tensor` of shape `[batch, height, width, channels]` and type
4580 `float32`, `float64`, `qint8`, `quint8`, or `qint32`.
4581 ksize: An int or list of `ints` that has length `1`, `2` or `4`. The size of
4582 the window for each dimension of the input tensor.
4583 strides: An int or list of `ints` that has length `1`, `2` or `4`. The
4584 stride of the sliding window for each dimension of the input tensor.
4585 padding: A string, either `'VALID'` or `'SAME'`. The padding algorithm. See
4586 [here](https://www.tensorflow.org/api_docs/python/tf/nn#notes_on_padding_2)
4587 for more information.
4588 data_format: A string. 'NHWC' and 'NCHW' are supported.
4589 name: Optional name for the operation.
4591 Returns:
4592 A `Tensor` with the same type as `value`. The average pooled output tensor.
4593 """
4594 with ops.name_scope(name, "AvgPool2D", [input]) as name:
4595 if data_format is None:
4596 data_format = "NHWC"
4597 channel_index = 1 if data_format.startswith("NC") else 3
4599 ksize = _get_sequence(ksize, 2, channel_index, "ksize")
4600 strides = _get_sequence(strides, 2, channel_index, "strides")
4602 return gen_nn_ops.avg_pool(
4603 input,
4604 ksize=ksize,
4605 strides=strides,
4606 padding=padding,
4607 data_format=data_format,
4608 name=name)
4611@tf_export("nn.avg_pool1d")
4612@dispatch.add_dispatch_support
4613def avg_pool1d(input, ksize, strides, padding, data_format="NWC", name=None): # pylint: disable=redefined-builtin
4614 """Performs the average pooling on the input.
4616 Each entry in `output` is the mean of the corresponding size `ksize`
4617 window in `value`.
4619 Note internally this op reshapes and uses the underlying 2d operation.
4621 Args:
4622 input: A 3-D `Tensor` of the format specified by `data_format`.
4623 ksize: An int or list of `ints` that has length `1` or `3`. The size of the
4624 window for each dimension of the input tensor.
4625 strides: An int or list of `ints` that has length `1` or `3`. The stride of
4626 the sliding window for each dimension of the input tensor.
4627 padding: A string, either `'VALID'` or `'SAME'`. The padding algorithm. See
4628 [here](https://www.tensorflow.org/api_docs/python/tf/nn#notes_on_padding_2)
4629 for more information.
4630 data_format: An optional string from: "NWC", "NCW". Defaults to "NWC".
4631 name: A name for the operation (optional).
4633 Returns:
4634 A `Tensor` of format specified by `data_format`.
4635 The max pooled output tensor.
4636 """
4637 with ops.name_scope(name, "AvgPool1D", [input]) as name:
4638 if data_format is None:
4639 data_format = "NWC"
4640 channel_index = 1 if data_format.startswith("NC") else 2
4641 ksize = [1] + _get_sequence(ksize, 1, channel_index, "ksize")
4642 strides = [1] + _get_sequence(strides, 1, channel_index, "strides")
4644 expanding_dim = 1 if data_format == "NWC" else 2
4645 data_format = "NHWC" if data_format == "NWC" else "NCHW"
4647 input = array_ops.expand_dims_v2(input, expanding_dim)
4648 result = gen_nn_ops.avg_pool(
4649 input,
4650 ksize=ksize,
4651 strides=strides,
4652 padding=padding,
4653 data_format=data_format,
4654 name=name)
4655 return array_ops.squeeze(result, expanding_dim)
4658@tf_export("nn.avg_pool3d")
4659@dispatch.add_dispatch_support
4660def avg_pool3d(input, ksize, strides, padding, data_format="NDHWC", name=None): # pylint: disable=redefined-builtin
4661 """Performs the average pooling on the input.
4663 Each entry in `output` is the mean of the corresponding size `ksize`
4664 window in `value`.
4666 Args:
4667 input: A 5-D `Tensor` of shape `[batch, depth, height, width, channels]`
4668 and type `float32`, `float64`, `qint8`, `quint8`, or `qint32`.
4669 ksize: An int or list of `ints` that has length `1`, `3` or `5`. The size of
4670 the window for each dimension of the input tensor.
4671 strides: An int or list of `ints` that has length `1`, `3` or `5`. The
4672 stride of the sliding window for each dimension of the input tensor.
4673 padding: A string, either `'VALID'` or `'SAME'`. The padding algorithm. See
4674 [here](https://www.tensorflow.org/api_docs/python/tf/nn#notes_on_padding_2)
4675 for more information.
4676 data_format: A string. 'NDHWC' and 'NCDHW' are supported.
4677 name: Optional name for the operation.
4679 Returns:
4680 A `Tensor` with the same type as `value`. The average pooled output tensor.
4681 """
4682 with ops.name_scope(name, "AvgPool3D", [input]) as name:
4683 if data_format is None:
4684 data_format = "NDHWC"
4685 channel_index = 1 if data_format.startswith("NC") else 3
4687 ksize = _get_sequence(ksize, 3, channel_index, "ksize")
4688 strides = _get_sequence(strides, 3, channel_index, "strides")
4690 return gen_nn_ops.avg_pool3d(
4691 input,
4692 ksize=ksize,
4693 strides=strides,
4694 padding=padding,
4695 data_format=data_format,
4696 name=name)
4699# pylint: disable=redefined-builtin
4700@tf_export("nn.max_pool", v1=["nn.max_pool_v2"])
4701@dispatch.add_dispatch_support
4702def max_pool_v2(input, ksize, strides, padding, data_format=None, name=None):
4703 """Performs max pooling on the input.
4705 For a given window of `ksize`, takes the maximum value within that window.
4706 Used for reducing computation and preventing overfitting.
4708 Consider an example of pooling with 2x2, non-overlapping windows:
4710 >>> matrix = tf.constant([
4711 ... [0, 0, 1, 7],
4712 ... [0, 2, 0, 0],
4713 ... [5, 2, 0, 0],
4714 ... [0, 0, 9, 8],
4715 ... ])
4716 >>> reshaped = tf.reshape(matrix, (1, 4, 4, 1))
4717 >>> tf.nn.max_pool(reshaped, ksize=2, strides=2, padding="SAME")
4718 <tf.Tensor: shape=(1, 2, 2, 1), dtype=int32, numpy=
4719 array([[[[2],
4720 [7]],
4721 [[5],
4722 [9]]]], dtype=int32)>
4724 We can adjust the window size using the `ksize` parameter. For example, if we
4725 were to expand the window to 3:
4727 >>> tf.nn.max_pool(reshaped, ksize=3, strides=2, padding="SAME")
4728 <tf.Tensor: shape=(1, 2, 2, 1), dtype=int32, numpy=
4729 array([[[[5],
4730 [7]],
4731 [[9],
4732 [9]]]], dtype=int32)>
4734 We've now picked up two additional large numbers (5 and 9) in two of the
4735 pooled spots.
4737 Note that our windows are now overlapping, since we're still moving by 2 units
4738 on each iteration. This is causing us to see the same 9 repeated twice, since
4739 it is part of two overlapping windows.
4741 We can adjust how far we move our window with each iteration using the
4742 `strides` parameter. Updating this to the same value as our window size
4743 eliminates the overlap:
4745 >>> tf.nn.max_pool(reshaped, ksize=3, strides=3, padding="SAME")
4746 <tf.Tensor: shape=(1, 2, 2, 1), dtype=int32, numpy=
4747 array([[[[2],
4748 [7]],
4749 [[5],
4750 [9]]]], dtype=int32)>
4752 Because the window does not neatly fit into our input, padding is added around
4753 the edges, giving us the same result as when we used a 2x2 window. We can skip
4754 padding altogether and simply drop the windows that do not fully fit into our
4755 input by instead passing `"VALID"` to the `padding` argument:
4757 >>> tf.nn.max_pool(reshaped, ksize=3, strides=3, padding="VALID")
4758 <tf.Tensor: shape=(1, 1, 1, 1), dtype=int32, numpy=array([[[[5]]]],
4759 dtype=int32)>
4761 Now we've grabbed the largest value in the 3x3 window starting from the upper-
4762 left corner. Since no other windows fit in our input, they are dropped.
4764 Args:
4765 input: Tensor of rank N+2, of shape `[batch_size] + input_spatial_shape +
4766 [num_channels]` if `data_format` does not start with "NC" (default), or
4767 `[batch_size, num_channels] + input_spatial_shape` if data_format starts
4768 with "NC". Pooling happens over the spatial dimensions only.
4769 ksize: An int or list of `ints` that has length `1`, `N` or `N+2`. The size
4770 of the window for each dimension of the input tensor.
4771 strides: An int or list of `ints` that has length `1`, `N` or `N+2`. The
4772 stride of the sliding window for each dimension of the input tensor.
4773 padding: Either the `string` `"SAME"` or `"VALID"` indicating the type of
4774 padding algorithm to use, or a list indicating the explicit paddings at
4775 the start and end of each dimension. See
4776 [here](https://www.tensorflow.org/api_docs/python/tf/nn#notes_on_padding_2)
4777 for more information. When explicit padding is used and data_format is
4778 `"NHWC"`, this should be in the form `[[0, 0], [pad_top, pad_bottom],
4779 [pad_left, pad_right], [0, 0]]`. When explicit padding used and
4780 data_format is `"NCHW"`, this should be in the form `[[0, 0], [0, 0],
4781 [pad_top, pad_bottom], [pad_left, pad_right]]`. When using explicit
4782 padding, the size of the paddings cannot be greater than the sliding
4783 window size.
4784 data_format: A string. Specifies the channel dimension. For N=1 it can be
4785 either "NWC" (default) or "NCW", for N=2 it can be either "NHWC" (default)
4786 or "NCHW" and for N=3 either "NDHWC" (default) or "NCDHW".
4787 name: Optional name for the operation.
4789 Returns:
4790 A `Tensor` of format specified by `data_format`.
4791 The max pooled output tensor.
4793 Raises:
4794 ValueError: If
4795 - explicit padding is used with an input tensor of rank 5.
4796 - explicit padding is used with data_format='NCHW_VECT_C'.
4797 """
4798 if input.shape is not None:
4799 n = len(input.shape) - 2
4800 elif data_format is not None:
4801 n = len(data_format) - 2
4802 else:
4803 raise ValueError(
4804 "`input` must have a static shape or a data format must be given. "
4805 f"Received: input.shape={input.shape} and "
4806 f"data_format={data_format}")
4807 if not 1 <= n <= 3:
4808 raise ValueError(
4809 f"`input.shape.rank` must be 3, 4 or 5. Received: "
4810 f"input.shape={input.shape} of rank {n + 2}.")
4811 if data_format is None:
4812 channel_index = n + 1
4813 else:
4814 channel_index = 1 if data_format.startswith("NC") else n + 1
4816 if isinstance(padding, (list, tuple)) and data_format == "NCHW_VECT_C":
4817 raise ValueError("`data_format='NCHW_VECT_C'` is not supported with "
4818 f"explicit padding. Received: padding={padding}")
4820 ksize = _get_sequence(ksize, n, channel_index, "ksize")
4821 strides = _get_sequence(strides, n, channel_index, "strides")
4823 if (isinstance(padding, (list, tuple)) and n == 3):
4824 raise ValueError("Explicit padding is not supported with an input "
4825 f"tensor of rank 5. Received: padding={padding}")
4827 max_pooling_ops = {
4828 1: max_pool1d,
4829 2: max_pool2d,
4830 3: gen_nn_ops.max_pool3d
4831 }
4833 op = max_pooling_ops[n]
4834 return op(
4835 input,
4836 ksize=ksize,
4837 strides=strides,
4838 padding=padding,
4839 data_format=data_format,
4840 name=name)
4841# pylint: enable=redefined-builtin
4844@tf_export(v1=["nn.max_pool"])
4845@dispatch.add_dispatch_support
4846def max_pool(value,
4847 ksize,
4848 strides,
4849 padding,
4850 data_format="NHWC",
4851 name=None,
4852 input=None): # pylint: disable=redefined-builtin
4853 """Performs the max pooling on the input.
4855 Args:
4856 value: A 4-D `Tensor` of the format specified by `data_format`.
4857 ksize: An int or list of `ints` that has length `1`, `2` or `4`.
4858 The size of the window for each dimension of the input tensor.
4859 strides: An int or list of `ints` that has length `1`, `2` or `4`.
4860 The stride of the sliding window for each dimension of the input tensor.
4861 padding: Either the `string` `"SAME"` or `"VALID"` indicating the type of
4862 padding algorithm to use, or a list indicating the explicit paddings at
4863 the start and end of each dimension. When explicit padding is used and
4864 data_format is `"NHWC"`, this should be in the form `[[0, 0], [pad_top,
4865 pad_bottom], [pad_left, pad_right], [0, 0]]`. When explicit padding used
4866 and data_format is `"NCHW"`, this should be in the form `[[0, 0], [0, 0],
4867 [pad_top, pad_bottom], [pad_left, pad_right]]`. When using explicit
4868 padding, the size of the paddings cannot be greater than the sliding
4869 window size.
4870 data_format: A string. 'NHWC', 'NCHW' and 'NCHW_VECT_C' are supported.
4871 name: Optional name for the operation.
4872 input: Alias for value.
4874 Returns:
4875 A `Tensor` of format specified by `data_format`.
4876 The max pooled output tensor.
4877 """
4878 value = deprecation.deprecated_argument_lookup("input", input, "value", value)
4879 with ops.name_scope(name, "MaxPool", [value]) as name:
4880 if data_format is None:
4881 data_format = "NHWC"
4882 channel_index = 1 if data_format.startswith("NC") else 3
4884 ksize = _get_sequence(ksize, 2, channel_index, "ksize")
4885 strides = _get_sequence(strides, 2, channel_index, "strides")
4886 if isinstance(padding, (list, tuple)) and data_format == "NCHW_VECT_C":
4887 raise ValueError("`data_format='NCHW_VECT_C'` is not supported with "
4888 f"explicit padding. Received: padding={padding}")
4889 padding, explicit_paddings = convert_padding(padding)
4890 if ((np.isscalar(ksize) and ksize == 0) or
4891 (isinstance(ksize,
4892 (list, tuple, np.ndarray)) and any(v == 0 for v in ksize))):
4893 raise ValueError(f"`ksize` cannot be zero. Received: ksize={ksize}")
4895 return gen_nn_ops.max_pool(
4896 value,
4897 ksize=ksize,
4898 strides=strides,
4899 padding=padding,
4900 explicit_paddings=explicit_paddings,
4901 data_format=data_format,
4902 name=name)
4905# pylint: disable=redefined-builtin
4906@tf_export("nn.max_pool1d")
4907@dispatch.add_dispatch_support
4908def max_pool1d(input, ksize, strides, padding, data_format="NWC", name=None):
4909 """Performs the max pooling on the input.
4911 Note internally this op reshapes and uses the underlying 2d operation.
4913 Args:
4914 input: A 3-D `Tensor` of the format specified by `data_format`.
4915 ksize: An int or list of `ints` that has length `1` or `3`. The size of the
4916 window for each dimension of the input tensor.
4917 strides: An int or list of `ints` that has length `1` or `3`. The stride of
4918 the sliding window for each dimension of the input tensor.
4919 padding: Either the `string` `"SAME"` or `"VALID"` indicating the type of
4920 padding algorithm to use, or a list indicating the explicit paddings at
4921 the start and end of each dimension. See
4922 [here](https://www.tensorflow.org/api_docs/python/tf/nn#notes_on_padding_2)
4923 for more information. When explicit padding is used and data_format is
4924 `"NWC"`, this should be in the form `[[0, 0], [pad_left, pad_right], [0,
4925 0]]`. When explicit padding used and data_format is `"NCW"`, this should
4926 be in the form `[[0, 0], [0, 0], [pad_left, pad_right]]`. When using
4927 explicit padding, the size of the paddings cannot be greater than the
4928 sliding window size.
4929 data_format: An optional string from: "NWC", "NCW". Defaults to "NWC".
4930 name: A name for the operation (optional).
4932 Returns:
4933 A `Tensor` of format specified by `data_format`.
4934 The max pooled output tensor.
4935 """
4936 with ops.name_scope(name, "MaxPool1d", [input]) as name:
4937 if isinstance(padding, (list, tuple)) and data_format == "NCHW_VECT_C":
4938 raise ValueError("`data_format='NCHW_VECT_C'` is not supported with "
4939 f"explicit padding. Received: padding={padding}")
4940 if data_format is None:
4941 data_format = "NWC"
4942 channel_index = 1 if data_format.startswith("NC") else 2
4943 ksize = [1] + _get_sequence(ksize, 1, channel_index, "ksize")
4944 strides = [1] + _get_sequence(strides, 1, channel_index, "strides")
4945 padding, explicit_paddings = convert_padding(padding, 3)
4946 if padding == "EXPLICIT":
4947 explicit_paddings = [0, 0] + explicit_paddings
4949 expanding_dim = 1 if data_format == "NWC" else 2
4950 data_format = "NHWC" if data_format == "NWC" else "NCHW"
4952 input = array_ops.expand_dims_v2(input, expanding_dim)
4953 result = gen_nn_ops.max_pool(
4954 input,
4955 ksize=ksize,
4956 strides=strides,
4957 padding=padding,
4958 explicit_paddings=explicit_paddings,
4959 data_format=data_format,
4960 name=name)
4961 return array_ops.squeeze(result, expanding_dim)
4962# pylint: enable=redefined-builtin
4965# pylint: disable=redefined-builtin
4966@tf_export("nn.max_pool2d")
4967@dispatch.add_dispatch_support
4968def max_pool2d(input, ksize, strides, padding, data_format="NHWC", name=None):
4969 """Performs max pooling on 2D spatial data such as images.
4971 This is a more specific version of `tf.nn.max_pool` where the input tensor
4972 is 4D, representing 2D spatial data such as images. Using these APIs are
4973 equivalent
4975 Downsamples the input images along theirs spatial dimensions (height and
4976 width) by taking its maximum over an input window defined by `ksize`.
4977 The window is shifted by `strides` along each dimension.
4979 For example, for `strides=(2, 2)` and `padding=VALID` windows that extend
4980 outside of the input are not included in the output:
4982 >>> x = tf.constant([[1., 2., 3., 4.],
4983 ... [5., 6., 7., 8.],
4984 ... [9., 10., 11., 12.]])
4985 >>> # Add the `batch` and `channels` dimensions.
4986 >>> x = x[tf.newaxis, :, :, tf.newaxis]
4987 >>> result = tf.nn.max_pool2d(x, ksize=(2, 2), strides=(2, 2),
4988 ... padding="VALID")
4989 >>> result[0, :, :, 0]
4990 <tf.Tensor: shape=(1, 2), dtype=float32, numpy=
4991 array([[6., 8.]], dtype=float32)>
4993 With `padding=SAME`, we get:
4995 >>> x = tf.constant([[1., 2., 3., 4.],
4996 ... [5., 6., 7., 8.],
4997 ... [9., 10., 11., 12.]])
4998 >>> x = x[tf.newaxis, :, :, tf.newaxis]
4999 >>> result = tf.nn.max_pool2d(x, ksize=(2, 2), strides=(2, 2),
5000 ... padding='SAME')
5001 >>> result[0, :, :, 0]
5002 <tf.Tensor: shape=(2, 2), dtype=float32, numpy=
5003 array([[ 6., 8.],
5004 [10.,12.]], dtype=float32)>
5006 We can also specify padding explicitly. The following example adds width-1
5007 padding on all sides (top, bottom, left, right):
5009 >>> x = tf.constant([[1., 2., 3., 4.],
5010 ... [5., 6., 7., 8.],
5011 ... [9., 10., 11., 12.]])
5012 >>> x = x[tf.newaxis, :, :, tf.newaxis]
5013 >>> result = tf.nn.max_pool2d(x, ksize=(2, 2), strides=(2, 2),
5014 ... padding=[[0, 0], [1, 1], [1, 1], [0, 0]])
5015 >>> result[0, :, :, 0]
5016 <tf.Tensor: shape=(2, 3), dtype=float32, numpy=
5017 array([[ 1., 3., 4.],
5018 [ 9., 11., 12.]], dtype=float32)>
5020 For more examples and detail, see `tf.nn.max_pool`.
5022 Args:
5023 input: A 4-D `Tensor` of the format specified by `data_format`.
5024 ksize: An int or list of `ints` that has length `1`, `2` or `4`. The size of
5025 the window for each dimension of the input tensor. If only one integer is
5026 specified, then we apply the same window for all 4 dims. If two are
5027 provided then we use those for H, W dimensions and keep N, C dimension
5028 window size = 1.
5029 strides: An int or list of `ints` that has length `1`, `2` or `4`. The
5030 stride of the sliding window for each dimension of the input tensor. If
5031 only one integer is specified, we apply the same stride to all 4 dims. If
5032 two are provided we use those for the H, W dimensions and keep N, C of
5033 stride = 1.
5034 padding: Either the `string` `"SAME"` or `"VALID"` indicating the type of
5035 padding algorithm to use, or a list indicating the explicit paddings at
5036 the start and end of each dimension. See
5037 [here](https://www.tensorflow.org/api_docs/python/tf/nn#notes_on_padding_2)
5038 for more information. When explicit padding is used and data_format is
5039 `"NHWC"`, this should be in the form `[[0, 0], [pad_top, pad_bottom],
5040 [pad_left, pad_right], [0, 0]]`. When explicit padding used and
5041 data_format is `"NCHW"`, this should be in the form `[[0, 0], [0, 0],
5042 [pad_top, pad_bottom], [pad_left, pad_right]]`. When using explicit
5043 padding, the size of the paddings cannot be greater than the sliding
5044 window size.
5045 data_format: A string. 'NHWC', 'NCHW' and 'NCHW_VECT_C' are supported.
5046 name: Optional name for the operation.
5048 Returns:
5049 A `Tensor` of format specified by `data_format`.
5050 The max pooled output tensor.
5052 Raises:
5053 ValueError: If explicit padding is used with data_format='NCHW_VECT_C'.
5054 """
5055 with ops.name_scope(name, "MaxPool2d", [input]) as name:
5056 if data_format is None:
5057 data_format = "NHWC"
5058 channel_index = 1 if data_format.startswith("NC") else 3
5060 ksize = _get_sequence(ksize, 2, channel_index, "ksize")
5061 strides = _get_sequence(strides, 2, channel_index, "strides")
5062 if isinstance(padding, (list, tuple)) and data_format == "NCHW_VECT_C":
5063 raise ValueError("`data_format='NCHW_VECT_C'` is not supported with "
5064 f"explicit padding. Received: padding={padding}")
5065 padding, explicit_paddings = convert_padding(padding)
5067 return gen_nn_ops.max_pool(
5068 input,
5069 ksize=ksize,
5070 strides=strides,
5071 padding=padding,
5072 explicit_paddings=explicit_paddings,
5073 data_format=data_format,
5074 name=name)
5075# pylint: enable=redefined-builtin
5078# pylint: disable=redefined-builtin
5079@tf_export("nn.max_pool3d")
5080@dispatch.add_dispatch_support
5081def max_pool3d(input, ksize, strides, padding, data_format="NDHWC", name=None):
5082 """Performs the max pooling on the input.
5084 Args:
5085 input: A 5-D `Tensor` of the format specified by `data_format`.
5086 ksize: An int or list of `ints` that has length `1`, `3` or `5`. The size of
5087 the window for each dimension of the input tensor.
5088 strides: An int or list of `ints` that has length `1`, `3` or `5`. The
5089 stride of the sliding window for each dimension of the input tensor.
5090 padding: A string, either `'VALID'` or `'SAME'`. The padding algorithm. See
5091 [here](https://www.tensorflow.org/api_docs/python/tf/nn#notes_on_padding_2)
5092 for more information.
5093 data_format: An optional string from: "NDHWC", "NCDHW". Defaults to "NDHWC".
5094 The data format of the input and output data. With the default format
5095 "NDHWC", the data is stored in the order of: [batch, in_depth, in_height,
5096 in_width, in_channels]. Alternatively, the format could be "NCDHW", the
5097 data storage order is: [batch, in_channels, in_depth, in_height,
5098 in_width].
5099 name: A name for the operation (optional).
5101 Returns:
5102 A `Tensor` of format specified by `data_format`.
5103 The max pooled output tensor.
5104 """
5105 with ops.name_scope(name, "MaxPool3D", [input]) as name:
5106 if data_format is None:
5107 data_format = "NDHWC"
5108 channel_index = 1 if data_format.startswith("NC") else 4
5110 ksize = _get_sequence(ksize, 3, channel_index, "ksize")
5111 strides = _get_sequence(strides, 3, channel_index, "strides")
5113 return gen_nn_ops.max_pool3d(
5114 input,
5115 ksize=ksize,
5116 strides=strides,
5117 padding=padding,
5118 data_format=data_format,
5119 name=name)
5120# pylint: enable=redefined-builtin
5123@tf_export("nn.max_pool_with_argmax", v1=[])
5124@dispatch.add_dispatch_support
5125def max_pool_with_argmax_v2(
5126 input, # pylint: disable=redefined-builtin
5127 ksize,
5128 strides,
5129 padding,
5130 data_format="NHWC",
5131 output_dtype=dtypes.int64,
5132 include_batch_in_index=False,
5133 name=None):
5134 """Performs max pooling on the input and outputs both max values and indices.
5136 The indices in `argmax` are flattened, so that a maximum value at position
5137 `[b, y, x, c]` becomes flattened index: `(y * width + x) * channels + c` if
5138 `include_batch_in_index` is False;
5139 `((b * height + y) * width + x) * channels + c`
5140 if `include_batch_in_index` is True.
5142 The indices returned are always in `[0, height) x [0, width)` before
5143 flattening, even if padding is involved and the mathematically correct answer
5144 is outside (either negative or too large). This is a bug, but fixing it is
5145 difficult to do in a safe backwards compatible way, especially due to
5146 flattening.
5148 Args:
5149 input: A `Tensor`. Must be one of the following types: `float32`, `float64`,
5150 `int32`, `uint8`, `int16`, `int8`, `int64`, `bfloat16`, `uint16`, `half`,
5151 `uint32`, `uint64`.
5152 4-D with shape `[batch, height, width, channels]`. Input to pool over.
5153 ksize: An int or list of `ints` that has length `1`, `2` or `4`.
5154 The size of the window for each dimension of the input tensor.
5155 strides: An int or list of `ints` that has length `1`, `2` or `4`.
5156 The stride of the sliding window for each dimension of the
5157 input tensor.
5158 padding: A `string` from: `"SAME", "VALID"`.
5159 The type of padding algorithm to use. See
5160 [here](https://www.tensorflow.org/api_docs/python/tf/nn#notes_on_padding_2)
5161 for more information.
5162 data_format: An optional `string`, must be set to `"NHWC"`. Defaults to
5163 `"NHWC"`.
5164 Specify the data format of the input and output data.
5165 output_dtype: An optional `tf.DType` from: `tf.int32, tf.int64`.
5166 Defaults to `tf.int64`.
5167 The dtype of the returned argmax tensor.
5168 include_batch_in_index: An optional `boolean`. Defaults to `False`.
5169 Whether to include batch dimension in flattened index of `argmax`.
5170 name: A name for the operation (optional).
5172 Returns:
5173 A tuple of `Tensor` objects (output, argmax).
5175 output: A `Tensor`. Has the same type as `input`.
5176 argmax: A `Tensor` of type `output_dtype`.
5177 """
5179 if data_format != "NHWC":
5180 raise ValueError("`data_format` values other than 'NHWC' are not "
5181 f"supported. Received: data_format={data_format}")
5183 ksize = _get_sequence(ksize, 2, 3, "ksize")
5184 strides = _get_sequence(strides, 2, 3, "strides")
5186 return gen_nn_ops.max_pool_with_argmax(
5187 input=input,
5188 ksize=ksize,
5189 strides=strides,
5190 padding=padding,
5191 Targmax=output_dtype,
5192 include_batch_in_index=include_batch_in_index,
5193 name=name)
5196@tf_export(v1=["nn.max_pool_with_argmax"])
5197@dispatch.add_dispatch_support
5198def max_pool_with_argmax_v1( # pylint: disable=missing-docstring,invalid-name
5199 input, # pylint: disable=redefined-builtin
5200 ksize,
5201 strides,
5202 padding,
5203 data_format="NHWC",
5204 Targmax=None,
5205 name=None,
5206 output_dtype=None,
5207 include_batch_in_index=False):
5208 if data_format != "NHWC":
5209 raise ValueError("`data_format` values other than 'NHWC' are not "
5210 f"supported. Received: data_format={data_format}")
5212 Targmax = deprecated_argument_lookup(
5213 "output_dtype", output_dtype, "Targmax", Targmax)
5214 if Targmax is None:
5215 Targmax = dtypes.int64
5216 return gen_nn_ops.max_pool_with_argmax(
5217 input=input,
5218 ksize=ksize,
5219 strides=strides,
5220 padding=padding,
5221 Targmax=Targmax,
5222 include_batch_in_index=include_batch_in_index,
5223 name=name)
5226max_pool_with_argmax_v1.__doc__ = gen_nn_ops.max_pool_with_argmax.__doc__
5229@ops.RegisterStatistics("Conv3D", "flops")
5230def _calc_conv3d_flops(graph, node):
5231 """Calculates the compute resources needed for Conv3D."""
5232 input_shape = graph_util.tensor_shape_from_node_def_name(graph, node.input[0])
5233 input_shape.assert_is_fully_defined()
5234 filter_shape = graph_util.tensor_shape_from_node_def_name(
5235 graph, node.input[1])
5236 filter_shape.assert_is_fully_defined()
5237 output_shape = graph_util.tensor_shape_from_node_def_name(graph, node.name)
5238 output_shape.assert_is_fully_defined()
5239 filter_time = int(filter_shape[0])
5240 filter_height = int(filter_shape[1])
5241 filter_width = int(filter_shape[2])
5242 filter_in_depth = int(filter_shape[3])
5243 output_count = np.prod(output_shape.as_list(), dtype=np.int64)
5244 return ops.OpStats("flops", (output_count * filter_in_depth * filter_time *
5245 filter_height * filter_width * 2))
5248@ops.RegisterStatistics("Conv2D", "flops")
5249def _calc_conv_flops(graph, node):
5250 """Calculates the compute resources needed for Conv2D."""
5251 input_shape = graph_util.tensor_shape_from_node_def_name(graph, node.input[0])
5252 input_shape.assert_is_fully_defined()
5253 filter_shape = graph_util.tensor_shape_from_node_def_name(
5254 graph, node.input[1])
5255 filter_shape.assert_is_fully_defined()
5256 output_shape = graph_util.tensor_shape_from_node_def_name(graph, node.name)
5257 output_shape.assert_is_fully_defined()
5258 filter_height = int(filter_shape[0])
5259 filter_width = int(filter_shape[1])
5260 filter_in_depth = int(filter_shape[2])
5261 output_count = np.prod(output_shape.as_list(), dtype=np.int64)
5262 return ops.OpStats(
5263 "flops",
5264 (output_count * filter_in_depth * filter_height * filter_width * 2))
5267@ops.RegisterStatistics("DepthwiseConv2dNative", "flops")
5268def _calc_depthwise_conv_flops(graph, node):
5269 """Calculates the compute resources needed for DepthwiseConv2dNative."""
5270 input_shape = graph_util.tensor_shape_from_node_def_name(graph, node.input[0])
5271 input_shape.assert_is_fully_defined()
5272 filter_shape = graph_util.tensor_shape_from_node_def_name(
5273 graph, node.input[1])
5274 filter_shape.assert_is_fully_defined()
5275 output_shape = graph_util.tensor_shape_from_node_def_name(graph, node.name)
5276 output_shape.assert_is_fully_defined()
5277 filter_height = int(filter_shape[0])
5278 filter_width = int(filter_shape[1])
5279 output_count = np.prod(output_shape.as_list(), dtype=np.int64)
5280 return ops.OpStats("flops", (output_count * filter_height * filter_width * 2))
5283@ops.RegisterStatistics("BiasAdd", "flops")
5284def _calc_bias_add_flops(graph, node):
5285 """Calculates the computing needed for BiasAdd."""
5286 input_shape = graph_util.tensor_shape_from_node_def_name(graph, node.input[0])
5287 input_shape.assert_is_fully_defined()
5288 input_count = np.prod(input_shape.as_list())
5289 return ops.OpStats("flops", input_count)
5292@tf_export(v1=["nn.xw_plus_b"])
5293@dispatch.add_dispatch_support
5294def xw_plus_b(x, weights, biases, name=None): # pylint: disable=invalid-name
5295 """Computes matmul(x, weights) + biases.
5297 Args:
5298 x: a 2D tensor. Dimensions typically: batch, in_units
5299 weights: a 2D tensor. Dimensions typically: in_units, out_units
5300 biases: a 1D tensor. Dimensions: out_units
5301 name: A name for the operation (optional). If not specified
5302 "xw_plus_b" is used.
5304 Returns:
5305 A 2-D Tensor computing matmul(x, weights) + biases.
5306 Dimensions typically: batch, out_units.
5307 """
5308 with ops.name_scope(name, "xw_plus_b", [x, weights, biases]) as name:
5309 x = ops.convert_to_tensor(x, name="x")
5310 weights = ops.convert_to_tensor(weights, name="weights")
5311 biases = ops.convert_to_tensor(biases, name="biases")
5312 mm = math_ops.matmul(x, weights)
5313 return bias_add(mm, biases, name=name)
5316def xw_plus_b_v1(x, weights, biases, name=None):
5317 """Computes matmul(x, weights) + biases.
5319 This is a deprecated version of that will soon be removed.
5321 Args:
5322 x: a 2D tensor. Dimensions typically: batch, in_units
5323 weights: a 2D tensor. Dimensions typically: in_units, out_units
5324 biases: a 1D tensor. Dimensions: out_units
5325 name: A name for the operation (optional). If not specified
5326 "xw_plus_b_v1" is used.
5328 Returns:
5329 A 2-D Tensor computing matmul(x, weights) + biases.
5330 Dimensions typically: batch, out_units.
5331 """
5332 with ops.name_scope(name, "xw_plus_b_v1", [x, weights, biases]) as name:
5333 x = ops.convert_to_tensor(x, name="x")
5334 weights = ops.convert_to_tensor(weights, name="weights")
5335 biases = ops.convert_to_tensor(biases, name="biases")
5336 mm = math_ops.matmul(x, weights)
5337 return bias_add_v1(mm, biases, name=name)
5340def _get_noise_shape(x, noise_shape):
5341 # If noise_shape is none return immediately.
5342 if noise_shape is None:
5343 return array_ops.shape(x)
5345 try:
5346 # Best effort to figure out the intended shape.
5347 # If not possible, let the op to handle it.
5348 # In eager mode exception will show up.
5349 noise_shape_ = tensor_shape.as_shape(noise_shape)
5350 except (TypeError, ValueError):
5351 return noise_shape
5353 if x.shape.dims is not None and len(x.shape.dims) == len(noise_shape_.dims):
5354 new_dims = []
5355 for i, dim in enumerate(x.shape.dims):
5356 if noise_shape_.dims[i].value is None and dim.value is not None:
5357 new_dims.append(dim.value)
5358 else:
5359 new_dims.append(noise_shape_.dims[i].value)
5360 return tensor_shape.TensorShape(new_dims)
5362 return noise_shape
5365@tf_export(v1=["nn.dropout"])
5366@dispatch.add_dispatch_support
5367@deprecation.deprecated_args(None, "Please use `rate` instead of `keep_prob`. "
5368 "Rate should be set to `rate = 1 - keep_prob`.",
5369 "keep_prob")
5370def dropout(x, keep_prob=None, noise_shape=None, seed=None, name=None,
5371 rate=None):
5372 """Computes dropout.
5374 For each element of `x`, with probability `rate`, outputs `0`, and otherwise
5375 scales up the input by `1 / (1-rate)`. The scaling is such that the expected
5376 sum is unchanged.
5378 By default, each element is kept or dropped independently. If `noise_shape`
5379 is specified, it must be
5380 [broadcastable](http://docs.scipy.org/doc/numpy/user/basics.broadcasting.html)
5381 to the shape of `x`, and only dimensions with `noise_shape[i] == shape(x)[i]`
5382 will make independent decisions. For example, if `shape(x) = [k, l, m, n]`
5383 and `noise_shape = [k, 1, 1, n]`, each batch and channel component will be
5384 kept independently and each row and column will be kept or not kept together.
5386 Args:
5387 x: A floating point tensor.
5388 keep_prob: (deprecated) A deprecated alias for `(1-rate)`.
5389 noise_shape: A 1-D integer `Tensor`, representing the
5390 shape for randomly generated keep/drop flags.
5391 seed: A Python integer. Used to create random seeds. See
5392 `tf.random.set_seed` for behavior.
5393 name: A name for this operation (optional).
5394 rate: A scalar `Tensor` with the same type as `x`. The probability that each
5395 element of `x` is discarded.
5397 Returns:
5398 A Tensor of the same shape of `x`.
5400 Raises:
5401 ValueError: If `rate` is not in `[0, 1)` or if `x` is not a floating
5402 point tensor.
5403 """
5404 try:
5405 rate_from_keep_prob = 1. - keep_prob if keep_prob is not None else None
5406 except TypeError:
5407 raise ValueError("`keep_prob` must be a floating point number or Tensor. "
5408 f"Received: keep_prob={keep_prob}")
5410 rate = deprecation.deprecated_argument_lookup(
5411 "rate", rate,
5412 "keep_prob", rate_from_keep_prob)
5414 if rate is None:
5415 raise ValueError(f"`rate` must be provided. Received: rate={rate}")
5417 return dropout_v2(x, rate, noise_shape=noise_shape, seed=seed, name=name)
5420@tf_export("nn.dropout", v1=[])
5421@dispatch.add_dispatch_support
5422def dropout_v2(x, rate, noise_shape=None, seed=None, name=None):
5423 """Computes dropout: randomly sets elements to zero to prevent overfitting.
5425 Warning: You should consider using
5426 `tf.nn.experimental.stateless_dropout` instead of this function. The
5427 difference between `tf.nn.experimental.stateless_dropout` and this
5428 function is analogous to the difference between
5429 `tf.random.stateless_uniform` and `tf.random.uniform`. Please see
5430 [Random number
5431 generation](https://www.tensorflow.org/guide/random_numbers) guide
5432 for a detailed description of the various RNG systems in TF. As the
5433 guide states, legacy stateful RNG ops like `tf.random.uniform` and
5434 `tf.nn.dropout` are not deprecated yet but highly discouraged,
5435 because their states are hard to control.
5437 Note: The behavior of dropout has changed between TensorFlow 1.x and 2.x.
5438 When converting 1.x code, please use named arguments to ensure behavior stays
5439 consistent.
5441 See also: `tf.keras.layers.Dropout` for a dropout layer.
5443 [Dropout](https://arxiv.org/abs/1207.0580) is useful for regularizing DNN
5444 models. Inputs elements are randomly set to zero (and the other elements are
5445 rescaled). This encourages each node to be independently useful, as it cannot
5446 rely on the output of other nodes.
5448 More precisely: With probability `rate` elements of `x` are set to `0`.
5449 The remaining elements are scaled up by `1.0 / (1 - rate)`, so that the
5450 expected value is preserved.
5452 >>> tf.random.set_seed(0)
5453 >>> x = tf.ones([3,5])
5454 >>> tf.nn.dropout(x, rate = 0.5, seed = 1).numpy()
5455 array([[2., 0., 0., 2., 2.],
5456 [2., 2., 2., 2., 2.],
5457 [2., 0., 2., 0., 2.]], dtype=float32)
5459 >>> tf.random.set_seed(0)
5460 >>> x = tf.ones([3,5])
5461 >>> tf.nn.dropout(x, rate = 0.8, seed = 1).numpy()
5462 array([[0., 0., 0., 5., 5.],
5463 [0., 5., 0., 5., 0.],
5464 [5., 0., 5., 0., 5.]], dtype=float32)
5466 >>> tf.nn.dropout(x, rate = 0.0) == x
5467 <tf.Tensor: shape=(3, 5), dtype=bool, numpy=
5468 array([[ True, True, True, True, True],
5469 [ True, True, True, True, True],
5470 [ True, True, True, True, True]])>
5473 By default, each element is kept or dropped independently. If `noise_shape`
5474 is specified, it must be
5475 [broadcastable](http://docs.scipy.org/doc/numpy/user/basics.broadcasting.html)
5476 to the shape of `x`, and only dimensions with `noise_shape[i] == shape(x)[i]`
5477 will make independent decisions. This is useful for dropping whole
5478 channels from an image or sequence. For example:
5480 >>> tf.random.set_seed(0)
5481 >>> x = tf.ones([3,10])
5482 >>> tf.nn.dropout(x, rate = 2/3, noise_shape=[1,10], seed=1).numpy()
5483 array([[0., 0., 0., 3., 3., 0., 3., 3., 3., 0.],
5484 [0., 0., 0., 3., 3., 0., 3., 3., 3., 0.],
5485 [0., 0., 0., 3., 3., 0., 3., 3., 3., 0.]], dtype=float32)
5487 Args:
5488 x: A floating point tensor.
5489 rate: A scalar `Tensor` with the same type as x. The probability
5490 that each element is dropped. For example, setting rate=0.1 would drop
5491 10% of input elements.
5492 noise_shape: A 1-D integer `Tensor`, representing the
5493 shape for randomly generated keep/drop flags.
5494 seed: A Python integer. Used to create random seeds. See
5495 `tf.random.set_seed` for behavior.
5496 name: A name for this operation (optional).
5498 Returns:
5499 A Tensor of the same shape of `x`.
5501 Raises:
5502 ValueError: If `rate` is not in `[0, 1)` or if `x` is not a floating point
5503 tensor. `rate=1` is disallowed, because the output would be all zeros,
5504 which is likely not what was intended.
5505 """
5506 uniform_sampler = functools.partial(random_ops.random_uniform, seed=seed)
5507 def dummy_rng_step():
5508 random_seed.get_seed(seed)
5509 return _dropout(x=x, rate=rate, noise_shape=noise_shape,
5510 uniform_sampler=uniform_sampler,
5511 dummy_rng_step=dummy_rng_step, name=name,
5512 default_name="dropout")
5515@tf_export("nn.experimental.stateless_dropout")
5516@dispatch.add_dispatch_support
5517def stateless_dropout(x, rate, seed, rng_alg=None, noise_shape=None, name=None):
5518 """Computes dropout: randomly sets elements to zero to prevent overfitting.
5520 [Dropout](https://arxiv.org/abs/1207.0580) is useful for regularizing DNN
5521 models. Inputs elements are randomly set to zero (and the other elements are
5522 rescaled). This encourages each node to be independently useful, as it cannot
5523 rely on the output of other nodes.
5525 More precisely: With probability `rate` elements of `x` are set to `0`.
5526 The remaining elements are scaled up by `1.0 / (1 - rate)`, so that the
5527 expected value is preserved.
5529 >>> x = tf.ones([3,5])
5530 >>> tf.nn.experimental.stateless_dropout(x, rate=0.5, seed=[1, 0])
5531 <tf.Tensor: shape=(3, 5), dtype=float32, numpy=
5532 array([[2., 0., 2., 0., 0.],
5533 [0., 0., 2., 0., 2.],
5534 [0., 0., 0., 0., 2.]], dtype=float32)>
5536 >>> x = tf.ones([3,5])
5537 >>> tf.nn.experimental.stateless_dropout(x, rate=0.8, seed=[1, 0])
5538 <tf.Tensor: shape=(3, 5), dtype=float32, numpy=
5539 array([[5., 0., 0., 0., 0.],
5540 [0., 0., 0., 0., 5.],
5541 [0., 0., 0., 0., 5.]], dtype=float32)>
5543 >>> tf.nn.experimental.stateless_dropout(x, rate=0.0, seed=[1, 0]) == x
5544 <tf.Tensor: shape=(3, 5), dtype=bool, numpy=
5545 array([[ True, True, True, True, True],
5546 [ True, True, True, True, True],
5547 [ True, True, True, True, True]])>
5550 This function is a stateless version of `tf.nn.dropout`, in the
5551 sense that no matter how many times you call this function, the same
5552 `seed` will lead to the same results, and different `seed` will lead
5553 to different results.
5555 >>> x = tf.ones([3,5])
5556 >>> tf.nn.experimental.stateless_dropout(x, rate=0.8, seed=[1, 0])
5557 <tf.Tensor: shape=(3, 5), dtype=float32, numpy=
5558 array([[5., 0., 0., 0., 0.],
5559 [0., 0., 0., 0., 5.],
5560 [0., 0., 0., 0., 5.]], dtype=float32)>
5561 >>> tf.nn.experimental.stateless_dropout(x, rate=0.8, seed=[1, 0])
5562 <tf.Tensor: shape=(3, 5), dtype=float32, numpy=
5563 array([[5., 0., 0., 0., 0.],
5564 [0., 0., 0., 0., 5.],
5565 [0., 0., 0., 0., 5.]], dtype=float32)>
5566 >>> tf.nn.experimental.stateless_dropout(x, rate=0.8, seed=[2, 0])
5567 <tf.Tensor: shape=(3, 5), dtype=float32, numpy=
5568 array([[5., 0., 0., 0., 0.],
5569 [0., 0., 0., 5., 0.],
5570 [0., 0., 0., 0., 0.]], dtype=float32)>
5571 >>> tf.nn.experimental.stateless_dropout(x, rate=0.8, seed=[2, 0])
5572 <tf.Tensor: shape=(3, 5), dtype=float32, numpy=
5573 array([[5., 0., 0., 0., 0.],
5574 [0., 0., 0., 5., 0.],
5575 [0., 0., 0., 0., 0.]], dtype=float32)>
5577 Compare the above results to those of `tf.nn.dropout` below. The
5578 second time `tf.nn.dropout` is called with the same seed, it will
5579 give a different output.
5581 >>> tf.random.set_seed(0)
5582 >>> x = tf.ones([3,5])
5583 >>> tf.nn.dropout(x, rate=0.8, seed=1)
5584 <tf.Tensor: shape=(3, 5), dtype=float32, numpy=
5585 array([[0., 0., 0., 5., 5.],
5586 [0., 5., 0., 5., 0.],
5587 [5., 0., 5., 0., 5.]], dtype=float32)>
5588 >>> tf.nn.dropout(x, rate=0.8, seed=1)
5589 <tf.Tensor: shape=(3, 5), dtype=float32, numpy=
5590 array([[0., 0., 0., 0., 0.],
5591 [0., 0., 0., 5., 0.],
5592 [0., 0., 0., 0., 0.]], dtype=float32)>
5593 >>> tf.nn.dropout(x, rate=0.8, seed=2)
5594 <tf.Tensor: shape=(3, 5), dtype=float32, numpy=
5595 array([[0., 0., 0., 0., 0.],
5596 [0., 5., 0., 5., 0.],
5597 [0., 0., 0., 0., 0.]], dtype=float32)>
5598 >>> tf.nn.dropout(x, rate=0.8, seed=2)
5599 <tf.Tensor: shape=(3, 5), dtype=float32, numpy=
5600 array([[0., 0., 0., 0., 0.],
5601 [5., 0., 5., 0., 5.],
5602 [0., 5., 0., 0., 5.]], dtype=float32)>
5604 The difference between this function and `tf.nn.dropout` is
5605 analogous to the difference between `tf.random.stateless_uniform`
5606 and `tf.random.uniform`. Please see [Random number
5607 generation](https://www.tensorflow.org/guide/random_numbers) guide
5608 for a detailed description of the various RNG systems in TF. As the
5609 guide states, legacy stateful RNG ops like `tf.random.uniform` and
5610 `tf.nn.dropout` are not deprecated yet but highly discouraged,
5611 because their states are hard to control.
5613 By default, each element is kept or dropped independently. If `noise_shape`
5614 is specified, it must be
5615 [broadcastable](http://docs.scipy.org/doc/numpy/user/basics.broadcasting.html)
5616 to the shape of `x`, and only dimensions with `noise_shape[i] == shape(x)[i]`
5617 will make independent decisions. This is useful for dropping whole
5618 channels from an image or sequence. For example:
5620 >>> x = tf.ones([3,10])
5621 >>> tf.nn.experimental.stateless_dropout(x, rate=2/3, noise_shape=[1,10],
5622 ... seed=[1, 0])
5623 <tf.Tensor: shape=(3, 10), dtype=float32, numpy=
5624 array([[3., 0., 0., 0., 0., 0., 0., 3., 0., 3.],
5625 [3., 0., 0., 0., 0., 0., 0., 3., 0., 3.],
5626 [3., 0., 0., 0., 0., 0., 0., 3., 0., 3.]], dtype=float32)>
5628 Args:
5629 x: A floating point tensor.
5630 rate: A scalar `Tensor` with the same type as x. The probability
5631 that each element is dropped. For example, setting rate=0.1 would drop
5632 10% of input elements.
5633 seed: An integer tensor of shape `[2]`. The seed of the random numbers.
5634 rng_alg: The algorithm used to generate the random numbers
5635 (default to `"auto_select"`). See the `alg` argument of
5636 `tf.random.stateless_uniform` for the supported values.
5637 noise_shape: A 1-D integer `Tensor`, representing the
5638 shape for randomly generated keep/drop flags.
5639 name: A name for this operation.
5641 Returns:
5642 A Tensor of the same shape and dtype of `x`.
5644 Raises:
5645 ValueError: If `rate` is not in `[0, 1)` or if `x` is not a floating point
5646 tensor. `rate=1` is disallowed, because the output would be all zeros,
5647 which is likely not what was intended.
5648 """
5649 uniform_sampler = functools.partial(
5650 stateless_random_ops.stateless_random_uniform, seed=seed, alg=rng_alg)
5651 def dummy_rng_step():
5652 pass
5653 return _dropout(x=x, rate=rate, noise_shape=noise_shape,
5654 uniform_sampler=uniform_sampler,
5655 dummy_rng_step=dummy_rng_step, name=name,
5656 default_name="stateless_dropout")
5659@tf_export("nn.experimental.general_dropout")
5660@dispatch.add_dispatch_support
5661def general_dropout(x, rate, uniform_sampler, noise_shape=None, name=None):
5662 """Computes dropout: randomly sets elements to zero to prevent overfitting.
5664 Please see `tf.nn.experimental.stateless_dropout` for an overview
5665 of dropout.
5667 Unlike `tf.nn.experimental.stateless_dropout`, here you can supply a
5668 custom sampler function `uniform_sampler` that (given a shape and a
5669 dtype) generates a random, `Uniform[0, 1)`-distributed tensor (of
5670 that shape and dtype). `uniform_sampler` can be
5671 e.g. `tf.random.stateless_random_uniform` or
5672 `tf.random.Generator.uniform`.
5674 For example, if you are using `tf.random.Generator` to generate
5675 random numbers, you can use this code to do dropouts:
5677 >>> g = tf.random.Generator.from_seed(7)
5678 >>> sampler = g.uniform
5679 >>> x = tf.constant([1.1, 2.2, 3.3, 4.4, 5.5])
5680 >>> rate = 0.5
5681 >>> tf.nn.experimental.general_dropout(x, rate, sampler)
5682 <tf.Tensor: shape=(5,), ..., numpy=array([ 0. , 4.4, 6.6, 8.8, 11. ], ...)>
5683 >>> tf.nn.experimental.general_dropout(x, rate, sampler)
5684 <tf.Tensor: shape=(5,), ..., numpy=array([2.2, 0. , 0. , 8.8, 0. ], ...)>
5686 It has better performance than using
5687 `tf.nn.experimental.stateless_dropout` and
5688 `tf.random.Generator.make_seeds`:
5690 >>> g = tf.random.Generator.from_seed(7)
5691 >>> x = tf.constant([1.1, 2.2, 3.3, 4.4, 5.5])
5692 >>> rate = 0.5
5693 >>> tf.nn.experimental.stateless_dropout(x, rate, g.make_seeds(1)[:, 0])
5694 <tf.Tensor: shape=(5,), ..., numpy=array([ 2.2, 4.4, 6.6, 0. , 11. ], ...)>
5695 >>> tf.nn.experimental.stateless_dropout(x, rate, g.make_seeds(1)[:, 0])
5696 <tf.Tensor: shape=(5,), ..., numpy=array([2.2, 0. , 6.6, 8.8, 0. ], ...>
5698 because generating and consuming seeds cost extra
5699 computation. `tf.nn.experimental.general_dropout` can let you avoid
5700 them.
5702 Args:
5703 x: A floating point tensor.
5704 rate: A scalar `Tensor` with the same type as x. The probability
5705 that each element is dropped. For example, setting rate=0.1 would drop
5706 10% of input elements.
5707 uniform_sampler: a callable of signature `(shape, dtype) ->
5708 Tensor[shape, dtype]`, used to generate a tensor of uniformly-distributed
5709 random numbers in the range `[0, 1)`, of the given shape and dtype.
5710 noise_shape: A 1-D integer `Tensor`, representing the
5711 shape for randomly generated keep/drop flags.
5712 name: A name for this operation.
5714 Returns:
5715 A Tensor of the same shape and dtype of `x`.
5717 Raises:
5718 ValueError: If `rate` is not in `[0, 1)` or if `x` is not a floating point
5719 tensor. `rate=1` is disallowed, because the output would be all zeros,
5720 which is likely not what was intended.
5721 """
5722 def dummy_rng_step():
5723 pass
5724 return _dropout(x=x, rate=rate, noise_shape=noise_shape,
5725 uniform_sampler=uniform_sampler,
5726 dummy_rng_step=dummy_rng_step, name=name,
5727 default_name="general_dropout")
5730def _dropout(x, rate, noise_shape, uniform_sampler, dummy_rng_step, name,
5731 default_name):
5732 """Shared implementation of the various dropout functions.
5734 Args:
5735 x: same as the namesake in `dropout_v2`.
5736 rate: same as the namesake in `dropout_v2`.
5737 noise_shape: same as the namesake in `dropout_v2`.
5738 uniform_sampler: a callable of signature `(shape, dtype) ->
5739 Tensor`, used to generate a tensor of uniformly-distributed
5740 random numbers in the range `[0, 1)`, of the given shape and dtype.
5741 dummy_rng_step: a callable of signature `() -> None`, to make a
5742 dummy RNG call in the fast path. In the fast path where rate is
5743 0, we don't need to generate random numbers, but some samplers
5744 still require you to make an RNG call, to make sure that RNG
5745 states won't depend on whether the fast path is taken.
5746 name: same as the namesake in `dropout_v2`.
5747 default_name: a default name in case `name` is `None`.
5749 Returns:
5750 A Tensor of the same shape and dtype of `x`.
5751 """
5752 with ops.name_scope(name, default_name, [x]) as name:
5753 is_rate_number = isinstance(rate, numbers.Real)
5754 if is_rate_number and (rate < 0 or rate >= 1):
5755 raise ValueError("`rate` must be a scalar tensor or a float in the "
5756 f"range [0, 1). Received: rate={rate}")
5757 x = ops.convert_to_tensor(x, name="x")
5758 x_dtype = x.dtype
5759 if not x_dtype.is_floating:
5760 raise ValueError(
5761 "`x.dtype` must be a floating point tensor as `x` will be "
5762 f"scaled. Received: x_dtype={x_dtype}")
5763 if is_rate_number and rate == 0:
5764 # Fast-path: Return the input immediately if rate is non-tensor & is `0`.
5765 # We trigger this after all error checking
5766 # and after `x` has been converted to a tensor, to prevent inconsistent
5767 # tensor conversions/error raising if rate is changed to/from 0.
5768 #
5769 # We also explicitly call `dummy_rng_step` to make sure
5770 # we don't change the random number generation behavior of
5771 # stateful random ops by entering a fastpath,
5772 # despite not generating a random tensor in the fastpath
5773 dummy_rng_step()
5774 return x
5776 is_executing_eagerly = context.executing_eagerly()
5777 if not tensor_util.is_tf_type(rate):
5778 if is_rate_number:
5779 keep_prob = 1 - rate
5780 scale = 1 / keep_prob
5781 scale = ops.convert_to_tensor(scale, dtype=x_dtype)
5782 ret = gen_math_ops.mul(x, scale)
5783 else:
5784 raise ValueError(
5785 f"`rate` must be a scalar or scalar tensor. Received: rate={rate}")
5786 else:
5787 rate.get_shape().assert_has_rank(0)
5788 rate_dtype = rate.dtype
5789 if rate_dtype != x_dtype:
5790 if not rate_dtype.is_compatible_with(x_dtype):
5791 raise ValueError(
5792 "`x.dtype` must be compatible with `rate.dtype`. "
5793 f"Received: x.dtype={x_dtype} and rate.dtype={rate_dtype}")
5794 rate = gen_math_ops.cast(rate, x_dtype, name="rate")
5795 one_tensor = constant_op.constant(1, dtype=x_dtype)
5796 ret = gen_math_ops.real_div(x, gen_math_ops.sub(one_tensor, rate))
5798 noise_shape = _get_noise_shape(x, noise_shape)
5799 # Sample a uniform distribution on [0.0, 1.0) and select values larger
5800 # than or equal to `rate`.
5801 random_tensor = uniform_sampler(shape=noise_shape, dtype=x_dtype)
5802 keep_mask = random_tensor >= rate
5803 zero_tensor = constant_op.constant(0, dtype=x_dtype)
5804 ret = array_ops.where_v2(keep_mask, ret, zero_tensor)
5805 if not is_executing_eagerly:
5806 ret.set_shape(x.get_shape())
5807 return ret
5810@tf_export("math.top_k", "nn.top_k")
5811@dispatch.add_dispatch_support
5812def top_k(input, k=1, sorted=True, index_type=dtypes.int32, name=None): # pylint: disable=redefined-builtin
5813 """Finds values and indices of the `k` largest entries for the last dimension.
5815 If the input is a vector (rank=1), finds the `k` largest entries in the vector
5816 and outputs their values and indices as vectors. Thus `values[j]` is the
5817 `j`-th largest entry in `input`, and its index is `indices[j]`.
5819 >>> result = tf.math.top_k([1, 2, 98, 1, 1, 99, 3, 1, 3, 96, 4, 1],
5820 ... k=3)
5821 >>> result.values.numpy()
5822 array([99, 98, 96], dtype=int32)
5823 >>> result.indices.numpy()
5824 array([5, 2, 9], dtype=int32)
5826 For matrices (resp. higher rank input), computes the top `k` entries in each
5827 row (resp. vector along the last dimension). Thus,
5829 >>> input = tf.random.normal(shape=(3,4,5,6))
5830 >>> k = 2
5831 >>> values, indices = tf.math.top_k(input, k=k)
5832 >>> values.shape.as_list()
5833 [3, 4, 5, 2]
5834 >>>
5835 >>> values.shape == indices.shape == input.shape[:-1] + [k]
5836 True
5838 The indices can be used to `gather` from a tensor who's shape matches `input`.
5840 >>> gathered_values = tf.gather(input, indices, batch_dims=-1)
5841 >>> assert tf.reduce_all(gathered_values == values)
5843 If two elements are equal, the lower-index element appears first.
5845 >>> result = tf.math.top_k([1, 1, 0, 1, 0, 1, 0, 0, 1, 0, 0, 0, 1, 0],
5846 ... k=3)
5847 >>> result.indices.numpy()
5848 array([0, 1, 3], dtype=int32)
5850 By default, indices are returned as type `int32`, however, this can be changed
5851 by specifying the `index_type`.
5853 >>> result = tf.math.top_k([1, 1, 0, 1, 0, 1, 0, 0, 1, 0, 0, 0, 1, 0],
5854 ... k=3, index_type=tf.int16)
5855 >>> result.indices.numpy()
5856 array([0, 1, 3], dtype=int16)
5858 Args:
5859 input: 1-D or higher `Tensor` with last dimension at least `k`.
5860 k: 0-D `Tensor` of type `int16`, `int32` or `int64`. Number of top element
5861 to look for along the last dimension (along each row for matrices).
5862 sorted: If true the resulting `k` elements will be sorted by the values in
5863 descending order.
5864 index_type: Optional dtype for output indices.
5865 name: Optional name for the operation.
5867 Returns:
5868 A tuple with two named fields:
5869 values: The `k` largest elements along each last dimensional slice.
5870 indices: The indices of `values` within the last dimension of `input`.
5871 """
5872 return gen_nn_ops.top_kv2(
5873 input, k=k, sorted=sorted, index_type=index_type, name=name
5874 )
5877@tf_export("math.approx_max_k", "nn.approx_max_k")
5878@dispatch.add_dispatch_support
5879def approx_max_k(operand,
5880 k,
5881 reduction_dimension=-1,
5882 recall_target=0.95,
5883 reduction_input_size_override=-1,
5884 aggregate_to_topk=True,
5885 name=None):
5886 """Returns max `k` values and their indices of the input `operand` in an approximate manner.
5888 See https://arxiv.org/abs/2206.14286 for the algorithm details. This op is
5889 only optimized on TPU currently.
5891 Args:
5892 operand : Array to search for max-k. Must be a floating number type.
5893 k : Specifies the number of max-k.
5894 reduction_dimension : Integer dimension along which to search. Default: -1.
5895 recall_target : Recall target for the approximation.
5896 reduction_input_size_override : When set to a positive value, it overrides
5897 the size determined by `operand[reduction_dim]` for evaluating the recall.
5898 This option is useful when the given `operand` is only a subset of the
5899 overall computation in SPMD or distributed pipelines, where the true input
5900 size cannot be deferred by the `operand` shape.
5901 aggregate_to_topk : When true, aggregates approximate results to top-k. When
5902 false, returns the approximate results. The number of the approximate
5903 results is implementation defined and is greater equals to the specified
5904 `k`.
5905 name: Optional name for the operation.
5907 Returns:
5908 Tuple of two arrays. The arrays are the max `k` values and the
5909 corresponding indices along the `reduction_dimension` of the input
5910 `operand`. The arrays' dimensions are the same as the input `operand`
5911 except for the `reduction_dimension`: when `aggregate_to_topk` is true,
5912 the reduction dimension is `k`; otherwise, it is greater equals to `k`
5913 where the size is implementation-defined.
5915 We encourage users to wrap `approx_max_k` with jit. See the following
5916 example for maximal inner production search (MIPS):
5918 >>> import tensorflow as tf
5919 >>> @tf.function(jit_compile=True)
5920 ... def mips(qy, db, k=10, recall_target=0.95):
5921 ... dists = tf.einsum('ik,jk->ij', qy, db)
5922 ... # returns (f32[qy_size, k], i32[qy_size, k])
5923 ... return tf.nn.approx_max_k(dists, k=k, recall_target=recall_target)
5924 >>>
5925 >>> qy = tf.random.uniform((256,128))
5926 >>> db = tf.random.uniform((2048,128))
5927 >>> dot_products, neighbors = mips(qy, db, k=20)
5928 """
5929 return gen_nn_ops.approx_top_k(
5930 operand,
5931 k=k,
5932 reduction_dimension=reduction_dimension,
5933 recall_target=recall_target,
5934 is_max_k=True,
5935 reduction_input_size_override=reduction_input_size_override,
5936 aggregate_to_topk=aggregate_to_topk,
5937 name=name)
5940@tf_export("math.approx_min_k", "nn.approx_min_k")
5941@dispatch.add_dispatch_support
5942def approx_min_k(operand,
5943 k,
5944 reduction_dimension=-1,
5945 recall_target=0.95,
5946 reduction_input_size_override=-1,
5947 aggregate_to_topk=True,
5948 name=None):
5949 """Returns min `k` values and their indices of the input `operand` in an approximate manner.
5951 See https://arxiv.org/abs/2206.14286 for the algorithm details. This op is
5952 only optimized on TPU currently.
5954 Args:
5955 operand : Array to search for min-k. Must be a floating number type.
5956 k : Specifies the number of min-k.
5957 reduction_dimension: Integer dimension along which to search. Default: -1.
5958 recall_target: Recall target for the approximation.
5959 reduction_input_size_override : When set to a positive value, it overrides
5960 the size determined by `operand[reduction_dim]` for evaluating the recall.
5961 This option is useful when the given `operand` is only a subset of the
5962 overall computation in SPMD or distributed pipelines, where the true input
5963 size cannot be deferred by the `operand` shape.
5964 aggregate_to_topk: When true, aggregates approximate results to top-k. When
5965 false, returns the approximate results. The number of the approximate
5966 results is implementation defined and is greater equals to the specified
5967 `k`.
5968 name: Optional name for the operation.
5970 Returns:
5971 Tuple of two arrays. The arrays are the least `k` values and the
5972 corresponding indices along the `reduction_dimension` of the input
5973 `operand`. The arrays' dimensions are the same as the input `operand`
5974 except for the `reduction_dimension`: when `aggregate_to_topk` is true,
5975 the reduction dimension is `k`; otherwise, it is greater equals to `k`
5976 where the size is implementation-defined.
5978 We encourage users to wrap `approx_min_k` with jit. See the following example
5979 for nearest neighbor search over the squared l2 distance:
5981 >>> import tensorflow as tf
5982 >>> @tf.function(jit_compile=True)
5983 ... def l2_ann(qy, db, half_db_norms, k=10, recall_target=0.95):
5984 ... dists = half_db_norms - tf.einsum('ik,jk->ij', qy, db)
5985 ... return tf.nn.approx_min_k(dists, k=k, recall_target=recall_target)
5986 >>>
5987 >>> qy = tf.random.uniform((256,128))
5988 >>> db = tf.random.uniform((2048,128))
5989 >>> half_db_norms = tf.norm(db, axis=1) / 2
5990 >>> dists, neighbors = l2_ann(qy, db, half_db_norms)
5992 In the example above, we compute `db_norms/2 - dot(qy, db^T)` instead of
5993 `qy^2 - 2 dot(qy, db^T) + db^2` for performance reason. The former uses less
5994 arithmetics and produces the same set of neighbors.
5995 """
5996 return gen_nn_ops.approx_top_k(
5997 operand,
5998 k=k,
5999 reduction_dimension=reduction_dimension,
6000 recall_target=recall_target,
6001 is_max_k=False,
6002 reduction_input_size_override=reduction_input_size_override,
6003 aggregate_to_topk=aggregate_to_topk,
6004 name=name)
6007def nth_element(input, n, reverse=False, name=None): # pylint: disable=redefined-builtin
6008 r"""Finds values of the `n`-th smallest value for the last dimension.
6010 Note that n is zero-indexed.
6012 If the input is a vector (rank-1), finds the entries which is the nth-smallest
6013 value in the vector and outputs their values as scalar tensor.
6015 For matrices (resp. higher rank input), computes the entries which is the
6016 nth-smallest value in each row (resp. vector along the last dimension). Thus,
6018 values.shape = input.shape[:-1]
6020 Args:
6021 input: 1-D or higher `Tensor` with last dimension at least `n+1`.
6022 n: A `Tensor` of type `int32`.
6023 0-D. Position of sorted vector to select along the last dimension (along
6024 each row for matrices). Valid range of n is `[0, input.shape[:-1])`
6025 reverse: An optional `bool`. Defaults to `False`.
6026 When set to True, find the nth-largest value in the vector and vice
6027 versa.
6028 name: A name for the operation (optional).
6030 Returns:
6031 A `Tensor`. Has the same type as `input`.
6032 The `n`-th order statistic along each last dimensional slice.
6033 """
6034 return gen_nn_ops.nth_element(input, n, reverse=reverse, name=name)
6037@tf_export(v1=["nn.fractional_max_pool"])
6038@dispatch.add_dispatch_support
6039@deprecation.deprecated(date=None, instructions="`seed2` and `deterministic` "
6040 "args are deprecated. Use fractional_max_pool_v2.")
6041def fractional_max_pool(value,
6042 pooling_ratio,
6043 pseudo_random=False,
6044 overlapping=False,
6045 deterministic=False,
6046 seed=0,
6047 seed2=0,
6048 name=None): # pylint: disable=redefined-builtin
6049 r"""Performs fractional max pooling on the input.
6051 This is a deprecated version of `fractional_max_pool`.
6053 Fractional max pooling is slightly different than regular max pooling. In
6054 regular max pooling, you downsize an input set by taking the maximum value of
6055 smaller N x N subsections of the set (often 2x2), and try to reduce the set by
6056 a factor of N, where N is an integer. Fractional max pooling, as you might
6057 expect from the word "fractional", means that the overall reduction ratio N
6058 does not have to be an integer.
6060 The sizes of the pooling regions are generated randomly but are fairly
6061 uniform. For example, let's look at the height dimension, and the constraints
6062 on the list of rows that will be pool boundaries.
6064 First we define the following:
6066 1. input_row_length : the number of rows from the input set
6067 2. output_row_length : which will be smaller than the input
6068 3. alpha = input_row_length / output_row_length : our reduction ratio
6069 4. K = floor(alpha)
6070 5. row_pooling_sequence : this is the result list of pool boundary rows
6072 Then, row_pooling_sequence should satisfy:
6074 1. a[0] = 0 : the first value of the sequence is 0
6075 2. a[end] = input_row_length : the last value of the sequence is the size
6076 3. K <= (a[i+1] - a[i]) <= K+1 : all intervals are K or K+1 size
6077 4. length(row_pooling_sequence) = output_row_length+1
6079 Args:
6080 value: A `Tensor`. 4-D with shape `[batch, height, width, channels]`.
6081 pooling_ratio: A list of `floats` that has length >= 4. Pooling ratio for
6082 each dimension of `value`, currently only supports row and col dimension
6083 and should be >= 1.0. For example, a valid pooling ratio looks like [1.0,
6084 1.44, 1.73, 1.0]. The first and last elements must be 1.0 because we don't
6085 allow pooling on batch and channels dimensions. 1.44 and 1.73 are pooling
6086 ratio on height and width dimensions respectively.
6087 pseudo_random: An optional `bool`. Defaults to `False`. When set to `True`,
6088 generates the pooling sequence in a pseudorandom fashion, otherwise, in a
6089 random fashion. Check (Graham, 2015) for difference between
6090 pseudorandom and random.
6091 overlapping: An optional `bool`. Defaults to `False`. When set to `True`,
6092 it means when pooling, the values at the boundary of adjacent pooling
6093 cells are used by both cells. For example:
6094 `index 0 1 2 3 4`
6095 `value 20 5 16 3 7`
6096 If the pooling sequence is [0, 2, 4], then 16, at index 2 will be used
6097 twice. The result would be [20, 16] for fractional max pooling.
6098 deterministic: An optional `bool`. Deprecated; use `fractional_max_pool_v2`
6099 instead.
6100 seed: An optional `int`. Defaults to `0`. If set to be non-zero, the
6101 random number generator is seeded by the given seed. Otherwise it is
6102 seeded by a random seed.
6103 seed2: An optional `int`. Deprecated; use `fractional_max_pool_v2` instead.
6104 name: A name for the operation (optional).
6106 Returns:
6107 A tuple of `Tensor` objects (`output`, `row_pooling_sequence`,
6108 `col_pooling_sequence`).
6109 output: Output `Tensor` after fractional max pooling. Has the same type as
6110 `value`.
6111 row_pooling_sequence: A `Tensor` of type `int64`.
6112 col_pooling_sequence: A `Tensor` of type `int64`.
6114 Raises:
6115 ValueError: If op determinism is enabled and either the seeds are not set or
6116 the "deterministic" argument is False.
6118 References:
6119 Fractional Max-Pooling:
6120 [Graham, 2015](https://arxiv.org/abs/1412.6071)
6121 ([pdf](https://arxiv.org/pdf/1412.6071.pdf))
6122 """
6123 if config.is_op_determinism_enabled() and (not seed or not seed2 or
6124 not deterministic):
6125 raise ValueError(
6126 f'tf.compat.v1.nn.fractional_max_pool requires "seed" and '
6127 f'"seed2" to be non-zero and "deterministic" to be true when op '
6128 f"determinism is enabled. Please pass in such values, e.g. by passing"
6129 f'"seed=1, seed2=1, deterministic=True". Got: seed={seed}, '
6130 f'seed2={seed2}, deterministic={deterministic}')
6131 return gen_nn_ops.fractional_max_pool(value, pooling_ratio, pseudo_random,
6132 overlapping, deterministic, seed, seed2,
6133 name)
6136@tf_export("nn.fractional_max_pool", v1=[])
6137@dispatch.add_dispatch_support
6138def fractional_max_pool_v2(value,
6139 pooling_ratio,
6140 pseudo_random=False,
6141 overlapping=False,
6142 seed=0,
6143 name=None): # pylint: disable=redefined-builtin
6144 r"""Performs fractional max pooling on the input.
6146 Fractional max pooling is slightly different than regular max pooling. In
6147 regular max pooling, you downsize an input set by taking the maximum value of
6148 smaller N x N subsections of the set (often 2x2), and try to reduce the set by
6149 a factor of N, where N is an integer. Fractional max pooling, as you might
6150 expect from the word "fractional", means that the overall reduction ratio N
6151 does not have to be an integer.
6153 The sizes of the pooling regions are generated randomly but are fairly
6154 uniform. For example, let's look at the height dimension, and the constraints
6155 on the list of rows that will be pool boundaries.
6157 First we define the following:
6159 1. input_row_length : the number of rows from the input set
6160 2. output_row_length : which will be smaller than the input
6161 3. alpha = input_row_length / output_row_length : our reduction ratio
6162 4. K = floor(alpha)
6163 5. row_pooling_sequence : this is the result list of pool boundary rows
6165 Then, row_pooling_sequence should satisfy:
6167 1. a[0] = 0 : the first value of the sequence is 0
6168 2. a[end] = input_row_length : the last value of the sequence is the size
6169 3. K <= (a[i+1] - a[i]) <= K+1 : all intervals are K or K+1 size
6170 4. length(row_pooling_sequence) = output_row_length+1
6172 Args:
6173 value: A `Tensor`. 4-D with shape `[batch, height, width, channels]`.
6174 pooling_ratio: An int or list of `ints` that has length `1`, `2` or `4`.
6175 Pooling ratio for each dimension of `value`, currently only supports row
6176 and col dimension and should be >= 1.0. For example, a valid pooling ratio
6177 looks like [1.0, 1.44, 1.73, 1.0]. The first and last elements must be 1.0
6178 because we don't allow pooling on batch and channels dimensions. 1.44 and
6179 1.73 are pooling ratio on height and width dimensions respectively.
6180 pseudo_random: An optional `bool`. Defaults to `False`. When set to `True`,
6181 generates the pooling sequence in a pseudorandom fashion, otherwise, in a
6182 random fashion. Check paper (Graham, 2015) for difference between
6183 pseudorandom and random.
6184 overlapping: An optional `bool`. Defaults to `False`. When set to `True`,
6185 it means when pooling, the values at the boundary of adjacent pooling
6186 cells are used by both cells. For example:
6187 `index 0 1 2 3 4`
6188 `value 20 5 16 3 7`
6189 If the pooling sequence is [0, 2, 4], then 16, at index 2 will be used
6190 twice. The result would be [20, 16] for fractional max pooling.
6191 seed: An optional `int`. Defaults to `0`. If set to be non-zero, the
6192 random number generator is seeded by the given seed. Otherwise it is
6193 seeded by a random seed.
6194 name: A name for the operation (optional).
6196 Returns:
6197 A tuple of `Tensor` objects (`output`, `row_pooling_sequence`,
6198 `col_pooling_sequence`).
6199 output: Output `Tensor` after fractional max pooling. Has the same type as
6200 `value`.
6201 row_pooling_sequence: A `Tensor` of type `int64`.
6202 col_pooling_sequence: A `Tensor` of type `int64`.
6204 Raises:
6205 ValueError: If no seed is specified and op determinism is enabled.
6207 References:
6208 Fractional Max-Pooling:
6209 [Graham, 2015](https://arxiv.org/abs/1412.6071)
6210 ([pdf](https://arxiv.org/pdf/1412.6071.pdf))
6211 """
6212 if (isinstance(pooling_ratio, (list, tuple))):
6213 if (pooling_ratio[0] != 1.0 or pooling_ratio[-1] != 1.0):
6214 raise ValueError(
6215 "`pooling_ratio` should have first and last elements with value 1.0. "
6216 f"Received: pooling_ratio={pooling_ratio}")
6217 for element in pooling_ratio:
6218 if element < 1.0:
6219 raise ValueError(
6220 f"`pooling_ratio` elements should be >= 1.0. "
6221 f"Received: pooling_ratio={pooling_ratio}")
6222 elif (isinstance(pooling_ratio, (int, float))):
6223 if pooling_ratio < 1.0:
6224 raise ValueError(
6225 "`pooling_ratio` should be >= 1.0. "
6226 f"Received: pooling_ratio={pooling_ratio}")
6227 else:
6228 raise ValueError(
6229 "`pooling_ratio` should be an int or a list of ints. "
6230 f"Received: pooling_ratio={pooling_ratio}")
6232 pooling_ratio = _get_sequence(pooling_ratio, 2, 3, "pooling_ratio")
6234 if seed == 0:
6235 if config.is_op_determinism_enabled():
6236 raise ValueError(
6237 f"tf.nn.fractional_max_pool requires a non-zero seed to be passed in "
6238 f"when determinism is enabled, but got seed={seed}. Please pass in a "
6239 f'non-zero seed, e.g. by passing "seed=1".')
6240 return gen_nn_ops.fractional_max_pool(value, pooling_ratio, pseudo_random,
6241 overlapping, deterministic=False,
6242 seed=0, seed2=0, name=name)
6243 else:
6244 seed1, seed2 = random_seed.get_seed(seed)
6245 return gen_nn_ops.fractional_max_pool(value, pooling_ratio, pseudo_random,
6246 overlapping, deterministic=True,
6247 seed=seed1, seed2=seed2, name=name)
6250@tf_export(v1=["nn.fractional_avg_pool"])
6251@dispatch.add_dispatch_support
6252@deprecation.deprecated(date=None, instructions="`seed2` and `deterministic` "
6253 "args are deprecated. Use fractional_avg_pool_v2.")
6254def fractional_avg_pool(value,
6255 pooling_ratio,
6256 pseudo_random=False,
6257 overlapping=False,
6258 deterministic=False,
6259 seed=0,
6260 seed2=0,
6261 name=None): # pylint: disable=redefined-builtin
6262 r"""Performs fractional average pooling on the input.
6264 This is a deprecated version of `fractional_avg_pool`.
6266 Fractional average pooling is similar to Fractional max pooling in the pooling
6267 region generation step. The only difference is that after pooling regions are
6268 generated, a mean operation is performed instead of a max operation in each
6269 pooling region.
6271 Args:
6272 value: A `Tensor`. 4-D with shape `[batch, height, width, channels]`.
6273 pooling_ratio: A list of `floats` that has length >= 4. Pooling ratio for
6274 each dimension of `value`, currently only supports row and col dimension
6275 and should be >= 1.0. For example, a valid pooling ratio looks like [1.0,
6276 1.44, 1.73, 1.0]. The first and last elements must be 1.0 because we don't
6277 allow pooling on batch and channels dimensions. 1.44 and 1.73 are pooling
6278 ratio on height and width dimensions respectively.
6279 pseudo_random: An optional `bool`. Defaults to `False`. When set to `True`,
6280 generates the pooling sequence in a pseudorandom fashion, otherwise, in a
6281 random fashion. Check paper (Graham, 2015) for difference between
6282 pseudorandom and random.
6283 overlapping: An optional `bool`. Defaults to `False`. When set to `True`,
6284 it means when pooling, the values at the boundary of adjacent pooling
6285 cells are used by both cells. For example:
6286 `index 0 1 2 3 4`
6287 `value 20 5 16 3 7`
6288 If the pooling sequence is [0, 2, 4], then 16, at index 2 will be used
6289 twice. The result would be [20, 16] for fractional avg pooling.
6290 deterministic: An optional `bool`. Deprecated; use `fractional_avg_pool_v2`
6291 instead.
6292 seed: An optional `int`. Defaults to `0`. If set to be non-zero, the
6293 random number generator is seeded by the given seed. Otherwise it is
6294 seeded by a random seed.
6295 seed2: An optional `int`. Deprecated; use `fractional_avg_pool_v2` instead.
6296 name: A name for the operation (optional).
6298 Returns:
6299 A tuple of `Tensor` objects (`output`, `row_pooling_sequence`,
6300 `col_pooling_sequence`).
6301 output: Output `Tensor` after fractional avg pooling. Has the same type as
6302 `value`.
6303 row_pooling_sequence: A `Tensor` of type `int64`.
6304 col_pooling_sequence: A `Tensor` of type `int64`.
6306 References:
6307 Fractional Max-Pooling:
6308 [Graham, 2015](https://arxiv.org/abs/1412.6071)
6309 ([pdf](https://arxiv.org/pdf/1412.6071.pdf))
6310 """
6311 return gen_nn_ops.fractional_avg_pool(value, pooling_ratio, pseudo_random,
6312 overlapping, deterministic, seed, seed2,
6313 name=name)
6316@tf_export("nn.fractional_avg_pool", v1=[])
6317@dispatch.add_dispatch_support
6318def fractional_avg_pool_v2(value,
6319 pooling_ratio,
6320 pseudo_random=False,
6321 overlapping=False,
6322 seed=0,
6323 name=None): # pylint: disable=redefined-builtin
6324 r"""Performs fractional average pooling on the input.
6326 Fractional average pooling is similar to Fractional max pooling in the pooling
6327 region generation step. The only difference is that after pooling regions are
6328 generated, a mean operation is performed instead of a max operation in each
6329 pooling region.
6331 Args:
6332 value: A `Tensor`. 4-D with shape `[batch, height, width, channels]`.
6333 pooling_ratio: A list of `floats` that has length >= 4. Pooling ratio for
6334 each dimension of `value`, currently only supports row and col dimension
6335 and should be >= 1.0. For example, a valid pooling ratio looks like [1.0,
6336 1.44, 1.73, 1.0]. The first and last elements must be 1.0 because we don't
6337 allow pooling on batch and channels dimensions. 1.44 and 1.73 are pooling
6338 ratio on height and width dimensions respectively.
6339 pseudo_random: An optional `bool`. Defaults to `False`. When set to `True`,
6340 generates the pooling sequence in a pseudorandom fashion, otherwise, in a
6341 random fashion. Check paper (Graham, 2015) for difference between
6342 pseudorandom and random.
6343 overlapping: An optional `bool`. Defaults to `False`. When set to `True`,
6344 it means when pooling, the values at the boundary of adjacent pooling
6345 cells are used by both cells. For example:
6346 `index 0 1 2 3 4`
6347 `value 20 5 16 3 7`
6348 If the pooling sequence is [0, 2, 4], then 16, at index 2 will be used
6349 twice. The result would be [20, 16] for fractional avg pooling.
6350 seed: An optional `int`. Defaults to `0`. If set to be non-zero, the
6351 random number generator is seeded by the given seed. Otherwise it is
6352 seeded by a random seed.
6353 name: A name for the operation (optional).
6355 Returns:
6356 A tuple of `Tensor` objects (`output`, `row_pooling_sequence`,
6357 `col_pooling_sequence`).
6358 output: Output `Tensor` after fractional avg pooling. Has the same type as
6359 `value`.
6360 row_pooling_sequence: A `Tensor` of type `int64`.
6361 col_pooling_sequence: A `Tensor` of type `int64`.
6363 References:
6364 Fractional Max-Pooling:
6365 [Graham, 2015](https://arxiv.org/abs/1412.6071)
6366 ([pdf](https://arxiv.org/pdf/1412.6071.pdf))
6367 """
6368 if seed == 0:
6369 return gen_nn_ops.fractional_avg_pool(value, pooling_ratio, pseudo_random,
6370 overlapping, deterministic=False,
6371 seed=0, seed2=0, name=name)
6372 else:
6373 seed1, seed2 = random_seed.get_seed(seed)
6374 return gen_nn_ops.fractional_avg_pool(value, pooling_ratio, pseudo_random,
6375 overlapping, deterministic=True,
6376 seed=seed1, seed2=seed2, name=name)
6379@ops.RegisterStatistics("Dilation2D", "flops")
6380def _calc_dilation2d_flops(graph, node):
6381 """Calculates the compute resources needed for Dilation2D."""
6382 input_shape = graph_util.tensor_shape_from_node_def_name(graph, node.input[0])
6383 input_shape.assert_is_fully_defined()
6384 filter_shape = graph_util.tensor_shape_from_node_def_name(
6385 graph, node.input[1])
6386 filter_shape.assert_is_fully_defined()
6387 output_shape = graph_util.tensor_shape_from_node_def_name(graph, node.name)
6388 output_shape.assert_is_fully_defined()
6389 filter_height = int(filter_shape[0])
6390 filter_width = int(filter_shape[1])
6391 output_count = np.prod(output_shape.as_list(), dtype=np.int64)
6392 return ops.OpStats("flops", (output_count * filter_height * filter_width * 2))
6395@tf_export(v1=["nn.erosion2d"])
6396@dispatch.add_dispatch_support
6397def erosion2d(value, kernel, strides, rates, padding, name=None):
6398 """Computes the grayscale erosion of 4-D `value` and 3-D `kernel` tensors.
6400 The `value` tensor has shape `[batch, in_height, in_width, depth]` and the
6401 `kernel` tensor has shape `[kernel_height, kernel_width, depth]`, i.e.,
6402 each input channel is processed independently of the others with its own
6403 structuring function. The `output` tensor has shape
6404 `[batch, out_height, out_width, depth]`. The spatial dimensions of the
6405 output tensor depend on the `padding` algorithm. We currently only support the
6406 default "NHWC" `data_format`.
6408 In detail, the grayscale morphological 2-D erosion is given by:
6410 output[b, y, x, c] =
6411 min_{dy, dx} value[b,
6412 strides[1] * y - rates[1] * dy,
6413 strides[2] * x - rates[2] * dx,
6414 c] -
6415 kernel[dy, dx, c]
6417 Duality: The erosion of `value` by the `kernel` is equal to the negation of
6418 the dilation of `-value` by the reflected `kernel`.
6420 Args:
6421 value: A `Tensor`. 4-D with shape `[batch, in_height, in_width, depth]`.
6422 kernel: A `Tensor`. Must have the same type as `value`.
6423 3-D with shape `[kernel_height, kernel_width, depth]`.
6424 strides: A list of `ints` that has length `>= 4`.
6425 1-D of length 4. The stride of the sliding window for each dimension of
6426 the input tensor. Must be: `[1, stride_height, stride_width, 1]`.
6427 rates: A list of `ints` that has length `>= 4`.
6428 1-D of length 4. The input stride for atrous morphological dilation.
6429 Must be: `[1, rate_height, rate_width, 1]`.
6430 padding: A `string` from: `"SAME", "VALID"`.
6431 The type of padding algorithm to use.
6432 name: A name for the operation (optional). If not specified "erosion2d"
6433 is used.
6435 Returns:
6436 A `Tensor`. Has the same type as `value`.
6437 4-D with shape `[batch, out_height, out_width, depth]`.
6438 Raises:
6439 ValueError: If the `value` depth does not match `kernel`' shape, or if
6440 padding is other than `'VALID'` or `'SAME'`.
6441 """
6442 with ops.name_scope(name, "erosion2d", [value, kernel]) as name:
6443 # Reduce erosion to dilation by duality.
6444 return math_ops.negative(
6445 gen_nn_ops.dilation2d(
6446 input=math_ops.negative(value),
6447 filter=array_ops.reverse_v2(kernel, [0, 1]),
6448 strides=strides,
6449 rates=rates,
6450 padding=padding,
6451 name=name))
6454@tf_export("nn.erosion2d", v1=[])
6455@dispatch.add_dispatch_support
6456def erosion2d_v2(value,
6457 filters,
6458 strides,
6459 padding,
6460 data_format,
6461 dilations,
6462 name=None):
6463 """Computes the grayscale erosion of 4-D `value` and 3-D `filters` tensors.
6465 The `value` tensor has shape `[batch, in_height, in_width, depth]` and the
6466 `filters` tensor has shape `[filters_height, filters_width, depth]`, i.e.,
6467 each input channel is processed independently of the others with its own
6468 structuring function. The `output` tensor has shape
6469 `[batch, out_height, out_width, depth]`. The spatial dimensions of the
6470 output tensor depend on the `padding` algorithm. We currently only support the
6471 default "NHWC" `data_format`.
6473 In detail, the grayscale morphological 2-D erosion is given by:
6475 output[b, y, x, c] =
6476 min_{dy, dx} value[b,
6477 strides[1] * y - dilations[1] * dy,
6478 strides[2] * x - dilations[2] * dx,
6479 c] -
6480 filters[dy, dx, c]
6482 Duality: The erosion of `value` by the `filters` is equal to the negation of
6483 the dilation of `-value` by the reflected `filters`.
6485 Args:
6486 value: A `Tensor`. 4-D with shape `[batch, in_height, in_width, depth]`.
6487 filters: A `Tensor`. Must have the same type as `value`.
6488 3-D with shape `[filters_height, filters_width, depth]`.
6489 strides: A list of `ints` that has length `>= 4`.
6490 1-D of length 4. The stride of the sliding window for each dimension of
6491 the input tensor. Must be: `[1, stride_height, stride_width, 1]`.
6492 padding: A `string` from: `"SAME", "VALID"`.
6493 The type of padding algorithm to use. See
6494 [here](https://www.tensorflow.org/api_docs/python/tf/nn#notes_on_padding_2)
6495 for more information.
6496 data_format: A `string`, only `"NHWC"` is currently supported.
6497 dilations: A list of `ints` that has length `>= 4`.
6498 1-D of length 4. The input stride for atrous morphological dilation.
6499 Must be: `[1, rate_height, rate_width, 1]`.
6500 name: A name for the operation (optional). If not specified "erosion2d"
6501 is used.
6503 Returns:
6504 A `Tensor`. Has the same type as `value`.
6505 4-D with shape `[batch, out_height, out_width, depth]`.
6507 Raises:
6508 ValueError: If the `value` depth does not match `filters`' shape, or if
6509 padding is other than `'VALID'` or `'SAME'`.
6510 """
6511 if data_format != "NHWC":
6512 raise ValueError("`data_format` values other than 'NHWC' are not "
6513 f"supported. Received: data_format={data_format}")
6515 with ops.name_scope(name, "erosion2d", [value, filters]) as name:
6516 # Reduce erosion to dilation by duality.
6517 return math_ops.negative(
6518 gen_nn_ops.dilation2d(
6519 input=math_ops.negative(value),
6520 filter=array_ops.reverse_v2(filters, [0, 1]),
6521 strides=strides,
6522 rates=dilations,
6523 padding=padding,
6524 name=name))
6527@tf_export(v1=["math.in_top_k", "nn.in_top_k"])
6528@dispatch.add_dispatch_support
6529def in_top_k(predictions, targets, k, name=None):
6530 r"""Says whether the targets are in the top `K` predictions.
6532 This outputs a `batch_size` bool array, an entry `out[i]` is `true` if the
6533 prediction for the target class is finite (not inf, -inf, or nan) and among
6534 the top `k` predictions among all predictions for example `i`. Note that the
6535 behavior of `InTopK` differs from the `TopK` op in its handling of ties; if
6536 multiple classes have the same prediction value and straddle the top-`k`
6537 boundary, all of those classes are considered to be in the top `k`.
6539 More formally, let
6541 \\(predictions_i\\) be the predictions for all classes for example `i`,
6542 \\(targets_i\\) be the target class for example `i`,
6543 \\(out_i\\) be the output for example `i`,
6545 $$out_i = predictions_{i, targets_i} \in TopKIncludingTies(predictions_i)$$
6547 Args:
6548 predictions: A `Tensor` of type `float32`.
6549 A `batch_size` x `classes` tensor.
6550 targets: A `Tensor`. Must be one of the following types: `int32`, `int64`.
6551 A `batch_size` vector of class ids.
6552 k: An `int`. Number of top elements to look at for computing precision.
6553 name: A name for the operation (optional).
6555 Returns:
6556 A `Tensor` of type `bool`. Computed Precision at `k` as a `bool Tensor`.
6557 """
6558 with ops.name_scope(name, "in_top_k"):
6559 return gen_nn_ops.in_top_kv2(predictions, targets, k, name=name)
6562@tf_export("math.in_top_k", "nn.in_top_k", v1=[])
6563@dispatch.add_dispatch_support
6564def in_top_k_v2(targets, predictions, k, name=None):
6565 """Outputs whether the targets are in the top `K` predictions.
6567 This outputs a `batch_size` bool array, an entry `out[i]` is `true` if the
6568 prediction for the target class is finite (not inf, -inf, or nan) and among
6569 the top `k` predictions among all predictions for example `i`.
6570 `predictions` does not have to be normalized.
6572 Note that the behavior of `InTopK` differs from the `TopK` op in its handling
6573 of ties; if multiple classes have the same prediction value and straddle the
6574 top-`k` boundary, all of those classes are considered to be in the top `k`.
6576 >>> target = tf.constant([0, 1, 3])
6577 >>> pred = tf.constant([
6578 ... [1.2, -0.3, 2.8, 5.2],
6579 ... [0.1, 0.0, 0.0, 0.0],
6580 ... [0.0, 0.5, 0.3, 0.3]],
6581 ... dtype=tf.float32)
6582 >>> print(tf.math.in_top_k(target, pred, 2))
6583 tf.Tensor([False True True], shape=(3,), dtype=bool)
6585 Args:
6586 targets: A `batch_size` vector of class ids. Must be `int32` or `int64`.
6587 predictions: A `batch_size` x `classes` tensor of type `float32`.
6588 k: An `int`. The parameter to specify search space.
6589 name: A name for the operation (optional).
6591 Returns:
6592 A `Tensor` with the same shape of `targets` with type of `bool`. Each
6593 element specifies if the target falls into top-k predictions.
6594 """
6595 return in_top_k(predictions, targets, k, name)
6598tf_export(v1=["nn.quantized_avg_pool"])(
6599 dispatch.add_dispatch_support(gen_nn_ops.quantized_avg_pool))
6600tf_export(v1=["nn.quantized_conv2d"])(
6601 dispatch.add_dispatch_support(gen_nn_ops.quantized_conv2d))
6602tf_export(v1=["nn.quantized_relu_x"])(
6603 dispatch.add_dispatch_support(gen_nn_ops.quantized_relu_x))
6604tf_export(v1=["nn.quantized_max_pool"])(
6605 dispatch.add_dispatch_support(gen_nn_ops.quantized_max_pool))
6608@tf_export("nn.isotonic_regression", v1=[])
6609@dispatch.add_dispatch_support
6610def isotonic_regression(inputs, decreasing=True, axis=-1):
6611 r"""Solves isotonic regression problems along the given axis.
6613 For each vector x, the problem solved is
6615 $$\argmin_{y_1 >= y_2 >= ... >= y_n} \sum_i (x_i - y_i)^2.$$
6617 As the solution is component-wise constant, a second tensor is returned that
6618 encodes the segments. The problems are solved over the given axis.
6620 Consider the following example, where we solve a batch of two problems. The
6621 first input is [3, 1, 2], while the second [1, 3, 4] (as the axis is 1).
6622 >>> x = tf.constant([[3, 1, 2], [1, 3, 4]], dtype=tf.float32)
6623 >>> y, segments = tf.nn.isotonic_regression(x, axis=1)
6624 >>> y # The solution.
6625 <tf.Tensor: shape=(2, 3), dtype=float32, numpy=
6626 array([[3. , 1.5 , 1.5 ],
6627 [2.6666667, 2.6666667, 2.6666667]], dtype=float32)>
6629 Note that the first solution has two blocks [2] and [1.5, 1.5]. The second
6630 solution is constant, and thus has a single segment. These segments are
6631 exactly what the second returned tensor encodes:
6633 >>> segments
6634 <tf.Tensor: shape=(2, 3), dtype=int32, numpy=
6635 array([[0, 1, 1],
6636 [0, 0, 0]], dtype=int32)>
6639 Args:
6640 inputs: A tensor holding the inputs.
6641 decreasing: If set to False, the inequalities in the optimizing constrained
6642 are flipped.
6643 axis: The axis along which the problems should be solved.
6645 Returns:
6646 output: The solutions, same shape as type as the input.
6647 segments: An int32 tensor, same shape as the input indicating the segments
6648 that have the same value. Specifically, those positions that have the same
6649 value correspond to the same segment. These values start at zero, and are
6650 monotonously increasing for each solution.
6651 """
6652 type_promotions = {
6653 # Float types get mapped to themselves, int8/16 to float32, rest to double
6654 dtypes.float32:
6655 dtypes.float32,
6656 dtypes.half:
6657 dtypes.half,
6658 dtypes.bfloat16:
6659 dtypes.bfloat16,
6660 dtypes.int8:
6661 dtypes.float32,
6662 dtypes.int16:
6663 dtypes.float32,
6664 }
6665 inputs = ops.convert_to_tensor(inputs)
6666 try:
6667 output_dtype = type_promotions[inputs.dtype]
6668 except KeyError:
6669 output_dtype = dtypes.float64
6671 def compute_on_matrix(matrix, name=None):
6672 iso_fn = functools.partial(
6673 gen_nn_ops.isotonic_regression, output_dtype=output_dtype, name=name)
6674 if decreasing:
6675 return iso_fn(matrix)
6676 else:
6677 output, segments = iso_fn(-matrix)
6678 return -output, segments
6680 return _wrap_2d_function(inputs, compute_on_matrix, axis)
6683# Register elementwise ops that don't have Python wrappers.
6684# Unary elementwise ops.
6685dispatch.register_unary_elementwise_api(gen_nn_ops.elu)
6686dispatch.register_unary_elementwise_api(gen_nn_ops.relu)
6687dispatch.register_unary_elementwise_api(gen_nn_ops.selu)
6688dispatch.register_unary_elementwise_api(gen_nn_ops.softsign)