{ "cells": [ { "cell_type": "markdown", "metadata": { "id": "Tce3stUlHN0L" }, "source": [ "##### Copyright 2018 The TensorFlow Authors.\n" ] }, { "cell_type": "code", "execution_count": 1, "metadata": { "cellView": "form", "execution": { "iopub.execute_input": "2024-07-19T02:13:38.336961Z", "iopub.status.busy": "2024-07-19T02:13:38.336311Z", "iopub.status.idle": "2024-07-19T02:13:38.340275Z", "shell.execute_reply": "2024-07-19T02:13:38.339640Z" }, "id": "tuOe1ymfHZPu" }, "outputs": [], "source": [ "#@title Licensed under the Apache License, Version 2.0 (the \"License\");\n", "# you may not use this file except in compliance with the License.\n", "# You may obtain a copy of the License at\n", "#\n", "# https://www.apache.org/licenses/LICENSE-2.0\n", "#\n", "# Unless required by applicable law or agreed to in writing, software\n", "# distributed under the License is distributed on an \"AS IS\" BASIS,\n", "# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n", "# See the License for the specific language governing permissions and\n", "# limitations under the License." ] }, { "cell_type": "markdown", "metadata": { "id": "MfBg1C5NB3X0" }, "source": [ "# Use a GPU\n", "\n", "\n", " \n", " \n", " \n", " \n", "
\n", " View on TensorFlow.org\n", " \n", " Run in Google Colab\n", " \n", " View source on GitHub\n", " \n", " Download notebook\n", "
" ] }, { "cell_type": "markdown", "metadata": { "id": "SoYIwe40vEPI" }, "source": [ "TensorFlow code, and `tf.keras` models will transparently run on a single GPU with no code changes required.\n", "\n", "Note: Use `tf.config.list_physical_devices('GPU')` to confirm that TensorFlow is using the GPU.\n", "\n", "The simplest way to run on multiple GPUs, on one or many machines, is using [Distribution Strategies](distributed_training.ipynb).\n", "\n", "This guide is for users who have tried these approaches and found that they need fine-grained control of how TensorFlow uses the GPU. To learn how to debug performance issues for single and multi-GPU scenarios, see the [Optimize TensorFlow GPU Performance](gpu_performance_analysis.md) guide." ] }, { "cell_type": "markdown", "metadata": { "id": "MUXex9ctTuDB" }, "source": [ "## Setup\n", "\n", "Ensure you have the latest TensorFlow gpu release installed." ] }, { "cell_type": "code", "execution_count": 2, "metadata": { "execution": { "iopub.execute_input": "2024-07-19T02:13:38.343489Z", "iopub.status.busy": "2024-07-19T02:13:38.343261Z", "iopub.status.idle": "2024-07-19T02:13:41.300192Z", "shell.execute_reply": "2024-07-19T02:13:41.299327Z" }, "id": "IqR2PQG4ZaZ0" }, "outputs": [ { "name": "stderr", "output_type": "stream", "text": [ "2024-07-19 02:13:38.598047: E external/local_xla/xla/stream_executor/cuda/cuda_fft.cc:485] Unable to register cuFFT factory: Attempting to register factory for plugin cuFFT when one has already been registered\n", "2024-07-19 02:13:38.619458: E external/local_xla/xla/stream_executor/cuda/cuda_dnn.cc:8454] Unable to register cuDNN factory: Attempting to register factory for plugin cuDNN when one has already been registered\n", "2024-07-19 02:13:38.625956: E external/local_xla/xla/stream_executor/cuda/cuda_blas.cc:1452] Unable to register cuBLAS factory: Attempting to register factory for plugin cuBLAS when one has already been registered\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "Num GPUs Available: 4\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "WARNING: All log messages before absl::InitializeLog() is called are written to STDERR\n", "I0000 00:00:1721355221.243317 76763 cuda_executor.cc:1015] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero. See more at https://github.com/torvalds/linux/blob/v6.0/Documentation/ABI/testing/sysfs-bus-pci#L344-L355\n", "I0000 00:00:1721355221.247092 76763 cuda_executor.cc:1015] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero. See more at https://github.com/torvalds/linux/blob/v6.0/Documentation/ABI/testing/sysfs-bus-pci#L344-L355\n", "I0000 00:00:1721355221.250904 76763 cuda_executor.cc:1015] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero. See more at https://github.com/torvalds/linux/blob/v6.0/Documentation/ABI/testing/sysfs-bus-pci#L344-L355\n", "I0000 00:00:1721355221.254688 76763 cuda_executor.cc:1015] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero. See more at https://github.com/torvalds/linux/blob/v6.0/Documentation/ABI/testing/sysfs-bus-pci#L344-L355\n", "I0000 00:00:1721355221.266379 76763 cuda_executor.cc:1015] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero. See more at https://github.com/torvalds/linux/blob/v6.0/Documentation/ABI/testing/sysfs-bus-pci#L344-L355\n", "I0000 00:00:1721355221.269905 76763 cuda_executor.cc:1015] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero. See more at https://github.com/torvalds/linux/blob/v6.0/Documentation/ABI/testing/sysfs-bus-pci#L344-L355\n", "I0000 00:00:1721355221.273592 76763 cuda_executor.cc:1015] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero. See more at https://github.com/torvalds/linux/blob/v6.0/Documentation/ABI/testing/sysfs-bus-pci#L344-L355\n", "I0000 00:00:1721355221.277097 76763 cuda_executor.cc:1015] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero. See more at https://github.com/torvalds/linux/blob/v6.0/Documentation/ABI/testing/sysfs-bus-pci#L344-L355\n", "I0000 00:00:1721355221.282392 76763 cuda_executor.cc:1015] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero. See more at https://github.com/torvalds/linux/blob/v6.0/Documentation/ABI/testing/sysfs-bus-pci#L344-L355\n", "I0000 00:00:1721355221.285771 76763 cuda_executor.cc:1015] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero. See more at https://github.com/torvalds/linux/blob/v6.0/Documentation/ABI/testing/sysfs-bus-pci#L344-L355\n", "I0000 00:00:1721355221.289315 76763 cuda_executor.cc:1015] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero. See more at https://github.com/torvalds/linux/blob/v6.0/Documentation/ABI/testing/sysfs-bus-pci#L344-L355\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "I0000 00:00:1721355221.292789 76763 cuda_executor.cc:1015] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero. See more at https://github.com/torvalds/linux/blob/v6.0/Documentation/ABI/testing/sysfs-bus-pci#L344-L355\n" ] } ], "source": [ "import tensorflow as tf\n", "print(\"Num GPUs Available: \", len(tf.config.list_physical_devices('GPU')))" ] }, { "cell_type": "markdown", "metadata": { "id": "ZELutYNetv-v" }, "source": [ "## Overview\n" ] }, { "cell_type": "markdown", "metadata": { "id": "xHxb-dlhMIzW" }, "source": [ "TensorFlow supports running computations on a variety of types of devices, including CPU and GPU. They are represented with string identifiers for example:\n", "\n", "* `\"/device:CPU:0\"`: The CPU of your machine.\n", "* `\"/GPU:0\"`: Short-hand notation for the first GPU of your machine that is visible to TensorFlow.\n", "* `\"/job:localhost/replica:0/task:0/device:GPU:1\"`: Fully qualified name of the second GPU of your machine that is visible to TensorFlow.\n", "\n", "If a TensorFlow operation has both CPU and GPU implementations, by default, the GPU device is prioritized when the operation is assigned. For example, `tf.matmul` has both CPU and GPU kernels and on a system with devices `CPU:0` and `GPU:0`, the `GPU:0` device is selected to run `tf.matmul` unless you explicitly request to run it on another device.\n", "\n", "If a TensorFlow operation has no corresponding GPU implementation, then the operation falls back to the CPU device. For example, since `tf.cast` only has a CPU kernel, on a system with devices `CPU:0` and `GPU:0`, the `CPU:0` device is selected to run `tf.cast`, even if requested to run on the `GPU:0` device." ] }, { "cell_type": "markdown", "metadata": { "id": "UhNtHfuxCGVy" }, "source": [ "## Logging device placement\n", "\n", "To find out which devices your operations and tensors are assigned to, put\n", "`tf.debugging.set_log_device_placement(True)` as the first statement of your\n", "program. Enabling device placement logging causes any Tensor allocations or operations to be printed." ] }, { "cell_type": "code", "execution_count": 3, "metadata": { "execution": { "iopub.execute_input": "2024-07-19T02:13:41.304982Z", "iopub.status.busy": "2024-07-19T02:13:41.304534Z", "iopub.status.idle": "2024-07-19T02:13:43.040739Z", "shell.execute_reply": "2024-07-19T02:13:43.040040Z" }, "id": "2Dbw0tpEirCd" }, "outputs": [ { "name": "stderr", "output_type": "stream", "text": [ "I0000 00:00:1721355222.552621 76763 cuda_executor.cc:1015] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero. See more at https://github.com/torvalds/linux/blob/v6.0/Documentation/ABI/testing/sysfs-bus-pci#L344-L355\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "Executing op _EagerConst in device /job:localhost/replica:0/task:0/device:GPU:0\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "Executing op _EagerConst in device /job:localhost/replica:0/task:0/device:GPU:0\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "Executing op MatMul in device /job:localhost/replica:0/task:0/device:GPU:0\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "tf.Tensor(\n", "[[22. 28.]\n", " [49. 64.]], shape=(2, 2), dtype=float32)\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "I0000 00:00:1721355222.554709 76763 cuda_executor.cc:1015] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero. See more at https://github.com/torvalds/linux/blob/v6.0/Documentation/ABI/testing/sysfs-bus-pci#L344-L355\n", "I0000 00:00:1721355222.556776 76763 cuda_executor.cc:1015] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero. See more at https://github.com/torvalds/linux/blob/v6.0/Documentation/ABI/testing/sysfs-bus-pci#L344-L355\n", "I0000 00:00:1721355222.558807 76763 cuda_executor.cc:1015] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero. See more at https://github.com/torvalds/linux/blob/v6.0/Documentation/ABI/testing/sysfs-bus-pci#L344-L355\n", "I0000 00:00:1721355222.560875 76763 cuda_executor.cc:1015] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero. See more at https://github.com/torvalds/linux/blob/v6.0/Documentation/ABI/testing/sysfs-bus-pci#L344-L355\n", "I0000 00:00:1721355222.562774 76763 cuda_executor.cc:1015] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero. See more at https://github.com/torvalds/linux/blob/v6.0/Documentation/ABI/testing/sysfs-bus-pci#L344-L355\n", "I0000 00:00:1721355222.564755 76763 cuda_executor.cc:1015] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero. See more at https://github.com/torvalds/linux/blob/v6.0/Documentation/ABI/testing/sysfs-bus-pci#L344-L355\n", "I0000 00:00:1721355222.566695 76763 cuda_executor.cc:1015] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero. See more at https://github.com/torvalds/linux/blob/v6.0/Documentation/ABI/testing/sysfs-bus-pci#L344-L355\n", "I0000 00:00:1721355222.568668 76763 cuda_executor.cc:1015] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero. See more at https://github.com/torvalds/linux/blob/v6.0/Documentation/ABI/testing/sysfs-bus-pci#L344-L355\n", "I0000 00:00:1721355222.570549 76763 cuda_executor.cc:1015] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero. See more at https://github.com/torvalds/linux/blob/v6.0/Documentation/ABI/testing/sysfs-bus-pci#L344-L355\n", "I0000 00:00:1721355222.572507 76763 cuda_executor.cc:1015] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero. See more at https://github.com/torvalds/linux/blob/v6.0/Documentation/ABI/testing/sysfs-bus-pci#L344-L355\n", "I0000 00:00:1721355222.574444 76763 cuda_executor.cc:1015] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero. See more at https://github.com/torvalds/linux/blob/v6.0/Documentation/ABI/testing/sysfs-bus-pci#L344-L355\n", "I0000 00:00:1721355222.613797 76763 cuda_executor.cc:1015] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero. See more at https://github.com/torvalds/linux/blob/v6.0/Documentation/ABI/testing/sysfs-bus-pci#L344-L355\n", "I0000 00:00:1721355222.615808 76763 cuda_executor.cc:1015] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero. See more at https://github.com/torvalds/linux/blob/v6.0/Documentation/ABI/testing/sysfs-bus-pci#L344-L355\n", "I0000 00:00:1721355222.617838 76763 cuda_executor.cc:1015] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero. See more at https://github.com/torvalds/linux/blob/v6.0/Documentation/ABI/testing/sysfs-bus-pci#L344-L355\n", "I0000 00:00:1721355222.619804 76763 cuda_executor.cc:1015] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero. See more at https://github.com/torvalds/linux/blob/v6.0/Documentation/ABI/testing/sysfs-bus-pci#L344-L355\n", "I0000 00:00:1721355222.621825 76763 cuda_executor.cc:1015] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero. See more at https://github.com/torvalds/linux/blob/v6.0/Documentation/ABI/testing/sysfs-bus-pci#L344-L355\n", "I0000 00:00:1721355222.623745 76763 cuda_executor.cc:1015] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero. See more at https://github.com/torvalds/linux/blob/v6.0/Documentation/ABI/testing/sysfs-bus-pci#L344-L355\n", "I0000 00:00:1721355222.625704 76763 cuda_executor.cc:1015] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero. See more at https://github.com/torvalds/linux/blob/v6.0/Documentation/ABI/testing/sysfs-bus-pci#L344-L355\n", "I0000 00:00:1721355222.627675 76763 cuda_executor.cc:1015] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero. See more at https://github.com/torvalds/linux/blob/v6.0/Documentation/ABI/testing/sysfs-bus-pci#L344-L355\n", "I0000 00:00:1721355222.629698 76763 cuda_executor.cc:1015] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero. See more at https://github.com/torvalds/linux/blob/v6.0/Documentation/ABI/testing/sysfs-bus-pci#L344-L355\n", "I0000 00:00:1721355222.632100 76763 cuda_executor.cc:1015] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero. See more at https://github.com/torvalds/linux/blob/v6.0/Documentation/ABI/testing/sysfs-bus-pci#L344-L355\n", "I0000 00:00:1721355222.634502 76763 cuda_executor.cc:1015] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero. See more at https://github.com/torvalds/linux/blob/v6.0/Documentation/ABI/testing/sysfs-bus-pci#L344-L355\n", "I0000 00:00:1721355222.636783 76763 cuda_executor.cc:1015] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero. See more at https://github.com/torvalds/linux/blob/v6.0/Documentation/ABI/testing/sysfs-bus-pci#L344-L355\n" ] } ], "source": [ "tf.debugging.set_log_device_placement(True)\n", "\n", "# Create some tensors\n", "a = tf.constant([[1.0, 2.0, 3.0], [4.0, 5.0, 6.0]])\n", "b = tf.constant([[1.0, 2.0], [3.0, 4.0], [5.0, 6.0]])\n", "c = tf.matmul(a, b)\n", "\n", "print(c)" ] }, { "cell_type": "markdown", "metadata": { "id": "kKhmFeraTdEI" }, "source": [ "The above code will print an indication the `MatMul` op was executed on `GPU:0`." ] }, { "cell_type": "markdown", "metadata": { "id": "U88FspwGjB7W" }, "source": [ "## Manual device placement\n", "\n", "If you would like a particular operation to run on a device of your choice\n", "instead of what's automatically selected for you, you can use `with tf.device`\n", "to create a device context, and all the operations within that context will\n", "run on the same designated device." ] }, { "cell_type": "code", "execution_count": 4, "metadata": { "execution": { "iopub.execute_input": "2024-07-19T02:13:43.044383Z", "iopub.status.busy": "2024-07-19T02:13:43.044142Z", "iopub.status.idle": "2024-07-19T02:13:43.050059Z", "shell.execute_reply": "2024-07-19T02:13:43.049466Z" }, "id": "8wqaQfEhjHit" }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Executing op MatMul in device /job:localhost/replica:0/task:0/device:GPU:0\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "tf.Tensor(\n", "[[22. 28.]\n", " [49. 64.]], shape=(2, 2), dtype=float32)\n" ] } ], "source": [ "tf.debugging.set_log_device_placement(True)\n", "\n", "# Place tensors on the CPU\n", "with tf.device('/CPU:0'):\n", " a = tf.constant([[1.0, 2.0, 3.0], [4.0, 5.0, 6.0]])\n", " b = tf.constant([[1.0, 2.0], [3.0, 4.0], [5.0, 6.0]])\n", "\n", "# Run on the GPU\n", "c = tf.matmul(a, b)\n", "print(c)" ] }, { "cell_type": "markdown", "metadata": { "id": "8ixO89gRjJUu" }, "source": [ "You will see that now `a` and `b` are assigned to `CPU:0`. Since a device was\n", "not explicitly specified for the `MatMul` operation, the TensorFlow runtime will\n", "choose one based on the operation and available devices (`GPU:0` in this\n", "example) and automatically copy tensors between devices if required." ] }, { "cell_type": "markdown", "metadata": { "id": "ARrRhwqijPzN" }, "source": [ "## Limiting GPU memory growth\n", "\n", "By default, TensorFlow maps nearly all of the GPU memory of all GPUs (subject to\n", "[`CUDA_VISIBLE_DEVICES`](https://docs.nvidia.com/cuda/cuda-c-programming-guide/index.html#env-vars)) visible to the process. This is done to more efficiently use the relatively precious GPU memory resources on the devices by reducing memory fragmentation. To limit TensorFlow to a specific set of GPUs, use the `tf.config.set_visible_devices` method." ] }, { "cell_type": "code", "execution_count": 5, "metadata": { "execution": { "iopub.execute_input": "2024-07-19T02:13:43.053008Z", "iopub.status.busy": "2024-07-19T02:13:43.052789Z", "iopub.status.idle": "2024-07-19T02:13:43.057083Z", "shell.execute_reply": "2024-07-19T02:13:43.056527Z" }, "id": "hPI--n_jhZhv" }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Visible devices cannot be modified after being initialized\n" ] } ], "source": [ "gpus = tf.config.list_physical_devices('GPU')\n", "if gpus:\n", " # Restrict TensorFlow to only use the first GPU\n", " try:\n", " tf.config.set_visible_devices(gpus[0], 'GPU')\n", " logical_gpus = tf.config.list_logical_devices('GPU')\n", " print(len(gpus), \"Physical GPUs,\", len(logical_gpus), \"Logical GPU\")\n", " except RuntimeError as e:\n", " # Visible devices must be set before GPUs have been initialized\n", " print(e)" ] }, { "cell_type": "markdown", "metadata": { "id": "N3x4M55DhYk9" }, "source": [ "In some cases it is desirable for the process to only allocate a subset of the available memory, or to only grow the memory usage as is needed by the process. TensorFlow provides two methods to control this.\n", "\n", "The first option is to turn on memory growth by calling `tf.config.experimental.set_memory_growth`, which attempts to allocate only as much GPU memory as needed for the runtime allocations: it starts out allocating very little memory, and as the program gets run and more GPU memory is needed, the GPU memory region is extended for the TensorFlow process. Memory is not released since it can lead to memory fragmentation. To turn on memory growth for a specific GPU, use the following code prior to allocating any tensors or executing any ops." ] }, { "cell_type": "code", "execution_count": 6, "metadata": { "execution": { "iopub.execute_input": "2024-07-19T02:13:43.060003Z", "iopub.status.busy": "2024-07-19T02:13:43.059786Z", "iopub.status.idle": "2024-07-19T02:13:43.063863Z", "shell.execute_reply": "2024-07-19T02:13:43.063300Z" }, "id": "jr3Kf1boFnCO" }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Physical devices cannot be modified after being initialized\n" ] } ], "source": [ "gpus = tf.config.list_physical_devices('GPU')\n", "if gpus:\n", " try:\n", " # Currently, memory growth needs to be the same across GPUs\n", " for gpu in gpus:\n", " tf.config.experimental.set_memory_growth(gpu, True)\n", " logical_gpus = tf.config.list_logical_devices('GPU')\n", " print(len(gpus), \"Physical GPUs,\", len(logical_gpus), \"Logical GPUs\")\n", " except RuntimeError as e:\n", " # Memory growth must be set before GPUs have been initialized\n", " print(e)" ] }, { "cell_type": "markdown", "metadata": { "id": "I1o8t51QFnmv" }, "source": [ "Another way to enable this option is to set the environmental variable `TF_FORCE_GPU_ALLOW_GROWTH` to `true`. This configuration is platform specific.\n", "\n", "The second method is to configure a virtual GPU device with `tf.config.set_logical_device_configuration` and set a hard limit on the total memory to allocate on the GPU." ] }, { "cell_type": "code", "execution_count": 7, "metadata": { "execution": { "iopub.execute_input": "2024-07-19T02:13:43.066928Z", "iopub.status.busy": "2024-07-19T02:13:43.066711Z", "iopub.status.idle": "2024-07-19T02:13:43.071182Z", "shell.execute_reply": "2024-07-19T02:13:43.070543Z" }, "id": "2qO2cS9QFn42" }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Virtual devices cannot be modified after being initialized\n" ] } ], "source": [ "gpus = tf.config.list_physical_devices('GPU')\n", "if gpus:\n", " # Restrict TensorFlow to only allocate 1GB of memory on the first GPU\n", " try:\n", " tf.config.set_logical_device_configuration(\n", " gpus[0],\n", " [tf.config.LogicalDeviceConfiguration(memory_limit=1024)])\n", " logical_gpus = tf.config.list_logical_devices('GPU')\n", " print(len(gpus), \"Physical GPUs,\", len(logical_gpus), \"Logical GPUs\")\n", " except RuntimeError as e:\n", " # Virtual devices must be set before GPUs have been initialized\n", " print(e)" ] }, { "cell_type": "markdown", "metadata": { "id": "Bsg1iLuHFoLW" }, "source": [ "This is useful if you want to truly bound the amount of GPU memory available to the TensorFlow process. This is common practice for local development when the GPU is shared with other applications such as a workstation GUI." ] }, { "cell_type": "markdown", "metadata": { "id": "B27_-1gyjf-t" }, "source": [ "## Using a single GPU on a multi-GPU system\n", "\n", "If you have more than one GPU in your system, the GPU with the lowest ID will be\n", "selected by default. If you would like to run on a different GPU, you will need\n", "to specify the preference explicitly:" ] }, { "cell_type": "code", "execution_count": 8, "metadata": { "execution": { "iopub.execute_input": "2024-07-19T02:13:43.074227Z", "iopub.status.busy": "2024-07-19T02:13:43.074011Z", "iopub.status.idle": "2024-07-19T02:13:43.386422Z", "shell.execute_reply": "2024-07-19T02:13:43.385763Z" }, "id": "wep4iteljjG1" }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Executing op _EagerConst in device /job:localhost/replica:0/task:0/device:GPU:2\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "Executing op _EagerConst in device /job:localhost/replica:0/task:0/device:GPU:2\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "Executing op MatMul in device /job:localhost/replica:0/task:0/device:GPU:2\n" ] } ], "source": [ "tf.debugging.set_log_device_placement(True)\n", "\n", "try:\n", " # Specify an invalid GPU device\n", " with tf.device('/device:GPU:2'):\n", " a = tf.constant([[1.0, 2.0, 3.0], [4.0, 5.0, 6.0]])\n", " b = tf.constant([[1.0, 2.0], [3.0, 4.0], [5.0, 6.0]])\n", " c = tf.matmul(a, b)\n", "except RuntimeError as e:\n", " print(e)" ] }, { "cell_type": "markdown", "metadata": { "id": "jy-4cCO_jn4G" }, "source": [ "If the device you have specified does not exist, you will get a `RuntimeError`: `.../device:GPU:2 unknown device`.\n", "\n", "If you would like TensorFlow to automatically choose an existing and supported device to run the operations in case the specified one doesn't exist, you can call `tf.config.set_soft_device_placement(True)`." ] }, { "cell_type": "code", "execution_count": 9, "metadata": { "execution": { "iopub.execute_input": "2024-07-19T02:13:43.389955Z", "iopub.status.busy": "2024-07-19T02:13:43.389670Z", "iopub.status.idle": "2024-07-19T02:13:43.397797Z", "shell.execute_reply": "2024-07-19T02:13:43.397188Z" }, "id": "sut_UHlkjvWd" }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Executing op _EagerConst in device /job:localhost/replica:0/task:0/device:GPU:0\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "Executing op _EagerConst in device /job:localhost/replica:0/task:0/device:GPU:0\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "Executing op MatMul in device /job:localhost/replica:0/task:0/device:GPU:0\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "tf.Tensor(\n", "[[22. 28.]\n", " [49. 64.]], shape=(2, 2), dtype=float32)\n" ] } ], "source": [ "tf.config.set_soft_device_placement(True)\n", "tf.debugging.set_log_device_placement(True)\n", "\n", "# Creates some tensors\n", "a = tf.constant([[1.0, 2.0, 3.0], [4.0, 5.0, 6.0]])\n", "b = tf.constant([[1.0, 2.0], [3.0, 4.0], [5.0, 6.0]])\n", "c = tf.matmul(a, b)\n", "\n", "print(c)" ] }, { "cell_type": "markdown", "metadata": { "id": "sYTYPrQZj2d9" }, "source": [ "## Using multiple GPUs\n", "\n", "Developing for multiple GPUs will allow a model to scale with the additional resources. If developing on a system with a single GPU, you can simulate multiple GPUs with virtual devices. This enables easy testing of multi-GPU setups without requiring additional resources." ] }, { "cell_type": "code", "execution_count": 10, "metadata": { "execution": { "iopub.execute_input": "2024-07-19T02:13:43.400747Z", "iopub.status.busy": "2024-07-19T02:13:43.400516Z", "iopub.status.idle": "2024-07-19T02:13:43.405202Z", "shell.execute_reply": "2024-07-19T02:13:43.404496Z" }, "id": "8EMGuGKbNkc6" }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Virtual devices cannot be modified after being initialized\n" ] } ], "source": [ "gpus = tf.config.list_physical_devices('GPU')\n", "if gpus:\n", " # Create 2 virtual GPUs with 1GB memory each\n", " try:\n", " tf.config.set_logical_device_configuration(\n", " gpus[0],\n", " [tf.config.LogicalDeviceConfiguration(memory_limit=1024),\n", " tf.config.LogicalDeviceConfiguration(memory_limit=1024)])\n", " logical_gpus = tf.config.list_logical_devices('GPU')\n", " print(len(gpus), \"Physical GPU,\", len(logical_gpus), \"Logical GPUs\")\n", " except RuntimeError as e:\n", " # Virtual devices must be set before GPUs have been initialized\n", " print(e)" ] }, { "cell_type": "markdown", "metadata": { "id": "xmNzO0FxNkol" }, "source": [ "Once there are multiple logical GPUs available to the runtime, you can utilize the multiple GPUs with `tf.distribute.Strategy` or with manual placement." ] }, { "cell_type": "markdown", "metadata": { "id": "IDZmEGq4j6kG" }, "source": [ "#### With `tf.distribute.Strategy`\n", "\n", "The best practice for using multiple GPUs is to use `tf.distribute.Strategy`.\n", "Here is a simple example:" ] }, { "cell_type": "code", "execution_count": 11, "metadata": { "execution": { "iopub.execute_input": "2024-07-19T02:13:43.408917Z", "iopub.status.busy": "2024-07-19T02:13:43.408266Z", "iopub.status.idle": "2024-07-19T02:13:44.153386Z", "shell.execute_reply": "2024-07-19T02:13:44.152759Z" }, "id": "1KgzY8V2AvRv" }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "INFO:tensorflow:Using MirroredStrategy with devices ('/job:localhost/replica:0/task:0/device:GPU:0', '/job:localhost/replica:0/task:0/device:GPU:1', '/job:localhost/replica:0/task:0/device:GPU:2', '/job:localhost/replica:0/task:0/device:GPU:3')\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "Executing op _EagerConst in device /job:localhost/replica:0/task:0/device:GPU:0\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "Executing op VarHandleOp in device /job:localhost/replica:0/task:0/device:GPU:0\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "Executing op AssignVariableOp in device /job:localhost/replica:0/task:0/device:GPU:0\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "Executing op _EagerConst in device /job:localhost/replica:0/task:0/device:GPU:1\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "Executing op VarHandleOp in device /job:localhost/replica:0/task:0/device:GPU:1\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "Executing op AssignVariableOp in device /job:localhost/replica:0/task:0/device:GPU:1\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "Executing op _EagerConst in device /job:localhost/replica:0/task:0/device:GPU:2\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "Executing op VarHandleOp in device /job:localhost/replica:0/task:0/device:GPU:2\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "Executing op AssignVariableOp in device /job:localhost/replica:0/task:0/device:GPU:2\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "Executing op _EagerConst in device /job:localhost/replica:0/task:0/device:GPU:3\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "Executing op VarHandleOp in device /job:localhost/replica:0/task:0/device:GPU:3\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "Executing op AssignVariableOp in device /job:localhost/replica:0/task:0/device:GPU:3\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "Executing op Cast in device /job:localhost/replica:0/task:0/device:CPU:0\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "Executing op _EagerConst in device /job:localhost/replica:0/task:0/device:GPU:0\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "Executing op Cast in device /job:localhost/replica:0/task:0/device:GPU:0\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "Executing op _EagerConst in device /job:localhost/replica:0/task:0/device:GPU:0\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "Executing op _EagerConst in device /job:localhost/replica:0/task:0/device:GPU:0\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "Executing op _EagerConst in device /job:localhost/replica:0/task:0/device:GPU:0\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "Executing op StatelessRandomGetKeyCounter in device /job:localhost/replica:0/task:0/device:GPU:0\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "Executing op _EagerConst in device /job:localhost/replica:0/task:0/device:GPU:0\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "input: (_Arg): /job:localhost/replica:0/task:0/device:GPU:0\n", "_EagerConst: (_EagerConst): /job:localhost/replica:0/task:0/device:GPU:0\n", "output_RetVal: (_Retval): /job:localhost/replica:0/task:0/device:GPU:0\n", "a: (_Arg): /job:localhost/replica:0/task:0/device:GPU:0\n", "b: (_Arg): /job:localhost/replica:0/task:0/device:GPU:0\n", "MatMul: (MatMul): /job:localhost/replica:0/task:0/device:GPU:0\n", "product_RetVal: (_Retval): /job:localhost/replica:0/task:0/device:GPU:0\n", "input: (_Arg): /job:localhost/replica:0/task:0/device:GPU:2\n", "_EagerConst: (_EagerConst): /job:localhost/replica:0/task:0/device:GPU:2\n", "output_RetVal: (_Retval): /job:localhost/replica:0/task:0/device:GPU:2\n", "a: (_Arg): /job:localhost/replica:0/task:0/device:GPU:2\n", "b: (_Arg): /job:localhost/replica:0/task:0/device:GPU:2\n", "MatMul: (MatMul): /job:localhost/replica:0/task:0/device:GPU:2\n", "product_RetVal: (_Retval): /job:localhost/replica:0/task:0/device:GPU:2\n", "resource_RetVal: (_Retval): /job:localhost/replica:0/task:0/device:GPU:0\n", "VarHandleOp: (VarHandleOp): /job:localhost/replica:0/task:0/device:GPU:0\n", "resource: (_Arg): /job:localhost/replica:0/task:0/device:GPU:0\n", "value: (_Arg): /job:localhost/replica:0/task:0/device:GPU:0\n", "AssignVariableOp: (AssignVariableOp): /job:localhost/replica:0/task:0/device:GPU:0\n", "input: (_Arg): /job:localhost/replica:0/task:0/device:GPU:1\n", "_EagerConst: (_EagerConst): /job:localhost/replica:0/task:0/device:GPU:1\n", "output_RetVal: (_Retval): /job:localhost/replica:0/task:0/device:GPU:1\n", "resource_RetVal: (_Retval): /job:localhost/replica:0/task:0/device:GPU:1\n", "VarHandleOp: (VarHandleOp): /job:localhost/replica:0/task:0/device:GPU:1\n", "resource: (_Arg): /job:localhost/replica:0/task:0/device:GPU:1\n", "value: (_Arg): /job:localhost/replica:0/task:0/device:GPU:1\n", "AssignVariableOp: (AssignVariableOp): /job:localhost/replica:0/task:0/device:GPU:1\n", "resource_RetVal: (_Retval): /job:localhost/replica:0/task:0/device:GPU:2\n", "VarHandleOp: (VarHandleOp): /job:localhost/replica:0/task:0/device:GPU:2\n", "resource: (_Arg): /job:localhost/replica:0/task:0/device:GPU:2\n", "value: (_Arg): /job:localhost/replica:0/task:0/device:GPU:2\n", "AssignVariableOp: (AssignVariableOp): /job:localhost/replica:0/task:0/device:GPU:2\n", "input: (_Arg): /job:localhost/replica:0/task:0/device:GPU:3\n", "_EagerConst: (_EagerConst): /job:localhost/replica:0/task:0/device:GPU:3\n", "output_RetVal: (_Retval): /job:localhost/replica:0/task:0/device:GPU:3\n", "resource_RetVal: (_Retval): /job:localhost/replica:0/task:0/device:GPU:3\n", "VarHandleOp: (VarHandleOp): /job:localhost/replica:0/task:0/device:GPU:3\n", "resource: (_Arg): /job:localhost/replica:0/task:0/device:GPU:3\n", "value: (_Arg): /job:localhost/replica:0/task:0/device:GPU:3\n", "AssignVariableOp: (AssignVariableOp): /job:localhost/replica:0/task:0/device:GPU:3\n", "x: (_Arg): /job:localhost/replica:0/task:0/device:CPU:0\n", "Cast: (Cast): /job:localhost/replica:0/task:0/device:CPU:0\n", "y_RetVal: (_Retval): /job:localhost/replica:0/task:0/device:CPU:0\n", "input: (_Arg): /job:localhost/replica:0/task:0/device:GPU:0\n", "_EagerConst: (_EagerConst): /job:localhost/replica:0/task:0/device:GPU:0\n", "output_RetVal: (_Retval): /job:localhost/replica:0/task:0/device:GPU:0\n", "x: (_Arg): /job:localhost/replica:0/task:0/device:GPU:0\n", "Cast: (Cast): /job:localhost/replica:0/task:0/device:GPU:0\n", "y_RetVal: (_DeviceRetval): /job:localhost/replica:0/task:0/device:GPU:0\n", "input: (_Arg): /job:localhost/replica:0/task:0/device:CPU:0\n", "_EagerConst: (_EagerConst): /job:localhost/replica:0/task:0/device:GPU:0\n", "output_RetVal: (_Retval): /job:localhost/replica:0/task:0/device:GPU:0\n", "seed: (_Arg): /job:localhost/replica:0/task:0/device:CPU:0\n", "StatelessRandomGetKeyCounter: (StatelessRandomGetKeyCounter): /job:localhost/replica:0/task:0/device:GPU:0\n", "key_RetVal: (_Retval): /job:localhost/replica:0/task:0/device:GPU:0\n", "counter_RetVal: (_Retval): /job:localhost/replica:0/task:0/device:GPU:0\n", "shape: (_DeviceArg): /job:localhost/replica:0/task:0/device:CPU:0\n", "key: (_Arg): /job:localhost/replica:0/task:0/device:GPU:0\n", "counter: (_Arg): /job:localhost/replica:0/task:0/device:GPU:0\n", "alg: (_DeviceArg): /job:localhost/replica:0/task:0/device:CPU:0\n", "StatelessRandomUniformV2: (StatelessRandomUnifExecuting op StatelessRandomUniformV2 in device /job:localhost/replica:0/task:0/device:GPU:0\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "Executing op Sub in device /job:localhost/replica:0/task:0/device:GPU:0\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "Executing op Mul in device /job:localhost/replica:0/task:0/device:GPU:0\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "Executing op AddV2 in device /job:localhost/replica:0/task:0/device:GPU:0\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "Executing op VarHandleOp in device /job:localhost/replica:0/task:0/device:GPU:0\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "Executing op AssignVariableOp in device /job:localhost/replica:0/task:0/device:GPU:0\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "Executing op ReadVariableOp in device /job:localhost/replica:0/task:0/device:GPU:0\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "Executing op Identity in device /job:localhost/replica:0/task:0/device:GPU:1\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "Executing op VarHandleOp in device /job:localhost/replica:0/task:0/device:GPU:1\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "Executing op AssignVariableOp in device /job:localhost/replica:0/task:0/device:GPU:1\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "Executing op ReadVariableOp in device /job:localhost/replica:0/task:0/device:GPU:0\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "Executing op Identity in device /job:localhost/replica:0/task:0/device:GPU:2\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "Executing op VarHandleOp in device /job:localhost/replica:0/task:0/device:GPU:2\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "Executing op AssignVariableOp in device /job:localhost/replica:0/task:0/device:GPU:2\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "Executing op ReadVariableOp in device /job:localhost/replica:0/task:0/device:GPU:0\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "Executing op Identity in device /job:localhost/replica:0/task:0/device:GPU:3\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "Executing op VarHandleOp in device /job:localhost/replica:0/task:0/device:GPU:3\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "Executing op AssignVariableOp in device /job:localhost/replica:0/task:0/device:GPU:3\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "Executing op NoOp in device /job:localhost/replica:0/task:0/device:GPU:0\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "Executing op NoOp in device /job:localhost/replica:0/task:0/device:GPU:0\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "Executing op NoOp in device /job:localhost/replica:0/task:0/device:GPU:0\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "Executing op NoOp in device /job:localhost/replica:0/task:0/device:GPU:0\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "Executing op _EagerConst in device /job:localhost/replica:0/task:0/device:GPU:0\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "Executing op _EagerConst in device /job:localhost/replica:0/task:0/device:GPU:0\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "Executing op Fill in device /job:localhost/replica:0/task:0/device:GPU:0\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "Executing op VarHandleOp in device /job:localhost/replica:0/task:0/device:GPU:0\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "Executing op AssignVariableOp in device /job:localhost/replica:0/task:0/device:GPU:0\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "Executing op ReadVariableOp in device /job:localhost/replica:0/task:0/device:GPU:0\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "Executing op Identity in device /job:localhost/replica:0/task:0/device:GPU:1\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "Executing op VarHandleOp in device /job:localhost/replica:0/task:0/device:GPU:1\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "ormV2): /job:localhost/replica:0/task:0/device:GPU:0\n", "output_RetVal: (_Retval): /job:localhost/replica:0/task:0/device:GPU:0\n", "x: (_Arg): /job:localhost/replica:0/task:0/device:GPU:0\n", "y: (_Arg): /job:localhost/replica:0/task:0/device:GPU:0\n", "Sub: (Sub): /job:localhost/replica:0/task:0/device:GPU:0\n", "z_RetVal: (_Retval): /job:localhost/replica:0/task:0/device:GPU:0\n", "x: (_Arg): /job:localhost/replica:0/task:0/device:GPU:0\n", "y: (_Arg): /job:localhost/replica:0/task:0/device:GPU:0\n", "Mul: (Mul): /job:localhost/replica:0/task:0/device:GPU:0\n", "z_RetVal: (_Retval): /job:localhost/replica:0/task:0/device:GPU:0\n", "x: (_Arg): /job:localhost/replica:0/task:0/device:GPU:0\n", "y: (_Arg): /job:localhost/replica:0/task:0/device:GPU:0\n", "AddV2: (AddV2): /job:localhost/replica:0/task:0/device:GPU:0\n", "z_RetVal: (_Retval): /job:localhost/replica:0/task:0/device:GPU:0\n", "resource_RetVal: (_Retval): /job:localhost/replica:0/task:0/device:GPU:0\n", "VarHandleOp: (VarHandleOp): /job:localhost/replica:0/task:0/device:GPU:0\n", "resource: (_Arg): /job:localhost/replica:0/task:0/device:GPU:0\n", "value: (_Arg): /job:localhost/replica:0/task:0/device:GPU:0\n", "AssignVariableOp: (AssignVariableOp): /job:localhost/replica:0/task:0/device:GPU:0\n", "resource: (_Arg): /job:localhost/replica:0/task:0/device:GPU:0\n", "ReadVariableOp: (ReadVariableOp): /job:localhost/replica:0/task:0/device:GPU:0\n", "value_RetVal: (_Retval): /job:localhost/replica:0/task:0/device:GPU:0\n", "input: (_Arg): /job:localhost/replica:0/task:0/device:GPU:1\n", "Identity: (Identity): /job:localhost/replica:0/task:0/device:GPU:1\n", "output_RetVal: (_Retval): /job:localhost/replica:0/task:0/device:GPU:1\n", "resource_RetVal: (_Retval): /job:localhost/replica:0/task:0/device:GPU:1\n", "VarHandleOp: (VarHandleOp): /job:localhost/replica:0/task:0/device:GPU:1\n", "resource: (_Arg): /job:localhost/replica:0/task:0/device:GPU:1\n", "value: (_Arg): /job:localhost/replica:0/task:0/device:GPU:1\n", "AssignVariableOp: (AssignVariableOp): /job:localhost/replica:0/task:0/device:GPU:1\n", "input: (_Arg): /job:localhost/replica:0/task:0/device:GPU:2\n", "Identity: (Identity): /job:localhost/replica:0/task:0/device:GPU:2\n", "output_RetVal: (_Retval): /job:localhost/replica:0/task:0/device:GPU:2\n", "resource_RetVal: (_Retval): /job:localhost/replica:0/task:0/device:GPU:2\n", "VarHandleOp: (VarHandleOp): /job:localhost/replica:0/task:0/device:GPU:2\n", "resource: (_Arg): /job:localhost/replica:0/task:0/device:GPU:2\n", "value: (_Arg): /job:localhost/replica:0/task:0/device:GPU:2\n", "AssignVariableOp: (AssignVariableOp): /job:localhost/replica:0/task:0/device:GPU:2\n", "input: (_Arg): /job:localhost/replica:0/task:0/device:GPU:3\n", "Identity: (Identity): /job:localhost/replica:0/task:0/device:GPU:3\n", "output_RetVal: (_Retval): /job:localhost/replica:0/task:0/device:GPU:3\n", "resource_RetVal: (_Retval): /job:localhost/replica:0/task:0/device:GPU:3\n", "VarHandleOp: (VarHandleOp): /job:localhost/replica:0/task:0/device:GPU:3\n", "resource: (_Arg): /job:localhost/replica:0/task:0/device:GPU:3\n", "value: (_Arg): /job:localhost/replica:0/task:0/device:GPU:3\n", "AssignVariableOp: (AssignVariableOp): /job:localhost/replica:0/task:0/device:GPU:3\n", "NoOp: (NoOp): /job:localhost/replica:0/task:0/device:GPU:0\n", "dims: (_DeviceArg): /job:localhost/replica:0/task:0/device:CPU:0\n", "value: (_Arg): /job:localhost/replica:0/task:0/device:GPU:0\n", "Fill: (Fill): /job:localhost/replica:0/task:0/device:GPU:0\n", "output_RetVal: (_Retval): /job:localhost/replica:0/task:0/device:GPU:0\n", "resource_RetVal: (_Retval): /job:localhost/replica:0/task:0/device:GPU:0\n", "VarHandleOp: (VarHandleOp): /job:localhost/replica:0/task:0/device:GPU:0\n", "resource: (_Arg): /job:localhost/replica:0/task:0/device:GPU:0\n", "value: (_Arg): /job:localhost/replica:0/task:0/device:GPU:0\n", "AssignVariableOp: (AssignVariableOp): /job:localhost/replica:0/task:0/device:GPU:0\n", "resource: (_Arg): /job:localhost/replica:0/task:0/device:GPU:0\n", "ReadVariableOp: (ReadVariableOp): /job:localhost/replica:0/task:0/device:GPU:0\n", "value_RetVal: (_Retval): /job:localhost/replica:0/task:0/device:GPU:0\n", "resource_RetVal: (_Retval): /job:localhost/replica:0/task:0/device:GPU:1\n", "VarHandleOp: (VarHandleOp): /job:localhost/replica:0/task:0/device:GPU:1\n", "resource: (_Arg): /job:loExecuting op AssignVariableOp in device /job:localhost/replica:0/task:0/device:GPU:1\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "Executing op ReadVariableOp in device /job:localhost/replica:0/task:0/device:GPU:0\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "Executing op Identity in device /job:localhost/replica:0/task:0/device:GPU:2\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "Executing op VarHandleOp in device /job:localhost/replica:0/task:0/device:GPU:2\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "Executing op AssignVariableOp in device /job:localhost/replica:0/task:0/device:GPU:2\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "Executing op ReadVariableOp in device /job:localhost/replica:0/task:0/device:GPU:0\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "Executing op Identity in device /job:localhost/replica:0/task:0/device:GPU:3\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "Executing op VarHandleOp in device /job:localhost/replica:0/task:0/device:GPU:3\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "Executing op AssignVariableOp in device /job:localhost/replica:0/task:0/device:GPU:3\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "Executing op NoOp in device /job:localhost/replica:0/task:0/device:GPU:0\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "Executing op NoOp in device /job:localhost/replica:0/task:0/device:GPU:0\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "Executing op NoOp in device /job:localhost/replica:0/task:0/device:GPU:0\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "Executing op NoOp in device /job:localhost/replica:0/task:0/device:GPU:0\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "Executing op _EagerConst in device /job:localhost/replica:0/task:0/device:GPU:0\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "Executing op VarHandleOp in device /job:localhost/replica:0/task:0/device:GPU:0\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "Executing op AssignVariableOp in device /job:localhost/replica:0/task:0/device:GPU:0\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "Executing op ReadVariableOp in device /job:localhost/replica:0/task:0/device:GPU:0\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "Executing op Identity in device /job:localhost/replica:0/task:0/device:GPU:1\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "Executing op VarHandleOp in device /job:localhost/replica:0/task:0/device:GPU:1\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "Executing op AssignVariableOp in device /job:localhost/replica:0/task:0/device:GPU:1\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "Executing op ReadVariableOp in device /job:localhost/replica:0/task:0/device:GPU:0\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "Executing op Identity in device /job:localhost/replica:0/task:0/device:GPU:2\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "Executing op VarHandleOp in device /job:localhost/replica:0/task:0/device:GPU:2\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "Executing op AssignVariableOp in device /job:localhost/replica:0/task:0/device:GPU:2\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "Executing op ReadVariableOp in device /job:localhost/replica:0/task:0/device:GPU:0\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "Executing op Identity in device /job:localhost/replica:0/task:0/device:GPU:3\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "Executing op VarHandleOp in device /job:localhost/replica:0/task:0/device:GPU:3\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "Executing op AssignVariableOp in device /job:localhost/replica:0/task:0/device:GPU:3\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "Executing op NoOp in device /job:localhost/replica:0/task:0/device:GPU:0\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "Executing op NoOp in device /job:localhost/replica:0/task:0/device:GPU:0\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "Executing op NoOp in device /job:localhost/replica:0/task:0/device:GPU:0\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "Executing op NoOp in device /job:localhost/replica:0/task:0/device:GPU:0\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "Executing op _EagerConst in device /job:localhost/replica:0/task:0/device:GPU:0\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "Executing op VarHandleOp in device /job:localhost/replica:0/task:0/device:GPU:0\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "Executing op AssignVariableOp in device /job:localhost/replica:0/task:0/device:GPU:0\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "Executing op ReadVariableOp in device /job:localhost/replica:0/task:0/device:GPU:0\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "Executing op Identity in device /job:localhost/replica:0/task:0/device:GPU:1\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "Executing op VarHandleOp in device /job:localhost/replica:0/task:0/device:GPU:1\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "Executing op AssignVariableOp in device /job:localhost/replica:0/task:0/device:GPU:1\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "Executing op ReadVariableOp in device /job:localhost/replica:0/task:0/device:GPU:0\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "Executing op Identity in device /job:localhost/replica:0/task:0/device:GPU:2\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "Executing op VarHandleOp in device /job:localhost/replica:0/task:0/device:GPU:2\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "Executing op AssignVariableOp in device /job:localhost/replica:0/task:0/device:GPU:2\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "Executing op ReadVariableOp in device /job:localhost/replica:0/task:0/device:GPU:0\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "Executing op Identity in device /job:localhost/replica:0/task:0/device:GPU:3\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "calhost/replica:0/task:0/device:GPU:1\n", "value: (_Arg): /job:localhost/replica:0/task:0/device:GPU:1\n", "AssignVariableOp: (AssignVariableOp): /job:localhost/replica:0/task:0/device:GPU:1\n", "resource_RetVal: (_Retval): /job:localhost/replica:0/task:0/device:GPU:2\n", "VarHandleOp: (VarHandleOp): /job:localhost/replica:0/task:0/device:GPU:2\n", "resource: (_Arg): /job:localhost/replica:0/task:0/device:GPU:2\n", "value: (_Arg): /job:localhost/replica:0/task:0/device:GPU:2\n", "AssignVariableOp: (AssignVariableOp): /job:localhost/replica:0/task:0/device:GPU:2\n", "resource_RetVal: (_Retval): /job:localhost/replica:0/task:0/device:GPU:3\n", "VarHandleOp: (VarHandleOp): /job:localhost/replica:0/task:0/device:GPU:3\n", "resource: (_Arg): /job:localhost/replica:0/task:0/device:GPU:3\n", "value: (_Arg): /job:localhost/replica:0/task:0/device:GPU:3\n", "AssignVariableOp: (AssignVariableOp): /job:localhost/replica:0/task:0/device:GPU:3\n", "input: (_Arg): /job:localhost/replica:0/task:0/device:GPU:0\n", "_EagerConst: (_EagerConst): /job:localhost/replica:0/task:0/device:GPU:0\n", "output_RetVal: (_Retval): /job:localhost/replica:0/task:0/device:GPU:0\n", "resource_RetVal: (_Retval): /job:localhost/replica:0/task:0/device:GPU:0\n", "VarHandleOp: (VarHandleOp): /job:localhost/replica:0/task:0/device:GPU:0\n", "resource: (_Arg): /job:localhost/replica:0/task:0/device:GPU:0\n", "value: (_Arg): /job:localhost/replica:0/task:0/device:GPU:0\n", "AssignVariableOp: (AssignVariableOp): /job:localhost/replica:0/task:0/device:GPU:0\n", "resource: (_Arg): /job:localhost/replica:0/task:0/device:GPU:0\n", "ReadVariableOp: (ReadVariableOp): /job:localhost/replica:0/task:0/device:GPU:0\n", "value_RetVal: (_Retval): /job:localhost/replica:0/task:0/device:GPU:0\n", "input: (_Arg): /job:localhost/replica:0/task:0/device:GPU:1\n", "Identity: (Identity): /job:localhost/replica:0/task:0/device:GPU:1\n", "output_RetVal: (_Retval): /job:localhost/replica:0/task:0/device:GPU:1\n", "resource_RetVal: (_Retval): /job:localhost/replica:0/task:0/device:GPU:1\n", "VarHandleOp: (VarHandleOp): /job:localhost/replica:0/task:0/device:GPU:1\n", "resource: (_Arg): /job:localhost/replica:0/task:0/device:GPU:1\n", "value: (_Arg): /job:localhost/replica:0/task:0/device:GPU:1\n", "AssignVariableOp: (AssignVariableOp): /job:localhost/replica:0/task:0/device:GPU:1\n", "input: (_Arg): /job:localhost/replica:0/task:0/device:GPU:2\n", "Identity: (Identity): /job:localhost/replica:0/task:0/device:GPU:2\n", "output_RetVal: (_Retval): /job:localhost/replica:0/task:0/device:GPU:2\n", "resource_RetVal: (_Retval): /job:localhost/replica:0/task:0/device:GPU:2\n", "VarHandleOp: (VarHandleOp): /job:localhost/replica:0/task:0/device:GPU:2\n", "resource: (_Arg): /job:localhost/replica:0/task:0/device:GPU:2\n", "value: (_Arg): /job:localhost/replica:0/task:0/device:GPU:2\n", "AssignVariableOp: (AssignVariableOp): /job:localhost/replica:0/task:0/device:GPU:2\n", "input: (_Arg): /job:localhost/replica:0/task:0/device:GPU:3\n", "Identity: (Identity): /job:localhost/replica:0/task:0/device:GPU:3\n", "output_RetVal: (_Retval): /job:localhost/replica:0/task:0/device:GPU:3\n", "resource_RetVal: (_Retval): /job:localhost/replica:0/task:0/device:GPU:3\n", "VarHandleOp: (VarHandleOp): /job:localhost/replica:0/task:0/device:GPU:3\n", "resource: (_Arg): /job:localhost/replica:0/task:0/device:GPU:3\n", "value: (_Arg): /job:localhost/replica:0/task:0/device:GPU:3\n", "AssignVariableOp: (AssignVariableOp): /job:localhost/replica:0/task:0/device:GPU:3\n", "resource_RetVal: (_Retval): /job:localhost/replica:0/task:0/device:GPU:0\n", "VarHandleOp: (VarHandleOp): /job:localhost/replica:0/task:0/device:GPU:0\n", "resource: (_Arg): /job:localhost/replica:0/task:0/device:GPU:0\n", "ReadVariableOp: (ReadVariableOp): /job:localhost/replica:0/task:0/device:GPU:0\n", "value_RetVal: (_Retval): /job:localhost/replica:0/task:0/device:GPU:0\n", "resource_RetVal: (_Retval): /job:localhost/replica:0/task:0/device:GPU:1\n", "VarHandleOp: (VarHandleOp): /job:localhost/replica:0/task:0/device:GPU:1\n", "resource_RetVal: (_Retval): /job:localhost/replica:0/task:0/device:GPU:2\n", "VarHandleOp: (VarHandleOp): /job:localhost/replica:0/task:0/device:GPU:2\n", "resource_RetVal: (_Retval): /job:localhost/replica:0/task:0/device:GPU:3\n", "VarHandleOp: (VarHandleOp): /job:localhost/replica:0/task:0/device:GPUExecuting op VarHandleOp in device /job:localhost/replica:0/task:0/device:GPU:3\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "Executing op AssignVariableOp in device /job:localhost/replica:0/task:0/device:GPU:3\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "Executing op NoOp in device /job:localhost/replica:0/task:0/device:GPU:0\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "Executing op NoOp in device /job:localhost/replica:0/task:0/device:GPU:0\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "Executing op NoOp in device /job:localhost/replica:0/task:0/device:GPU:0\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "Executing op NoOp in device /job:localhost/replica:0/task:0/device:GPU:0\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "Executing op _EagerConst in device /job:localhost/replica:0/task:0/device:GPU:0\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "Executing op Fill in device /job:localhost/replica:0/task:0/device:GPU:0\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "Executing op VarHandleOp in device /job:localhost/replica:0/task:0/device:GPU:0\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "Executing op AssignVariableOp in device /job:localhost/replica:0/task:0/device:GPU:0\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "Executing op ReadVariableOp in device /job:localhost/replica:0/task:0/device:GPU:0\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "Executing op Identity in device /job:localhost/replica:0/task:0/device:GPU:1\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "Executing op VarHandleOp in device /job:localhost/replica:0/task:0/device:GPU:1\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "Executing op AssignVariableOp in device /job:localhost/replica:0/task:0/device:GPU:1\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "Executing op ReadVariableOp in device /job:localhost/replica:0/task:0/device:GPU:0\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "Executing op Identity in device /job:localhost/replica:0/task:0/device:GPU:2\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "Executing op VarHandleOp in device /job:localhost/replica:0/task:0/device:GPU:2\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "Executing op AssignVariableOp in device /job:localhost/replica:0/task:0/device:GPU:2\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "Executing op ReadVariableOp in device /job:localhost/replica:0/task:0/device:GPU:0\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "Executing op Identity in device /job:localhost/replica:0/task:0/device:GPU:3\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "Executing op VarHandleOp in device /job:localhost/replica:0/task:0/device:GPU:3\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "Executing op AssignVariableOp in device /job:localhost/replica:0/task:0/device:GPU:3\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "Executing op NoOp in device /job:localhost/replica:0/task:0/device:GPU:0\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "Executing op NoOp in device /job:localhost/replica:0/task:0/device:GPU:0\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "Executing op NoOp in device /job:localhost/replica:0/task:0/device:GPU:0\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "Executing op NoOp in device /job:localhost/replica:0/task:0/device:GPU:0\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "Executing op _EagerConst in device /job:localhost/replica:0/task:0/device:GPU:0\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "Executing op Fill in device /job:localhost/replica:0/task:0/device:GPU:0\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "Executing op VarHandleOp in device /job:localhost/replica:0/task:0/device:GPU:0\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "Executing op AssignVariableOp in device /job:localhost/replica:0/task:0/device:GPU:0\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "Executing op ReadVariableOp in device /job:localhost/replica:0/task:0/device:GPU:0\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "Executing op Identity in device /job:localhost/replica:0/task:0/device:GPU:1\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "Executing op VarHandleOp in device /job:localhost/replica:0/task:0/device:GPU:1\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "Executing op AssignVariableOp in device /job:localhost/replica:0/task:0/device:GPU:1\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "Executing op ReadVariableOp in device /job:localhost/replica:0/task:0/device:GPU:0\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "Executing op Identity in device /job:localhost/replica:0/task:0/device:GPU:2\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "Executing op VarHandleOp in device /job:localhost/replica:0/task:0/device:GPU:2\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "Executing op AssignVariableOp in device /job:localhost/replica:0/task:0/device:GPU:2\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "Executing op ReadVariableOp in device /job:localhost/replica:0/task:0/device:GPU:0\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "Executing op Identity in device /job:localhost/replica:0/task:0/device:GPU:3\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "Executing op VarHandleOp in device /job:localhost/replica:0/task:0/device:GPU:3\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "Executing op AssignVariableOp in device /job:localhost/replica:0/task:0/device:GPU:3\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "Executing op NoOp in device /job:localhost/replica:0/task:0/device:GPU:0\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "Executing op NoOp in device /job:localhost/replica:0/task:0/device:GPU:0\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "Executing op NoOp in device /job:localhost/replica:0/task:0/device:GPU:0\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "Executing op NoOp in device /job:localhost/replica:0/task:0/device:GPU:0\n" ] } ], "source": [ "tf.debugging.set_log_device_placement(True)\n", "gpus = tf.config.list_logical_devices('GPU')\n", "strategy = tf.distribute.MirroredStrategy(gpus)\n", "with strategy.scope():\n", " inputs = tf.keras.layers.Input(shape=(1,))\n", " predictions = tf.keras.layers.Dense(1)(inputs)\n", " model = tf.keras.models.Model(inputs=inputs, outputs=predictions)\n", " model.compile(loss='mse',\n", " optimizer=tf.keras.optimizers.SGD(learning_rate=0.2))" ] }, { "cell_type": "markdown", "metadata": { "id": "Dy7nxlKsAxkK" }, "source": [ "This program will run a copy of your model on each GPU, splitting the input data\n", "between them, also known as \"[data parallelism](https://en.wikipedia.org/wiki/Data_parallelism)\".\n", "\n", "For more information about distribution strategies, check out the guide [here](./distributed_training.ipynb)." ] }, { "cell_type": "markdown", "metadata": { "id": "8phxM5TVkAY_" }, "source": [ "#### Manual placement\n", "\n", "`tf.distribute.Strategy` works under the hood by replicating computation across devices. You can manually implement replication by constructing your model on each GPU. For example:" ] }, { "cell_type": "code", "execution_count": 12, "metadata": { "execution": { "iopub.execute_input": "2024-07-19T02:13:44.156807Z", "iopub.status.busy": "2024-07-19T02:13:44.156566Z", "iopub.status.idle": "2024-07-19T02:13:44.212086Z", "shell.execute_reply": "2024-07-19T02:13:44.211437Z" }, "id": "AqPo9ltUA_EY" }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Executing op _EagerConst in device /job:localhost/replica:0/task:0/device:GPU:0\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "Executing op _EagerConst in device /job:localhost/replica:0/task:0/device:GPU:0\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "Executing op MatMul in device /job:localhost/replica:0/task:0/device:GPU:0\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "Executing op _EagerConst in device /job:localhost/replica:0/task:0/device:GPU:1\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "Executing op _EagerConst in device /job:localhost/replica:0/task:0/device:GPU:1\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "Executing op MatMul in device /job:localhost/replica:0/task:0/device:GPU:1\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "Executing op _EagerConst in device /job:localhost/replica:0/task:0/device:GPU:2\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "Executing op _EagerConst in device /job:localhost/replica:0/task:0/device:GPU:2\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "Executing op MatMul in device /job:localhost/replica:0/task:0/device:GPU:2\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "Executing op _EagerConst in device /job:localhost/replica:0/task:0/device:GPU:3\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "Executing op _EagerConst in device /job:localhost/replica:0/task:0/device:GPU:3\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "Executing op MatMul in device /job:localhost/replica:0/task:0/device:GPU:3\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "Executing op AddN in device /job:localhost/replica:0/task:0/device:CPU:0\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "tf.Tensor(\n", "[[ 88. 112.]\n", " [196. 256.]], shape=(2, 2), dtype=float32)\n" ] } ], "source": [ "tf.debugging.set_log_device_placement(True)\n", "\n", "gpus = tf.config.list_logical_devices('GPU')\n", "if gpus:\n", " # Replicate your computation on multiple GPUs\n", " c = []\n", " for gpu in gpus:\n", " with tf.device(gpu.name):\n", " a = tf.constant([[1.0, 2.0, 3.0], [4.0, 5.0, 6.0]])\n", " b = tf.constant([[1.0, 2.0], [3.0, 4.0], [5.0, 6.0]])\n", " c.append(tf.matmul(a, b))\n", "\n", " with tf.device('/CPU:0'):\n", " matmul_sum = tf.add_n(c)\n", "\n", " print(matmul_sum)" ] } ], "metadata": { "accelerator": "GPU", "colab": { "collapsed_sections": [], "name": "gpu.ipynb", "toc_visible": true }, "kernelspec": { "display_name": "Python 3", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.9.19" } }, "nbformat": 4, "nbformat_minor": 0 }