{ "cells": [ { "cell_type": "markdown", "metadata": { "id": "Tce3stUlHN0L" }, "source": [ "##### Copyright 2020 The TensorFlow Authors." ] }, { "cell_type": "code", "execution_count": 1, "metadata": { "cellView": "form", "execution": { "iopub.execute_input": "2022-12-14T20:16:23.656364Z", "iopub.status.busy": "2022-12-14T20:16:23.656138Z", "iopub.status.idle": "2022-12-14T20:16:23.660169Z", "shell.execute_reply": "2022-12-14T20:16:23.659592Z" }, "id": "tuOe1ymfHZPu" }, "outputs": [], "source": [ "#@title Licensed under the Apache License, Version 2.0 (the \"License\");\n", "# you may not use this file except in compliance with the License.\n", "# You may obtain a copy of the License at\n", "#\n", "# https://www.apache.org/licenses/LICENSE-2.0\n", "#\n", "# Unless required by applicable law or agreed to in writing, software\n", "# distributed under the License is distributed on an \"AS IS\" BASIS,\n", "# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n", "# See the License for the specific language governing permissions and\n", "# limitations under the License." ] }, { "cell_type": "markdown", "metadata": { "id": "qFdPvlXBOdUN" }, "source": [ "# 高效的 TensorFlow 2" ] }, { "cell_type": "markdown", "metadata": { "id": "MfBg1C5NB3X0" }, "source": [ "\n", " \n", " \n", " \n", " \n", "
在 TensorFlow.org 上查看 在 Google Colab 中运行 在 GitHub 上查看源代码 下载笔记本
" ] }, { "cell_type": "markdown", "metadata": { "id": "xHxb-dlhMIzW" }, "source": [ "## 概述\n", "\n", "本指南提供了使用 TensorFlow 2 (TF2) 编写代码的最佳做法列表,此列表专为最近从 TensorFlow 1 (TF1) 切换过来的用户编写。有关将 TF1 代码迁移到 TF2 的更多信息,请参阅[指南的迁移部分](https://tensorflow.org/guide/migrate)。" ] }, { "cell_type": "markdown", "metadata": { "id": "MUXex9ctTuDB" }, "source": [ "## 设置\n", "\n", "为本指南中的示例导入 TensorFlow 和其他依赖项。" ] }, { "cell_type": "code", "execution_count": 2, "metadata": { "execution": { "iopub.execute_input": "2022-12-14T20:16:23.664278Z", "iopub.status.busy": "2022-12-14T20:16:23.663658Z", "iopub.status.idle": "2022-12-14T20:16:25.967376Z", "shell.execute_reply": "2022-12-14T20:16:25.966659Z" }, "id": "IqR2PQG4ZaZ0" }, "outputs": [ { "name": "stderr", "output_type": "stream", "text": [ "2022-12-14 20:16:24.587741: W tensorflow/compiler/xla/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libnvinfer.so.7'; dlerror: libnvinfer.so.7: cannot open shared object file: No such file or directory\n", "2022-12-14 20:16:24.587848: W tensorflow/compiler/xla/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libnvinfer_plugin.so.7'; dlerror: libnvinfer_plugin.so.7: cannot open shared object file: No such file or directory\n", "2022-12-14 20:16:24.587858: W tensorflow/compiler/tf2tensorrt/utils/py_utils.cc:38] TF-TRT Warning: Cannot dlopen some TensorRT libraries. If you would like to use Nvidia GPU with TensorRT, please make sure the missing libraries mentioned above are installed properly.\n" ] } ], "source": [ "import tensorflow as tf\n", "import tensorflow_datasets as tfds" ] }, { "cell_type": "markdown", "metadata": { "id": "Ngds9zateIY8" }, "source": [ "## 惯用 TensorFlow 2 的建议" ] }, { "cell_type": "markdown", "metadata": { "id": "B3RdHaroMAi4" }, "source": [ "### 将代码重构为更小的模块\n", "\n", "一种良好做法是将代码重构为根据需要调用的更小函数。为了获得最佳性能,您应当尝试在 `tf.function` 中装饰最大的计算块(请注意,由 `tf.function` 调用的嵌套 Python 函数不需要自己单独的装饰,除非您想为 `tf.function` 使用不同的 `jit_compile` 设置)。根据您的用例,这可能是多个训练步骤,甚至是整个训练循环。对于推断用例,它可能是单个模型前向传递。" ] }, { "cell_type": "markdown", "metadata": { "id": "Rua1l8et3Evd" }, "source": [ "### 调整某些 `tf.keras.optimizer` 的默认学习率\n", "\n", "\n", "\n", "在 TF2 中,某些 Keras 优化器具有不同的学习率。如果您发现模型的收敛行为发生变化,请检查默认学习率。\n", "\n", "`optimizers.SGD`、`optimizers.Adam` 或 `optimizers.RMSprop` 没有任何变更。\n", "\n", "以下优化器的默认学习率已更改:\n", "\n", "- `optimizers.Adagrad` 从 `0.01` 更改为 `0.001`\n", "- `optimizers.Adadelta` 从 `1.0` 更改为 `0.001`\n", "- `optimizers.Adamax` 从 `0.002` 更改为 `0.001`\n", "- `optimizers.Nadam` 从 `0.002` 更改为 `0.001`" ] }, { "cell_type": "markdown", "metadata": { "id": "z6LfkpsEldEV" }, "source": [ "### 使用 `tf.Module` 和 Keras 层管理变量\n", "\n", "`tf.Module` 和 `tf.keras.layers.Layer` 提供了方便的 `variables` 和 `trainable_variables` 属性,它们以递归方式收集所有因变量。这样便可轻松在使用变量的地方对它们进行本地管理。" ] }, { "cell_type": "markdown", "metadata": { "id": "WQ2U0rj1oBlc" }, "source": [ "Keras 层/模型继承自 `tf.train.Checkpointable` 并与 `@tf.function` 集成,这样便有可能从 Keras 对象直接导出 SavedModel 或为其添加检查点。您不必使用 Keras的 `Model.fit` API 来利用这些集成。\n", "\n", "阅读 Keras 指南中有关[迁移学习和微调](https://tensorflow.google.cn/guide/keras/transfer_learning#transfer_learning_fine-tuning_with_a_custom_training_loop)的部分,了解如何使用 Keras 收集相关变量的子集。" ] }, { "cell_type": "markdown", "metadata": { "id": "j34MsfxWodG6" }, "source": [ "### 结合 `tf.data.Dataset` 和 `tf.function`\n", "\n", "[TensorFlow Datasets](https://tensorflow.org/datasets) 软件包 (tfds) 包含用于将预定义数据集作为 `tf.data.Dataset` 对象加载的的实用工具。对于此示例,您可以使用 `tfds` 加载 MNIST 数据集:" ] }, { "cell_type": "code", "execution_count": 3, "metadata": { "execution": { "iopub.execute_input": "2022-12-14T20:16:25.972367Z", "iopub.status.busy": "2022-12-14T20:16:25.971466Z", "iopub.status.idle": "2022-12-14T20:16:29.915170Z", "shell.execute_reply": "2022-12-14T20:16:29.914484Z" }, "id": "BMgxaLH74_s-" }, "outputs": [], "source": [ "datasets, info = tfds.load(name='mnist', with_info=True, as_supervised=True)\n", "mnist_train, mnist_test = datasets['train'], datasets['test']" ] }, { "cell_type": "markdown", "metadata": { "id": "hPJhEuvj5VfR" }, "source": [ "然后,准备用于训练的数据:\n", "\n", "- 重新缩放每个图像;\n", "- 重排样本顺序。\n", "- 收集图像和标签批次。\n" ] }, { "cell_type": "code", "execution_count": 4, "metadata": { "execution": { "iopub.execute_input": "2022-12-14T20:16:29.919658Z", "iopub.status.busy": "2022-12-14T20:16:29.918992Z", "iopub.status.idle": "2022-12-14T20:16:29.922961Z", "shell.execute_reply": "2022-12-14T20:16:29.922317Z" }, "id": "StBRHtJM2S7o" }, "outputs": [], "source": [ "BUFFER_SIZE = 10 # Use a much larger value for real code\n", "BATCH_SIZE = 64\n", "NUM_EPOCHS = 5\n", "\n", "\n", "def scale(image, label):\n", " image = tf.cast(image, tf.float32)\n", " image /= 255\n", "\n", " return image, label" ] }, { "cell_type": "markdown", "metadata": { "id": "SKq14zKKFAdv" }, "source": [ "为了使样本简短,将数据集修剪为仅返回 5 个批次:" ] }, { "cell_type": "code", "execution_count": 5, "metadata": { "execution": { "iopub.execute_input": "2022-12-14T20:16:29.926429Z", "iopub.status.busy": "2022-12-14T20:16:29.925884Z", "iopub.status.idle": "2022-12-14T20:16:29.973511Z", "shell.execute_reply": "2022-12-14T20:16:29.972968Z" }, "id": "_J-o4YjG2mkM" }, "outputs": [], "source": [ "train_data = mnist_train.map(scale).shuffle(BUFFER_SIZE).batch(BATCH_SIZE)\n", "test_data = mnist_test.map(scale).batch(BATCH_SIZE)\n", "\n", "STEPS_PER_EPOCH = 5\n", "\n", "train_data = train_data.take(STEPS_PER_EPOCH)\n", "test_data = test_data.take(STEPS_PER_EPOCH)" ] }, { "cell_type": "code", "execution_count": 6, "metadata": { "execution": { "iopub.execute_input": "2022-12-14T20:16:29.976918Z", "iopub.status.busy": "2022-12-14T20:16:29.976375Z", "iopub.status.idle": "2022-12-14T20:16:30.260199Z", "shell.execute_reply": "2022-12-14T20:16:30.259349Z" }, "id": "XEqdkH54VM6c" }, "outputs": [ { "name": "stderr", "output_type": "stream", "text": [ "2022-12-14 20:16:30.248760: W tensorflow/core/kernels/data/cache_dataset_ops.cc:856] The calling iterator did not fully read the dataset being cached. In order to avoid unexpected truncation of the dataset, the partially cached contents of the dataset will be discarded. This can happen if you have an input pipeline similar to `dataset.cache().take(k).repeat()`. You should use `dataset.take(k).cache().repeat()` instead.\n" ] } ], "source": [ "image_batch, label_batch = next(iter(train_data))" ] }, { "cell_type": "markdown", "metadata": { "id": "loTPH2Pz4_Oj" }, "source": [ "使用常规 Python 迭代来迭代适合装入内存的训练数据。除此之外,`tf.data.Dataset` 是从磁盘流式传输训练数据的最佳方式。数据集是[可迭代对象(但不是迭代器)](https://docs.python.org/3/glossary.html#term-iterable),就像其他 Eager Execution 中的 Python 可迭代对象一样。您可以通过将代码封装在 `tf.function` 中来充分利用数据集异步预提取/流式传输功能,此代码将 Python 迭代替换为使用 AutoGraph 的等效计算图运算。\n", "\n", "```python\n", "@tf.function\n", "def train(model, dataset, optimizer):\n", " for x, y in dataset:\n", " with tf.GradientTape() as tape:\n", " # training=True is only needed if there are layers with different\n", " # behavior during training versus inference (e.g. Dropout).\n", " prediction = model(x, training=True)\n", " loss = loss_fn(prediction, y)\n", " gradients = tape.gradient(loss, model.trainable_variables)\n", " optimizer.apply_gradients(zip(gradients, model.trainable_variables))\n", "```\n", "\n", "如果您使用 Keras `Model.fit` API,则不必担心数据集迭代。\n", "\n", "```python\n", "model.compile(optimizer=optimizer, loss=loss_fn)\n", "model.fit(dataset)\n", "```" ] }, { "cell_type": "markdown", "metadata": { "id": "mSev7vZC5GJB" }, "source": [ "\n", "\n", "### 使用 Keras 训练循环\n", "\n", "如果您不需要对训练过程进行低级控制,建议使用 Keras 的内置 `fit`、`evaluate` 和 `predict` 方法。无论实现方式(顺序、函数或子类化)如何,这些方法都能提供统一的接口来训练模型。\n", "\n", "这些方法的优点包括:\n", "\n", "- 接受 Numpy 数组、Python 生成器和 `tf.data.Datasets`。\n", "- 自动应用正则化和激活损失。\n", "- 支持 `tf.distribute`,[无论硬件配置如何](distributed_training.ipynb),训练代码都保持不变。\n", "- 支持将任意可调用对象作为损失和指标。\n", "- 支持 `tf.keras.callbacks.TensorBoard` 之类的回调以及自定义回调。\n", "- 性能出色,可以自动使用 TensorFlow 计算图。\n", "\n", "下面是使用 `Dataset` 训练模型的示例。要详细了解工作原理,请参阅[教程](https://tensorflow.org/tutorials)。" ] }, { "cell_type": "code", "execution_count": 7, "metadata": { "execution": { "iopub.execute_input": "2022-12-14T20:16:30.264039Z", "iopub.status.busy": "2022-12-14T20:16:30.263300Z", "iopub.status.idle": "2022-12-14T20:16:38.964773Z", "shell.execute_reply": "2022-12-14T20:16:38.964058Z" }, "id": "uzHFCzd45Rae" }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Epoch 1/5\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "\r", "1/5 [=====>........................] - ETA: 28s - loss: 2.6460 - accuracy: 0.1250" ] }, { "name": "stdout", "output_type": "stream", "text": [ "\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\r", "5/5 [==============================] - 7s 6ms/step - loss: 1.6298 - accuracy: 0.4750\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "Epoch 2/5\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "2022-12-14 20:16:37.438231: W tensorflow/core/kernels/data/cache_dataset_ops.cc:856] The calling iterator did not fully read the dataset being cached. In order to avoid unexpected truncation of the dataset, the partially cached contents of the dataset will be discarded. This can happen if you have an input pipeline similar to `dataset.cache().take(k).repeat()`. You should use `dataset.take(k).cache().repeat()` instead.\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "\r", "1/5 [=====>........................] - ETA: 1s - loss: 0.5281 - accuracy: 0.8906" ] }, { "name": "stdout", "output_type": "stream", "text": [ "\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\r", "5/5 [==============================] - 0s 5ms/step - loss: 0.4652 - accuracy: 0.9031\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "Epoch 3/5\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "2022-12-14 20:16:37.745850: W tensorflow/core/kernels/data/cache_dataset_ops.cc:856] The calling iterator did not fully read the dataset being cached. In order to avoid unexpected truncation of the dataset, the partially cached contents of the dataset will be discarded. This can happen if you have an input pipeline similar to `dataset.cache().take(k).repeat()`. You should use `dataset.take(k).cache().repeat()` instead.\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "\r", "1/5 [=====>........................] - ETA: 0s - loss: 0.3215 - accuracy: 0.9688" ] }, { "name": "stdout", "output_type": "stream", "text": [ "\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\r", "5/5 [==============================] - 0s 4ms/step - loss: 0.2902 - accuracy: 0.9688\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "Epoch 4/5\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "2022-12-14 20:16:38.020249: W tensorflow/core/kernels/data/cache_dataset_ops.cc:856] The calling iterator did not fully read the dataset being cached. In order to avoid unexpected truncation of the dataset, the partially cached contents of the dataset will be discarded. This can happen if you have an input pipeline similar to `dataset.cache().take(k).repeat()`. You should use `dataset.take(k).cache().repeat()` instead.\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "\r", "1/5 [=====>........................] - ETA: 1s - loss: 0.2478 - accuracy: 0.9844" ] }, { "name": "stdout", "output_type": "stream", "text": [ "\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\r", "5/5 [==============================] - 0s 5ms/step - loss: 0.2238 - accuracy: 0.9750\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "Epoch 5/5\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "2022-12-14 20:16:38.301951: W tensorflow/core/kernels/data/cache_dataset_ops.cc:856] The calling iterator did not fully read the dataset being cached. In order to avoid unexpected truncation of the dataset, the partially cached contents of the dataset will be discarded. This can happen if you have an input pipeline similar to `dataset.cache().take(k).repeat()`. You should use `dataset.take(k).cache().repeat()` instead.\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "\r", "1/5 [=====>........................] - ETA: 1s - loss: 0.1902 - accuracy: 0.9844" ] }, { "name": "stdout", "output_type": "stream", "text": [ "\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\r", "5/5 [==============================] - 0s 5ms/step - loss: 0.1776 - accuracy: 0.9875\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "2022-12-14 20:16:38.608839: W tensorflow/core/kernels/data/cache_dataset_ops.cc:856] The calling iterator did not fully read the dataset being cached. In order to avoid unexpected truncation of the dataset, the partially cached contents of the dataset will be discarded. This can happen if you have an input pipeline similar to `dataset.cache().take(k).repeat()`. You should use `dataset.take(k).cache().repeat()` instead.\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "\r", "1/5 [=====>........................] - ETA: 1s - loss: 1.6016 - accuracy: 0.6562" ] }, { "name": "stdout", "output_type": "stream", "text": [ "\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\r", "5/5 [==============================] - 0s 3ms/step - loss: 1.6134 - accuracy: 0.5906\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "Loss 1.6133979558944702, Accuracy 0.590624988079071\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "2022-12-14 20:16:38.952507: W tensorflow/core/kernels/data/cache_dataset_ops.cc:856] The calling iterator did not fully read the dataset being cached. In order to avoid unexpected truncation of the dataset, the partially cached contents of the dataset will be discarded. This can happen if you have an input pipeline similar to `dataset.cache().take(k).repeat()`. You should use `dataset.take(k).cache().repeat()` instead.\n" ] } ], "source": [ "model = tf.keras.Sequential([\n", " tf.keras.layers.Conv2D(32, 3, activation='relu',\n", " kernel_regularizer=tf.keras.regularizers.l2(0.02),\n", " input_shape=(28, 28, 1)),\n", " tf.keras.layers.MaxPooling2D(),\n", " tf.keras.layers.Flatten(),\n", " tf.keras.layers.Dropout(0.1),\n", " tf.keras.layers.Dense(64, activation='relu'),\n", " tf.keras.layers.BatchNormalization(),\n", " tf.keras.layers.Dense(10)\n", "])\n", "\n", "# Model is the full model w/o custom layers\n", "model.compile(optimizer='adam',\n", " loss=tf.keras.losses.SparseCategoricalCrossentropy(from_logits=True),\n", " metrics=['accuracy'])\n", "\n", "model.fit(train_data, epochs=NUM_EPOCHS)\n", "loss, acc = model.evaluate(test_data)\n", "\n", "print(\"Loss {}, Accuracy {}\".format(loss, acc))" ] }, { "cell_type": "markdown", "metadata": { "id": "LQTaHTuK5S5A" }, "source": [ "\n", "\n", "### 自定义训练并编写自己的循环\n", "\n", "如果 Keras 模型适合您,但您需要更大的灵活性和对训练步骤或外层训练循环的控制,您可以实现自己的训练步骤甚至整个训练循环。如需了解详情,请参阅有关[自定义 `fit`](https://tensorflow.google.cn/guide/keras/customizing_what_happens_in_fit) 的 Keras 指南。\n", "\n", "此外 ,您还可以将许多内容作为 `tf.keras.callbacks.Callback` 实现。\n", "\n", "这种方法具有[前面提到](#keras_training_loops)的许多优点,但可以让您控制训练步骤甚至外层循环。\n", "\n", "标准训练循环分为三个步骤:\n", "\n", "1. 迭代 Python 生成器或 `tf.data.Dataset` 来获得样本批次。\n", "2. 使用 `tf.GradientTape` 收集梯度。\n", "3. 使用 `tf.keras.optimizers` 之一将权重更新应用于模型的变量。\n", "\n", "请记住:\n", "\n", "- 始终在子类化层和模型的 `call` 方法上包含一个 `training` 参数。\n", "- 确保在 `training` 参数正确设置的情况下调用模型。\n", "- 根据用法,在对一批数据运行模型之前,模型变量可能不存在。\n", "- 您需要手动处理模型的正则化损失这类问题。\n", "\n", "无需运行变量初始值设定项或添加手动控制依赖项。`tf.function` 会在创建时为您处理自动控制依赖项和变量初始化。" ] }, { "cell_type": "code", "execution_count": 8, "metadata": { "execution": { "iopub.execute_input": "2022-12-14T20:16:38.968443Z", "iopub.status.busy": "2022-12-14T20:16:38.967788Z", "iopub.status.idle": "2022-12-14T20:16:41.815304Z", "shell.execute_reply": "2022-12-14T20:16:41.814410Z" }, "id": "gQooejfYlQeF" }, "outputs": [ { "name": "stderr", "output_type": "stream", "text": [ "2022-12-14 20:16:40.664234: W tensorflow/core/kernels/data/cache_dataset_ops.cc:856] The calling iterator did not fully read the dataset being cached. In order to avoid unexpected truncation of the dataset, the partially cached contents of the dataset will be discarded. This can happen if you have an input pipeline similar to `dataset.cache().take(k).repeat()`. You should use `dataset.take(k).cache().repeat()` instead.\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "Finished epoch 0\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "2022-12-14 20:16:40.916298: W tensorflow/core/kernels/data/cache_dataset_ops.cc:856] The calling iterator did not fully read the dataset being cached. In order to avoid unexpected truncation of the dataset, the partially cached contents of the dataset will be discarded. This can happen if you have an input pipeline similar to `dataset.cache().take(k).repeat()`. You should use `dataset.take(k).cache().repeat()` instead.\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "Finished epoch 1\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "2022-12-14 20:16:41.225330: W tensorflow/core/kernels/data/cache_dataset_ops.cc:856] The calling iterator did not fully read the dataset being cached. In order to avoid unexpected truncation of the dataset, the partially cached contents of the dataset will be discarded. This can happen if you have an input pipeline similar to `dataset.cache().take(k).repeat()`. You should use `dataset.take(k).cache().repeat()` instead.\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "Finished epoch 2\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "2022-12-14 20:16:41.484274: W tensorflow/core/kernels/data/cache_dataset_ops.cc:856] The calling iterator did not fully read the dataset being cached. In order to avoid unexpected truncation of the dataset, the partially cached contents of the dataset will be discarded. This can happen if you have an input pipeline similar to `dataset.cache().take(k).repeat()`. You should use `dataset.take(k).cache().repeat()` instead.\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "Finished epoch 3\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "Finished epoch 4\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "2022-12-14 20:16:41.802437: W tensorflow/core/kernels/data/cache_dataset_ops.cc:856] The calling iterator did not fully read the dataset being cached. In order to avoid unexpected truncation of the dataset, the partially cached contents of the dataset will be discarded. This can happen if you have an input pipeline similar to `dataset.cache().take(k).repeat()`. You should use `dataset.take(k).cache().repeat()` instead.\n" ] } ], "source": [ "model = tf.keras.Sequential([\n", " tf.keras.layers.Conv2D(32, 3, activation='relu',\n", " kernel_regularizer=tf.keras.regularizers.l2(0.02),\n", " input_shape=(28, 28, 1)),\n", " tf.keras.layers.MaxPooling2D(),\n", " tf.keras.layers.Flatten(),\n", " tf.keras.layers.Dropout(0.1),\n", " tf.keras.layers.Dense(64, activation='relu'),\n", " tf.keras.layers.BatchNormalization(),\n", " tf.keras.layers.Dense(10)\n", "])\n", "\n", "optimizer = tf.keras.optimizers.Adam(0.001)\n", "loss_fn = tf.keras.losses.SparseCategoricalCrossentropy(from_logits=True)\n", "\n", "@tf.function\n", "def train_step(inputs, labels):\n", " with tf.GradientTape() as tape:\n", " predictions = model(inputs, training=True)\n", " regularization_loss=tf.math.add_n(model.losses)\n", " pred_loss=loss_fn(labels, predictions)\n", " total_loss=pred_loss + regularization_loss\n", "\n", " gradients = tape.gradient(total_loss, model.trainable_variables)\n", " optimizer.apply_gradients(zip(gradients, model.trainable_variables))\n", "\n", "for epoch in range(NUM_EPOCHS):\n", " for inputs, labels in train_data:\n", " train_step(inputs, labels)\n", " print(\"Finished epoch\", epoch)\n" ] }, { "cell_type": "markdown", "metadata": { "id": "WikxMFGgo3oZ" }, "source": [ "### 通过 Python 控制流充分利用 `tf.function`\n", "\n", "`tf.function` 提供了一种将依赖于数据的控制流转换为计算图模式等效项(如 `tf.cond` 和 `tf.while_loop`)的方法。\n", "\n", "数据依赖控制流出现的一个常见位置是序列模型。`tf.keras.layers.RNN` 封装一个 RNN 单元,允许您以静态或动态方式展开递归。例如,您可以按照下文所述重新实现动态展开。" ] }, { "cell_type": "code", "execution_count": 9, "metadata": { "execution": { "iopub.execute_input": "2022-12-14T20:16:41.819151Z", "iopub.status.busy": "2022-12-14T20:16:41.818482Z", "iopub.status.idle": "2022-12-14T20:16:41.825449Z", "shell.execute_reply": "2022-12-14T20:16:41.824661Z" }, "id": "n5UebfChRu4T" }, "outputs": [], "source": [ "class DynamicRNN(tf.keras.Model):\n", "\n", " def __init__(self, rnn_cell):\n", " super(DynamicRNN, self).__init__(self)\n", " self.cell = rnn_cell\n", "\n", " @tf.function(input_signature=[tf.TensorSpec(dtype=tf.float32, shape=[None, None, 3])])\n", " def call(self, input_data):\n", "\n", " # [batch, time, features] -> [time, batch, features]\n", " input_data = tf.transpose(input_data, [1, 0, 2])\n", " timesteps = tf.shape(input_data)[0]\n", " batch_size = tf.shape(input_data)[1]\n", " outputs = tf.TensorArray(tf.float32, timesteps)\n", " state = self.cell.get_initial_state(batch_size = batch_size, dtype=tf.float32)\n", " for i in tf.range(timesteps):\n", " output, state = self.cell(input_data[i], state)\n", " outputs = outputs.write(i, output)\n", " return tf.transpose(outputs.stack(), [1, 0, 2]), state" ] }, { "cell_type": "code", "execution_count": 10, "metadata": { "execution": { "iopub.execute_input": "2022-12-14T20:16:41.829125Z", "iopub.status.busy": "2022-12-14T20:16:41.828401Z", "iopub.status.idle": "2022-12-14T20:16:42.092698Z", "shell.execute_reply": "2022-12-14T20:16:42.091784Z" }, "id": "NhBI_SGKQVIB" }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "(10, 20, 13)\n" ] } ], "source": [ "lstm_cell = tf.keras.layers.LSTMCell(units = 13)\n", "\n", "my_rnn = DynamicRNN(lstm_cell)\n", "outputs, state = my_rnn(tf.random.normal(shape=[10,20,3]))\n", "print(outputs.shape)" ] }, { "cell_type": "markdown", "metadata": { "id": "du7bn3NX7iIr" }, "source": [ "阅读 [`tf.function` 指南](https://tensorflow.google.cn/guide/function)以了解更多信息。" ] }, { "cell_type": "markdown", "metadata": { "id": "SUAYhgL_NomT" }, "source": [ "### 新型指标和损失\n", "\n", "指标和损失均为对象,两者都在 Eager 模式下工作,且都位于 `tf.function` 中。\n", "\n", "损失对象是可调用对象,并使用 (`y_true`, `y_pred`) 作为参数:" ] }, { "cell_type": "code", "execution_count": 11, "metadata": { "execution": { "iopub.execute_input": "2022-12-14T20:16:42.096575Z", "iopub.status.busy": "2022-12-14T20:16:42.096055Z", "iopub.status.idle": "2022-12-14T20:16:42.118392Z", "shell.execute_reply": "2022-12-14T20:16:42.117620Z" }, "id": "Pf5gcwMzNs8F" }, "outputs": [ { "data": { "text/plain": [ "4.01815" ] }, "execution_count": 11, "metadata": {}, "output_type": "execute_result" } ], "source": [ "cce = tf.keras.losses.CategoricalCrossentropy(from_logits=True)\n", "cce([[1, 0]], [[-1.0,3.0]]).numpy()" ] }, { "cell_type": "markdown", "metadata": { "id": "a89m-wRfxyfV" }, "source": [ "#### 使用指标收集和显示数据\n", "\n", "您可以使用 `tf.metrics` 聚合数据,使用 `tf.summary` 记录摘要并使用上下文管理器将其重定向到编写器。摘要会直接发送到编写器,这意味着您必须在调用点提供 `step` 值。\n", "\n", "```python\n", "summary_writer = tf.summary.create_file_writer('/tmp/summaries')\n", "with summary_writer.as_default():\n", " tf.summary.scalar('loss', 0.1, step=42)\n", "```\n", "\n", "要在将数据记录为摘要之前对其进行聚合,请使用 `tf.metrics`。指标是有状态的;它们积累值并在您调用 `result` 方法(例如 `Mean.result`)时返回累积结果。可以使用 `Model.reset_states` 清除累积值。\n", "\n", "```python\n", "def train(model, optimizer, dataset, log_freq=10):\n", " avg_loss = tf.keras.metrics.Mean(name='loss', dtype=tf.float32)\n", " for images, labels in dataset:\n", " loss = train_step(model, optimizer, images, labels)\n", " avg_loss.update_state(loss)\n", " if tf.equal(optimizer.iterations % log_freq, 0):\n", " tf.summary.scalar('loss', avg_loss.result(), step=optimizer.iterations)\n", " avg_loss.reset_states()\n", "\n", "def test(model, test_x, test_y, step_num):\n", " # training=False is only needed if there are layers with different\n", " # behavior during training versus inference (e.g. Dropout).\n", " loss = loss_fn(model(test_x, training=False), test_y)\n", " tf.summary.scalar('loss', loss, step=step_num)\n", "\n", "train_summary_writer = tf.summary.create_file_writer('/tmp/summaries/train')\n", "test_summary_writer = tf.summary.create_file_writer('/tmp/summaries/test')\n", "\n", "with train_summary_writer.as_default():\n", " train(model, optimizer, dataset)\n", "\n", "with test_summary_writer.as_default():\n", " test(model, test_x, test_y, optimizer.iterations)\n", "```\n", "\n", "通过将 TensorBoard 指向摘要日志目录来呈现生成的摘要:\n", "\n", "```shell\n", "tensorboard --logdir /tmp/summaries\n", "```" ] }, { "cell_type": "markdown", "metadata": { "id": "0tx7FyM_RHwJ" }, "source": [ "使用 `tf.summary` API 编写要在 TensorBoard 中呈现的摘要数据。有关更多信息,请阅读 `tf.summary` 指南。" ] }, { "cell_type": "code", "execution_count": 12, "metadata": { "execution": { "iopub.execute_input": "2022-12-14T20:16:42.122068Z", "iopub.status.busy": "2022-12-14T20:16:42.121596Z", "iopub.status.idle": "2022-12-14T20:16:43.841043Z", "shell.execute_reply": "2022-12-14T20:16:43.840263Z" }, "id": "HAbA0fKW58CH" }, "outputs": [ { "name": "stderr", "output_type": "stream", "text": [ "2022-12-14 20:16:42.710276: W tensorflow/core/kernels/data/cache_dataset_ops.cc:856] The calling iterator did not fully read the dataset being cached. In order to avoid unexpected truncation of the dataset, the partially cached contents of the dataset will be discarded. This can happen if you have an input pipeline similar to `dataset.cache().take(k).repeat()`. You should use `dataset.take(k).cache().repeat()` instead.\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "Epoch: 0\n", " loss: 0.158\n", " accuracy: 0.991\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "2022-12-14 20:16:42.986717: W tensorflow/core/kernels/data/cache_dataset_ops.cc:856] The calling iterator did not fully read the dataset being cached. In order to avoid unexpected truncation of the dataset, the partially cached contents of the dataset will be discarded. This can happen if you have an input pipeline similar to `dataset.cache().take(k).repeat()`. You should use `dataset.take(k).cache().repeat()` instead.\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "Epoch: 1\n", " loss: 0.134\n", " accuracy: 0.997\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "2022-12-14 20:16:43.273487: W tensorflow/core/kernels/data/cache_dataset_ops.cc:856] The calling iterator did not fully read the dataset being cached. In order to avoid unexpected truncation of the dataset, the partially cached contents of the dataset will be discarded. This can happen if you have an input pipeline similar to `dataset.cache().take(k).repeat()`. You should use `dataset.take(k).cache().repeat()` instead.\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "Epoch: 2\n", " loss: 0.113\n", " accuracy: 0.997\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "2022-12-14 20:16:43.559555: W tensorflow/core/kernels/data/cache_dataset_ops.cc:856] The calling iterator did not fully read the dataset being cached. In order to avoid unexpected truncation of the dataset, the partially cached contents of the dataset will be discarded. This can happen if you have an input pipeline similar to `dataset.cache().take(k).repeat()`. You should use `dataset.take(k).cache().repeat()` instead.\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "Epoch: 3\n", " loss: 0.099\n", " accuracy: 1.000\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "Epoch: 4\n", " loss: 0.091\n", " accuracy: 1.000\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "2022-12-14 20:16:43.827528: W tensorflow/core/kernels/data/cache_dataset_ops.cc:856] The calling iterator did not fully read the dataset being cached. In order to avoid unexpected truncation of the dataset, the partially cached contents of the dataset will be discarded. This can happen if you have an input pipeline similar to `dataset.cache().take(k).repeat()`. You should use `dataset.take(k).cache().repeat()` instead.\n" ] } ], "source": [ "# Create the metrics\n", "loss_metric = tf.keras.metrics.Mean(name='train_loss')\n", "accuracy_metric = tf.keras.metrics.SparseCategoricalAccuracy(name='train_accuracy')\n", "\n", "@tf.function\n", "def train_step(inputs, labels):\n", " with tf.GradientTape() as tape:\n", " predictions = model(inputs, training=True)\n", " regularization_loss=tf.math.add_n(model.losses)\n", " pred_loss=loss_fn(labels, predictions)\n", " total_loss=pred_loss + regularization_loss\n", "\n", " gradients = tape.gradient(total_loss, model.trainable_variables)\n", " optimizer.apply_gradients(zip(gradients, model.trainable_variables))\n", " # Update the metrics\n", " loss_metric.update_state(total_loss)\n", " accuracy_metric.update_state(labels, predictions)\n", "\n", "\n", "for epoch in range(NUM_EPOCHS):\n", " # Reset the metrics\n", " loss_metric.reset_states()\n", " accuracy_metric.reset_states()\n", "\n", " for inputs, labels in train_data:\n", " train_step(inputs, labels)\n", " # Get the metric results\n", " mean_loss=loss_metric.result()\n", " mean_accuracy = accuracy_metric.result()\n", "\n", " print('Epoch: ', epoch)\n", " print(' loss: {:.3f}'.format(mean_loss))\n", " print(' accuracy: {:.3f}'.format(mean_accuracy))\n" ] }, { "cell_type": "markdown", "metadata": { "id": "bG9AaMzih3eh" }, "source": [ "#### Keras 指标名称\n", "\n", "\n", "\n", "Keras 模型以一致方式处理指标名称。当您在指标列表中传递字符串时,该*确切*字符串会用作指标的 `name`。这些名称在 `model.fit` 返回的历史对象中可见,而在传递给 `keras.callbacks` 的日志中,它们被设置为您在指标列表中传递的字符串。 " ] }, { "cell_type": "code", "execution_count": 13, "metadata": { "execution": { "iopub.execute_input": "2022-12-14T20:16:43.844854Z", "iopub.status.busy": "2022-12-14T20:16:43.844207Z", "iopub.status.idle": "2022-12-14T20:16:45.500798Z", "shell.execute_reply": "2022-12-14T20:16:45.499871Z" }, "id": "1iODIsGDgyYd" }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "\r", "1/5 [=====>........................] - ETA: 6s - loss: 0.0799 - acc: 1.0000 - accuracy: 1.0000 - my_accuracy: 1.0000" ] }, { "name": "stdout", "output_type": "stream", "text": [ "\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\r", "5/5 [==============================] - 2s 5ms/step - loss: 0.1055 - acc: 0.9937 - accuracy: 0.9937 - my_accuracy: 0.9937\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "2022-12-14 20:16:45.487647: W tensorflow/core/kernels/data/cache_dataset_ops.cc:856] The calling iterator did not fully read the dataset being cached. In order to avoid unexpected truncation of the dataset, the partially cached contents of the dataset will be discarded. This can happen if you have an input pipeline similar to `dataset.cache().take(k).repeat()`. You should use `dataset.take(k).cache().repeat()` instead.\n" ] } ], "source": [ "model.compile(\n", " optimizer = tf.keras.optimizers.Adam(0.001),\n", " loss = tf.keras.losses.SparseCategoricalCrossentropy(from_logits=True),\n", " metrics = ['acc', 'accuracy', tf.keras.metrics.SparseCategoricalAccuracy(name=\"my_accuracy\")])\n", "history = model.fit(train_data)" ] }, { "cell_type": "code", "execution_count": 14, "metadata": { "execution": { "iopub.execute_input": "2022-12-14T20:16:45.504390Z", "iopub.status.busy": "2022-12-14T20:16:45.503784Z", "iopub.status.idle": "2022-12-14T20:16:45.509060Z", "shell.execute_reply": "2022-12-14T20:16:45.508250Z" }, "id": "8oGzs_TlisKJ" }, "outputs": [ { "data": { "text/plain": [ "dict_keys(['loss', 'acc', 'accuracy', 'my_accuracy'])" ] }, "execution_count": 14, "metadata": {}, "output_type": "execute_result" } ], "source": [ "history.history.keys()" ] }, { "cell_type": "markdown", "metadata": { "id": "JaB2z2XIyhcr" }, "source": [ "### 调试\n", "\n", "使用 Eager Execution 可以分步运行代码来检查形状、数据类型和值。某些 API(如 `tf.function`、`tf.keras` 等)设计为使用计算图执行来提高性能和可移植性。调试时,使用 `tf.config.run_functions_eagerly(True)` 可以在此代码内使用 Eager Execution。\n", "\n", "例如:\n", "\n", "```python\n", "@tf.function\n", "def f(x):\n", " if x > 0:\n", " import pdb\n", " pdb.set_trace()\n", " x = x + 1\n", " return x\n", "\n", "tf.config.run_functions_eagerly(True)\n", "f(tf.constant(1))\n", "```\n", "\n", "```\n", ">>> f()\n", "-> x = x + 1\n", "(Pdb) l\n", " 6 @tf.function\n", " 7 def f(x):\n", " 8 if x > 0:\n", " 9 import pdb\n", " 10 pdb.set_trace()\n", " 11 -> x = x + 1\n", " 12 return x\n", " 13\n", " 14 tf.config.run_functions_eagerly(True)\n", " 15 f(tf.constant(1))\n", "[EOF]\n", "```" ] }, { "cell_type": "markdown", "metadata": { "id": "gdvGF2FvbBXZ" }, "source": [ "这也可以在 Keras 模型和其他支持 Eager Execution 的 API 中使用:\n", "\n", "```python\n", "class CustomModel(tf.keras.models.Model):\n", "\n", " @tf.function\n", " def call(self, input_data):\n", " if tf.reduce_mean(input_data) > 0:\n", " return input_data\n", " else:\n", " import pdb\n", " pdb.set_trace()\n", " return input_data // 2\n", "\n", "\n", "tf.config.run_functions_eagerly(True)\n", "model = CustomModel()\n", "model(tf.constant([-2, -4]))\n", "```\n", "\n", "```\n", ">>> call()\n", "-> return input_data // 2\n", "(Pdb) l\n", " 10 if tf.reduce_mean(input_data) > 0:\n", " 11 return input_data\n", " 12 else:\n", " 13 import pdb\n", " 14 pdb.set_trace()\n", " 15 -> return input_data // 2\n", " 16\n", " 17\n", " 18 tf.config.run_functions_eagerly(True)\n", " 19 model = CustomModel()\n", " 20 model(tf.constant([-2, -4]))\n", "```" ] }, { "cell_type": "markdown", "metadata": { "id": "S0-F-bvJXKD8" }, "source": [ "注释:\n", "\n", "- `tf.keras.Model` 方法(例如 `fit`、`evaluate` 和 `predict`)作为[计算图](https://tensorflow.google.cn/guide/intro_to_graphs)执行,并且 `tf.function` 位于底层。\n", "\n", "- 使用 `tf.keras.Model.compile`时,设置 `run_eagerly = True` 以禁止 `Model` 逻辑被封装在 `tf.function` 中。\n", "\n", "- 使用 `tf.data.experimental.enable_debug_mode` 为 `tf.data` 启用调试模式。阅读 [API 文档](https://tensorflow.google.cn/api_docs/python/tf/data/experimental/enable_debug_mode),了解详细信息。\n" ] }, { "cell_type": "markdown", "metadata": { "id": "wxa5yKK7bym0" }, "source": [ "### 不要在您的对象中保留 `tf.Tensors`\n", "\n", "这些张量对象可能会在 `tf.function` 或 Eager 上下文中创建,并且这些张量的行为有所不同。始终仅将 `tf.Tensor` 用于中间值。\n", "\n", "要跟踪状态,请使用 `tf.Variable`,因为它们始终可用于两种上下文。阅读 `tf.Variable` 指南以了解更多信息。\n" ] }, { "cell_type": "markdown", "metadata": { "id": "FdXLLYa2uAyx" }, "source": [ "## 资源和延伸阅读\n", "\n", "- 阅读 TF2 [指南](https://tensorflow.org/guide)和[教程](https://tensorflow.org/tutorials)以详细了解如何使用 TF2。\n", "\n", "- 如果您以前使用过 TF1.x,强烈建议您将代码迁移到 TF2。阅读[迁移指南](https://tensorflow.org/guide/migrate)以了解更多信息。" ] } ], "metadata": { "colab": { "collapsed_sections": [], "name": "effective_tf2.ipynb", "toc_visible": true }, "kernelspec": { "display_name": "Python 3", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.9.16" } }, "nbformat": 4, "nbformat_minor": 0 }