{ "cells": [ { "cell_type": "markdown", "metadata": { "id": "Tce3stUlHN0L" }, "source": [ "##### Copyright 2019 The TensorFlow Authors.\n" ] }, { "cell_type": "code", "execution_count": 1, "metadata": { "cellView": "form", "execution": { "iopub.execute_input": "2022-12-14T20:00:03.143830Z", "iopub.status.busy": "2022-12-14T20:00:03.143405Z", "iopub.status.idle": "2022-12-14T20:00:03.147088Z", "shell.execute_reply": "2022-12-14T20:00:03.146590Z" }, "id": "tuOe1ymfHZPu" }, "outputs": [], "source": [ "#@title Licensed under the Apache License, Version 2.0 (the \"License\");\n", "# you may not use this file except in compliance with the License.\n", "# You may obtain a copy of the License at\n", "#\n", "# https://www.apache.org/licenses/LICENSE-2.0\n", "#\n", "# Unless required by applicable law or agreed to in writing, software\n", "# distributed under the License is distributed on an \"AS IS\" BASIS,\n", "# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n", "# See the License for the specific language governing permissions and\n", "# limitations under the License." ] }, { "cell_type": "markdown", "metadata": { "id": "MfBg1C5NB3X0" }, "source": [ "# Keras を使ったマルチワーカートレーニング\n", "\n", "\n", " \n", " \n", " \n", " \n", "
TensorFlow.org で表示 Google Colabで実行 GitHubでソースを表示 ノートブックをダウンロード
" ] }, { "cell_type": "markdown", "metadata": { "id": "xHxb-dlhMIzW" }, "source": [ "## 概要\n", "\n", "このチュートリアルでは、`tf.distribute.Strategy` API を使用して、Keras モデルと Model.fit API によるマルチワーカー分散型トレーニングを実演します。このストラテジーにより、単一のワーカーで実行するように設計された Keras モデルは、最小限のコード変更で複数のワーカーでシームレスに機能することができます。\n", "\n", "To learn how to use the `MultiWorkerMirroredStrategy` with Keras and a custom training loop, refer to [Custom training loop with Keras and MultiWorkerMirroredStrategy](multi_worker_with_ctl.ipynb).\n", "\n", "このチュートリアルには、デモ用に 2 つのワーカーを含む最小限のマルチワーカーの例が含まれています。" ] }, { "cell_type": "markdown", "metadata": { "id": "JUdRerXg6yz3" }, "source": [ "### 適切なストラテジーを選択する" ] }, { "cell_type": "markdown", "metadata": { "id": "YAiCV_oL63GM" }, "source": [ "始める前に、アクセラレータとトレーニングに `tf.distribute.MultiWorkerMirroredStrategy` が適切な選択であることを確認してください。これらは、データ並列処理を使用してトレーニングを分散する 2 つの一般的な方法です。\n", "\n", "- *同期トレーニング*。`tf.distribute.MirroredStrategy`、`tf.distribute.TPUStrategy` および `tf.distribute.MultiWorkerMirroredStrategy` などのトレーニングステップがワーカーとレプリカ間で同期されます。すべてのワーカーは、入力データの異なるスライスを同期してトレーニングし、各ステップで勾配を集約します。\n", "- *非同期トレーニング*。`tf.distribute.experimental.ParameterServerStrategy` など、トレーニングステップが厳密に同期されていません。すべてのワーカーは、入力データを個別にトレーニングし、変数を非同期的に更新します。\n", "\n", "TPU を使用しないマルチワーカーの同期トレーニングには、`tf.distribute.experimental.MultiWorkerMirroredStrategy` を使用します。これは、すべてのワーカーの各デバイスにあるモデルのレイヤーにすべての変数のコピーを作成します。集合通信に使用する TensorFlow 演算子 `CollectiveOps` を使用して勾配を集め、変数の同期を維持します。集合実装オプションについては、`tf.distribute.experimental.CommunicationOptions` パラメータを確認してください。\n", "\n", "`tf.distribute.Strategy` API の概要については、[TensorFlow での分散トレーニング](../../guide/distributed_training.ipynb)を参照してください。" ] }, { "cell_type": "markdown", "metadata": { "id": "MUXex9ctTuDB" }, "source": [ "## セットアップ\n", "\n", "まず、必要なものをインポートします。" ] }, { "cell_type": "code", "execution_count": 2, "metadata": { "execution": { "iopub.execute_input": "2022-12-14T20:00:03.150605Z", "iopub.status.busy": "2022-12-14T20:00:03.150155Z", "iopub.status.idle": "2022-12-14T20:00:03.156960Z", "shell.execute_reply": "2022-12-14T20:00:03.156339Z" }, "id": "bnYxvfLD-LW-" }, "outputs": [], "source": [ "import json\n", "import os\n", "import sys" ] }, { "cell_type": "markdown", "metadata": { "id": "Zz0EY91y3mxy" }, "source": [ "TensorFlow をインポートする前に、環境にいくつかの変更を加えます。\n", "\n", "- 実際のアプリケーションでは、各ワーカーは異なるマシン上にあります。このチュートリアルでは、すべてのワーカーが**この**マシンで実行されます。そのため、すべての GPU を無効にして、すべてのワーカーが同じ GPU を使用しようとすることによって発生するエラーを防ぎます。" ] }, { "cell_type": "code", "execution_count": 3, "metadata": { "execution": { "iopub.execute_input": "2022-12-14T20:00:03.160306Z", "iopub.status.busy": "2022-12-14T20:00:03.159739Z", "iopub.status.idle": "2022-12-14T20:00:03.162864Z", "shell.execute_reply": "2022-12-14T20:00:03.162219Z" }, "id": "rpEIVI5upIzM" }, "outputs": [], "source": [ "os.environ[\"CUDA_VISIBLE_DEVICES\"] = \"-1\"" ] }, { "cell_type": "markdown", "metadata": { "id": "7X1MS6385BWi" }, "source": [ "- `TF_CONFIG` 環境変数をリセットします(これについては後で詳しく説明します)。" ] }, { "cell_type": "code", "execution_count": 4, "metadata": { "execution": { "iopub.execute_input": "2022-12-14T20:00:03.166028Z", "iopub.status.busy": "2022-12-14T20:00:03.165585Z", "iopub.status.idle": "2022-12-14T20:00:03.168435Z", "shell.execute_reply": "2022-12-14T20:00:03.167905Z" }, "id": "WEJLYa2_7OZF" }, "outputs": [], "source": [ "os.environ.pop('TF_CONFIG', None)" ] }, { "cell_type": "markdown", "metadata": { "id": "Rd4L9Ii77SS8" }, "source": [ "- 現在のディレクトリが Python のパス上にあることを確認してください。これにより、ノートブックは `%%writefile` で書き込まれたファイルを後でインポートできるようになります。\n" ] }, { "cell_type": "code", "execution_count": 5, "metadata": { "execution": { "iopub.execute_input": "2022-12-14T20:00:03.171657Z", "iopub.status.busy": "2022-12-14T20:00:03.171137Z", "iopub.status.idle": "2022-12-14T20:00:03.174209Z", "shell.execute_reply": "2022-12-14T20:00:03.173672Z" }, "id": "hPBuZUNSZmrQ" }, "outputs": [], "source": [ "if '.' not in sys.path:\n", " sys.path.insert(0, '.')" ] }, { "cell_type": "markdown", "metadata": { "id": "9hLpDZhAz2q-" }, "source": [ "`tf-nightly` をインストールします。TensorFlow 2.10 から `tf.keras.callbacks.BackupAndRestore` の `save_freq` 引数を使用した特定のステップでのチェックポイント保存頻度が導入されます。" ] }, { "cell_type": "code", "execution_count": 6, "metadata": { "execution": { "iopub.execute_input": "2022-12-14T20:00:03.177440Z", "iopub.status.busy": "2022-12-14T20:00:03.177054Z", "iopub.status.idle": "2022-12-14T20:00:31.425335Z", "shell.execute_reply": "2022-12-14T20:00:31.424469Z" }, "id": "-XqozLfzz30N" }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Collecting tf-nightly\r\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ " Downloading tf_nightly-2.12.0.dev20221214-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (556.4 MB)\r\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "Collecting jax>=0.3.15\r\n", " Downloading jax-0.4.1.tar.gz (1.2 MB)\r\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ " Preparing metadata (setup.py) ... \u001b[?25l-" ] }, { "name": "stdout", "output_type": "stream", "text": [ "\b \bdone\r\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "\u001b[?25hRequirement already satisfied: numpy>=1.20 in /tmpfs/src/tf_docs_env/lib/python3.9/site-packages (from tf-nightly) (1.24.0rc2)\r\n", "Requirement already satisfied: flatbuffers>=2.0 in /tmpfs/src/tf_docs_env/lib/python3.9/site-packages (from tf-nightly) (22.12.6)\r\n", "Requirement already satisfied: absl-py>=1.0.0 in /tmpfs/src/tf_docs_env/lib/python3.9/site-packages (from tf-nightly) (1.3.0)\r\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "Collecting tb-nightly~=2.12.0.a\r\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ " Downloading tb_nightly-2.12.0a20221214-py3-none-any.whl (5.7 MB)\r\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "Requirement already satisfied: wrapt>=1.11.0 in /tmpfs/src/tf_docs_env/lib/python3.9/site-packages (from tf-nightly) (1.14.1)\r\n", "Requirement already satisfied: typing-extensions>=3.6.6 in /tmpfs/src/tf_docs_env/lib/python3.9/site-packages (from tf-nightly) (4.4.0)\r\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "Collecting tf-estimator-nightly~=2.12.0.dev\r\n", " Downloading tf_estimator_nightly-2.12.0.dev2022121409-py2.py3-none-any.whl (439 kB)\r\n", "Requirement already satisfied: six>=1.12.0 in /tmpfs/src/tf_docs_env/lib/python3.9/site-packages (from tf-nightly) (1.16.0)\r\n", "Requirement already satisfied: protobuf<3.20,>=3.9.2 in /tmpfs/src/tf_docs_env/lib/python3.9/site-packages (from tf-nightly) (3.19.6)\r\n", "Requirement already satisfied: termcolor>=1.1.0 in /tmpfs/src/tf_docs_env/lib/python3.9/site-packages (from tf-nightly) (2.1.1)\r\n", "Requirement already satisfied: google-pasta>=0.1.1 in /tmpfs/src/tf_docs_env/lib/python3.9/site-packages (from tf-nightly) (0.2.0)\r\n", "Requirement already satisfied: setuptools in /tmpfs/src/tf_docs_env/lib/python3.9/site-packages (from tf-nightly) (65.6.3)\r\n", "Requirement already satisfied: gast<=0.4.0,>=0.2.1 in /tmpfs/src/tf_docs_env/lib/python3.9/site-packages (from tf-nightly) (0.4.0)\r\n", "Requirement already satisfied: packaging in /tmpfs/src/tf_docs_env/lib/python3.9/site-packages (from tf-nightly) (22.0)\r\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "Requirement already satisfied: grpcio<2.0,>=1.24.3 in /tmpfs/src/tf_docs_env/lib/python3.9/site-packages (from tf-nightly) (1.51.1)\r\n", "Requirement already satisfied: tensorflow-io-gcs-filesystem>=0.23.1 in /tmpfs/src/tf_docs_env/lib/python3.9/site-packages (from tf-nightly) (0.28.0)\r\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "Collecting keras-nightly~=2.12.0.dev\r\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ " Downloading keras_nightly-2.12.0.dev2022121408-py2.py3-none-any.whl (1.7 MB)\r\n", "Requirement already satisfied: libclang>=13.0.0 in /tmpfs/src/tf_docs_env/lib/python3.9/site-packages (from tf-nightly) (14.0.6)\r\n", "Requirement already satisfied: h5py>=2.9.0 in /tmpfs/src/tf_docs_env/lib/python3.9/site-packages (from tf-nightly) (3.7.0)\r\n", "Requirement already satisfied: astunparse>=1.6.0 in /tmpfs/src/tf_docs_env/lib/python3.9/site-packages (from tf-nightly) (1.6.3)\r\n", "Requirement already satisfied: opt-einsum>=2.3.2 in /tmpfs/src/tf_docs_env/lib/python3.9/site-packages (from tf-nightly) (3.3.0)\r\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "Requirement already satisfied: wheel<1.0,>=0.23.0 in /tmpfs/src/tf_docs_env/lib/python3.9/site-packages (from astunparse>=1.6.0->tf-nightly) (0.37.1)\r\n", "Requirement already satisfied: scipy>=1.5 in /tmpfs/src/tf_docs_env/lib/python3.9/site-packages (from jax>=0.3.15->tf-nightly) (1.9.3)\r\n", "Requirement already satisfied: requests<3,>=2.21.0 in /tmpfs/src/tf_docs_env/lib/python3.9/site-packages (from tb-nightly~=2.12.0.a->tf-nightly) (2.28.1)\r\n", "Requirement already satisfied: tensorboard-plugin-wit>=1.6.0 in /tmpfs/src/tf_docs_env/lib/python3.9/site-packages (from tb-nightly~=2.12.0.a->tf-nightly) (1.8.1)\r\n", "Requirement already satisfied: google-auth<3,>=1.6.3 in /tmpfs/src/tf_docs_env/lib/python3.9/site-packages (from tb-nightly~=2.12.0.a->tf-nightly) (2.15.0)\r\n", "Requirement already satisfied: google-auth-oauthlib<0.5,>=0.4.1 in /tmpfs/src/tf_docs_env/lib/python3.9/site-packages (from tb-nightly~=2.12.0.a->tf-nightly) (0.4.6)\r\n", "Requirement already satisfied: werkzeug>=1.0.1 in /tmpfs/src/tf_docs_env/lib/python3.9/site-packages (from tb-nightly~=2.12.0.a->tf-nightly) (2.2.2)\r\n", "Requirement already satisfied: markdown>=2.6.8 in /tmpfs/src/tf_docs_env/lib/python3.9/site-packages (from tb-nightly~=2.12.0.a->tf-nightly) (3.4.1)\r\n", "Requirement already satisfied: tensorboard-data-server<0.7.0,>=0.6.0 in /tmpfs/src/tf_docs_env/lib/python3.9/site-packages (from tb-nightly~=2.12.0.a->tf-nightly) (0.6.1)\r\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "Requirement already satisfied: cachetools<6.0,>=2.0.0 in /tmpfs/src/tf_docs_env/lib/python3.9/site-packages (from google-auth<3,>=1.6.3->tb-nightly~=2.12.0.a->tf-nightly) (5.2.0)\r\n", "Requirement already satisfied: rsa<5,>=3.1.4 in /tmpfs/src/tf_docs_env/lib/python3.9/site-packages (from google-auth<3,>=1.6.3->tb-nightly~=2.12.0.a->tf-nightly) (4.9)\r\n", "Requirement already satisfied: pyasn1-modules>=0.2.1 in /tmpfs/src/tf_docs_env/lib/python3.9/site-packages (from google-auth<3,>=1.6.3->tb-nightly~=2.12.0.a->tf-nightly) (0.3.0rc1)\r\n", "Requirement already satisfied: requests-oauthlib>=0.7.0 in /tmpfs/src/tf_docs_env/lib/python3.9/site-packages (from google-auth-oauthlib<0.5,>=0.4.1->tb-nightly~=2.12.0.a->tf-nightly) (1.3.1)\r\n", "Requirement already satisfied: importlib-metadata>=4.4 in /tmpfs/src/tf_docs_env/lib/python3.9/site-packages (from markdown>=2.6.8->tb-nightly~=2.12.0.a->tf-nightly) (5.1.0)\r\n", "Requirement already satisfied: certifi>=2017.4.17 in /tmpfs/src/tf_docs_env/lib/python3.9/site-packages (from requests<3,>=2.21.0->tb-nightly~=2.12.0.a->tf-nightly) (2022.12.7)\r\n", "Requirement already satisfied: idna<4,>=2.5 in /tmpfs/src/tf_docs_env/lib/python3.9/site-packages (from requests<3,>=2.21.0->tb-nightly~=2.12.0.a->tf-nightly) (3.4)\r\n", "Requirement already satisfied: urllib3<1.27,>=1.21.1 in /tmpfs/src/tf_docs_env/lib/python3.9/site-packages (from requests<3,>=2.21.0->tb-nightly~=2.12.0.a->tf-nightly) (1.26.13)\r\n", "Requirement already satisfied: charset-normalizer<3,>=2 in /tmpfs/src/tf_docs_env/lib/python3.9/site-packages (from requests<3,>=2.21.0->tb-nightly~=2.12.0.a->tf-nightly) (2.1.1)\r\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "Requirement already satisfied: MarkupSafe>=2.1.1 in /tmpfs/src/tf_docs_env/lib/python3.9/site-packages (from werkzeug>=1.0.1->tb-nightly~=2.12.0.a->tf-nightly) (2.1.1)\r\n", "Requirement already satisfied: zipp>=0.5 in /tmpfs/src/tf_docs_env/lib/python3.9/site-packages (from importlib-metadata>=4.4->markdown>=2.6.8->tb-nightly~=2.12.0.a->tf-nightly) (3.11.0)\r\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "Requirement already satisfied: pyasn1<0.6.0,>=0.4.6 in /tmpfs/src/tf_docs_env/lib/python3.9/site-packages (from pyasn1-modules>=0.2.1->google-auth<3,>=1.6.3->tb-nightly~=2.12.0.a->tf-nightly) (0.5.0rc2)\r\n", "Requirement already satisfied: oauthlib>=3.0.0 in /tmpfs/src/tf_docs_env/lib/python3.9/site-packages (from requests-oauthlib>=0.7.0->google-auth-oauthlib<0.5,>=0.4.1->tb-nightly~=2.12.0.a->tf-nightly) (3.2.2)\r\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "Building wheels for collected packages: jax\r\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ " Building wheel for jax (setup.py) ... \u001b[?25l-" ] }, { "name": "stdout", "output_type": "stream", "text": [ "\b \b\\" ] }, { "name": "stdout", "output_type": "stream", "text": [ "\b \b|" ] }, { "name": "stdout", "output_type": "stream", "text": [ "\b \b/" ] }, { "name": "stdout", "output_type": "stream", "text": [ "\b \bdone\r\n", "\u001b[?25h Created wheel for jax: filename=jax-0.4.1-py3-none-any.whl size=1332462 sha256=e4b7a0b05e48ea35ddbda56a810cf2446f98784785f2bf0acfab5db413b6e4b1\r\n", " Stored in directory: /home/kbuilder/.cache/pip/wheels/50/a9/f3/86082312fd44e12e52b1b7744c37ed1d02e64deefdc735c77b\r\n", "Successfully built jax\r\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "Installing collected packages: tf-estimator-nightly, keras-nightly, jax, tb-nightly, tf-nightly\r\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "Successfully installed jax-0.4.1 keras-nightly-2.12.0.dev2022121408 tb-nightly-2.12.0a20221214 tf-estimator-nightly-2.12.0.dev2022121409 tf-nightly-2.12.0.dev20221214\r\n" ] } ], "source": [ "!pip install tf-nightly" ] }, { "cell_type": "markdown", "metadata": { "id": "524e38dab658" }, "source": [ "最後に、TensorFlow をインポートします。" ] }, { "cell_type": "code", "execution_count": 7, "metadata": { "execution": { "iopub.execute_input": "2022-12-14T20:00:31.429648Z", "iopub.status.busy": "2022-12-14T20:00:31.429341Z", "iopub.status.idle": "2022-12-14T20:00:33.890439Z", "shell.execute_reply": "2022-12-14T20:00:33.889770Z" }, "id": "vHNvttzV43sA" }, "outputs": [ { "name": "stderr", "output_type": "stream", "text": [ "2022-12-14 20:00:31.684068: E tensorflow/tsl/lib/monitoring/collection_registry.cc:81] Cannot register 2 metrics with the same name: /tensorflow/core/bfc_allocator_delay\n" ] } ], "source": [ "import tensorflow as tf" ] }, { "cell_type": "markdown", "metadata": { "id": "0S2jpf6Sx50i" }, "source": [ "### データセットとモデルの定義" ] }, { "cell_type": "markdown", "metadata": { "id": "fLW6D2TzvC-4" }, "source": [ "次に、単純なモデルとデータセットの設定を使用して `mnist.py` ファイルを作成します。この Python ファイルは、このチュートリアルのワーカープロセスによって使用されます。" ] }, { "cell_type": "code", "execution_count": 8, "metadata": { "execution": { "iopub.execute_input": "2022-12-14T20:00:33.894960Z", "iopub.status.busy": "2022-12-14T20:00:33.894589Z", "iopub.status.idle": "2022-12-14T20:00:33.899960Z", "shell.execute_reply": "2022-12-14T20:00:33.899337Z" }, "id": "dma_wUAxZqo2" }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Writing mnist_setup.py\n" ] } ], "source": [ "%%writefile mnist_setup.py\n", "\n", "import os\n", "import tensorflow as tf\n", "import numpy as np\n", "\n", "def mnist_dataset(batch_size):\n", " (x_train, y_train), _ = tf.keras.datasets.mnist.load_data()\n", " # The `x` arrays are in uint8 and have values in the [0, 255] range.\n", " # You need to convert them to float32 with values in the [0, 1] range.\n", " x_train = x_train / np.float32(255)\n", " y_train = y_train.astype(np.int64)\n", " train_dataset = tf.data.Dataset.from_tensor_slices(\n", " (x_train, y_train)).shuffle(60000).repeat().batch(batch_size)\n", " return train_dataset\n", "\n", "def build_and_compile_cnn_model():\n", " model = tf.keras.Sequential([\n", " tf.keras.layers.InputLayer(input_shape=(28, 28)),\n", " tf.keras.layers.Reshape(target_shape=(28, 28, 1)),\n", " tf.keras.layers.Conv2D(32, 3, activation='relu'),\n", " tf.keras.layers.Flatten(),\n", " tf.keras.layers.Dense(128, activation='relu'),\n", " tf.keras.layers.Dense(10)\n", " ])\n", " model.compile(\n", " loss=tf.keras.losses.SparseCategoricalCrossentropy(from_logits=True),\n", " optimizer=tf.keras.optimizers.SGD(learning_rate=0.001),\n", " metrics=['accuracy'])\n", " return model" ] }, { "cell_type": "markdown", "metadata": { "id": "2UL3kisMO90X" }, "source": [ "### シングルワーカーでのモデルのトレーニング\n", "\n", "まず、少数のエポックでモデルをトレーニングし、シングルワーカーで結果を観察して、すべてが正しく機能していることを確認します。エポックが進むにつれ、損失が下降し、精度が 1.0 に近づくはずです。" ] }, { "cell_type": "code", "execution_count": 9, "metadata": { "execution": { "iopub.execute_input": "2022-12-14T20:00:33.903561Z", "iopub.status.busy": "2022-12-14T20:00:33.903029Z", "iopub.status.idle": "2022-12-14T20:00:36.897242Z", "shell.execute_reply": "2022-12-14T20:00:36.896543Z" }, "id": "6Qe6iAf5O8iJ" }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Downloading data from https://storage.googleapis.com/tensorflow/tf-keras-datasets/mnist.npz\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "\r", " 8192/11490434 [..............................] - ETA: 0s" ] }, { "name": "stdout", "output_type": "stream", "text": [ "\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\r", " 2932736/11490434 [======>.......................] - ETA: 0s" ] }, { "name": "stdout", "output_type": "stream", "text": [ "\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\r", " 7176192/11490434 [=================>............] - ETA: 0s" ] }, { "name": "stdout", "output_type": "stream", "text": [ "\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\r", "11490434/11490434 [==============================] - 0s 0us/step\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "2022-12-14 20:00:34.473671: E tensorflow/compiler/xla/stream_executor/cuda/cuda_driver.cc:267] failed call to cuInit: CUDA_ERROR_NO_DEVICE: no CUDA-capable device is detected\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "Epoch 1/3\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "\r", " 1/70 [..............................] - ETA: 33s - loss: 2.2978 - accuracy: 0.1562" ] }, { "name": "stdout", "output_type": "stream", "text": [ "\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\r", " 8/70 [==>...........................] - ETA: 0s - loss: 2.3086 - accuracy: 0.1152 " ] }, { "name": "stdout", "output_type": "stream", "text": [ "\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\r", "17/70 [======>.......................] - ETA: 0s - loss: 2.3051 - accuracy: 0.1296" ] }, { "name": "stdout", "output_type": "stream", "text": [ "\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\r", "26/70 [==========>...................] - ETA: 0s - loss: 2.3001 - accuracy: 0.1364" ] }, { "name": "stdout", "output_type": "stream", "text": [ "\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\r", "34/70 [=============>................] - ETA: 0s - loss: 2.2935 - accuracy: 0.1535" ] }, { "name": "stdout", "output_type": "stream", "text": [ "\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\r", "43/70 [=================>............] - ETA: 0s - loss: 2.2877 - accuracy: 0.1657" ] }, { "name": "stdout", "output_type": "stream", "text": [ "\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\r", "51/70 [====================>.........] - ETA: 0s - loss: 2.2831 - accuracy: 0.1798" ] }, { "name": "stdout", "output_type": "stream", "text": [ "\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\r", "59/70 [========================>.....] - ETA: 0s - loss: 2.2782 - accuracy: 0.1965" ] }, { "name": "stdout", "output_type": "stream", "text": [ "\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\r", "67/70 [===========================>..] - ETA: 0s - loss: 2.2737 - accuracy: 0.2113" ] }, { "name": "stdout", "output_type": "stream", "text": [ "\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\r", "70/70 [==============================] - 1s 7ms/step - loss: 2.2720 - accuracy: 0.2181\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "Epoch 2/3\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "\r", " 1/70 [..............................] - ETA: 0s - loss: 2.2347 - accuracy: 0.4062" ] }, { "name": "stdout", "output_type": "stream", "text": [ "\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\r", "10/70 [===>..........................] - ETA: 0s - loss: 2.2253 - accuracy: 0.3922" ] }, { "name": "stdout", "output_type": "stream", "text": [ "\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\r", "19/70 [=======>......................] - ETA: 0s - loss: 2.2219 - accuracy: 0.3964" ] }, { "name": "stdout", "output_type": "stream", "text": [ "\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\r", "28/70 [===========>..................] - ETA: 0s - loss: 2.2163 - accuracy: 0.4096" ] }, { "name": "stdout", "output_type": "stream", "text": [ "\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\r", "37/70 [==============>...............] - ETA: 0s - loss: 2.2139 - accuracy: 0.4067" ] }, { "name": "stdout", "output_type": "stream", "text": [ "\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\r", "46/70 [==================>...........] - ETA: 0s - loss: 2.2095 - accuracy: 0.4202" ] }, { "name": "stdout", "output_type": "stream", "text": [ "\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\r", "55/70 [======================>.......] - ETA: 0s - loss: 2.2046 - accuracy: 0.4330" ] }, { "name": "stdout", "output_type": "stream", "text": [ "\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\r", "63/70 [==========================>...] - ETA: 0s - loss: 2.2007 - accuracy: 0.4422" ] }, { "name": "stdout", "output_type": "stream", "text": [ "\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\r", "70/70 [==============================] - 0s 6ms/step - loss: 2.1978 - accuracy: 0.4471\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "Epoch 3/3\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "\r", " 1/70 [..............................] - ETA: 0s - loss: 2.1798 - accuracy: 0.4531" ] }, { "name": "stdout", "output_type": "stream", "text": [ "\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\r", " 9/70 [==>...........................] - ETA: 0s - loss: 2.1475 - accuracy: 0.5382" ] }, { "name": "stdout", "output_type": "stream", "text": [ "\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\r", "16/70 [=====>........................] - ETA: 0s - loss: 2.1393 - accuracy: 0.5625" ] }, { "name": "stdout", "output_type": "stream", "text": [ "\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\r", "25/70 [=========>....................] - ETA: 0s - loss: 2.1369 - accuracy: 0.5587" ] }, { "name": "stdout", "output_type": "stream", "text": [ "\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\r", "33/70 [=============>................] - ETA: 0s - loss: 2.1329 - accuracy: 0.5691" ] }, { "name": "stdout", "output_type": "stream", "text": [ "\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\r", "42/70 [=================>............] - ETA: 0s - loss: 2.1279 - accuracy: 0.5722" ] }, { "name": "stdout", "output_type": "stream", "text": [ "\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\r", "51/70 [====================>.........] - ETA: 0s - loss: 2.1223 - accuracy: 0.5748" ] }, { "name": "stdout", "output_type": "stream", "text": [ "\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\r", "59/70 [========================>.....] - ETA: 0s - loss: 2.1160 - accuracy: 0.5792" ] }, { "name": "stdout", "output_type": "stream", "text": [ "\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\r", "68/70 [============================>.] - ETA: 0s - loss: 2.1079 - accuracy: 0.5885" ] }, { "name": "stdout", "output_type": "stream", "text": [ "\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\r", "70/70 [==============================] - 0s 6ms/step - loss: 2.1061 - accuracy: 0.5902\n" ] }, { "data": { "text/plain": [ "" ] }, "execution_count": 9, "metadata": {}, "output_type": "execute_result" } ], "source": [ "import mnist_setup\n", "\n", "batch_size = 64\n", "single_worker_dataset = mnist_setup.mnist_dataset(batch_size)\n", "single_worker_model = mnist_setup.build_and_compile_cnn_model()\n", "single_worker_model.fit(single_worker_dataset, epochs=3, steps_per_epoch=70)" ] }, { "cell_type": "markdown", "metadata": { "id": "JmgZwwymxqt5" }, "source": [ "## マルチワーカー構成\n", "\n", "では、マルチワーカートレーニングの世界を覗いてみましょう。\n", "\n", "### ジョブとタスクのクラスタ\n", "\n", "TensorFlow では、分散トレーニングには、いくつかのジョブが含まれる `'cluster'` があり、各ジョブには 1 つ以上の `'task'` が含まれることがあります。\n", "\n", "それぞれに異なる役割をもつ複数のマシンでトレーニングするには `TF_CONFIG` 環境変数が必要です。`TF_CONFIG` は JSON 文字列で、クラスタの一部である各ワーカーのクラスタ構成を指定するために使用されます。\n", "\n", "`TF_CONFIG` 変数には、`'cluster'` と `'task'` の 2 つのコンポーネントがあります。\n", "\n", "- `'cluster'` はすべてのワーカーに共通し、トレーニングクラスタに関する情報を、`'worker'` または `'chief'` などのさまざまなジョブの種類で構成される dict として提供します。\n", "\n", " - `tf.distribute.MultiWorkerMirroredStrategy` によるマルチワーカートレーニングでは通常、`'worker'` が通常行うことのほかにチェックポイントの保存や TensorBoard 用のサマリーファイルの書き込みといった役割を果たす 1 つの `'worker'` があります。こういった `'worker'` はチーフワーカー (ジョブ名は `'chief'`) と呼ばれます。\n", " - 通常、`'index'` `0` を持つワーカーが `'chief'` になります。\n", "\n", "- `'task'` は現在のタスクの情報を提供し、ワーカーごとに異なります。タスクはそのワーカーの `'type'` と `'index'` を指定します。\n", "\n", "以下に構成例を示します。" ] }, { "cell_type": "code", "execution_count": 10, "metadata": { "execution": { "iopub.execute_input": "2022-12-14T20:00:36.900833Z", "iopub.status.busy": "2022-12-14T20:00:36.900289Z", "iopub.status.idle": "2022-12-14T20:00:36.903697Z", "shell.execute_reply": "2022-12-14T20:00:36.903132Z" }, "id": "XK1eTYvSZiX7" }, "outputs": [], "source": [ "tf_config = {\n", " 'cluster': {\n", " 'worker': ['localhost:12345', 'localhost:23456']\n", " },\n", " 'task': {'type': 'worker', 'index': 0}\n", "}" ] }, { "cell_type": "markdown", "metadata": { "id": "JjgwJbPKZkJL" }, "source": [ "`tf_config` は Python の単なるローカル変数であることに注意してください。トレーニング構成に使用するには、JSON としてシリアル化し、`'TF_CONFIG'` 環境変数に配置します。" ] }, { "cell_type": "code", "execution_count": 11, "metadata": { "execution": { "iopub.execute_input": "2022-12-14T20:00:36.906726Z", "iopub.status.busy": "2022-12-14T20:00:36.906513Z", "iopub.status.idle": "2022-12-14T20:00:36.910536Z", "shell.execute_reply": "2022-12-14T20:00:36.909991Z" }, "id": "yY-T0YDQZjbu" }, "outputs": [ { "data": { "text/plain": [ "'{\"cluster\": {\"worker\": [\"localhost:12345\", \"localhost:23456\"]}, \"task\": {\"type\": \"worker\", \"index\": 0}}'" ] }, "execution_count": 11, "metadata": {}, "output_type": "execute_result" } ], "source": [ "json.dumps(tf_config)" ] }, { "cell_type": "markdown", "metadata": { "id": "8YFpxrcsZ2xG" }, "source": [ "上記の構成例では、タスク `'type'` を `'worker'` に設定し、タスク `'index'` を `0` に設定しています。そのため、このマシンが*最初*のワーカーとなり、`'chief'` ワーカーとして指定されます。\n", "\n", "注意: 他のマシンにも `TF_CONFIG` 環境変数を設定し、同じ `'cluster'` dict が必要となりますが、それらのマシンの役割に応じた異なるタスク `'type'` またはタスク `'index'` が必要となります。" ] }, { "cell_type": "markdown", "metadata": { "id": "aogb74kHxynz" }, "source": [ "実際には、外部 IP アドレス/ポートに複数のワーカーを作成し、それに応じて各ワーカーに `TF_CONFIG` 変数を設定します。このチュートリアルでは、デモとして `localhost` で 2 つのワーカーを使用して `TF_CONFIG` 変数を設定する方法を示します。\n", "\n", "- 最初の (`'chief'`) ワーカーの `TF_CONFIG` は上記に示す通りです。\n", "- 2 つ目のワーカーでは、`tf_config['task']['index']=1` を設定します。" ] }, { "cell_type": "markdown", "metadata": { "id": "cIlkfWmjz1PG" }, "source": [ "### ノートブックの環境変数とサブプロセス" ] }, { "cell_type": "markdown", "metadata": { "id": "FcjAbuGY1ACJ" }, "source": [ "サブプロセスは、親から環境変数を継承します。したがって、この Jupyter Notebook プロセスで環境変数を設定すると、次のようになります。" ] }, { "cell_type": "code", "execution_count": 12, "metadata": { "execution": { "iopub.execute_input": "2022-12-14T20:00:36.914310Z", "iopub.status.busy": "2022-12-14T20:00:36.913774Z", "iopub.status.idle": "2022-12-14T20:00:36.916782Z", "shell.execute_reply": "2022-12-14T20:00:36.916185Z" }, "id": "PH2gHn2_0_U8" }, "outputs": [], "source": [ "os.environ['GREETINGS'] = 'Hello TensorFlow!'" ] }, { "cell_type": "markdown", "metadata": { "id": "gQkIX-cg18md" }, "source": [ "すると、サブプロセスからその環境変数にアクセスできます。" ] }, { "cell_type": "code", "execution_count": 13, "metadata": { "execution": { "iopub.execute_input": "2022-12-14T20:00:36.919997Z", "iopub.status.busy": "2022-12-14T20:00:36.919484Z", "iopub.status.idle": "2022-12-14T20:00:36.980843Z", "shell.execute_reply": "2022-12-14T20:00:36.979812Z" }, "id": "pquKO6IA18G5" }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Hello TensorFlow!\n" ] } ], "source": [ "%%bash\n", "echo ${GREETINGS}" ] }, { "cell_type": "markdown", "metadata": { "id": "af6BCA-Y2fpz" }, "source": [ "次のセクションでは、これを使用して `TF_CONFIG` をワーカーサブプロセスに渡します。この方法で実際にジョブを起動することは決してありませんが、このチュートリアルで最小限のマルチワーカーの例を示すためには十分です。" ] }, { "cell_type": "markdown", "metadata": { "id": "dnDJmaRA9qnf" }, "source": [ "## モデルのトレーニング" ] }, { "cell_type": "markdown", "metadata": { "id": "UhNtHfuxCGVy" }, "source": [ "モデルをトレーニングするには、まず `tf.distribute.MultiWorkerMirroredStrategy` のインスタンスを作成します。" ] }, { "cell_type": "code", "execution_count": 14, "metadata": { "execution": { "iopub.execute_input": "2022-12-14T20:00:36.985049Z", "iopub.status.busy": "2022-12-14T20:00:36.984741Z", "iopub.status.idle": "2022-12-14T20:00:36.993147Z", "shell.execute_reply": "2022-12-14T20:00:36.992494Z" }, "id": "1uFSHCJXMrQ-" }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "WARNING:tensorflow:Collective ops is not configured at program startup. Some performance features may not be enabled.\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "INFO:tensorflow:Single-worker MultiWorkerMirroredStrategy with local_devices = ('/device:CPU:0',), communication = CommunicationImplementation.AUTO\n" ] } ], "source": [ "strategy = tf.distribute.MultiWorkerMirroredStrategy()" ] }, { "cell_type": "markdown", "metadata": { "id": "N0iv7SyyAohc" }, "source": [ "注意: `MultiWorkerMirroredStrategy` が呼び出されると、`TF_CONFIG` が解析され、TensorFlow の GRPC サーバーが開始します。そのため、`TF_CONFIG` 環境変数は、`tf.distribute.Strategy` インスタンスが作成される前に設定しておく必要があります。`TF_CONFIG` はまだ設定されていないため、上記のストラテジーは実質的にシングルワーカーのトレーニングです。" ] }, { "cell_type": "markdown", "metadata": { "id": "H47DDcOgfzm7" }, "source": [ "`tf.keras`に`tf.distribute.Strategy` API を統合したため、トレーニングをマルチワーカーに分散するには、モデルビルディングと `model.compile()` 呼び出しを `strategy.scope()` 内に収めるように変更することだけが必要となりました。この分散ストラテジーのスコープは、どこでどのように変数が作成されるかを指定し、`MultiWorkerMirroredStrategy` の場合、作成される変数は `MirroredVariable` で、各ワーカーに複製されます。\n" ] }, { "cell_type": "code", "execution_count": 15, "metadata": { "execution": { "iopub.execute_input": "2022-12-14T20:00:36.996395Z", "iopub.status.busy": "2022-12-14T20:00:36.996153Z", "iopub.status.idle": "2022-12-14T20:00:37.058153Z", "shell.execute_reply": "2022-12-14T20:00:37.057453Z" }, "id": "wo6b9wX65glL" }, "outputs": [], "source": [ "with strategy.scope():\n", " # Model building/compiling need to be within `strategy.scope()`.\n", " multi_worker_model = mnist_setup.build_and_compile_cnn_model()" ] }, { "cell_type": "markdown", "metadata": { "id": "Mhq3fzyR5hTw" }, "source": [ "注意: 現在のところ、`MultiWorkerMirroredStrategy` には、TensorFlow 演算子をストラテジーのインスタンスが作成された後に作成する必要があるという制限があります。`RuntimeError: Collective ops must be configured at program startup` が表示される場合は、プログラムのはじめに `MultiWorkerMirroredStrategy` のインスタンスを作成するようにし、演算子を作成するコードをストラテジーがインスタンス化される後に配置するようにしてください。" ] }, { "cell_type": "markdown", "metadata": { "id": "jfYpmIxO6Jck" }, "source": [ "`MultiWorkerMirroredStrategy`で実際に実行するには、ワーカープロセスを実行し、`TF_CONFIG`をそれらに渡す必要があります。\n", "\n", "前に記述した`mnist_setup.py`ファイルと同様に、各ワーカーが実行する`main.py`は次のとおりです。" ] }, { "cell_type": "code", "execution_count": 16, "metadata": { "execution": { "iopub.execute_input": "2022-12-14T20:00:37.062066Z", "iopub.status.busy": "2022-12-14T20:00:37.061825Z", "iopub.status.idle": "2022-12-14T20:00:37.066253Z", "shell.execute_reply": "2022-12-14T20:00:37.065633Z" }, "id": "BcsuBYrpgnlS" }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Writing main.py\n" ] } ], "source": [ "%%writefile main.py\n", "\n", "import os\n", "import json\n", "\n", "import tensorflow as tf\n", "import mnist_setup\n", "\n", "per_worker_batch_size = 64\n", "tf_config = json.loads(os.environ['TF_CONFIG'])\n", "num_workers = len(tf_config['cluster']['worker'])\n", "\n", "strategy = tf.distribute.MultiWorkerMirroredStrategy()\n", "\n", "global_batch_size = per_worker_batch_size * num_workers\n", "multi_worker_dataset = mnist_setup.mnist_dataset(global_batch_size)\n", "\n", "with strategy.scope():\n", " # Model building/compiling need to be within `strategy.scope()`.\n", " multi_worker_model = mnist_setup.build_and_compile_cnn_model()\n", "\n", "\n", "multi_worker_model.fit(multi_worker_dataset, epochs=3, steps_per_epoch=70)" ] }, { "cell_type": "markdown", "metadata": { "id": "Aom9xelvJQ_6" }, "source": [ "上記のコードスニペットでは、`Dataset.batch`に渡される`global_batch_size`が`per_worker_batch_size * num_workers`に設定されていることに注意してください。これにより、ワーカーの数に関係なく、各ワーカーが`per_worker_batch_size`の例のバッチを処理するようになります。" ] }, { "cell_type": "markdown", "metadata": { "id": "lHLhOii67Saa" }, "source": [ "現在のディレクトリには、両方の Python ファイルが含まれています。" ] }, { "cell_type": "code", "execution_count": 17, "metadata": { "execution": { "iopub.execute_input": "2022-12-14T20:00:37.069729Z", "iopub.status.busy": "2022-12-14T20:00:37.069236Z", "iopub.status.idle": "2022-12-14T20:00:37.112066Z", "shell.execute_reply": "2022-12-14T20:00:37.111288Z" }, "id": "bi6x05Sr60O9" }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "main.py\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "mnist_setup.py\n" ] } ], "source": [ "%%bash\n", "ls *.py" ] }, { "cell_type": "markdown", "metadata": { "id": "qmEEStPS6vR_" }, "source": [ "JSON は `TF_CONFIG` をシリアル化し、環境変数に追加します。" ] }, { "cell_type": "code", "execution_count": 18, "metadata": { "execution": { "iopub.execute_input": "2022-12-14T20:00:37.115551Z", "iopub.status.busy": "2022-12-14T20:00:37.115261Z", "iopub.status.idle": "2022-12-14T20:00:37.119097Z", "shell.execute_reply": "2022-12-14T20:00:37.118537Z" }, "id": "9uu3g7vV7Bbt" }, "outputs": [], "source": [ "os.environ['TF_CONFIG'] = json.dumps(tf_config)" ] }, { "cell_type": "markdown", "metadata": { "id": "MsY3dQLK7jdf" }, "source": [ "これで、`main.py`を実行し、`TF_CONFIG`を使用するワーカープロセスを起動できます。" ] }, { "cell_type": "code", "execution_count": 19, "metadata": { "execution": { "iopub.execute_input": "2022-12-14T20:00:37.122334Z", "iopub.status.busy": "2022-12-14T20:00:37.121856Z", "iopub.status.idle": "2022-12-14T20:00:37.125252Z", "shell.execute_reply": "2022-12-14T20:00:37.124701Z" }, "id": "txMXaq8d8N_S" }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "All background processes were killed.\n" ] } ], "source": [ "# first kill any previous runs\n", "%killbgscripts" ] }, { "cell_type": "code", "execution_count": 20, "metadata": { "execution": { "iopub.execute_input": "2022-12-14T20:00:37.128563Z", "iopub.status.busy": "2022-12-14T20:00:37.128073Z", "iopub.status.idle": "2022-12-14T20:00:37.163108Z", "shell.execute_reply": "2022-12-14T20:00:37.162079Z" }, "id": "qnSma_Ck7r-r" }, "outputs": [], "source": [ "%%bash --bg\n", "python main.py &> job_0.log" ] }, { "cell_type": "markdown", "metadata": { "id": "ZChyazqS7v0P" }, "source": [ "上記のコマンドについて注意すべき点がいくつかあります。\n", "\n", "1. [ノートブック 「マジック」 ](https://ipython.readthedocs.io/en/stable/interactive/magics.html)である`%%bash`を使用して、いくつかの bash コマンドを実行します。\n", "2. このワーカーは終了しないため、`--bg`フラグを使用して`bash`プロセスをバックグラウンドで実行します。 このワーカーは始める前にすべてのワーカーを待ちます。\n", "\n", "バックグラウンドのワーカープロセスはこのノートブックに出力を出力しないため、`&>` で出力をファイルにリダイレクトし、何が起こったかを検査できます。\n", "\n", "プロセスが開始するまで数秒待ちます。" ] }, { "cell_type": "code", "execution_count": 21, "metadata": { "execution": { "iopub.execute_input": "2022-12-14T20:00:37.167111Z", "iopub.status.busy": "2022-12-14T20:00:37.166850Z", "iopub.status.idle": "2022-12-14T20:00:47.180629Z", "shell.execute_reply": "2022-12-14T20:00:47.179904Z" }, "id": "Hm2yrULE9281" }, "outputs": [], "source": [ "import time\n", "time.sleep(10)" ] }, { "cell_type": "markdown", "metadata": { "id": "ZFPoNxg_9_Mx" }, "source": [ "これまでにワーカーのログファイルに出力されたものを検査します。" ] }, { "cell_type": "code", "execution_count": 22, "metadata": { "execution": { "iopub.execute_input": "2022-12-14T20:00:47.184828Z", "iopub.status.busy": "2022-12-14T20:00:47.184583Z", "iopub.status.idle": "2022-12-14T20:00:47.227954Z", "shell.execute_reply": "2022-12-14T20:00:47.227099Z" }, "id": "vZEOuVgQ9-hn" }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "2022-12-14 20:00:37.557684: E tensorflow/tsl/lib/monitoring/collection_registry.cc:81] Cannot register 2 metrics with the same name: /tensorflow/core/bfc_allocator_delay\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "2022-12-14 20:00:39.352865: E tensorflow/compiler/xla/stream_executor/cuda/cuda_driver.cc:267] failed call to cuInit: CUDA_ERROR_NO_DEVICE: no CUDA-capable device is detected\n" ] } ], "source": [ "%%bash\n", "cat job_0.log" ] }, { "cell_type": "markdown", "metadata": { "id": "RqZhVF7L_KOy" }, "source": [ "ログファイルの最後の行は `Started server with target: grpc://localhost:12345` であるはずです。最初のワーカーは準備が整い、他のすべてのワーカーの準備が整うのを待っています。" ] }, { "cell_type": "markdown", "metadata": { "id": "Pi8vPNNA_l4a" }, "source": [ "2 番目のワーカーのプロセスを始めるように `tf_config` を更新します。" ] }, { "cell_type": "code", "execution_count": 23, "metadata": { "execution": { "iopub.execute_input": "2022-12-14T20:00:47.231661Z", "iopub.status.busy": "2022-12-14T20:00:47.231135Z", "iopub.status.idle": "2022-12-14T20:00:47.234991Z", "shell.execute_reply": "2022-12-14T20:00:47.234432Z" }, "id": "lAiYkkPu_Jqd" }, "outputs": [], "source": [ "tf_config['task']['index'] = 1\n", "os.environ['TF_CONFIG'] = json.dumps(tf_config)" ] }, { "cell_type": "markdown", "metadata": { "id": "0AshGVO0_x0w" }, "source": [ "2番目のワーカーを起動します。すべてのワーカーがアクティブであるため、これによりトレーニングが開始されます(したがって、このプロセスをバックグラウンドで実行する必要はありません)。" ] }, { "cell_type": "code", "execution_count": 24, "metadata": { "execution": { "iopub.execute_input": "2022-12-14T20:00:47.238414Z", "iopub.status.busy": "2022-12-14T20:00:47.237952Z", "iopub.status.idle": "2022-12-14T20:01:02.898795Z", "shell.execute_reply": "2022-12-14T20:01:02.897678Z" }, "id": "_ESVtyQ9_xjx" }, "outputs": [ { "name": "stderr", "output_type": "stream", "text": [ "2022-12-14 20:00:47.666158: E tensorflow/tsl/lib/monitoring/collection_registry.cc:81] Cannot register 2 metrics with the same name: /tensorflow/core/bfc_allocator_delay\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "2022-12-14 20:00:49.465809: E tensorflow/compiler/xla/stream_executor/cuda/cuda_driver.cc:267] failed call to cuInit: CUDA_ERROR_NO_DEVICE: no CUDA-capable device is detected\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "2022-12-14 20:00:50.424484: W tensorflow/core/grappler/optimizers/data/auto_shard.cc:784] AUTO sharding policy will apply DATA sharding policy as it failed to apply FILE sharding policy because of the following reason: Found an unshardable source dataset: name: \"TensorSliceDataset/_2\"\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "op: \"TensorSliceDataset\"\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "input: \"Placeholder/_0\"\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "input: \"Placeholder/_1\"\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "attr {\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ " key: \"Toutput_types\"\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ " value {\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ " list {\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ " type: DT_FLOAT\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ " type: DT_INT64\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ " }\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ " }\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "}\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "attr {\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ " key: \"_cardinality\"\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ " value {\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ " i: 60000\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ " }\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "}\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "attr {\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ " key: \"is_files\"\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ " value {\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ " b: false\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ " }\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "}\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "attr {\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ " key: \"metadata\"\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ " value {\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ " s: \"\\n\\024TensorSliceDataset:0\"\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ " }\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "}\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "attr {\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ " key: \"output_shapes\"\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ " value {\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ " list {\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ " shape {\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ " dim {\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ " size: 28\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ " }\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ " dim {\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ " size: 28\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ " }\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ " }\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ " shape {\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ " }\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ " }\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ " }\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "}\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "attr {\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ " key: \"replicate_on_split\"\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ " value {\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ " b: false\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ " }\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "}\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "experimental_type {\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ " type_id: TFT_PRODUCT\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ " args {\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ " type_id: TFT_DATASET\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ " args {\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ " type_id: TFT_PRODUCT\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ " args {\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ " type_id: TFT_TENSOR\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ " args {\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ " type_id: TFT_FLOAT\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ " }\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ " }\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ " args {\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ " type_id: TFT_TENSOR\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ " args {\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ " type_id: TFT_INT64\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ " }\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ " }\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ " }\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ " }\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "}\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "2022-12-14 20:00:50.702584: W tensorflow/core/framework/dataset.cc:807] Input of GeneratorDatasetOp::Dataset will not be optimized because the dataset does not implement the AsGraphDefInternal() method needed to apply optimizations.\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "Epoch 1/3\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "\r", " 1/70 [..............................] - ETA: 3:18 - loss: 4.6259 - accuracy: 0.1406\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\r", " 2/70 [..............................] - ETA: 4s - loss: 4.6227 - accuracy: 0.1406 \b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\r", " 4/70 [>.............................] - ETA: 3s - loss: 4.6283 - accuracy: 0.1367\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\r", " 6/70 [=>............................] - ETA: 3s - loss: 4.6214 - accuracy: 0.1719\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\r", " 8/70 [==>...........................] - ETA: 2s - loss: 4.6199 - accuracy: 0.1699\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\r", "10/70 [===>..........................] - ETA: 2s - loss: 4.6188 - accuracy: 0.1734\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\r", "12/70 [====>.........................] - ETA: 2s - loss: 4.6139 - accuracy: 0.1810\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\r", "14/70 [=====>........................] - ETA: 2s - loss: 4.6109 - accuracy: 0.1975\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\r", "16/70 [=====>........................] - ETA: 2s - loss: 4.6093 - accuracy: 0.1963\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\r", "18/70 [======>.......................] - ETA: 2s - loss: 4.6089 - accuracy: 0.2014\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\r", "20/70 [=======>......................] - ETA: 2s - loss: 4.6070 - accuracy: 0.2039\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\r", "22/70 [========>.....................] - ETA: 2s - loss: 4.6039 - accuracy: 0.2067\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\r", "24/70 [=========>....................] - ETA: 1s - loss: 4.6040 - accuracy: 0.2090\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\r", "26/70 [==========>...................] - ETA: 1s - loss: 4.6033 - accuracy: 0.2109\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\r", "28/70 [===========>..................] - ETA: 1s - loss: 4.6027 - accuracy: 0.2109\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\r", "30/70 [===========>..................] - ETA: 1s - loss: 4.6027 - accuracy: 0.2141\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\r", "32/70 [============>.................] - ETA: 1s - loss: 4.6025 - accuracy: 0.2139\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\r", "34/70 [=============>................] - ETA: 1s - loss: 4.6016 - accuracy: 0.2146\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\r", "36/70 [==============>...............] - ETA: 1s - loss: 4.6006 - accuracy: 0.2170\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\r", "38/70 [===============>..............] - ETA: 1s - loss: 4.5990 - accuracy: 0.2241\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\r", "40/70 [================>.............] - ETA: 1s - loss: 4.5974 - accuracy: 0.2277\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\r", "42/70 [=================>............] - ETA: 1s - loss: 4.5964 - accuracy: 0.2314\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\r", "44/70 [=================>............] - ETA: 1s - loss: 4.5952 - accuracy: 0.2351\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\r", "46/70 [==================>...........] - ETA: 1s - loss: 4.5944 - accuracy: 0.2381\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\r", "48/70 [===================>..........] - ETA: 0s - loss: 4.5929 - accuracy: 0.2435\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\r", "50/70 [====================>.........] - ETA: 0s - loss: 4.5921 - accuracy: 0.2444\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\r", "52/70 [=====================>........] - ETA: 0s - loss: 4.5916 - accuracy: 0.2473\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\r", "54/70 [======================>.......] - ETA: 0s - loss: 4.5903 - accuracy: 0.2497\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\r", "56/70 [=======================>......] - ETA: 0s - loss: 4.5896 - accuracy: 0.2494\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\r", "58/70 [=======================>......] - ETA: 0s - loss: 4.5888 - accuracy: 0.2513\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\r", "60/70 [========================>.....] - ETA: 0s - loss: 4.5877 - accuracy: 0.2552\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\r", "62/70 [=========================>....] - ETA: 0s - loss: 4.5869 - accuracy: 0.2566\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\r", "64/70 [==========================>...] - ETA: 0s - loss: 4.5861 - accuracy: 0.2559\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\r", "66/70 [===========================>..] - ETA: 0s - loss: 4.5852 - accuracy: 0.2562\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\r", "68/70 [============================>.] - ETA: 0s - loss: 4.5838 - accuracy: 0.2606\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\r", "70/70 [==============================] - ETA: 0s - loss: 4.5826 - accuracy: 0.2632\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\r", "70/70 [==============================] - 6s 42ms/step - loss: 2.2913 - accuracy: 0.1316\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "Epoch 2/3\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "\r", " 1/70 [..............................] - ETA: 2s - loss: 4.5393 - accuracy: 0.3750\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\r", " 3/70 [>.............................] - ETA: 2s - loss: 4.5318 - accuracy: 0.4219\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\r", " 5/70 [=>............................] - ETA: 2s - loss: 4.5348 - accuracy: 0.4062\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\r", " 7/70 [==>...........................] - ETA: 2s - loss: 4.5369 - accuracy: 0.4040\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\r", " 9/70 [==>...........................] - ETA: 2s - loss: 4.5379 - accuracy: 0.4080\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\r", "11/70 [===>..........................] - ETA: 2s - loss: 4.5366 - accuracy: 0.4119\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\r", "13/70 [====>.........................] - ETA: 2s - loss: 4.5383 - accuracy: 0.4135\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\r", "15/70 [=====>........................] - ETA: 2s - loss: 4.5378 - accuracy: 0.4062\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\r", "17/70 [======>.......................] - ETA: 2s - loss: 4.5382 - accuracy: 0.4007\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\r", "19/70 [=======>......................] - ETA: 2s - loss: 4.5371 - accuracy: 0.4038\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\r", "21/70 [========>.....................] - ETA: 2s - loss: 4.5357 - accuracy: 0.4062\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\r", "23/70 [========>.....................] - ETA: 1s - loss: 4.5343 - accuracy: 0.4062\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\r", "25/70 [=========>....................] - ETA: 1s - loss: 4.5341 - accuracy: 0.4100\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\r", "27/70 [==========>...................] - ETA: 1s - loss: 4.5334 - accuracy: 0.4126\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\r", "29/70 [===========>..................] - ETA: 1s - loss: 4.5328 - accuracy: 0.4181\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\r", "31/70 [============>.................] - ETA: 1s - loss: 4.5312 - accuracy: 0.4269\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\r", "33/70 [=============>................] - ETA: 1s - loss: 4.5304 - accuracy: 0.4261\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\r", "35/70 [==============>...............] - ETA: 1s - loss: 4.5288 - accuracy: 0.4317\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\r", "37/70 [==============>...............] - ETA: 1s - loss: 4.5282 - accuracy: 0.4295\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\r", "39/70 [===============>..............] - ETA: 1s - loss: 4.5277 - accuracy: 0.4263\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\r", "41/70 [================>.............] - ETA: 1s - loss: 4.5270 - accuracy: 0.4303\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\r", "43/70 [=================>............] - ETA: 1s - loss: 4.5261 - accuracy: 0.4310\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\r", "45/70 [==================>...........] - ETA: 1s - loss: 4.5243 - accuracy: 0.4347\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\r", "47/70 [===================>..........] - ETA: 0s - loss: 4.5230 - accuracy: 0.4372\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\r", "49/70 [====================>.........] - ETA: 0s - loss: 4.5223 - accuracy: 0.4391\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\r", "51/70 [====================>.........] - ETA: 0s - loss: 4.5214 - accuracy: 0.4427\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\r", "53/70 [=====================>........] - ETA: 0s - loss: 4.5196 - accuracy: 0.4463\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\r", "55/70 [======================>.......] - ETA: 0s - loss: 4.5188 - accuracy: 0.4463\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\r", "57/70 [=======================>......] - ETA: 0s - loss: 4.5178 - accuracy: 0.4498\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\r", "59/70 [========================>.....] - ETA: 0s - loss: 4.5168 - accuracy: 0.4518\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\r", "61/70 [=========================>....] - ETA: 0s - loss: 4.5157 - accuracy: 0.4534\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\r", "63/70 [==========================>...] - ETA: 0s - loss: 4.5149 - accuracy: 0.4566\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\r", "65/70 [==========================>...] - ETA: 0s - loss: 4.5135 - accuracy: 0.4567\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\r", "67/70 [===========================>..] - ETA: 0s - loss: 4.5125 - accuracy: 0.4557\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\r", "69/70 [============================>.] - ETA: 0s - loss: 4.5109 - accuracy: 0.4617\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\r", "70/70 [==============================] - 3s 41ms/step - loss: 2.2551 - accuracy: 0.2313\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "Epoch 3/3\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "\r", " 1/70 [..............................] - ETA: 2s - loss: 4.4577 - accuracy: 0.7656\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\r", " 3/70 [>.............................] - ETA: 2s - loss: 4.4632 - accuracy: 0.6458\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\r", " 5/70 [=>............................] - ETA: 2s - loss: 4.4584 - accuracy: 0.6531\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\r", " 7/70 [==>...........................] - ETA: 2s - loss: 4.4597 - accuracy: 0.6339\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\r", " 9/70 [==>...........................] - ETA: 2s - loss: 4.4619 - accuracy: 0.6163\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\r", "11/70 [===>..........................] - ETA: 2s - loss: 4.4606 - accuracy: 0.6236\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\r", "13/70 [====>.........................] - ETA: 2s - loss: 4.4629 - accuracy: 0.6094\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\r", "15/70 [=====>........................] - ETA: 2s - loss: 4.4646 - accuracy: 0.5875\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\r", "17/70 [======>.......................] - ETA: 2s - loss: 4.4627 - accuracy: 0.5928\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\r", "19/70 [=======>......................] - ETA: 2s - loss: 4.4621 - accuracy: 0.5929\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\r", "21/70 [========>.....................] - ETA: 2s - loss: 4.4620 - accuracy: 0.5804\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\r", "23/70 [========>.....................] - ETA: 1s - loss: 4.4601 - accuracy: 0.5849\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\r", "25/70 [=========>....................] - ETA: 1s - loss: 4.4599 - accuracy: 0.5906\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\r", "27/70 [==========>...................] - ETA: 1s - loss: 4.4586 - accuracy: 0.5961\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\r", "29/70 [===========>..................] - ETA: 1s - loss: 4.4579 - accuracy: 0.5954\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\r", "31/70 [============>.................] - ETA: 1s - loss: 4.4564 - accuracy: 0.5973\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\r", "33/70 [=============>................] - ETA: 1s - loss: 4.4542 - accuracy: 0.6075\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\r", "35/70 [==============>...............] - ETA: 1s - loss: 4.4523 - accuracy: 0.6121\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\r", "37/70 [==============>...............] - ETA: 1s - loss: 4.4512 - accuracy: 0.6195\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\r", "39/70 [===============>..............] - ETA: 1s - loss: 4.4500 - accuracy: 0.6222\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\r", "41/70 [================>.............] - ETA: 1s - loss: 4.4488 - accuracy: 0.6235\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\r", "43/70 [=================>............] - ETA: 1s - loss: 4.4473 - accuracy: 0.6283\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\r", "45/70 [==================>...........] - ETA: 1s - loss: 4.4464 - accuracy: 0.6326\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\r", "47/70 [===================>..........] - ETA: 0s - loss: 4.4450 - accuracy: 0.6386\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\r", "49/70 [====================>.........] - ETA: 0s - loss: 4.4436 - accuracy: 0.6397\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\r", "51/70 [====================>.........] - ETA: 0s - loss: 4.4419 - accuracy: 0.6428\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\r", "53/70 [=====================>........] - ETA: 0s - loss: 4.4404 - accuracy: 0.6477\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\r", "55/70 [======================>.......] - ETA: 0s - loss: 4.4389 - accuracy: 0.6511\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\r", "57/70 [=======================>......] - ETA: 0s - loss: 4.4376 - accuracy: 0.6524\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\r", "59/70 [========================>.....] - ETA: 0s - loss: 4.4356 - accuracy: 0.6573\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\r", "61/70 [=========================>....] - ETA: 0s - loss: 4.4344 - accuracy: 0.6542\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\r", "63/70 [==========================>...] - ETA: 0s - loss: 4.4334 - accuracy: 0.6558\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\r", "65/70 [==========================>...] - ETA: 0s - loss: 4.4323 - accuracy: 0.6558\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\r", "67/70 [===========================>..] - ETA: 0s - loss: 4.4315 - accuracy: 0.6565\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\r", "69/70 [============================>.] - ETA: 0s - loss: 4.4296 - accuracy: 0.6603\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\r", "70/70 [==============================] - 3s 40ms/step - loss: 2.2144 - accuracy: 0.3311\n" ] } ], "source": [ "%%bash\n", "python main.py" ] }, { "cell_type": "markdown", "metadata": { "id": "hX4FA2O2AuAn" }, "source": [ "最初のワーカーにより書き込まれたログを再確認すると、そのモデルのトレーニングに参加していることがわかります。" ] }, { "cell_type": "code", "execution_count": 25, "metadata": { "execution": { "iopub.execute_input": "2022-12-14T20:01:02.902953Z", "iopub.status.busy": "2022-12-14T20:01:02.902674Z", "iopub.status.idle": "2022-12-14T20:01:02.984952Z", "shell.execute_reply": "2022-12-14T20:01:02.984046Z" }, "id": "rc6hw3yTBKXX" }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "2022-12-14 20:00:37.557684: E tensorflow/tsl/lib/monitoring/collection_registry.cc:81] Cannot register 2 metrics with the same name: /tensorflow/core/bfc_allocator_delay\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "2022-12-14 20:00:39.352865: E tensorflow/compiler/xla/stream_executor/cuda/cuda_driver.cc:267] failed call to cuInit: CUDA_ERROR_NO_DEVICE: no CUDA-capable device is detected\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "2022-12-14 20:00:50.422449: W tensorflow/core/grappler/optimizers/data/auto_shard.cc:784] AUTO sharding policy will apply DATA sharding policy as it failed to apply FILE sharding policy because of the following reason: Found an unshardable source dataset: name: \"TensorSliceDataset/_2\"\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "op: \"TensorSliceDataset\"\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "input: \"Placeholder/_0\"\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "input: \"Placeholder/_1\"\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "attr {\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ " key: \"Toutput_types\"\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ " value {\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ " list {\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ " type: DT_FLOAT\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ " type: DT_INT64\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ " }\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ " }\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "}\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "attr {\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ " key: \"_cardinality\"\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ " value {\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ " i: 60000\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ " }\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "}\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "attr {\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ " key: \"is_files\"\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ " value {\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ " b: false\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ " }\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "}\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "attr {\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ " key: \"metadata\"\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ " value {\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ " s: \"\\n\\024TensorSliceDataset:0\"\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ " }\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "}\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "attr {\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ " key: \"output_shapes\"\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ " value {\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ " list {\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ " shape {\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ " dim {\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ " size: 28\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ " }\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ " dim {\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ " size: 28\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ " }\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ " }\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ " shape {\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ " }\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ " }\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ " }\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "}\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "attr {\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ " key: \"replicate_on_split\"\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ " value {\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ " b: false\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ " }\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "}\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "experimental_type {\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ " type_id: TFT_PRODUCT\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ " args {\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ " type_id: TFT_DATASET\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ " args {\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ " type_id: TFT_PRODUCT\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ " args {\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ " type_id: TFT_TENSOR\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ " args {\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ " type_id: TFT_FLOAT\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ " }\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ " }\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ " args {\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ " type_id: TFT_TENSOR\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ " args {\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ " type_id: TFT_INT64\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ " }\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ " }\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ " }\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ " }\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "}\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "2022-12-14 20:00:50.700553: W tensorflow/core/framework/dataset.cc:807] Input of GeneratorDatasetOp::Dataset will not be optimized because the dataset does not implement the AsGraphDefInternal() method needed to apply optimizations.\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "Epoch 1/3\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "\r", " 1/70 [..............................] - ETA: 3:18 - loss: 4.6259 - accuracy: 0.1406\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\r", " 2/70 [..............................] - ETA: 4s - loss: 4.6227 - accuracy: 0.1406 \b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\r", " 4/70 [>.............................] - ETA: 3s - loss: 4.6283 - accuracy: 0.1367\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\r", " 6/70 [=>............................] - ETA: 3s - loss: 4.6214 - accuracy: 0.1719\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\r", " 8/70 [==>...........................] - ETA: 2s - loss: 4.6199 - accuracy: 0.1699\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\r", "10/70 [===>..........................] - ETA: 2s - loss: 4.6188 - accuracy: 0.1734\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\r", "12/70 [====>.........................] - ETA: 2s - loss: 4.6139 - accuracy: 0.1810\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\r", "14/70 [=====>........................] - ETA: 2s - loss: 4.6109 - accuracy: 0.1975\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\r", "16/70 [=====>........................] - ETA: 2s - loss: 4.6093 - accuracy: 0.1963\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\r", "18/70 [======>.......................] - ETA: 2s - loss: 4.6089 - accuracy: 0.2014\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\r", "20/70 [=======>......................] - ETA: 2s - loss: 4.6070 - accuracy: 0.2039\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\r", "22/70 [========>.....................] - ETA: 2s - loss: 4.6039 - accuracy: 0.2067\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\r", "24/70 [=========>....................] - ETA: 1s - loss: 4.6040 - accuracy: 0.2090\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\r", "26/70 [==========>...................] - ETA: 1s - loss: 4.6033 - accuracy: 0.2109\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\r", "28/70 [===========>..................] - ETA: 1s - loss: 4.6027 - accuracy: 0.2109\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\r", "30/70 [===========>..................] - ETA: 1s - loss: 4.6027 - accuracy: 0.2141\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\r", "32/70 [============>.................] - ETA: 1s - loss: 4.6025 - accuracy: 0.2139\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\r", "34/70 [=============>................] - ETA: 1s - loss: 4.6016 - accuracy: 0.2146\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\r", "36/70 [==============>...............] - ETA: 1s - loss: 4.6006 - accuracy: 0.2170\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\r", "38/70 [===============>..............] - ETA: 1s - loss: 4.5990 - accuracy: 0.2241\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\r", "40/70 [================>.............] - ETA: 1s - loss: 4.5974 - accuracy: 0.2277\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\r", "42/70 [=================>............] - ETA: 1s - loss: 4.5964 - accuracy: 0.2314\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\r", "44/70 [=================>............] - ETA: 1s - loss: 4.5952 - accuracy: 0.2351\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\r", "46/70 [==================>...........] - ETA: 1s - loss: 4.5944 - accuracy: 0.2381\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\r", "48/70 [===================>..........] - ETA: 0s - loss: 4.5929 - accuracy: 0.2435\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\r", "50/70 [====================>.........] - ETA: 0s - loss: 4.5921 - accuracy: 0.2444\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\r", "52/70 [=====================>........] - ETA: 0s - loss: 4.5916 - accuracy: 0.2473\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\r", "54/70 [======================>.......] - ETA: 0s - loss: 4.5903 - accuracy: 0.2497\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\r", "56/70 [=======================>......] - ETA: 0s - loss: 4.5896 - accuracy: 0.2494\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\r", "58/70 [=======================>......] - ETA: 0s - loss: 4.5888 - accuracy: 0.2513\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\r", "60/70 [========================>.....] - ETA: 0s - loss: 4.5877 - accuracy: 0.2552\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\r", "62/70 [=========================>....] - ETA: 0s - loss: 4.5869 - accuracy: 0.2566\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\r", "64/70 [==========================>...] - ETA: 0s - loss: 4.5861 - accuracy: 0.2559\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\r", "66/70 [===========================>..] - ETA: 0s - loss: 4.5852 - accuracy: 0.2562\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\r", "68/70 [============================>.] - ETA: 0s - loss: 4.5838 - accuracy: 0.2606\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\r", "70/70 [==============================] - ETA: 0s - loss: 4.5826 - accuracy: 0.2632\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\r", "70/70 [==============================] - 6s 42ms/step - loss: 2.2913 - accuracy: 0.1316\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "Epoch 2/3\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "\r", " 1/70 [..............................] - ETA: 2s - loss: 4.5393 - accuracy: 0.3750\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\r", " 3/70 [>.............................] - ETA: 2s - loss: 4.5318 - accuracy: 0.4219\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\r", " 5/70 [=>............................] - ETA: 2s - loss: 4.5348 - accuracy: 0.4062\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\r", " 7/70 [==>...........................] - ETA: 2s - loss: 4.5369 - accuracy: 0.4040\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\r", " 9/70 [==>...........................] - ETA: 2s - loss: 4.5379 - accuracy: 0.4080\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\r", "11/70 [===>..........................] - ETA: 2s - loss: 4.5366 - accuracy: 0.4119\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\r", "13/70 [====>.........................] - ETA: 2s - loss: 4.5383 - accuracy: 0.4135\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\r", "15/70 [=====>........................] - ETA: 2s - loss: 4.5378 - accuracy: 0.4062\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\r", "17/70 [======>.......................] - ETA: 2s - loss: 4.5382 - accuracy: 0.4007\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\r", "19/70 [=======>......................] - ETA: 2s - loss: 4.5371 - accuracy: 0.4038\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\r", "21/70 [========>.....................] - ETA: 2s - loss: 4.5357 - accuracy: 0.4062\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\r", "23/70 [========>.....................] - ETA: 1s - loss: 4.5343 - accuracy: 0.4062\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\r", "25/70 [=========>....................] - ETA: 1s - loss: 4.5341 - accuracy: 0.4100\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\r", "27/70 [==========>...................] - ETA: 1s - loss: 4.5334 - accuracy: 0.4126\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\r", "29/70 [===========>..................] - ETA: 1s - loss: 4.5328 - accuracy: 0.4181\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\r", "31/70 [============>.................] - ETA: 1s - loss: 4.5312 - accuracy: 0.4269\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\r", "33/70 [=============>................] - ETA: 1s - loss: 4.5304 - accuracy: 0.4261\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\r", "35/70 [==============>...............] - ETA: 1s - loss: 4.5288 - accuracy: 0.4317\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\r", "37/70 [==============>...............] - ETA: 1s - loss: 4.5282 - accuracy: 0.4295\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\r", "39/70 [===============>..............] - ETA: 1s - loss: 4.5277 - accuracy: 0.4263\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\r", "41/70 [================>.............] - ETA: 1s - loss: 4.5270 - accuracy: 0.4303\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\r", "43/70 [=================>............] - ETA: 1s - loss: 4.5261 - accuracy: 0.4310\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\r", "45/70 [==================>...........] - ETA: 1s - loss: 4.5243 - accuracy: 0.4347\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\r", "47/70 [===================>..........] - ETA: 0s - loss: 4.5230 - accuracy: 0.4372\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\r", "49/70 [====================>.........] - ETA: 0s - loss: 4.5223 - accuracy: 0.4391\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\r", "51/70 [====================>.........] - ETA: 0s - loss: 4.5214 - accuracy: 0.4427\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\r", "53/70 [=====================>........] - ETA: 0s - loss: 4.5196 - accuracy: 0.4463\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\r", "55/70 [======================>.......] - ETA: 0s - loss: 4.5188 - accuracy: 0.4463\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\r", "57/70 [=======================>......] - ETA: 0s - loss: 4.5178 - accuracy: 0.4498\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\r", "59/70 [========================>.....] - ETA: 0s - loss: 4.5168 - accuracy: 0.4518\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\r", "61/70 [=========================>....] - ETA: 0s - loss: 4.5157 - accuracy: 0.4534\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\r", "63/70 [==========================>...] - ETA: 0s - loss: 4.5149 - accuracy: 0.4566\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\r", "65/70 [==========================>...] - ETA: 0s - loss: 4.5135 - accuracy: 0.4567\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\r", "67/70 [===========================>..] - ETA: 0s - loss: 4.5125 - accuracy: 0.4557\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\r", "69/70 [============================>.] - ETA: 0s - loss: 4.5109 - accuracy: 0.4617\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\r", "70/70 [==============================] - 3s 41ms/step - loss: 2.2551 - accuracy: 0.2313\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "Epoch 3/3\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "\r", " 1/70 [..............................] - ETA: 2s - loss: 4.4577 - accuracy: 0.7656\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\r", " 3/70 [>.............................] - ETA: 2s - loss: 4.4632 - accuracy: 0.6458\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\r", " 5/70 [=>............................] - ETA: 2s - loss: 4.4584 - accuracy: 0.6531\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\r", " 7/70 [==>...........................] - ETA: 2s - loss: 4.4597 - accuracy: 0.6339\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\r", " 9/70 [==>...........................] - ETA: 2s - loss: 4.4619 - accuracy: 0.6163\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\r", "11/70 [===>..........................] - ETA: 2s - loss: 4.4606 - accuracy: 0.6236\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\r", "13/70 [====>.........................] - ETA: 2s - loss: 4.4629 - accuracy: 0.6094\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\r", "15/70 [=====>........................] - ETA: 2s - loss: 4.4646 - accuracy: 0.5875\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\r", "17/70 [======>.......................] - ETA: 2s - loss: 4.4627 - accuracy: 0.5928\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\r", "19/70 [=======>......................] - ETA: 2s - loss: 4.4621 - accuracy: 0.5929\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\r", "21/70 [========>.....................] - ETA: 2s - loss: 4.4620 - accuracy: 0.5804\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\r", "23/70 [========>.....................] - ETA: 1s - loss: 4.4601 - accuracy: 0.5849\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\r", "25/70 [=========>....................] - ETA: 1s - loss: 4.4599 - accuracy: 0.5906\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\r", "27/70 [==========>...................] - ETA: 1s - loss: 4.4586 - accuracy: 0.5961\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\r", "29/70 [===========>..................] - ETA: 1s - loss: 4.4579 - accuracy: 0.5954\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\r", "31/70 [============>.................] - ETA: 1s - loss: 4.4564 - accuracy: 0.5973\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\r", "33/70 [=============>................] - ETA: 1s - loss: 4.4542 - accuracy: 0.6075\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\r", "35/70 [==============>...............] - ETA: 1s - loss: 4.4523 - accuracy: 0.6121\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\r", "37/70 [==============>...............] - ETA: 1s - loss: 4.4512 - accuracy: 0.6195\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\r", "39/70 [===============>..............] - ETA: 1s - loss: 4.4500 - accuracy: 0.6222\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\r", "41/70 [================>.............] - ETA: 1s - loss: 4.4488 - accuracy: 0.6235\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\r", "43/70 [=================>............] - ETA: 1s - loss: 4.4473 - accuracy: 0.6283\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\r", "45/70 [==================>...........] - ETA: 1s - loss: 4.4464 - accuracy: 0.6326\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\r", "47/70 [===================>..........] - ETA: 0s - loss: 4.4450 - accuracy: 0.6386\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\r", "49/70 [====================>.........] - ETA: 0s - loss: 4.4436 - accuracy: 0.6397\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\r", "51/70 [====================>.........] - ETA: 0s - loss: 4.4419 - accuracy: 0.6428\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\r", "53/70 [=====================>........] - ETA: 0s - loss: 4.4404 - accuracy: 0.6477\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\r", "55/70 [======================>.......] - ETA: 0s - loss: 4.4389 - accuracy: 0.6511\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\r", "57/70 [=======================>......] - ETA: 0s - loss: 4.4376 - accuracy: 0.6524\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\r", "59/70 [========================>.....] - ETA: 0s - loss: 4.4356 - accuracy: 0.6573\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\r", "61/70 [=========================>....] - ETA: 0s - loss: 4.4344 - accuracy: 0.6542\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\r", "63/70 [==========================>...] - ETA: 0s - loss: 4.4334 - accuracy: 0.6558\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\r", "65/70 [==========================>...] - ETA: 0s - loss: 4.4323 - accuracy: 0.6558\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\r", "67/70 [===========================>..] - ETA: 0s - loss: 4.4315 - accuracy: 0.6565\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\r", "69/70 [============================>.] - ETA: 0s - loss: 4.4296 - accuracy: 0.6603\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\r", "70/70 [==============================] - 3s 40ms/step - loss: 2.2144 - accuracy: 0.3311\n" ] } ], "source": [ "%%bash\n", "cat job_0.log" ] }, { "cell_type": "markdown", "metadata": { "id": "zL79ak5PMzEg" }, "source": [ "注意: 1 台のマシンで複数のワーカーを実行するとオーバーヘッドが増えるため、これはこのチュートリアルの最初に実行されたテストよりも実行に時間がかかる可能性があります。ここでの目標は、トレーニング時間を改善することではなく、マルチワーカートレーニングの例を示すことです。\n" ] }, { "cell_type": "code", "execution_count": 26, "metadata": { "execution": { "iopub.execute_input": "2022-12-14T20:01:03.014713Z", "iopub.status.busy": "2022-12-14T20:01:03.014437Z", "iopub.status.idle": "2022-12-14T20:01:03.019263Z", "shell.execute_reply": "2022-12-14T20:01:03.018645Z" }, "id": "sG5_1UgrgniF" }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "All background processes were killed.\n" ] } ], "source": [ "# Delete the `TF_CONFIG`, and kill any background tasks so they don't affect the next section.\n", "os.environ.pop('TF_CONFIG', None)\n", "%killbgscripts" ] }, { "cell_type": "markdown", "metadata": { "id": "9j2FJVHoUIrE" }, "source": [ "## マルチワーカートレーニングの詳細\n" ] }, { "cell_type": "markdown", "metadata": { "id": "C1hBks_dAZmT" }, "source": [ "このチュートリアルでは、基本的なマルチワーカーのセットアップについて説明してきました。このドキュメントの残りの部分では、実際のユースケースに役立つ他の要因について詳しく見ていきます。" ] }, { "cell_type": "markdown", "metadata": { "id": "Rr14Vl9GR4zq" }, "source": [ "### データセットのシャーディング\n", "\n", "マルチワーカートレーニングでは、コンバージェンスとパフォーマンスを確保するために、データセットのシャーディングが必要です。\n", "\n", "前のセクションの例は、`tf.distribute.Strategy` API により提供されるデフォルトの自動シャーディングに依存しています。`tf.data.experimental.DistributeOptions` の `tf.data.experimental.AutoShardPolicy` を設定することで、シャーディングを制御できます。\n", "\n", "*自動シャーディング*の詳細については、[分散入力ガイド](https://www.tensorflow.org/tutorials/distribute/input#sharding)をご覧ください。\n", "\n", "自動シャーディングをオフにして、各レプリカがすべての例を処理する方法の簡単な例を次に示します(推奨されません)。\n" ] }, { "cell_type": "code", "execution_count": 27, "metadata": { "execution": { "iopub.execute_input": "2022-12-14T20:01:03.022699Z", "iopub.status.busy": "2022-12-14T20:01:03.022123Z", "iopub.status.idle": "2022-12-14T20:01:03.516674Z", "shell.execute_reply": "2022-12-14T20:01:03.515995Z" }, "id": "JxEtdh1vH-TF" }, "outputs": [], "source": [ "options = tf.data.Options()\n", "options.experimental_distribute.auto_shard_policy = tf.data.experimental.AutoShardPolicy.OFF\n", "\n", "global_batch_size = 64\n", "multi_worker_dataset = mnist_setup.mnist_dataset(batch_size=64)\n", "dataset_no_auto_shard = multi_worker_dataset.with_options(options)" ] }, { "cell_type": "markdown", "metadata": { "id": "z85hElxsBQsT" }, "source": [ "### 評価する" ] }, { "cell_type": "markdown", "metadata": { "id": "gmqvlh5LhAoU" }, "source": [ "`validation_data` を `Model.fit` に渡すと、エポックごとにトレーニングと評価が交互に行われるようになります。評価は同じセットのワーカー間で分散されているため、評価結果はすべてのワーカーが使用できるように集計されます。\n", "\n", "トレーニングと同様に、評価データセットもファイルレベルで自動的にシャーディングされます。評価データセットにグローバルバッチサイズを設定し、`validation_steps` を設定する必要があります。\n", "\n", "繰り返しのデータセットを評価することをお勧めします (`tf.data.Dataset.repeat` を呼び出します) 。\n", "\n", "または、定期的にチェックポイントを読み取って評価を実行するもう 1 つのタスクを作成することもできます。これは Estimator が行うことですが、推奨される評価方法ではないため、ここでは詳細については触れません。" ] }, { "cell_type": "markdown", "metadata": { "id": "FNkoxUPJBNTb" }, "source": [ "### パフォーマンス" ] }, { "cell_type": "markdown", "metadata": { "id": "XVk4ftYx6JAO" }, "source": [ "マルチワーカートレーニングのパフォーマンスを調整するには、次を行うことができます。\n", "\n", "- `tf.distribute.MultiWorkerMirroredStrategy` には複数の[集合体通信実装](https://www.tensorflow.org/api_docs/python/tf/distribute/experimental/CommunicationImplementation)が用意されています。\n", "\n", " - `RING` は、クロスホスト通信レイヤーとして、gRPC を使用したリング状の集合体を実装します。\n", " - `NCCL` は [NVIDIA Collective Communication Library](https://developer.nvidia.com/nccl) を使用して集合体を実装します。\n", " - `AUTO` は、選択をランタイムに任せます。\n", "\n", " 集合体の最適な実装は、GPU の数、GPU の種類、およびクラスタ内のネットワーク相互接続によって異なります。自動選択をオーバーライドするには、`MultiWorkerMirroredStrategy` のコンストラクタの `communication_options` パラメータを以下のようにして指定します。\n", "\n", " ```python\n", " communication_options=tf.distribute.experimental.CommunicationOptions(implementation=tf.distribute.experimental.CommunicationImplementation.NCCL)\n", " ```\n", "\n", "- 可能であれば、変数を `tf.float` にキャストします。\n", "\n", " - 公式の ResNet モデルには、どのようにしてこれを行うかの[例](https://github.com/tensorflow/models/blob/8367cf6dabe11adf7628541706b660821f397dce/official/resnet/resnet_model.py#L466)が示されています。" ] }, { "cell_type": "markdown", "metadata": { "id": "97WhAu8uKw3j" }, "source": [ "### フォールトトレランス\n", "\n", "同期トレーニングでは、ワーカーが 1 つでも失敗し、障害復旧の仕組みが存在しない場合、クラスタは失敗します。\n", "\n", "Keras を`tf.distribute.Strategy`で使用する場合、ワーカーが停止した場合や不安定である際に、フォールトトラレンスが機能するというメリットがあります。この機能は、指定された分散ファイルシステムにトレーニングの状態を保存するため、失敗、または、中断されたインスタンスを再開する場合に、トレーニングの状態が復旧されます。\n", "\n", "ワーカーが使用できなくなると、他のワーカーはエラーを発生します (おそらくタイムアウト後)。 このような場合、使用できないワーカー、およびエラーが発生した他のワーカーを再起動する必要があります。\n", "\n", "注意: 以前は、`ModelCheckpoint` コールバックには、マルチワーカートレーニングに失敗したジョブを再開したときに、トレーニングの状態を復元するメカニズムがありました。新たに導入される [`BackupAndRestore`](#scrollTo=kmH8uCUhfn4w) コールバックでは、一貫したエクスペリエンスを提供するために、シングルワーカートレーニングにもこのサポートが追加され、既存の `ModelCheckpoint` コールバックからフォールトトレランス機能が削除されました。今後、この動作に依存するアプリケーションは、新しい `BackupAndRestore` コールバックに移行する必要があります。" ] }, { "cell_type": "markdown", "metadata": { "id": "KvHPjGlyyFt6" }, "source": [ "#### `ModelCheckpoint` コールバック\n", "\n", "`ModelCheckpoint`コールバックは、フォールトトレランス機能を提供しなくなりました。代わりに [`BackupAndRestore`](#scrollTo=kmH8uCUhfn4w)コールバックを使用してください。\n", "\n", "`ModelCheckpoint`コールバックを使用してチェックポイントを保存することは、依然として可能です。ただし、これを使用する場合、トレーニングが中断されるか、問題なく終了した場合、チェックポイントからトレーニングを続行するには、手動でモデルを読み込まなければなりません。\n", "\n", "オプションで、ユーザーは `ModelCheckpoint` コールバックの外部でモデル/重みを保存および復元することを選択できます。" ] }, { "cell_type": "markdown", "metadata": { "id": "EUNV5Utc1d0s" }, "source": [ "### モデルの保存と読み込み\n", "\n", "`model.save` または `tf.saved_model.save` を使用してモデルを保存するには、ワーカーごとに異なる保存先が必要となります。\n", "\n", "- チーフワーカー以外のワーカーの場合、モデルを一時ディレクトリに保存する必要があります。\n", "- チーフワーカーの場合、指定されたモデルのディレクトリに保存する必要があります。\n", "\n", "ワーカーの一時ディレクトリは、複数のワーカーが同じ場所に書き込もうとしてエラーが発生しないように、一意のディレクトリである必要があります。\n", "\n", "すべてのディレクトリに保存されるモデルは同一のものであり、復元やサービングで参照されるのは一般的に、チーフワーカーが保存したモデルです。\n", "\n", "トレーニングが完了したらワーカーが作成した一時ディレクトリを削除するクリーンアップロジックを用意しておく必要があります。\n", "\n", "チーフとワーカーを同時に保存する必要があるのは、チェックポイント中に変数を集計する可能性があり、チーフとワーカーの両方が allreduce 通信プロトコルに参加する必要があるためです。しかしながら、チーフとワーカーを同じモデルディレクトリに保存すると競合が発生し、エラーとなります。\n", "\n", "`MultiWorkerMirroredStrategy` を使用すると、プログラムはワーカーごとに実行され、現在のワーカーがチーフであるかを知る際には、`task_type` と `task_id` の属性があるクラスタレゾルバオブジェクトが利用されます。\n", "\n", "- `task_type` から、現在のジョブが何であるか (`'worker'` など) を知ることができます。\n", "- `task_id` から、ワーカーの ID を得られます。\n", "- `task_id == 0` のワーカーはチーフワーカーです。\n", "\n", "以下のコードスニペットの `write_filepath` 関数は、書き込みのファイルパスを指定します。このパスはワーカーの `task_id` によって異なります。\n", "\n", "- チーフワーカー(`task_id == 0`)の場合は、元のファイルパスに書き込みます。\n", "- それ以外のワーカーの場合は、書き込むディレクトリパスに `task_id` を指定して、一時ディレクトリ(`temp_dir`)を作成します。" ] }, { "cell_type": "code", "execution_count": 28, "metadata": { "execution": { "iopub.execute_input": "2022-12-14T20:01:03.521240Z", "iopub.status.busy": "2022-12-14T20:01:03.520990Z", "iopub.status.idle": "2022-12-14T20:01:03.526887Z", "shell.execute_reply": "2022-12-14T20:01:03.526332Z" }, "id": "XQfGkmg-pfCY" }, "outputs": [], "source": [ "model_path = '/tmp/keras-model'\n", "\n", "def _is_chief(task_type, task_id):\n", " # Note: there are two possible `TF_CONFIG` configurations.\n", " # 1) In addition to `worker` tasks, a `chief` task type is use;\n", " # in this case, this function should be modified to\n", " # `return task_type == 'chief'`.\n", " # 2) Only `worker` task type is used; in this case, worker 0 is\n", " # regarded as the chief. The implementation demonstrated here\n", " # is for this case.\n", " # For the purpose of this Colab section, the `task_type` is `None` case\n", " # is added because it is effectively run with only a single worker.\n", " return (task_type == 'worker' and task_id == 0) or task_type is None\n", "\n", "def _get_temp_dir(dirpath, task_id):\n", " base_dirpath = 'workertemp_' + str(task_id)\n", " temp_dir = os.path.join(dirpath, base_dirpath)\n", " tf.io.gfile.makedirs(temp_dir)\n", " return temp_dir\n", "\n", "def write_filepath(filepath, task_type, task_id):\n", " dirpath = os.path.dirname(filepath)\n", " base = os.path.basename(filepath)\n", " if not _is_chief(task_type, task_id):\n", " dirpath = _get_temp_dir(dirpath, task_id)\n", " return os.path.join(dirpath, base)\n", "\n", "task_type, task_id = (strategy.cluster_resolver.task_type,\n", " strategy.cluster_resolver.task_id)\n", "write_model_path = write_filepath(model_path, task_type, task_id)" ] }, { "cell_type": "markdown", "metadata": { "id": "hs0_agYR_qKm" }, "source": [ "これで、保存の準備ができました。" ] }, { "cell_type": "code", "execution_count": 29, "metadata": { "execution": { "iopub.execute_input": "2022-12-14T20:01:03.529890Z", "iopub.status.busy": "2022-12-14T20:01:03.529674Z", "iopub.status.idle": "2022-12-14T20:01:04.173885Z", "shell.execute_reply": "2022-12-14T20:01:04.173282Z" }, "id": "J-yA3BYG_vTs" }, "outputs": [ { "name": "stderr", "output_type": "stream", "text": [ "WARNING:absl:Found untraced functions such as _jit_compiled_convolution_op, _update_step_xla while saving (showing 2 of 2). These functions will not be directly callable after loading.\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "INFO:tensorflow:Assets written to: /tmp/keras-model/assets\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "INFO:tensorflow:Assets written to: /tmp/keras-model/assets\n" ] } ], "source": [ "multi_worker_model.save(write_model_path)" ] }, { "cell_type": "markdown", "metadata": { "id": "8LXUVVl9_v5x" }, "source": [ "前述したように、後でモデルを読み込む場合、チーフが保存した場所にあるモデルのみを使用するべきなので、非チーフワーカーが保存した一時的なモデルは削除します。" ] }, { "cell_type": "code", "execution_count": 30, "metadata": { "execution": { "iopub.execute_input": "2022-12-14T20:01:04.178032Z", "iopub.status.busy": "2022-12-14T20:01:04.177760Z", "iopub.status.idle": "2022-12-14T20:01:04.181378Z", "shell.execute_reply": "2022-12-14T20:01:04.180727Z" }, "id": "aJTyu-97ABpY" }, "outputs": [], "source": [ "if not _is_chief(task_type, task_id):\n", " tf.io.gfile.rmtree(os.path.dirname(write_model_path))" ] }, { "cell_type": "markdown", "metadata": { "id": "Nr-2PKlHAPBT" }, "source": [ "読み込む際に便利な `tf.keras.models.load_model` API を使用して、以降の作業に続けることにします。\n", "\n", "ここでは、シングルワーカーのみを使用してトレーニングを読み込んで続けると仮定します。この場合、別の `strategy.scope()` 内で `tf.keras.models.load_model` を呼び出しません (前に定義したように、`strategy = tf.distribute.MultiWorkerMirroredStrategy()` です)。" ] }, { "cell_type": "code", "execution_count": 31, "metadata": { "execution": { "iopub.execute_input": "2022-12-14T20:01:04.184452Z", "iopub.status.busy": "2022-12-14T20:01:04.183936Z", "iopub.status.idle": "2022-12-14T20:01:05.143225Z", "shell.execute_reply": "2022-12-14T20:01:05.142530Z" }, "id": "iUZna-JKAOrX" }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Epoch 1/2\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "\r", " 1/20 [>.............................] - ETA: 8s - loss: 2.2967 - accuracy: 0.1562" ] }, { "name": "stdout", "output_type": "stream", "text": [ "\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\r", " 9/20 [============>.................] - ETA: 0s - loss: 2.2984 - accuracy: 0.2066" ] }, { "name": "stdout", "output_type": "stream", "text": [ "\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\r", "17/20 [========================>.....] - ETA: 0s - loss: 2.2930 - accuracy: 0.2013" ] }, { "name": "stdout", "output_type": "stream", "text": [ "\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\r", "20/20 [==============================] - 1s 7ms/step - loss: 2.2914 - accuracy: 0.2000\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "Epoch 2/2\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "\r", " 1/20 [>.............................] - ETA: 0s - loss: 2.2763 - accuracy: 0.1875" ] }, { "name": "stdout", "output_type": "stream", "text": [ "\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\r", " 9/20 [============>.................] - ETA: 0s - loss: 2.2764 - accuracy: 0.2274" ] }, { "name": "stdout", "output_type": "stream", "text": [ "\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\r", "17/20 [========================>.....] - ETA: 0s - loss: 2.2770 - accuracy: 0.2417" ] }, { "name": "stdout", "output_type": "stream", "text": [ "\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\r", "20/20 [==============================] - 0s 7ms/step - loss: 2.2754 - accuracy: 0.2422\n" ] }, { "data": { "text/plain": [ "" ] }, "execution_count": 31, "metadata": {}, "output_type": "execute_result" } ], "source": [ "loaded_model = tf.keras.models.load_model(model_path)\n", "\n", "# Now that the model is restored, and can continue with the training.\n", "loaded_model.fit(single_worker_dataset, epochs=2, steps_per_epoch=20)" ] }, { "cell_type": "markdown", "metadata": { "id": "YJ1fmxmTpocS" }, "source": [ "### チェックポイントの保存と復元\n", "\n", "一方、チェックポイントを作成すれば、モデルの重みを保存し、モデル全体を保存せずともそれらを復元することが可能です。\n", "\n", "ここでは、モデルをトラッキングする `tf.train.Checkpoint` を 1 つ作成します。これは `tf.train.CheckpointManager` によって管理されるため、最新のチェックポイントのみが保存されます。" ] }, { "cell_type": "code", "execution_count": 32, "metadata": { "execution": { "iopub.execute_input": "2022-12-14T20:01:05.146954Z", "iopub.status.busy": "2022-12-14T20:01:05.146339Z", "iopub.status.idle": "2022-12-14T20:01:05.150732Z", "shell.execute_reply": "2022-12-14T20:01:05.150146Z" }, "id": "_1-RYaB5xnNH" }, "outputs": [], "source": [ "checkpoint_dir = '/tmp/ckpt'\n", "\n", "checkpoint = tf.train.Checkpoint(model=multi_worker_model)\n", "write_checkpoint_dir = write_filepath(checkpoint_dir, task_type, task_id)\n", "checkpoint_manager = tf.train.CheckpointManager(\n", " checkpoint, directory=write_checkpoint_dir, max_to_keep=1)" ] }, { "cell_type": "markdown", "metadata": { "id": "7oBpPCRsW1MF" }, "source": [ "`CheckpointManager` の準備ができたら、チェックポイントを保存し、チーフ以外のワーカーが保存したチェックポイントを削除します。" ] }, { "cell_type": "code", "execution_count": 33, "metadata": { "execution": { "iopub.execute_input": "2022-12-14T20:01:05.153824Z", "iopub.status.busy": "2022-12-14T20:01:05.153360Z", "iopub.status.idle": "2022-12-14T20:01:05.181450Z", "shell.execute_reply": "2022-12-14T20:01:05.180882Z" }, "id": "l1ZXG_GbWzLp" }, "outputs": [], "source": [ "checkpoint_manager.save()\n", "if not _is_chief(task_type, task_id):\n", " tf.io.gfile.rmtree(write_checkpoint_dir)" ] }, { "cell_type": "markdown", "metadata": { "id": "RO7cbN40XD5v" }, "source": [ "これで、復元する必要があれば、便利な`tf.train.latest_checkpoint`関数を使用して、保存された最新のチェックポイントを見つけることができるようになりました。チェックポイントが復元されると、トレーニングを続行することができます。" ] }, { "cell_type": "code", "execution_count": 34, "metadata": { "execution": { "iopub.execute_input": "2022-12-14T20:01:05.184773Z", "iopub.status.busy": "2022-12-14T20:01:05.184542Z", "iopub.status.idle": "2022-12-14T20:01:08.436055Z", "shell.execute_reply": "2022-12-14T20:01:08.435239Z" }, "id": "NJW7vtknXFEH" }, "outputs": [ { "name": "stderr", "output_type": "stream", "text": [ "2022-12-14 20:01:05.395997: W tensorflow/core/grappler/optimizers/data/auto_shard.cc:784] AUTO sharding policy will apply DATA sharding policy as it failed to apply FILE sharding policy because of the following reason: Found an unshardable source dataset: name: \"TensorSliceDataset/_2\"\n", "op: \"TensorSliceDataset\"\n", "input: \"Placeholder/_0\"\n", "input: \"Placeholder/_1\"\n", "attr {\n", " key: \"Toutput_types\"\n", " value {\n", " list {\n", " type: DT_FLOAT\n", " type: DT_INT64\n", " }\n", " }\n", "}\n", "attr {\n", " key: \"_cardinality\"\n", " value {\n", " i: 60000\n", " }\n", "}\n", "attr {\n", " key: \"is_files\"\n", " value {\n", " b: false\n", " }\n", "}\n", "attr {\n", " key: \"metadata\"\n", " value {\n", " s: \"\\n\\024TensorSliceDataset:5\"\n", " }\n", "}\n", "attr {\n", " key: \"output_shapes\"\n", " value {\n", " list {\n", " shape {\n", " dim {\n", " size: 28\n", " }\n", " dim {\n", " size: 28\n", " }\n", " }\n", " shape {\n", " }\n", " }\n", " }\n", "}\n", "attr {\n", " key: \"replicate_on_split\"\n", " value {\n", " b: false\n", " }\n", "}\n", "experimental_type {\n", " type_id: TFT_PRODUCT\n", " args {\n", " type_id: TFT_DATASET\n", " args {\n", " type_id: TFT_PRODUCT\n", " args {\n", " type_id: TFT_TENSOR\n", " args {\n", " type_id: TFT_FLOAT\n", " }\n", " }\n", " args {\n", " type_id: TFT_TENSOR\n", " args {\n", " type_id: TFT_INT64\n", " }\n", " }\n", " }\n", " }\n", "}\n", "\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "Epoch 1/2\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "2022-12-14 20:01:05.654507: W tensorflow/core/framework/dataset.cc:807] Input of GeneratorDatasetOp::Dataset will not be optimized because the dataset does not implement the AsGraphDefInternal() method needed to apply optimizations.\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "\r", " 1/20 [>.............................] - ETA: 46s - loss: 2.3147 - accuracy: 0.2969" ] }, { "name": "stdout", "output_type": "stream", "text": [ "\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\r", " 9/20 [============>.................] - ETA: 0s - loss: 2.2893 - accuracy: 0.2101 " ] }, { "name": "stdout", "output_type": "stream", "text": [ "\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\r", "16/20 [=======================>......] - ETA: 0s - loss: 2.2933 - accuracy: 0.2236" ] }, { "name": "stdout", "output_type": "stream", "text": [ "\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\r", "20/20 [==============================] - 3s 7ms/step - loss: 2.2941 - accuracy: 0.2281\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "Epoch 2/2\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "\r", " 1/20 [>.............................] - ETA: 0s - loss: 2.2859 - accuracy: 0.2969" ] }, { "name": "stdout", "output_type": "stream", "text": [ "\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\r", " 9/20 [============>.................] - ETA: 0s - loss: 2.2849 - accuracy: 0.2552" ] }, { "name": "stdout", "output_type": "stream", "text": [ "\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\r", "17/20 [========================>.....] - ETA: 0s - loss: 2.2811 - accuracy: 0.2629" ] }, { "name": "stdout", "output_type": "stream", "text": [ "\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\r", "20/20 [==============================] - 0s 7ms/step - loss: 2.2783 - accuracy: 0.2648\n" ] }, { "data": { "text/plain": [ "" ] }, "execution_count": 34, "metadata": {}, "output_type": "execute_result" } ], "source": [ "latest_checkpoint = tf.train.latest_checkpoint(checkpoint_dir)\n", "checkpoint.restore(latest_checkpoint)\n", "multi_worker_model.fit(multi_worker_dataset, epochs=2, steps_per_epoch=20)" ] }, { "cell_type": "markdown", "metadata": { "id": "kmH8uCUhfn4w" }, "source": [ "#### `BackupAndRestore` コールバック\n", "\n", "`tf.keras.callbacks.BackupAndRestore` コールバックはフォールトトレランス機能を提供します。この機能はモデルと現在のエポック番号を一時チェックポイントファイルに `backup_dir` 引数でバックアップし、`BackupAndRestore` でコールバックします。\n", "\n", "注意: Tensorflow 2.9 では、現在のモデルとトレーニング状態がエポック境界でバックアップされます。 `tf-nightly` バージョンおよび TensorFlow 2.10 以降では、`BackupAndRestore` コールバックはエポックまたはステップ境界でモデルとトレーニング状態をバックアップします。`BackupAndRestore` は、オプションの `save_freq` 引数を受け入れます。`save_freq` は、`'epoch'` または `int` 値のいずれかを受け入れます。`save_freq` が `'epoch'` に設定されている場合、モデルはエポックごとにバックアップされます。`save_freq` が `0` より大きい整数値に設定されている場合、モデルは `save_freq` バッチごとにバックアップされます。\n", "\n", "ジョブが中断されて再開されると、`BackupAndRestore` コールバックが最後のチェックポイントを復元し、トレーニング状態が最後に保存されたエポックとステップの最初からトレーニングを続行できます。\n", "\n", "これを使用するには、`Model.fit` 呼び出し時に、 `Model.fit` のインスタンスを指定します。\n", "\n", "`MultiWorkerMirroredStrategy` では、ワーカーが中断されると、そのワーカーが再開するまでクラスタ全体が一時停止されます。そのワーカーが再開すると他のワーカーも再開します。中断したワーカーがクラスタに参加し直すと、各ワーカーは以前に保存されたチェックポイントファイルを読み取って以前の状態を復元するため、クラスタの同期状態が戻ります。そして、トレーニングが続行されます。分散データセットの反復子の状態は再初期化され、復元されません。\n", "\n", "The `BackupAndRestore` callback uses the `CheckpointManager` to save and restore the training state, which generates a file called checkpoint that tracks existing checkpoints together with the latest one. For this reason, `backup_dir` should not be re-used to store other checkpoints in order to avoid name collision.\n", "\n", "現在、`BackupAndRestore` コールバックは、ストラテジーなしのシングルワーカートレーニング(`MirroredStrategy`)と `MultiWorkerMirroredStrategy` によるマルチワーカートレーニングをサポートしています。\n", "\n", "Below are two examples for both multi-worker training and single-worker training:" ] }, { "cell_type": "code", "execution_count": 35, "metadata": { "execution": { "iopub.execute_input": "2022-12-14T20:01:08.440023Z", "iopub.status.busy": "2022-12-14T20:01:08.439301Z", "iopub.status.idle": "2022-12-14T20:01:13.203377Z", "shell.execute_reply": "2022-12-14T20:01:13.202719Z" }, "id": "CYdzZi4Qs1jz" }, "outputs": [ { "name": "stderr", "output_type": "stream", "text": [ "2022-12-14 20:01:08.689450: W tensorflow/core/grappler/optimizers/data/auto_shard.cc:784] AUTO sharding policy will apply DATA sharding policy as it failed to apply FILE sharding policy because of the following reason: Found an unshardable source dataset: name: \"TensorSliceDataset/_2\"\n", "op: \"TensorSliceDataset\"\n", "input: \"Placeholder/_0\"\n", "input: \"Placeholder/_1\"\n", "attr {\n", " key: \"Toutput_types\"\n", " value {\n", " list {\n", " type: DT_FLOAT\n", " type: DT_INT64\n", " }\n", " }\n", "}\n", "attr {\n", " key: \"_cardinality\"\n", " value {\n", " i: 60000\n", " }\n", "}\n", "attr {\n", " key: \"is_files\"\n", " value {\n", " b: false\n", " }\n", "}\n", "attr {\n", " key: \"metadata\"\n", " value {\n", " s: \"\\n\\024TensorSliceDataset:5\"\n", " }\n", "}\n", "attr {\n", " key: \"output_shapes\"\n", " value {\n", " list {\n", " shape {\n", " dim {\n", " size: 28\n", " }\n", " dim {\n", " size: 28\n", " }\n", " }\n", " shape {\n", " }\n", " }\n", " }\n", "}\n", "attr {\n", " key: \"replicate_on_split\"\n", " value {\n", " b: false\n", " }\n", "}\n", "experimental_type {\n", " type_id: TFT_PRODUCT\n", " args {\n", " type_id: TFT_DATASET\n", " args {\n", " type_id: TFT_PRODUCT\n", " args {\n", " type_id: TFT_TENSOR\n", " args {\n", " type_id: TFT_FLOAT\n", " }\n", " }\n", " args {\n", " type_id: TFT_TENSOR\n", " args {\n", " type_id: TFT_INT64\n", " }\n", " }\n", " }\n", " }\n", "}\n", "\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "Epoch 1/3\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "\r", " 1/70 [..............................] - ETA: 2:50 - loss: 2.2816 - accuracy: 0.1875" ] }, { "name": "stdout", "output_type": "stream", "text": [ "\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\r", " 8/70 [==>...........................] - ETA: 0s - loss: 2.2997 - accuracy: 0.1426 " ] }, { "name": "stdout", "output_type": "stream", "text": [ "\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\r", "16/70 [=====>........................] - ETA: 0s - loss: 2.2916 - accuracy: 0.1836" ] }, { "name": "stdout", "output_type": "stream", "text": [ "\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\r", "23/70 [========>.....................] - ETA: 0s - loss: 2.2913 - accuracy: 0.1787" ] }, { "name": "stdout", "output_type": "stream", "text": [ "\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\r", "31/70 [============>.................] - ETA: 0s - loss: 2.2872 - accuracy: 0.1870" ] }, { "name": "stdout", "output_type": "stream", "text": [ "\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\r", "38/70 [===============>..............] - ETA: 0s - loss: 2.2843 - accuracy: 0.1924" ] }, { "name": "stdout", "output_type": "stream", "text": [ "\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\r", "45/70 [==================>...........] - ETA: 0s - loss: 2.2807 - accuracy: 0.2010" ] }, { "name": "stdout", "output_type": "stream", "text": [ "\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\r", "53/70 [=====================>........] - ETA: 0s - loss: 2.2774 - accuracy: 0.2114" ] }, { "name": "stdout", "output_type": "stream", "text": [ "\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\r", "60/70 [========================>.....] - ETA: 0s - loss: 2.2744 - accuracy: 0.2164" ] }, { "name": "stdout", "output_type": "stream", "text": [ "\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\r", "67/70 [===========================>..] - ETA: 0s - loss: 2.2713 - accuracy: 0.2225" ] }, { "name": "stdout", "output_type": "stream", "text": [ "\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\r", "70/70 [==============================] - 3s 8ms/step - loss: 2.2696 - accuracy: 0.2268\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "Epoch 2/3\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "\r", " 1/70 [..............................] - ETA: 0s - loss: 2.2675 - accuracy: 0.2344" ] }, { "name": "stdout", "output_type": "stream", "text": [ "\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\r", " 8/70 [==>...........................] - ETA: 0s - loss: 2.2326 - accuracy: 0.3008" ] }, { "name": "stdout", "output_type": "stream", "text": [ "\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\r", "15/70 [=====>........................] - ETA: 0s - loss: 2.2340 - accuracy: 0.2865" ] }, { "name": "stdout", "output_type": "stream", "text": [ "\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\r", "23/70 [========>.....................] - ETA: 0s - loss: 2.2318 - accuracy: 0.2779" ] }, { "name": "stdout", "output_type": "stream", "text": [ "\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\r", "30/70 [===========>..................] - ETA: 0s - loss: 2.2283 - accuracy: 0.2818" ] }, { "name": "stdout", "output_type": "stream", "text": [ "\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\r", "37/70 [==============>...............] - ETA: 0s - loss: 2.2243 - accuracy: 0.2884" ] }, { "name": "stdout", "output_type": "stream", "text": [ "\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\r", "44/70 [=================>............] - ETA: 0s - loss: 2.2215 - accuracy: 0.2884" ] }, { "name": "stdout", "output_type": "stream", "text": [ "\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\r", "51/70 [====================>.........] - ETA: 0s - loss: 2.2192 - accuracy: 0.2920" ] }, { "name": "stdout", "output_type": "stream", "text": [ "\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\r", "59/70 [========================>.....] - ETA: 0s - loss: 2.2151 - accuracy: 0.2974" ] }, { "name": "stdout", "output_type": "stream", "text": [ "\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\r", "66/70 [===========================>..] - ETA: 0s - loss: 2.2118 - accuracy: 0.3004" ] }, { "name": "stdout", "output_type": "stream", "text": [ "\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\r", "70/70 [==============================] - 1s 8ms/step - loss: 2.2096 - accuracy: 0.3013\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "Epoch 3/3\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "\r", " 1/70 [..............................] - ETA: 0s - loss: 2.1476 - accuracy: 0.4062" ] }, { "name": "stdout", "output_type": "stream", "text": [ "\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\r", " 8/70 [==>...........................] - ETA: 0s - loss: 2.1671 - accuracy: 0.3555" ] }, { "name": "stdout", "output_type": "stream", "text": [ "\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\r", "15/70 [=====>........................] - ETA: 0s - loss: 2.1607 - accuracy: 0.3865" ] }, { "name": "stdout", "output_type": "stream", "text": [ "\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\r", "21/70 [========>.....................] - ETA: 0s - loss: 2.1584 - accuracy: 0.3914" ] }, { "name": "stdout", "output_type": "stream", "text": [ "\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\r", "28/70 [===========>..................] - ETA: 0s - loss: 2.1527 - accuracy: 0.4035" ] }, { "name": "stdout", "output_type": "stream", "text": [ "\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\r", "35/70 [==============>...............] - ETA: 0s - loss: 2.1528 - accuracy: 0.4009" ] }, { "name": "stdout", "output_type": "stream", "text": [ "\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\r", "42/70 [=================>............] - ETA: 0s - loss: 2.1497 - accuracy: 0.4077" ] }, { "name": "stdout", "output_type": "stream", "text": [ "\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\r", "50/70 [====================>.........] - ETA: 0s - loss: 2.1426 - accuracy: 0.4150" ] }, { "name": "stdout", "output_type": "stream", "text": [ "\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\r", "57/70 [=======================>......] - ETA: 0s - loss: 2.1375 - accuracy: 0.4238" ] }, { "name": "stdout", "output_type": "stream", "text": [ "\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\r", "64/70 [==========================>...] - ETA: 0s - loss: 2.1329 - accuracy: 0.4302" ] }, { "name": "stdout", "output_type": "stream", "text": [ "\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\r", "70/70 [==============================] - 1s 8ms/step - loss: 2.1317 - accuracy: 0.4317\n" ] }, { "data": { "text/plain": [ "" ] }, "execution_count": 35, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# Multi-worker training with `MultiWorkerMirroredStrategy`\n", "# and the `BackupAndRestore` callback. The training state \n", "# is backed up at epoch boundaries by default.\n", "\n", "callbacks = [tf.keras.callbacks.BackupAndRestore(backup_dir='/tmp/backup')]\n", "with strategy.scope():\n", " multi_worker_model = mnist_setup.build_and_compile_cnn_model()\n", "multi_worker_model.fit(multi_worker_dataset,\n", " epochs=3,\n", " steps_per_epoch=70,\n", " callbacks=callbacks)" ] }, { "cell_type": "markdown", "metadata": { "id": "f8e86TAp0Rsl" }, "source": [ "`BackupAndRestore` コールバックの `save_freq` 引数が `'epoch'` に設定されている場合、モデルはエポックごとにバックアップされます。" ] }, { "cell_type": "code", "execution_count": 36, "metadata": { "execution": { "iopub.execute_input": "2022-12-14T20:01:13.207204Z", "iopub.status.busy": "2022-12-14T20:01:13.206497Z", "iopub.status.idle": "2022-12-14T20:01:17.954330Z", "shell.execute_reply": "2022-12-14T20:01:17.953660Z" }, "id": "rZjQGPsF0aEI" }, "outputs": [ { "name": "stderr", "output_type": "stream", "text": [ "2022-12-14 20:01:13.452703: W tensorflow/core/grappler/optimizers/data/auto_shard.cc:784] AUTO sharding policy will apply DATA sharding policy as it failed to apply FILE sharding policy because of the following reason: Found an unshardable source dataset: name: \"TensorSliceDataset/_2\"\n", "op: \"TensorSliceDataset\"\n", "input: \"Placeholder/_0\"\n", "input: \"Placeholder/_1\"\n", "attr {\n", " key: \"Toutput_types\"\n", " value {\n", " list {\n", " type: DT_FLOAT\n", " type: DT_INT64\n", " }\n", " }\n", "}\n", "attr {\n", " key: \"_cardinality\"\n", " value {\n", " i: 60000\n", " }\n", "}\n", "attr {\n", " key: \"is_files\"\n", " value {\n", " b: false\n", " }\n", "}\n", "attr {\n", " key: \"metadata\"\n", " value {\n", " s: \"\\n\\024TensorSliceDataset:5\"\n", " }\n", "}\n", "attr {\n", " key: \"output_shapes\"\n", " value {\n", " list {\n", " shape {\n", " dim {\n", " size: 28\n", " }\n", " dim {\n", " size: 28\n", " }\n", " }\n", " shape {\n", " }\n", " }\n", " }\n", "}\n", "attr {\n", " key: \"replicate_on_split\"\n", " value {\n", " b: false\n", " }\n", "}\n", "experimental_type {\n", " type_id: TFT_PRODUCT\n", " args {\n", " type_id: TFT_DATASET\n", " args {\n", " type_id: TFT_PRODUCT\n", " args {\n", " type_id: TFT_TENSOR\n", " args {\n", " type_id: TFT_FLOAT\n", " }\n", " }\n", " args {\n", " type_id: TFT_TENSOR\n", " args {\n", " type_id: TFT_INT64\n", " }\n", " }\n", " }\n", " }\n", "}\n", "\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "Epoch 1/3\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "\r", " 1/70 [..............................] - ETA: 2:51 - loss: 2.3035 - accuracy: 0.0312" ] }, { "name": "stdout", "output_type": "stream", "text": [ "\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\r", " 8/70 [==>...........................] - ETA: 0s - loss: 2.2948 - accuracy: 0.0957 " ] }, { "name": "stdout", "output_type": "stream", "text": [ "\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\r", "15/70 [=====>........................] - ETA: 0s - loss: 2.2944 - accuracy: 0.1104" ] }, { "name": "stdout", "output_type": "stream", "text": [ "\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\r", "22/70 [========>.....................] - ETA: 0s - loss: 2.2933 - accuracy: 0.1207" ] }, { "name": "stdout", "output_type": "stream", "text": [ "\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\r", "29/70 [===========>..................] - ETA: 0s - loss: 2.2901 - accuracy: 0.1347" ] }, { "name": "stdout", "output_type": "stream", "text": [ "\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\r", "36/70 [==============>...............] - ETA: 0s - loss: 2.2879 - accuracy: 0.1376" ] }, { "name": "stdout", "output_type": "stream", "text": [ "\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\r", "44/70 [=================>............] - ETA: 0s - loss: 2.2855 - accuracy: 0.1442" ] }, { "name": "stdout", "output_type": "stream", "text": [ "\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\r", "51/70 [====================>.........] - ETA: 0s - loss: 2.2834 - accuracy: 0.1517" ] }, { "name": "stdout", "output_type": "stream", "text": [ "\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\r", "58/70 [=======================>......] - ETA: 0s - loss: 2.2817 - accuracy: 0.1571" ] }, { "name": "stdout", "output_type": "stream", "text": [ "\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\r", "65/70 [==========================>...] - ETA: 0s - loss: 2.2799 - accuracy: 0.1654" ] }, { "name": "stdout", "output_type": "stream", "text": [ "\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\r", "70/70 [==============================] - 3s 8ms/step - loss: 2.2786 - accuracy: 0.1719\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "Epoch 2/3\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "\r", " 1/70 [..............................] - ETA: 0s - loss: 2.2618 - accuracy: 0.2812" ] }, { "name": "stdout", "output_type": "stream", "text": [ "\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\r", " 8/70 [==>...........................] - ETA: 0s - loss: 2.2543 - accuracy: 0.2734" ] }, { "name": "stdout", "output_type": "stream", "text": [ "\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\r", "15/70 [=====>........................] - ETA: 0s - loss: 2.2508 - accuracy: 0.2990" ] }, { "name": "stdout", "output_type": "stream", "text": [ "\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\r", "22/70 [========>.....................] - ETA: 0s - loss: 2.2485 - accuracy: 0.3047" ] }, { "name": "stdout", "output_type": "stream", "text": [ "\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\r", "30/70 [===========>..................] - ETA: 0s - loss: 2.2460 - accuracy: 0.3276" ] }, { "name": "stdout", "output_type": "stream", "text": [ "\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\r", "37/70 [==============>...............] - ETA: 0s - loss: 2.2438 - accuracy: 0.3399" ] }, { "name": "stdout", "output_type": "stream", "text": [ "\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\r", "45/70 [==================>...........] - ETA: 0s - loss: 2.2409 - accuracy: 0.3497" ] }, { "name": "stdout", "output_type": "stream", "text": [ "\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\r", "52/70 [=====================>........] - ETA: 0s - loss: 2.2384 - accuracy: 0.3645" ] }, { "name": "stdout", "output_type": "stream", "text": [ "\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\r", "59/70 [========================>.....] - ETA: 0s - loss: 2.2358 - accuracy: 0.3761" ] }, { "name": "stdout", "output_type": "stream", "text": [ "\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\r", "67/70 [===========================>..] - ETA: 0s - loss: 2.2326 - accuracy: 0.3906" ] }, { "name": "stdout", "output_type": "stream", "text": [ "\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\r", "70/70 [==============================] - 1s 8ms/step - loss: 2.2316 - accuracy: 0.3955\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "Epoch 3/3\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "\r", " 1/70 [..............................] - ETA: 0s - loss: 2.2178 - accuracy: 0.4688" ] }, { "name": "stdout", "output_type": "stream", "text": [ "\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\r", " 8/70 [==>...........................] - ETA: 0s - loss: 2.2081 - accuracy: 0.4844" ] }, { "name": "stdout", "output_type": "stream", "text": [ "\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\r", "16/70 [=====>........................] - ETA: 0s - loss: 2.2045 - accuracy: 0.4980" ] }, { "name": "stdout", "output_type": "stream", "text": [ "\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\r", "24/70 [=========>....................] - ETA: 0s - loss: 2.2011 - accuracy: 0.4974" ] }, { "name": "stdout", "output_type": "stream", "text": [ "\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\r", "32/70 [============>.................] - ETA: 0s - loss: 2.1991 - accuracy: 0.5039" ] }, { "name": "stdout", "output_type": "stream", "text": [ "\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\r", "39/70 [===============>..............] - ETA: 0s - loss: 2.1970 - accuracy: 0.5096" ] }, { "name": "stdout", "output_type": "stream", "text": [ "\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\r", "47/70 [===================>..........] - ETA: 0s - loss: 2.1931 - accuracy: 0.5246" ] }, { "name": "stdout", "output_type": "stream", "text": [ "\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\r", "53/70 [=====================>........] - ETA: 0s - loss: 2.1909 - accuracy: 0.5283" ] }, { "name": "stdout", "output_type": "stream", "text": [ "\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\r", "60/70 [========================>.....] - ETA: 0s - loss: 2.1870 - accuracy: 0.5333" ] }, { "name": "stdout", "output_type": "stream", "text": [ "\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\r", "68/70 [============================>.] - ETA: 0s - loss: 2.1838 - accuracy: 0.5384" ] }, { "name": "stdout", "output_type": "stream", "text": [ "\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\r", "70/70 [==============================] - 1s 8ms/step - loss: 2.1827 - accuracy: 0.5406\n" ] }, { "data": { "text/plain": [ "" ] }, "execution_count": 36, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# The training state is backed up at epoch boundaries because `save_freq` is\n", "# set to `epoch`.\n", "\n", "callbacks = [tf.keras.callbacks.BackupAndRestore(backup_dir='/tmp/backup')]\n", "with strategy.scope():\n", " multi_worker_model = mnist_setup.build_and_compile_cnn_model()\n", "multi_worker_model.fit(multi_worker_dataset,\n", " epochs=3,\n", " steps_per_epoch=70,\n", " callbacks=callbacks)\n" ] }, { "cell_type": "markdown", "metadata": { "id": "p-r44kCM0jc6" }, "source": [ "注意: 次のコード ブロックでは、Tensorflow 2.10 がリリースされるまで `tf-nightly` でのみ利用可能な機能を使用します。\n", "\n", "`BackupAndRestore` コールバックの `save_freq` 引数が `0` より大きい整数値に設定されている場合、モデルは `save_freq` バッチごとにバックアップされます。" ] }, { "cell_type": "code", "execution_count": 37, "metadata": { "execution": { "iopub.execute_input": "2022-12-14T20:01:17.958338Z", "iopub.status.busy": "2022-12-14T20:01:17.957679Z", "iopub.status.idle": "2022-12-14T20:01:22.885253Z", "shell.execute_reply": "2022-12-14T20:01:22.884536Z" }, "id": "bSJUyLSF0moC" }, "outputs": [ { "name": "stderr", "output_type": "stream", "text": [ "2022-12-14 20:01:18.217652: W tensorflow/core/grappler/optimizers/data/auto_shard.cc:784] AUTO sharding policy will apply DATA sharding policy as it failed to apply FILE sharding policy because of the following reason: Found an unshardable source dataset: name: \"TensorSliceDataset/_2\"\n", "op: \"TensorSliceDataset\"\n", "input: \"Placeholder/_0\"\n", "input: \"Placeholder/_1\"\n", "attr {\n", " key: \"Toutput_types\"\n", " value {\n", " list {\n", " type: DT_FLOAT\n", " type: DT_INT64\n", " }\n", " }\n", "}\n", "attr {\n", " key: \"_cardinality\"\n", " value {\n", " i: 60000\n", " }\n", "}\n", "attr {\n", " key: \"is_files\"\n", " value {\n", " b: false\n", " }\n", "}\n", "attr {\n", " key: \"metadata\"\n", " value {\n", " s: \"\\n\\024TensorSliceDataset:5\"\n", " }\n", "}\n", "attr {\n", " key: \"output_shapes\"\n", " value {\n", " list {\n", " shape {\n", " dim {\n", " size: 28\n", " }\n", " dim {\n", " size: 28\n", " }\n", " }\n", " shape {\n", " }\n", " }\n", " }\n", "}\n", "attr {\n", " key: \"replicate_on_split\"\n", " value {\n", " b: false\n", " }\n", "}\n", "experimental_type {\n", " type_id: TFT_PRODUCT\n", " args {\n", " type_id: TFT_DATASET\n", " args {\n", " type_id: TFT_PRODUCT\n", " args {\n", " type_id: TFT_TENSOR\n", " args {\n", " type_id: TFT_FLOAT\n", " }\n", " }\n", " args {\n", " type_id: TFT_TENSOR\n", " args {\n", " type_id: TFT_INT64\n", " }\n", " }\n", " }\n", " }\n", "}\n", "\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "Epoch 1/3\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "\r", " 1/70 [..............................] - ETA: 2:50 - loss: 2.3173 - accuracy: 0.0469" ] }, { "name": "stdout", "output_type": "stream", "text": [ "\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\r", " 8/70 [==>...........................] - ETA: 0s - loss: 2.2996 - accuracy: 0.1055 " ] }, { "name": "stdout", "output_type": "stream", "text": [ "\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\r", "15/70 [=====>........................] - ETA: 0s - loss: 2.3000 - accuracy: 0.0958" ] }, { "name": "stdout", "output_type": "stream", "text": [ "\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\r", "23/70 [========>.....................] - ETA: 0s - loss: 2.2959 - accuracy: 0.1101" ] }, { "name": "stdout", "output_type": "stream", "text": [ "\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\r", "30/70 [===========>..................] - ETA: 0s - loss: 2.2928 - accuracy: 0.1146" ] }, { "name": "stdout", "output_type": "stream", "text": [ "\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\r", "38/70 [===============>..............] - ETA: 0s - loss: 2.2882 - accuracy: 0.1271" ] }, { "name": "stdout", "output_type": "stream", "text": [ "\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\r", "45/70 [==================>...........] - ETA: 0s - loss: 2.2842 - accuracy: 0.1417" ] }, { "name": "stdout", "output_type": "stream", "text": [ "\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\r", "52/70 [=====================>........] - ETA: 0s - loss: 2.2799 - accuracy: 0.1599" ] }, { "name": "stdout", "output_type": "stream", "text": [ "\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\r", "60/70 [========================>.....] - ETA: 0s - loss: 2.2760 - accuracy: 0.1784" ] }, { "name": "stdout", "output_type": "stream", "text": [ "\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\r", "68/70 [============================>.] - ETA: 0s - loss: 2.2719 - accuracy: 0.1923" ] }, { "name": "stdout", "output_type": "stream", "text": [ "\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\r", "70/70 [==============================] - 3s 9ms/step - loss: 2.2709 - accuracy: 0.1958\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "Epoch 2/3\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "\r", " 1/70 [..............................] - ETA: 0s - loss: 2.2521 - accuracy: 0.2188" ] }, { "name": "stdout", "output_type": "stream", "text": [ "\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\r", " 8/70 [==>...........................] - ETA: 0s - loss: 2.2333 - accuracy: 0.3555" ] }, { "name": "stdout", "output_type": "stream", "text": [ "\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\r", "16/70 [=====>........................] - ETA: 0s - loss: 2.2245 - accuracy: 0.3994" ] }, { "name": "stdout", "output_type": "stream", "text": [ "\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\r", "20/70 [=======>......................] - ETA: 0s - loss: 2.2238 - accuracy: 0.4008" ] }, { "name": "stdout", "output_type": "stream", "text": [ "\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\r", "27/70 [==========>...................] - ETA: 0s - loss: 2.2196 - accuracy: 0.4144" ] }, { "name": "stdout", "output_type": "stream", "text": [ "\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\r", "35/70 [==============>...............] - ETA: 0s - loss: 2.2144 - accuracy: 0.4281" ] }, { "name": "stdout", "output_type": "stream", "text": [ "\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\r", "43/70 [=================>............] - ETA: 0s - loss: 2.2089 - accuracy: 0.4488" ] }, { "name": "stdout", "output_type": "stream", "text": [ "\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\r", "50/70 [====================>.........] - ETA: 0s - loss: 2.2049 - accuracy: 0.4594" ] }, { "name": "stdout", "output_type": "stream", "text": [ "\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\r", "58/70 [=======================>......] - ETA: 0s - loss: 2.2003 - accuracy: 0.4731" ] }, { "name": "stdout", "output_type": "stream", "text": [ "\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\r", "66/70 [===========================>..] - ETA: 0s - loss: 2.1957 - accuracy: 0.4863" ] }, { "name": "stdout", "output_type": "stream", "text": [ "\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\r", "70/70 [==============================] - 1s 9ms/step - loss: 2.1935 - accuracy: 0.4922\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "Epoch 3/3\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "\r", " 1/70 [..............................] - ETA: 0s - loss: 2.1483 - accuracy: 0.5781" ] }, { "name": "stdout", "output_type": "stream", "text": [ "\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\r", " 9/70 [==>...........................] - ETA: 0s - loss: 2.1457 - accuracy: 0.5938" ] }, { "name": "stdout", "output_type": "stream", "text": [ "\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\r", "10/70 [===>..........................] - ETA: 0s - loss: 2.1435 - accuracy: 0.6031" ] }, { "name": "stdout", "output_type": "stream", "text": [ "\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\r", "17/70 [======>.......................] - ETA: 0s - loss: 2.1372 - accuracy: 0.6176" ] }, { "name": "stdout", "output_type": "stream", "text": [ "\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\r", "25/70 [=========>....................] - ETA: 0s - loss: 2.1327 - accuracy: 0.6144" ] }, { "name": "stdout", "output_type": "stream", "text": [ "\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\r", "33/70 [=============>................] - ETA: 0s - loss: 2.1262 - accuracy: 0.6226" ] }, { "name": "stdout", "output_type": "stream", "text": [ "\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\r", "40/70 [================>.............] - ETA: 0s - loss: 2.1220 - accuracy: 0.6285" ] }, { "name": "stdout", "output_type": "stream", "text": [ "\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\r", "48/70 [===================>..........] - ETA: 0s - loss: 2.1151 - accuracy: 0.6423" ] }, { "name": "stdout", "output_type": "stream", "text": [ "\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\r", "55/70 [======================>.......] - ETA: 0s - loss: 2.1101 - accuracy: 0.6460" ] }, { "name": "stdout", "output_type": "stream", "text": [ "\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\r", "62/70 [=========================>....] - ETA: 0s - loss: 2.1064 - accuracy: 0.6454" ] }, { "name": "stdout", "output_type": "stream", "text": [ "\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\r", "70/70 [==============================] - ETA: 0s - loss: 2.1001 - accuracy: 0.6498" ] }, { "name": "stdout", "output_type": "stream", "text": [ "\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\r", "70/70 [==============================] - 1s 10ms/step - loss: 2.1001 - accuracy: 0.6498\n" ] }, { "data": { "text/plain": [ "" ] }, "execution_count": 37, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# The training state is backed up at every 30 steps because `save_freq` is set\n", "# to an integer value of `30`.\n", "\n", "callbacks = [tf.keras.callbacks.BackupAndRestore(backup_dir='/tmp/backup', save_freq=30)]\n", "with strategy.scope():\n", " multi_worker_model = mnist_setup.build_and_compile_cnn_model()\n", "multi_worker_model.fit(multi_worker_dataset,\n", " epochs=3,\n", " steps_per_epoch=70,\n", " callbacks=callbacks)" ] }, { "cell_type": "markdown", "metadata": { "id": "rIV5_3ebzXmB" }, "source": [ "`BackupAndRestore` に指定した `backup_dir` のディレクトリを検査すると、一時的に生成されたチェックポイントファイルがいくつかあることに気づくでしょう。これらのファイルは、以前に失われたインスタンスの復元に必要なもので、トレーニングが正常に終了した時点で、`Model.fit` の最後にライブラリによって削除されます。\n", "\n", "注意: 現在、`BackupAndRestore` コールバックは eager モードのみをサポートしています。グラフ モードでは、モデルの保存と復元に `Model.save`/`tf.saved_model.save` と `tf.keras.models.load_model` を使用することを検討してください。それぞれ、上記の*モデルの保存と読み込み*セクションで説明されています。トレーニング中に `Model.fit` で `initial_epoch` を提供します。" ] }, { "cell_type": "markdown", "metadata": { "id": "ega2hdOQEmy_" }, "source": [ "## 追加リソース\n", "\n", "1. [TensorFlow での分散型トレーニング](../../guide/distributed_training.ipynb)ガイドでは、利用可能な分散ストラテジーの概要が説明されています。\n", "2. [Keras によるカスタムトレーニングループと MultiWorkerMirroredStrategy](multi_worker_with_ctl.ipynb) のチュートリアルでは、Keras とカスタムトレーニングループで`MultiWorkerMirroredStrategy` を使用する方法が説明されています。\n", "3. [公式モデル](https://github.com/tensorflow/models/tree/master/official)をご覧ください。この多くは、複数の分散ストラテジーを実行するように構成できます。\n", "4. [tf.function を使ったパフォーマンスの改善](../../guide/function.ipynb)ガイドでは、その他のストラテジーや、TensorFlow モデルのパフォーマンスを最適化するために使用できる [TensorFlow Profiler](../../guide/profiler.md) といったツールに関する情報が提供されています。" ] } ], "metadata": { "colab": { "collapsed_sections": [], "name": "multi_worker_with_keras.ipynb", "toc_visible": true }, "kernelspec": { "display_name": "Python 3", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.9.16" } }, "nbformat": 4, "nbformat_minor": 0 }