{ "cells": [ { "cell_type": "markdown", "metadata": { "id": "rK1pP01MMuU1" }, "source": [ "##### Copyright 2020 The TensorFlow Authors." ] }, { "cell_type": "code", "execution_count": 1, "metadata": { "cellView": "form", "execution": { "iopub.execute_input": "2022-08-09T05:28:41.908176Z", "iopub.status.busy": "2022-08-09T05:28:41.907753Z", "iopub.status.idle": "2022-08-09T05:28:41.911494Z", "shell.execute_reply": "2022-08-09T05:28:41.910950Z" }, "id": "gtl722MvjuSf" }, "outputs": [], "source": [ "#@title Licensed under the Apache License, Version 2.0 (the \"License\");\n", "# you may not use this file except in compliance with the License.\n", "# You may obtain a copy of the License at\n", "#\n", "# https://www.apache.org/licenses/LICENSE-2.0\n", "#\n", "# Unless required by applicable law or agreed to in writing, software\n", "# distributed under the License is distributed on an \"AS IS\" BASIS,\n", "# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n", "# See the License for the specific language governing permissions and\n", "# limitations under the License." ] }, { "cell_type": "markdown", "metadata": { "id": "F9AnjBfz22gq" }, "source": [ "# Kerasモデルの保存と読み込み" ] }, { "cell_type": "markdown", "metadata": { "id": "TrNGttwSFElt" }, "source": [ "\n", " \n", " \n", " \n", " \n", "
TensorFlow.org で表示 Google Colab で実行GitHub でソースを表示ノートブックをダウンロード
" ] }, { "cell_type": "markdown", "metadata": { "id": "PlYTaLGGOlmx" }, "source": [ "## はじめに\n", "\n", "Keras モデルは以下の複数のコンポーネントで構成されています。\n", "\n", "- アーキテクチャー/構成(モデルに含まれるレイヤーとそれらの接続方法を指定する)\n", "- 重み値のセット(「モデルの状態」)\n", "- オプティマイザ(モデルのコンパイルで定義する)\n", "- 損失とメトリックのセット(モデルのコンパイルで定義するか、`add_loss()`または`add_metric()`を呼び出して定義する)\n", "\n", "Keras API を使用すると、これらを一度にディスクに保存したり、一部のみを選択して保存できます。\n", "\n", "- すべてを TensorFlow SavedModel 形式(または古い Keras H5 形式)で1つのアーカイブに保存。これは標準的な方法です。\n", "- アーキテクチャ/構成のみを(通常、JSON ファイルとして)保存。\n", "- 重み値のみを保存。(通常、モデルのトレーニング時に使用)。\n", "\n", "では、次にこれらのオプションの用途と機能をそれぞれ見ていきましょう。" ] }, { "cell_type": "markdown", "metadata": { "id": "EKhPbck9E82N" }, "source": [ "## 保存と読み込みに関する簡単な説明\n", "\n", "このガイドを読む時間が 10 秒しかない場合は、次のことを知っておく必要があります。\n", "\n", "**Keras モデルの保存**\n", "\n", "```python\n", "model = ... # Get model (Sequential, Functional Model, or Model subclass) model.save('path/to/location')\n", "```\n", "\n", "**モデルの再読み込み**\n", "\n", "```python\n", "from tensorflow import keras model = keras.models.load_model('path/to/location')\n", "```\n", "\n", "では、詳細を見てみましょう。" ] }, { "cell_type": "markdown", "metadata": { "id": "bT80eTSUngCU" }, "source": [ "## セットアップ" ] }, { "cell_type": "code", "execution_count": 2, "metadata": { "execution": { "iopub.execute_input": "2022-08-09T05:28:41.915284Z", "iopub.status.busy": "2022-08-09T05:28:41.914736Z", "iopub.status.idle": "2022-08-09T05:28:43.730803Z", "shell.execute_reply": "2022-08-09T05:28:43.730124Z" }, "id": "BallmpGiEbXD" }, "outputs": [ { "name": "stderr", "output_type": "stream", "text": [ "2022-08-09 05:28:42.265258: E tensorflow/stream_executor/cuda/cuda_blas.cc:2981] Unable to register cuBLAS factory: Attempting to register factory for plugin cuBLAS when one has already been registered\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "2022-08-09 05:28:42.817786: W tensorflow/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libnvinfer.so.7'; dlerror: libnvrtc.so.11.1: cannot open shared object file: No such file or directory\n", "2022-08-09 05:28:42.818021: W tensorflow/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libnvinfer_plugin.so.7'; dlerror: libnvrtc.so.11.1: cannot open shared object file: No such file or directory\n", "2022-08-09 05:28:42.818033: W tensorflow/compiler/tf2tensorrt/utils/py_utils.cc:38] TF-TRT Warning: Cannot dlopen some TensorRT libraries. If you would like to use Nvidia GPU with TensorRT, please make sure the missing libraries mentioned above are installed properly.\n" ] } ], "source": [ "import numpy as np\n", "import tensorflow as tf\n", "from tensorflow import keras" ] }, { "cell_type": "markdown", "metadata": { "id": "rZ6eEK8ekthu" }, "source": [ "## モデル全体の保存と読み込み\n", "\n", "モデル全体を1つのアーティファクトとして保存できます。その場合、以下が含まれます。\n", "\n", "- モデルのアーキテクチャ/構成\n", "- モデルの重み値(トレーニング時に学習される)\n", "- モデルのコンパイル情報(`compile()`が呼び出された場合)\n", "- オプティマイザとその状態(存在する場合)。これは、中断した所からトレーニングを再開するために使用します。\n", "\n", "#### API\n", "\n", "- `model.save()`または`tf.keras.models.save_model()`\n", "- `tf.keras.models.load_model()`\n", "\n", "モデル全体をディスクに保存するには **{nbsp}TensorFlow SavedModel 形式**と**古い Keras H5 形式**の 2 つの形式を使用できます。推奨される形式は SavedModel です。これは、`model.save()`を使用する場合のデフォルトです。\n", "\n", "次の方法で H5 形式に切り替えることができます。\n", "\n", "- `save_format='h5'`を`save()`に渡す。\n", "- `.h5`または`.keras`で終わるファイル名を`save()`に渡す。" ] }, { "cell_type": "markdown", "metadata": { "id": "HUg5WkAAObZn" }, "source": [ "### SavedModel 形式\n", "\n", "**例:**" ] }, { "cell_type": "code", "execution_count": 3, "metadata": { "execution": { "iopub.execute_input": "2022-08-09T05:28:43.734955Z", "iopub.status.busy": "2022-08-09T05:28:43.734584Z", "iopub.status.idle": "2022-08-09T05:28:48.339927Z", "shell.execute_reply": "2022-08-09T05:28:48.339238Z" }, "id": "MsqSBTGkkGma" }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "\r", "1/4 [======>.......................] - ETA: 1s - loss: 0.2713" ] }, { "name": "stdout", "output_type": "stream", "text": [ "\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\r", "4/4 [==============================] - 0s 2ms/step - loss: 0.2990\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "INFO:tensorflow:Assets written to: my_model/assets\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "\r", "1/4 [======>.......................] - ETA: 0s" ] }, { "name": "stdout", "output_type": "stream", "text": [ "\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\r", "4/4 [==============================] - 0s 2ms/step\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "\r", "1/4 [======>.......................] - ETA: 0s" ] }, { "name": "stdout", "output_type": "stream", "text": [ "\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\r", "4/4 [==============================] - 0s 2ms/step\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "\r", "1/4 [======>.......................] - ETA: 0s - loss: 0.3890" ] }, { "name": "stdout", "output_type": "stream", "text": [ "\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\r", "4/4 [==============================] - 0s 2ms/step - loss: 0.2952\n" ] }, { "data": { "text/plain": [ "" ] }, "execution_count": 3, "metadata": {}, "output_type": "execute_result" } ], "source": [ "def get_model():\n", " # Create a simple model.\n", " inputs = keras.Input(shape=(32,))\n", " outputs = keras.layers.Dense(1)(inputs)\n", " model = keras.Model(inputs, outputs)\n", " model.compile(optimizer=\"adam\", loss=\"mean_squared_error\")\n", " return model\n", "\n", "\n", "model = get_model()\n", "\n", "# Train the model.\n", "test_input = np.random.random((128, 32))\n", "test_target = np.random.random((128, 1))\n", "model.fit(test_input, test_target)\n", "\n", "# Calling `save('my_model')` creates a SavedModel folder `my_model`.\n", "model.save(\"my_model\")\n", "\n", "# It can be used to reconstruct the model identically.\n", "reconstructed_model = keras.models.load_model(\"my_model\")\n", "\n", "# Let's check:\n", "np.testing.assert_allclose(\n", " model.predict(test_input), reconstructed_model.predict(test_input)\n", ")\n", "\n", "# The reconstructed model is already compiled and has retained the optimizer\n", "# state, so training can resume:\n", "reconstructed_model.fit(test_input, test_target)" ] }, { "cell_type": "markdown", "metadata": { "id": "onibKMsFZ4Bk" }, "source": [ "#### SavedModel に含まれるもの\n", "\n", "`model.save('my_model')`を呼び出すと、以下を含む`my_model`という名前のフォルダが作成されます。" ] }, { "cell_type": "code", "execution_count": 4, "metadata": { "execution": { "iopub.execute_input": "2022-08-09T05:28:48.343527Z", "iopub.status.busy": "2022-08-09T05:28:48.342914Z", "iopub.status.idle": "2022-08-09T05:28:48.497316Z", "shell.execute_reply": "2022-08-09T05:28:48.496603Z" }, "id": "o0kniAGdvEmH" }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "assets\tkeras_metadata.pb saved_model.pb variables\r\n" ] } ], "source": [ "!ls my_model" ] }, { "cell_type": "markdown", "metadata": { "id": "9gZk3nwEHKCt" }, "source": [ "モデルアーキテクチャとトレーニング構成(オプティマイザ、損失、メトリックを含む)は、`saved_model.pb`に格納されます。重みは`variables/`ディレクトリに保存されます。\n", "\n", "SavedModel 形式についての詳細は[「SavedModel ガイド(*ディスク上の SavedModel 形式」*)](https://www.tensorflow.org/guide/saved_model#the_savedmodel_format_on_disk)をご覧ください。\n", "\n", "#### SavedModel によるカスタムオブジェクトの処理\n", "\n", "モデルとそのレイヤーを保存する場合、SavedModel 形式はクラス名、**呼び出し関数**、損失、および重み(および実装されている場合は構成)を保存します。呼び出し関数は、モデル/レイヤーの計算グラフを定義します。\n", "\n", "モデル/レイヤーの構成がない場合、トレーニング、評価、および推論に使用できる元のモデルのようなモデルを作成するために呼び出し関数が使用されます。\n", "\n", "しかしながら、カスタムモデルまたはレイヤークラスを作成する場合は、常に`get_config`および`from_config`メソッドを使用して定義することをお勧めします。これにより、必要に応じて後で計算を簡単に更新できます。詳細については[「カスタムオブジェクト」](save_and_serialize.ipynb#custom-objects)をご覧ください。\n", "\n", "以下は、**config メソッドを上書きせずに **SavedModel 形式からカスタムレイヤーを読み込んだ場合の例です。" ] }, { "cell_type": "code", "execution_count": 5, "metadata": { "execution": { "iopub.execute_input": "2022-08-09T05:28:48.501667Z", "iopub.status.busy": "2022-08-09T05:28:48.501068Z", "iopub.status.idle": "2022-08-09T05:28:48.961516Z", "shell.execute_reply": "2022-08-09T05:28:48.960926Z" }, "id": "PPIAXT8BFSf9" }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "INFO:tensorflow:Assets written to: my_model/assets\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "WARNING:tensorflow:No training configuration found in save file, so the model was *not* compiled. Compile it manually.\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "Original model: <__main__.CustomModel object at 0x7f1ac4ab47f0>\n", "Loaded model: \n" ] } ], "source": [ "class CustomModel(keras.Model):\n", " def __init__(self, hidden_units):\n", " super(CustomModel, self).__init__()\n", " self.dense_layers = [keras.layers.Dense(u) for u in hidden_units]\n", "\n", " def call(self, inputs):\n", " x = inputs\n", " for layer in self.dense_layers:\n", " x = layer(x)\n", " return x\n", "\n", "\n", "model = CustomModel([16, 16, 10])\n", "# Build the model by calling it\n", "input_arr = tf.random.uniform((1, 5))\n", "outputs = model(input_arr)\n", "model.save(\"my_model\")\n", "\n", "# Delete the custom-defined model class to ensure that the loader does not have\n", "# access to it.\n", "del CustomModel\n", "\n", "loaded = keras.models.load_model(\"my_model\")\n", "np.testing.assert_allclose(loaded(input_arr), outputs)\n", "\n", "print(\"Original model:\", model)\n", "print(\"Loaded model:\", loaded)" ] }, { "cell_type": "markdown", "metadata": { "id": "WnESi1jRVLHz" }, "source": [ "上記の例のように、ローダーは、元のモデルのように機能する新しいモデルクラスを動的に作成します。" ] }, { "cell_type": "markdown", "metadata": { "id": "STywDB8VW8gu" }, "source": [ "### Keras H5 形式\n", "\n", "Keras は、モデルのアーキテクチャ、重み値、および`compile()`情報を含む1つの HDF5 ファイルの保存もサポートしています。これは、SavedModel に代わる軽量な形式です。\n", "\n", "**例:**" ] }, { "cell_type": "code", "execution_count": 6, "metadata": { "execution": { "iopub.execute_input": "2022-08-09T05:28:48.964727Z", "iopub.status.busy": "2022-08-09T05:28:48.964472Z", "iopub.status.idle": "2022-08-09T05:28:49.575269Z", "shell.execute_reply": "2022-08-09T05:28:49.574579Z" }, "id": "gRIvOIfqWhQJ" }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "\r", "1/4 [======>.......................] - ETA: 0s - loss: 0.6862" ] }, { "name": "stdout", "output_type": "stream", "text": [ "\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\r", "4/4 [==============================] - 0s 2ms/step - loss: 0.6498\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "\r", "1/4 [======>.......................] - ETA: 0s" ] }, { "name": "stdout", "output_type": "stream", "text": [ "\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\r", "4/4 [==============================] - 0s 1ms/step\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "\r", "1/4 [======>.......................] - ETA: 0s" ] }, { "name": "stdout", "output_type": "stream", "text": [ "\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\r", "4/4 [==============================] - 0s 2ms/step\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "\r", "1/4 [======>.......................] - ETA: 0s - loss: 0.3540" ] }, { "name": "stdout", "output_type": "stream", "text": [ "\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\r", "4/4 [==============================] - 0s 2ms/step - loss: 0.5706\n" ] }, { "data": { "text/plain": [ "" ] }, "execution_count": 6, "metadata": {}, "output_type": "execute_result" } ], "source": [ "model = get_model()\n", "\n", "# Train the model.\n", "test_input = np.random.random((128, 32))\n", "test_target = np.random.random((128, 1))\n", "model.fit(test_input, test_target)\n", "\n", "# Calling `save('my_model.h5')` creates a h5 file `my_model.h5`.\n", "model.save(\"my_h5_model.h5\")\n", "\n", "# It can be used to reconstruct the model identically.\n", "reconstructed_model = keras.models.load_model(\"my_h5_model.h5\")\n", "\n", "# Let's check:\n", "np.testing.assert_allclose(\n", " model.predict(test_input), reconstructed_model.predict(test_input)\n", ")\n", "\n", "# The reconstructed model is already compiled and has retained the optimizer\n", "# state, so training can resume:\n", "reconstructed_model.fit(test_input, test_target)" ] }, { "cell_type": "markdown", "metadata": { "id": "bjxsX8XdS4Oj" }, "source": [ "#### 制限事項\n", "\n", "SavedModel 形式と比較して、H5 ファイルに含まれないものが 2 つあります。\n", "\n", "- SavedModel とは異なり、`model.add_loss()`および`model.add_metric()`を介して追加された**外部損失およびメトリック**は保存されません。モデルにそのような損失とメトリックがあり、トレーニングを再開する場合は、モデルを読み込んだ後、これらの損失を自分で追加する必要があります。これは、`self.add_loss()`および`self.add_metric()`を介して*内部レイヤー*で作成された損失/メトリックには適用されないことに注意してください。レイヤーが読み込まれる限り、これらの損失とメトリックはレイヤーの`call`メソッドの一部であるため保持されます。\n", "- カスタムレイヤーなどの**カスタムオブジェクトの計算グラフ**は、保存されたファイルに含まれません。読み込む際に、Keras はモデルを再構築するためにこれらのオブジェクトの Python クラス/関数にアクセスする必要があります。詳細については、[「カスタムオブジェクト」](save_and_serialize.ipynb#custom-objects)をご覧ください。\n" ] }, { "cell_type": "markdown", "metadata": { "id": "cY3tXyZyk4Ws" }, "source": [ "## アーキテクチャの保存\n", "\n", "モデルの構成(アーキテクチャ)は、モデルに含まれるレイヤー、およびこれらのレイヤーの接続方法を指定します*。モデルの構成がある場合、コンパイル情報なしで、重みが新しく初期化された状態でモデルを作成することができます。\n", "\n", "*これは、サブクラス化されたモデルではなく、Functional または Sequential API を使用して定義されたモデルにのみ適用されることに注意してください。" ] }, { "cell_type": "markdown", "metadata": { "id": "rIjTX1Z0ljoo" }, "source": [ "### Sequential モデルまたは Functional API モデルの構成\n", "\n", "これらのタイプのモデルは、レイヤーの明示的なグラフです。それらの構成は常に構造化された形式で提供されます。\n", "\n", "#### API\n", "\n", "- `get_config()`および`from_config()`\n", "- `tf.keras.models.model_to_json()`および`tf.keras.models.model_from_json()`" ] }, { "cell_type": "markdown", "metadata": { "id": "F7V8jN9nt9hB" }, "source": [ "#### `get_config()`および`from_config()`\n", "\n", "`config = model.get_config()`を呼び出すと、モデルの構成を含むPython dictが返されます。その後、同じモデルを`Sequential.from_config(config)`(
`Sequential`モデルの場合)または`Model.from_config(config)`(Functional API モデルの場合) で再度構築できます。\n", "\n", "同じワークフローは、シリアル化可能なレイヤーでも使用できます。\n", "\n", "**レイヤーの例:**" ] }, { "cell_type": "code", "execution_count": 7, "metadata": { "execution": { "iopub.execute_input": "2022-08-09T05:28:49.578942Z", "iopub.status.busy": "2022-08-09T05:28:49.578387Z", "iopub.status.idle": "2022-08-09T05:28:49.583596Z", "shell.execute_reply": "2022-08-09T05:28:49.582960Z" }, "id": "E4H3XIDY91oy" }, "outputs": [], "source": [ "layer = keras.layers.Dense(3, activation=\"relu\")\n", "layer_config = layer.get_config()\n", "new_layer = keras.layers.Dense.from_config(layer_config)" ] }, { "cell_type": "markdown", "metadata": { "id": "2orPhGTaHZRX" }, "source": [ "**Sequential モデルの例:**" ] }, { "cell_type": "code", "execution_count": 8, "metadata": { "execution": { "iopub.execute_input": "2022-08-09T05:28:49.586700Z", "iopub.status.busy": "2022-08-09T05:28:49.586213Z", "iopub.status.idle": "2022-08-09T05:28:49.608848Z", "shell.execute_reply": "2022-08-09T05:28:49.608321Z" }, "id": "F09I6yvGV2uf" }, "outputs": [], "source": [ "model = keras.Sequential([keras.Input((32,)), keras.layers.Dense(1)])\n", "config = model.get_config()\n", "new_model = keras.Sequential.from_config(config)" ] }, { "cell_type": "markdown", "metadata": { "id": "Q9SuxM15lEUr" }, "source": [ "**Functional モデルの例:**" ] }, { "cell_type": "code", "execution_count": 9, "metadata": { "execution": { "iopub.execute_input": "2022-08-09T05:28:49.611948Z", "iopub.status.busy": "2022-08-09T05:28:49.611551Z", "iopub.status.idle": "2022-08-09T05:28:49.633608Z", "shell.execute_reply": "2022-08-09T05:28:49.633058Z" }, "id": "HHIVpEKSsT8o" }, "outputs": [], "source": [ "inputs = keras.Input((32,))\n", "outputs = keras.layers.Dense(1)(inputs)\n", "model = keras.Model(inputs, outputs)\n", "config = model.get_config()\n", "new_model = keras.Model.from_config(config)" ] }, { "cell_type": "markdown", "metadata": { "id": "NDjRR6fO4GS6" }, "source": [ "#### `to_json()`および`tf.keras.models.model_from_json()`\n", "\n", "これは、`get_config` / `from_config`と似ていますが、モデルを JSON 文字列に変換します。この文字列は、元のモデルクラスなしで読み込めます。また、これはモデル固有であり、レイヤー向けではありません。\n", "\n", "**例:**" ] }, { "cell_type": "code", "execution_count": 10, "metadata": { "execution": { "iopub.execute_input": "2022-08-09T05:28:49.636543Z", "iopub.status.busy": "2022-08-09T05:28:49.636209Z", "iopub.status.idle": "2022-08-09T05:28:49.657795Z", "shell.execute_reply": "2022-08-09T05:28:49.657249Z" }, "id": "J7jcVOpdPRie" }, "outputs": [], "source": [ "model = keras.Sequential([keras.Input((32,)), keras.layers.Dense(1)])\n", "json_config = model.to_json()\n", "new_model = keras.models.model_from_json(json_config)" ] }, { "cell_type": "markdown", "metadata": { "id": "WE6kPB1B8Xy5" }, "source": [ "### カスタムオブジェクト\n", "\n", "**モデルとレイヤー**\n", "\n", "サブクラス化されたモデルとレイヤーのアーキテクチャは、メソッド`__init__`および`call`で定義されています。それらは Python バイトコードと見なされ、JSON と互換性のある構成にシリアル化できません。` pickle `などを使用してバイトコードのシリアル化を試すことができますが、これは安全ではなく、モデルを別のシステムに読み込むことはできません。\n", "\n", "カスタム定義されたレイヤーのあるモデル、またはサブクラス化されたモデルを保存/読み込むには、`get_config`および`from_config`(オプション) メソッドを上書きする必要があります。さらに、Keras が認識できるように、カスタムオブジェクトの登録を使用する必要があります。\n", "\n", "**カスタム関数**\n", "\n", "カスタム定義関数 (アクティブ化の損失や初期化など) には、`get_config`メソッドは必要ありません。カスタムオブジェクトとして登録されている限り、関数名は読み込みに十分です。\n", "\n", "**TensorFlow グラフのみの読み込み**\n", "\n", "Keras により生成された TensorFlow グラフを以下のように読み込むことができます。 その場合、`custom_objects`を提供する必要はありません。" ] }, { "cell_type": "code", "execution_count": 11, "metadata": { "execution": { "iopub.execute_input": "2022-08-09T05:28:49.660794Z", "iopub.status.busy": "2022-08-09T05:28:49.660575Z", "iopub.status.idle": "2022-08-09T05:28:49.925797Z", "shell.execute_reply": "2022-08-09T05:28:49.925193Z" }, "id": "znOcN8keiaaD" }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "WARNING:tensorflow:Compiled the loaded model, but the compiled metrics have yet to be built. `model.compile_metrics` will be empty until you train or evaluate the model.\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "INFO:tensorflow:Assets written to: my_model/assets\n" ] } ], "source": [ "model.save(\"my_model\")\n", "tensorflow_graph = tf.saved_model.load(\"my_model\")\n", "x = np.random.uniform(size=(4, 32)).astype(np.float32)\n", "predicted = tensorflow_graph(x).numpy()" ] }, { "cell_type": "markdown", "metadata": { "id": "Ovu5chswcHzn" }, "source": [ "この方法にはいくつかの欠点があることに注意してください。\n", "\n", "- 再作成できないモデルをプロダクションにロールアウトしないように、履歴を追跡するために使用されたカスタムオブジェクトに常にアクセスできる必要があります。\n", "- `tf.saved_model.load`により返されるオブジェクトは、Keras モデルではないので、簡単には使えません。たとえば、`.predict()`や`.fit()`へのアクセスはありません。\n", "\n", "この方法は推奨されていませんが、カスタムオブジェクトのコードを紛失した場合や`tf.keras.models.load_model()`でモデルを読み込む際に問題が発生した場合などに役に立ちます。\n", "\n", "詳細は、[`tf.saved_model.load`に関するページ](https://www.tensorflow.org/api_docs/python/tf/saved_model/load)をご覧ください。" ] }, { "cell_type": "markdown", "metadata": { "id": "B5p8XgNCi0Sm" }, "source": [ "#### 構成メソッドの定義\n", "\n", "仕様:\n", "\n", "- `get_config`は、Keras のアーキテクチャおよびモデルを保存する API と互換性があるように、JSON シリアル化可能なディクショナリを返す必要があります。\n", "- `from_config(config)` (`classmethod`) は、構成から作成された新しいレイヤーまたはモデルオブジェクトを返します。デフォルトの実装は `cls(**config)`を返します。\n", "\n", "**例:**" ] }, { "cell_type": "code", "execution_count": 12, "metadata": { "execution": { "iopub.execute_input": "2022-08-09T05:28:49.929415Z", "iopub.status.busy": "2022-08-09T05:28:49.928976Z", "iopub.status.idle": "2022-08-09T05:28:49.939729Z", "shell.execute_reply": "2022-08-09T05:28:49.939191Z" }, "id": "YeVMs9Rs5ojC" }, "outputs": [], "source": [ "class CustomLayer(keras.layers.Layer):\n", " def __init__(self, a):\n", " self.var = tf.Variable(a, name=\"var_a\")\n", "\n", " def call(self, inputs, training=False):\n", " if training:\n", " return inputs * self.var\n", " else:\n", " return inputs\n", "\n", " def get_config(self):\n", " return {\"a\": self.var.numpy()}\n", "\n", " # There's actually no need to define `from_config` here, since returning\n", " # `cls(**config)` is the default behavior.\n", " @classmethod\n", " def from_config(cls, config):\n", " return cls(**config)\n", "\n", "\n", "layer = CustomLayer(5)\n", "layer.var.assign(2)\n", "\n", "serialized_layer = keras.layers.serialize(layer)\n", "new_layer = keras.layers.deserialize(\n", " serialized_layer, custom_objects={\"CustomLayer\": CustomLayer}\n", ")" ] }, { "cell_type": "markdown", "metadata": { "id": "OlbIz9cmWDsr" }, "source": [ "#### カスタムオブジェクトの登録\n", "\n", "Keras は構成を生成したクラスについての情報を保持します。上記の例では、`tf.keras.layers.serialize`はシリアル化された形態のカスタムレイヤーを生成します。\n", "\n", "```\n", "{'class_name': 'CustomLayer', 'config': {'a': 2}}\n", "```\n", "\n", "Keras は、すべての組み込みのレイヤー、モデル、オプティマイザ、およびメトリッククラスのマスターリストを保持し、`from_config`を呼び出すための正しいクラスを見つけるために使用されます。クラスが見つからない場合は、エラー(`Value Error: Unknown layer`)が発生します。このリストにカスタムクラスを登録する方法は、いくつかあります。\n", "\n", "1. 読み込み関数で`custom_objects`引数を設定する。(上記の「config メソッドの定義」セクションの例をご覧ください)\n", "2. `tf.keras.utils.custom_object_scope`または`tf.keras.utils.CustomObjectScope`\n", "3. `tf.keras.utils.register_keras_serializable`" ] }, { "cell_type": "markdown", "metadata": { "id": "5X5chZaxYpC2" }, "source": [ "#### カスタムレイヤーと関数の例" ] }, { "cell_type": "code", "execution_count": 13, "metadata": { "execution": { "iopub.execute_input": "2022-08-09T05:28:49.943007Z", "iopub.status.busy": "2022-08-09T05:28:49.942488Z", "iopub.status.idle": "2022-08-09T05:28:49.996410Z", "shell.execute_reply": "2022-08-09T05:28:49.995830Z" }, "id": "MdYdOM5u4NJ9" }, "outputs": [], "source": [ "class CustomLayer(keras.layers.Layer):\n", " def __init__(self, units=32, **kwargs):\n", " super(CustomLayer, self).__init__(**kwargs)\n", " self.units = units\n", "\n", " def build(self, input_shape):\n", " self.w = self.add_weight(\n", " shape=(input_shape[-1], self.units),\n", " initializer=\"random_normal\",\n", " trainable=True,\n", " )\n", " self.b = self.add_weight(\n", " shape=(self.units,), initializer=\"random_normal\", trainable=True\n", " )\n", "\n", " def call(self, inputs):\n", " return tf.matmul(inputs, self.w) + self.b\n", "\n", " def get_config(self):\n", " config = super(CustomLayer, self).get_config()\n", " config.update({\"units\": self.units})\n", " return config\n", "\n", "\n", "def custom_activation(x):\n", " return tf.nn.tanh(x) ** 2\n", "\n", "\n", "# Make a model with the CustomLayer and custom_activation\n", "inputs = keras.Input((32,))\n", "x = CustomLayer(32)(inputs)\n", "outputs = keras.layers.Activation(custom_activation)(x)\n", "model = keras.Model(inputs, outputs)\n", "\n", "# Retrieve the config\n", "config = model.get_config()\n", "\n", "# At loading time, register the custom objects with a `custom_object_scope`:\n", "custom_objects = {\"CustomLayer\": CustomLayer, \"custom_activation\": custom_activation}\n", "with keras.utils.custom_object_scope(custom_objects):\n", " new_model = keras.Model.from_config(config)" ] }, { "cell_type": "markdown", "metadata": { "id": "Ia1JUuCjy70o" }, "source": [ "### メモリ内でモデルのクローンを作成する\n", "\n", "また、`tf.keras.models.clone_model()`を通じて、メモリ内でモデルのクローンを作成できます。これは、構成を取得し、その構成からモデルを再作成する方法と同じです (したがって、コンパイル情報やレイヤーの重み値は保持されません)。\n", "\n", "**例:**" ] }, { "cell_type": "code", "execution_count": 14, "metadata": { "execution": { "iopub.execute_input": "2022-08-09T05:28:49.999677Z", "iopub.status.busy": "2022-08-09T05:28:49.999150Z", "iopub.status.idle": "2022-08-09T05:28:50.013778Z", "shell.execute_reply": "2022-08-09T05:28:50.013211Z" }, "id": "16KQFlItCZf2" }, "outputs": [], "source": [ "with keras.utils.custom_object_scope(custom_objects):\n", " new_model = keras.models.clone_model(model)" ] }, { "cell_type": "markdown", "metadata": { "id": "wq1Dgi9eZUrR" }, "source": [ "## モデルの重み値のみを保存および読み込む\n", "\n", "モデルの重みのみを保存および読み込むように選択できます。これは次の場合に役立ちます。\n", "\n", "- 推論のためのモデルだけが必要とされる場合。この場合、トレーニングを再開する必要がないため、コンパイル情報やオプティマイザの状態は必要ありません。\n", "- 転移学習を行う場合。以前のモデルの状態を再利用して新しいモデルをトレーニングするため、以前のモデルのコンパイル情報は必要ありません。" ] }, { "cell_type": "markdown", "metadata": { "id": "dRJgbG8Zq7WB" }, "source": [ "### インメモリの重みの移動のための API\n", "\n", "異なるオブジェクト間で重みをコピーするには`get_weights`および`set_weights`を使用します。\n", "\n", "- `tf.keras.layers.Layer.get_weights()`: numpy配列のリストを返す。\n", "- `tf.keras.layers.Layer.set_weights()`: モデルの重みを`weights`引数の値に設定する。\n", "\n", "以下に例を示します。\n", "\n", "***インメモリで、1 つのレイヤーから別のレイヤーに重みを転送する***" ] }, { "cell_type": "code", "execution_count": 15, "metadata": { "execution": { "iopub.execute_input": "2022-08-09T05:28:50.017091Z", "iopub.status.busy": "2022-08-09T05:28:50.016551Z", "iopub.status.idle": "2022-08-09T05:28:50.034137Z", "shell.execute_reply": "2022-08-09T05:28:50.033534Z" }, "id": "xXT0h7yxAA4e" }, "outputs": [], "source": [ "def create_layer():\n", " layer = keras.layers.Dense(64, activation=\"relu\", name=\"dense_2\")\n", " layer.build((None, 784))\n", " return layer\n", "\n", "\n", "layer_1 = create_layer()\n", "layer_2 = create_layer()\n", "\n", "# Copy weights from layer 2 to layer 1\n", "layer_2.set_weights(layer_1.get_weights())" ] }, { "cell_type": "markdown", "metadata": { "id": "IvCxdjmy6eKA" }, "source": [ "***インメモリで 1 つのモデルから互換性のあるアーキテクチャを備えた別のモデルに重みを転送する***" ] }, { "cell_type": "code", "execution_count": 16, "metadata": { "execution": { "iopub.execute_input": "2022-08-09T05:28:50.037357Z", "iopub.status.busy": "2022-08-09T05:28:50.036997Z", "iopub.status.idle": "2022-08-09T05:28:50.095930Z", "shell.execute_reply": "2022-08-09T05:28:50.095377Z" }, "id": "CleccO1um5WU" }, "outputs": [], "source": [ "# Create a simple functional model\n", "inputs = keras.Input(shape=(784,), name=\"digits\")\n", "x = keras.layers.Dense(64, activation=\"relu\", name=\"dense_1\")(inputs)\n", "x = keras.layers.Dense(64, activation=\"relu\", name=\"dense_2\")(x)\n", "outputs = keras.layers.Dense(10, name=\"predictions\")(x)\n", "functional_model = keras.Model(inputs=inputs, outputs=outputs, name=\"3_layer_mlp\")\n", "\n", "# Define a subclassed model with the same architecture\n", "class SubclassedModel(keras.Model):\n", " def __init__(self, output_dim, name=None):\n", " super(SubclassedModel, self).__init__(name=name)\n", " self.output_dim = output_dim\n", " self.dense_1 = keras.layers.Dense(64, activation=\"relu\", name=\"dense_1\")\n", " self.dense_2 = keras.layers.Dense(64, activation=\"relu\", name=\"dense_2\")\n", " self.dense_3 = keras.layers.Dense(output_dim, name=\"predictions\")\n", "\n", " def call(self, inputs):\n", " x = self.dense_1(inputs)\n", " x = self.dense_2(x)\n", " x = self.dense_3(x)\n", " return x\n", "\n", " def get_config(self):\n", " return {\"output_dim\": self.output_dim, \"name\": self.name}\n", "\n", "\n", "subclassed_model = SubclassedModel(10)\n", "# Call the subclassed model once to create the weights.\n", "subclassed_model(tf.ones((1, 784)))\n", "\n", "# Copy weights from functional_model to subclassed_model.\n", "subclassed_model.set_weights(functional_model.get_weights())\n", "\n", "assert len(functional_model.weights) == len(subclassed_model.weights)\n", "for a, b in zip(functional_model.weights, subclassed_model.weights):\n", " np.testing.assert_allclose(a.numpy(), b.numpy())" ] }, { "cell_type": "markdown", "metadata": { "id": "V42tpJDicL4v" }, "source": [ "***ステートレスレイヤーの場合***\n", "\n", "ステートレスレイヤーは重みの順序や数を変更しないため、ステートレスレイヤーが余分にある場合や不足している場合でも、モデルのアーキテクチャは互換性があります。" ] }, { "cell_type": "code", "execution_count": 17, "metadata": { "execution": { "iopub.execute_input": "2022-08-09T05:28:50.099207Z", "iopub.status.busy": "2022-08-09T05:28:50.098758Z", "iopub.status.idle": "2022-08-09T05:28:50.155568Z", "shell.execute_reply": "2022-08-09T05:28:50.155027Z" }, "id": "TWVjoCuVP6to" }, "outputs": [], "source": [ "inputs = keras.Input(shape=(784,), name=\"digits\")\n", "x = keras.layers.Dense(64, activation=\"relu\", name=\"dense_1\")(inputs)\n", "x = keras.layers.Dense(64, activation=\"relu\", name=\"dense_2\")(x)\n", "outputs = keras.layers.Dense(10, name=\"predictions\")(x)\n", "functional_model = keras.Model(inputs=inputs, outputs=outputs, name=\"3_layer_mlp\")\n", "\n", "inputs = keras.Input(shape=(784,), name=\"digits\")\n", "x = keras.layers.Dense(64, activation=\"relu\", name=\"dense_1\")(inputs)\n", "x = keras.layers.Dense(64, activation=\"relu\", name=\"dense_2\")(x)\n", "\n", "# Add a dropout layer, which does not contain any weights.\n", "x = keras.layers.Dropout(0.5)(x)\n", "outputs = keras.layers.Dense(10, name=\"predictions\")(x)\n", "functional_model_with_dropout = keras.Model(\n", " inputs=inputs, outputs=outputs, name=\"3_layer_mlp\"\n", ")\n", "\n", "functional_model_with_dropout.set_weights(functional_model.get_weights())" ] }, { "cell_type": "markdown", "metadata": { "id": "tUrgZcDAYaML" }, "source": [ "### 重みをディスクに保存して再度読み込むための API\n", "\n", "以下の形式で`model.save_weights`を呼び出すことにより、重みをディスクに保存できます。\n", "\n", "- TensorFlow Checkpoint\n", "- HDF5\n", "\n", "`model.save_weights`のデフォルトの形式は TensorFlow Checkpoint です。保存形式を指定する方法は 2 つあります。\n", "\n", "1. `save_format`引数:値を`save_format = \"tf\"`または`save_format = \"h5\"`に設定する。\n", "2. `path`引数:パスが`.h5`または`.hdf5`で終わる場合、HDF5 形式が使用されます。`save_format`が設定されていない限り、他のサフィックスでは、TensorFlow Checkpoint になります。\n", "\n", "また、オプションとしてインメモリの numpy 配列として重みを取得することもできます。各 API には、以下の長所と短所があります。" ] }, { "cell_type": "markdown", "metadata": { "id": "de8G1QVux2za" }, "source": [ "### TF Checkpoint 形式\n", "\n", "**例:**" ] }, { "cell_type": "code", "execution_count": 18, "metadata": { "execution": { "iopub.execute_input": "2022-08-09T05:28:50.158944Z", "iopub.status.busy": "2022-08-09T05:28:50.158484Z", "iopub.status.idle": "2022-08-09T05:28:50.207833Z", "shell.execute_reply": "2022-08-09T05:28:50.207295Z" }, "id": "1W82BZuskILz" }, "outputs": [ { "data": { "text/plain": [ "" ] }, "execution_count": 18, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# Runnable example\n", "sequential_model = keras.Sequential(\n", " [\n", " keras.Input(shape=(784,), name=\"digits\"),\n", " keras.layers.Dense(64, activation=\"relu\", name=\"dense_1\"),\n", " keras.layers.Dense(64, activation=\"relu\", name=\"dense_2\"),\n", " keras.layers.Dense(10, name=\"predictions\"),\n", " ]\n", ")\n", "sequential_model.save_weights(\"ckpt\")\n", "load_status = sequential_model.load_weights(\"ckpt\")\n", "\n", "# `assert_consumed` can be used as validation that all variable values have been\n", "# restored from the checkpoint. See `tf.train.Checkpoint.restore` for other\n", "# methods in the Status object.\n", "load_status.assert_consumed()" ] }, { "cell_type": "markdown", "metadata": { "id": "CUDB1dkiecxZ" }, "source": [ "#### 形式の詳細\n", "\n", "TensorFlow Checkpoint 形式は、オブジェクト属性名を使用して重みを保存および復元します。 たとえば、`tf.keras.layers.Dense `レイヤーを見てみましょう。このレイヤーには、2 つの重み、`dense.kernel`と`dense.bias`があります。レイヤーが`tf`形式で保存されると、結果のチェックポイントには、キー`「kernel」`と`「bias」`およびそれらに対応する重み値が含まれます。 詳細につきましては、[TF Checkpoint ガイドの「読み込みの仕組み」](https://www.tensorflow.org/guide/checkpoint#loading_mechanics)をご覧ください。\n", "\n", "属性/グラフのエッジは、**変数名ではなく、親オブジェクトで使用される名前**で命名されていることに注意してください。以下の例の`CustomLayer`では、変数` CustomLayer.var `は、` \"var_a\" `ではなく、` \"var\" `をキーの一部として保存されます。" ] }, { "cell_type": "code", "execution_count": 19, "metadata": { "execution": { "iopub.execute_input": "2022-08-09T05:28:50.211007Z", "iopub.status.busy": "2022-08-09T05:28:50.210484Z", "iopub.status.idle": "2022-08-09T05:28:50.228550Z", "shell.execute_reply": "2022-08-09T05:28:50.228029Z" }, "id": "wwjjEg7zQ29O" }, "outputs": [ { "data": { "text/plain": [ "{'save_counter/.ATTRIBUTES/VARIABLE_VALUE': tf.int64,\n", " 'layer/var/.ATTRIBUTES/VARIABLE_VALUE': tf.int32,\n", " '_CHECKPOINTABLE_OBJECT_GRAPH': tf.string}" ] }, "execution_count": 19, "metadata": {}, "output_type": "execute_result" } ], "source": [ "class CustomLayer(keras.layers.Layer):\n", " def __init__(self, a):\n", " self.var = tf.Variable(a, name=\"var_a\")\n", "\n", "\n", "layer = CustomLayer(5)\n", "layer_ckpt = tf.train.Checkpoint(layer=layer).save(\"custom_layer\")\n", "\n", "ckpt_reader = tf.train.load_checkpoint(layer_ckpt)\n", "\n", "ckpt_reader.get_variable_to_dtype_map()" ] }, { "cell_type": "markdown", "metadata": { "id": "tfdbha2TvYWH" }, "source": [ "#### 転移学習の例\n", "\n", "基本的に、2 つのモデルが同じアーキテクチャを持っている限り、同じチェックポイントを共有できます。\n", "\n", "**例:**" ] }, { "cell_type": "code", "execution_count": 20, "metadata": { "execution": { "iopub.execute_input": "2022-08-09T05:28:50.231633Z", "iopub.status.busy": "2022-08-09T05:28:50.231185Z", "iopub.status.idle": "2022-08-09T05:28:50.367404Z", "shell.execute_reply": "2022-08-09T05:28:50.366839Z" }, "id": "6Xqhxo35q0qj" }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Model: \"pretrained_model\"\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "_________________________________________________________________\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ " Layer (type) Output Shape Param # \n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "=================================================================\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ " digits (InputLayer) [(None, 784)] 0 \n" ] }, { "name": "stdout", "output_type": "stream", "text": [ " \n" ] }, { "name": "stdout", "output_type": "stream", "text": [ " dense_1 (Dense) (None, 64) 50240 \n" ] }, { "name": "stdout", "output_type": "stream", "text": [ " \n" ] }, { "name": "stdout", "output_type": "stream", "text": [ " dense_2 (Dense) (None, 64) 4160 \n" ] }, { "name": "stdout", "output_type": "stream", "text": [ " \n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "=================================================================\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "Total params: 54,400\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "Trainable params: 54,400\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "Non-trainable params: 0\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "_________________________________________________________________\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "\n", " --------------------------------------------------\n", "Model: \"new_model\"\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "_________________________________________________________________\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ " Layer (type) Output Shape Param # \n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "=================================================================\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ " digits (InputLayer) [(None, 784)] 0 \n" ] }, { "name": "stdout", "output_type": "stream", "text": [ " \n" ] }, { "name": "stdout", "output_type": "stream", "text": [ " dense_1 (Dense) (None, 64) 50240 \n" ] }, { "name": "stdout", "output_type": "stream", "text": [ " \n" ] }, { "name": "stdout", "output_type": "stream", "text": [ " dense_2 (Dense) (None, 64) 4160 \n" ] }, { "name": "stdout", "output_type": "stream", "text": [ " \n" ] }, { "name": "stdout", "output_type": "stream", "text": [ " predictions (Dense) (None, 5) 325 \n" ] }, { "name": "stdout", "output_type": "stream", "text": [ " \n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "=================================================================\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "Total params: 54,725\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "Trainable params: 54,725\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "Non-trainable params: 0\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "_________________________________________________________________\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "Model: \"sequential_3\"\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "_________________________________________________________________\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ " Layer (type) Output Shape Param # \n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "=================================================================\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ " pretrained (Functional) (None, 64) 54400 \n" ] }, { "name": "stdout", "output_type": "stream", "text": [ " \n" ] }, { "name": "stdout", "output_type": "stream", "text": [ " predictions (Dense) (None, 5) 325 \n" ] }, { "name": "stdout", "output_type": "stream", "text": [ " \n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "=================================================================\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "Total params: 54,725\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "Trainable params: 54,725\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "Non-trainable params: 0\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "_________________________________________________________________\n" ] }, { "data": { "text/plain": [ "" ] }, "execution_count": 20, "metadata": {}, "output_type": "execute_result" } ], "source": [ "inputs = keras.Input(shape=(784,), name=\"digits\")\n", "x = keras.layers.Dense(64, activation=\"relu\", name=\"dense_1\")(inputs)\n", "x = keras.layers.Dense(64, activation=\"relu\", name=\"dense_2\")(x)\n", "outputs = keras.layers.Dense(10, name=\"predictions\")(x)\n", "functional_model = keras.Model(inputs=inputs, outputs=outputs, name=\"3_layer_mlp\")\n", "\n", "# Extract a portion of the functional model defined in the Setup section.\n", "# The following lines produce a new model that excludes the final output\n", "# layer of the functional model.\n", "pretrained = keras.Model(\n", " functional_model.inputs, functional_model.layers[-1].input, name=\"pretrained_model\"\n", ")\n", "# Randomly assign \"trained\" weights.\n", "for w in pretrained.weights:\n", " w.assign(tf.random.normal(w.shape))\n", "pretrained.save_weights(\"pretrained_ckpt\")\n", "pretrained.summary()\n", "\n", "# Assume this is a separate program where only 'pretrained_ckpt' exists.\n", "# Create a new functional model with a different output dimension.\n", "inputs = keras.Input(shape=(784,), name=\"digits\")\n", "x = keras.layers.Dense(64, activation=\"relu\", name=\"dense_1\")(inputs)\n", "x = keras.layers.Dense(64, activation=\"relu\", name=\"dense_2\")(x)\n", "outputs = keras.layers.Dense(5, name=\"predictions\")(x)\n", "model = keras.Model(inputs=inputs, outputs=outputs, name=\"new_model\")\n", "\n", "# Load the weights from pretrained_ckpt into model.\n", "model.load_weights(\"pretrained_ckpt\")\n", "\n", "# Check that all of the pretrained weights have been loaded.\n", "for a, b in zip(pretrained.weights, model.weights):\n", " np.testing.assert_allclose(a.numpy(), b.numpy())\n", "\n", "print(\"\\n\", \"-\" * 50)\n", "model.summary()\n", "\n", "# Example 2: Sequential model\n", "# Recreate the pretrained model, and load the saved weights.\n", "inputs = keras.Input(shape=(784,), name=\"digits\")\n", "x = keras.layers.Dense(64, activation=\"relu\", name=\"dense_1\")(inputs)\n", "x = keras.layers.Dense(64, activation=\"relu\", name=\"dense_2\")(x)\n", "pretrained_model = keras.Model(inputs=inputs, outputs=x, name=\"pretrained\")\n", "\n", "# Sequential example:\n", "model = keras.Sequential([pretrained_model, keras.layers.Dense(5, name=\"predictions\")])\n", "model.summary()\n", "\n", "pretrained_model.load_weights(\"pretrained_ckpt\")\n", "\n", "# Warning! Calling `model.load_weights('pretrained_ckpt')` won't throw an error,\n", "# but will *not* work as expected. If you inspect the weights, you'll see that\n", "# none of the weights will have loaded. `pretrained_model.load_weights()` is the\n", "# correct method to call." ] }, { "cell_type": "markdown", "metadata": { "id": "eCsRvSzqMJ0s" }, "source": [ "通常、モデルの作成には同じ API を使用することをお勧めします。Sequential と Functional、またはFunctional とサブクラス化などの間で切り替える場合は、常に事前トレーニング済みモデルを再構築し、事前トレーニング済みの重みをそのモデルに読み込みます。" ] }, { "cell_type": "markdown", "metadata": { "id": "a9EmwUaZBTeW" }, "source": [ "モデルのアーキテクチャがまったく異なる場合は、どうすれば重みを保存して異なるモデルに読み込むことができるのでしょうか?`tf.train.Checkpoint`を使用すると、正確なレイヤー/変数を保存および復元することができます。\n", "\n", "**例:**" ] }, { "cell_type": "code", "execution_count": 21, "metadata": { "execution": { "iopub.execute_input": "2022-08-09T05:28:50.370478Z", "iopub.status.busy": "2022-08-09T05:28:50.370022Z", "iopub.status.idle": "2022-08-09T05:28:50.399859Z", "shell.execute_reply": "2022-08-09T05:28:50.399316Z" }, "id": "j6jE9sz7yQ9b" }, "outputs": [ { "name": "stderr", "output_type": "stream", "text": [ "/tmpfs/tmp/ipykernel_42242/1562824211.py:15: UserWarning: `layer.add_variable` is deprecated and will be removed in a future version. Please use the `layer.add_weight()` method instead.\n", " self.kernel = self.add_variable(\"kernel\", shape=(64, 10))\n", "/tmpfs/tmp/ipykernel_42242/1562824211.py:16: UserWarning: `layer.add_variable` is deprecated and will be removed in a future version. Please use the `layer.add_weight()` method instead.\n", " self.bias = self.add_variable(\"bias\", shape=(10,))\n" ] }, { "data": { "text/plain": [ "" ] }, "execution_count": 21, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# Create a subclassed model that essentially uses functional_model's first\n", "# and last layers.\n", "# First, save the weights of functional_model's first and last dense layers.\n", "first_dense = functional_model.layers[1]\n", "last_dense = functional_model.layers[-1]\n", "ckpt_path = tf.train.Checkpoint(\n", " dense=first_dense, kernel=last_dense.kernel, bias=last_dense.bias\n", ").save(\"ckpt\")\n", "\n", "# Define the subclassed model.\n", "class ContrivedModel(keras.Model):\n", " def __init__(self):\n", " super(ContrivedModel, self).__init__()\n", " self.first_dense = keras.layers.Dense(64)\n", " self.kernel = self.add_variable(\"kernel\", shape=(64, 10))\n", " self.bias = self.add_variable(\"bias\", shape=(10,))\n", "\n", " def call(self, inputs):\n", " x = self.first_dense(inputs)\n", " return tf.matmul(x, self.kernel) + self.bias\n", "\n", "\n", "model = ContrivedModel()\n", "# Call model on inputs to create the variables of the dense layer.\n", "_ = model(tf.ones((1, 784)))\n", "\n", "# Create a Checkpoint with the same structure as before, and load the weights.\n", "tf.train.Checkpoint(\n", " dense=model.first_dense, kernel=model.kernel, bias=model.bias\n", ").restore(ckpt_path).assert_consumed()" ] }, { "cell_type": "markdown", "metadata": { "id": "1R9zCAelVexH" }, "source": [ "### HDF5 形式\n", "\n", "HDF5 形式には、レイヤー名でグループ化された重みが含まれています。重みは、トレーニング可能な重みのリストをトレーニング不可能な重みのリストに連結することによって並べられたリストです(`layer.weights`と同じ)。 したがって、チェックポイントに保存されているものと同じレイヤーとトレーニング可能な状態がある場合、モデルは HDF 5 チェックポイントを使用できます。\n", "\n", "**例:**" ] }, { "cell_type": "code", "execution_count": 22, "metadata": { "execution": { "iopub.execute_input": "2022-08-09T05:28:50.402814Z", "iopub.status.busy": "2022-08-09T05:28:50.402480Z", "iopub.status.idle": "2022-08-09T05:28:50.442199Z", "shell.execute_reply": "2022-08-09T05:28:50.441662Z" }, "id": "J2LictZSclDh" }, "outputs": [], "source": [ "# Runnable example\n", "sequential_model = keras.Sequential(\n", " [\n", " keras.Input(shape=(784,), name=\"digits\"),\n", " keras.layers.Dense(64, activation=\"relu\", name=\"dense_1\"),\n", " keras.layers.Dense(64, activation=\"relu\", name=\"dense_2\"),\n", " keras.layers.Dense(10, name=\"predictions\"),\n", " ]\n", ")\n", "sequential_model.save_weights(\"weights.h5\")\n", "sequential_model.load_weights(\"weights.h5\")" ] }, { "cell_type": "markdown", "metadata": { "id": "rCy09yfqXQT8" }, "source": [ "ネストされたレイヤーがモデルに含まれている場合、`layer.trainable`を変更すると、`layer.weights`の順序が異なる場合があることに注意してください。" ] }, { "cell_type": "code", "execution_count": 23, "metadata": { "execution": { "iopub.execute_input": "2022-08-09T05:28:50.445258Z", "iopub.status.busy": "2022-08-09T05:28:50.445046Z", "iopub.status.idle": "2022-08-09T05:28:50.488964Z", "shell.execute_reply": "2022-08-09T05:28:50.488425Z" }, "id": "VX8hFyI9HgYT" }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "variables: ['nested/dense_1/kernel:0', 'nested/dense_1/bias:0', 'nested/dense_2/kernel:0', 'nested/dense_2/bias:0']\n", "\n", "Changing trainable status of one of the nested layers...\n", "\n", "variables: ['nested/dense_2/kernel:0', 'nested/dense_2/bias:0', 'nested/dense_1/kernel:0', 'nested/dense_1/bias:0']\n", "variable ordering changed: True\n" ] } ], "source": [ "class NestedDenseLayer(keras.layers.Layer):\n", " def __init__(self, units, name=None):\n", " super(NestedDenseLayer, self).__init__(name=name)\n", " self.dense_1 = keras.layers.Dense(units, name=\"dense_1\")\n", " self.dense_2 = keras.layers.Dense(units, name=\"dense_2\")\n", "\n", " def call(self, inputs):\n", " return self.dense_2(self.dense_1(inputs))\n", "\n", "\n", "nested_model = keras.Sequential([keras.Input((784,)), NestedDenseLayer(10, \"nested\")])\n", "variable_names = [v.name for v in nested_model.weights]\n", "print(\"variables: {}\".format(variable_names))\n", "\n", "print(\"\\nChanging trainable status of one of the nested layers...\")\n", "nested_model.get_layer(\"nested\").dense_1.trainable = False\n", "\n", "variable_names_2 = [v.name for v in nested_model.weights]\n", "print(\"\\nvariables: {}\".format(variable_names_2))\n", "print(\"variable ordering changed:\", variable_names != variable_names_2)" ] }, { "cell_type": "markdown", "metadata": { "id": "V4GHHReOFEGq" }, "source": [ "#### 転移学習の例\n", "\n", "HDF5 から事前トレーニングされた重みを読み込む場合は、元のチェックポイントモデルに重みを読み込んでから、目的の重み/レイヤーを新しいモデルに抽出することをお勧めします。\n", "\n", "**例:**" ] }, { "cell_type": "code", "execution_count": 24, "metadata": { "execution": { "iopub.execute_input": "2022-08-09T05:28:50.492030Z", "iopub.status.busy": "2022-08-09T05:28:50.491564Z", "iopub.status.idle": "2022-08-09T05:28:50.576136Z", "shell.execute_reply": "2022-08-09T05:28:50.575561Z" }, "id": "YcgjA7yYG49d" }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Model: \"sequential_6\"\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "_________________________________________________________________\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ " Layer (type) Output Shape Param # \n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "=================================================================\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ " dense_1 (Dense) (None, 64) 50240 \n" ] }, { "name": "stdout", "output_type": "stream", "text": [ " \n" ] }, { "name": "stdout", "output_type": "stream", "text": [ " dense_2 (Dense) (None, 64) 4160 \n" ] }, { "name": "stdout", "output_type": "stream", "text": [ " \n" ] }, { "name": "stdout", "output_type": "stream", "text": [ " dense_3 (Dense) (None, 5) 325 \n" ] }, { "name": "stdout", "output_type": "stream", "text": [ " \n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "=================================================================\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "Total params: 54,725\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "Trainable params: 54,725\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "Non-trainable params: 0\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "_________________________________________________________________\n" ] } ], "source": [ "def create_functional_model():\n", " inputs = keras.Input(shape=(784,), name=\"digits\")\n", " x = keras.layers.Dense(64, activation=\"relu\", name=\"dense_1\")(inputs)\n", " x = keras.layers.Dense(64, activation=\"relu\", name=\"dense_2\")(x)\n", " outputs = keras.layers.Dense(10, name=\"predictions\")(x)\n", " return keras.Model(inputs=inputs, outputs=outputs, name=\"3_layer_mlp\")\n", "\n", "\n", "functional_model = create_functional_model()\n", "functional_model.save_weights(\"pretrained_weights.h5\")\n", "\n", "# In a separate program:\n", "pretrained_model = create_functional_model()\n", "pretrained_model.load_weights(\"pretrained_weights.h5\")\n", "\n", "# Create a new model by extracting layers from the original model:\n", "extracted_layers = pretrained_model.layers[:-1]\n", "extracted_layers.append(keras.layers.Dense(5, name=\"dense_3\"))\n", "model = keras.Sequential(extracted_layers)\n", "model.summary()" ] } ], "metadata": { "colab": { "collapsed_sections": [], "name": "save_and_serialize.ipynb", "toc_visible": true }, "kernelspec": { "display_name": "Python 3", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.9.13" } }, "nbformat": 4, "nbformat_minor": 0 }