{ "cells": [ { "cell_type": "markdown", "metadata": { "id": "wFPyjGqMQ82Q" }, "source": [ "##### Copyright 2020 The TensorFlow Authors.\n" ] }, { "cell_type": "code", "execution_count": 1, "metadata": { "cellView": "form", "execution": { "iopub.execute_input": "2020-11-12T07:32:26.256022Z", "iopub.status.busy": "2020-11-12T07:32:26.255308Z", "iopub.status.idle": "2020-11-12T07:32:26.257931Z", "shell.execute_reply": "2020-11-12T07:32:26.257434Z" }, "id": "aNZ7aEDyQIYU" }, "outputs": [], "source": [ "#@title Licensed under the Apache License, Version 2.0 (the \"License\");\n", "# you may not use this file except in compliance with the License.\n", "# You may obtain a copy of the License at\n", "#\n", "# https://www.apache.org/licenses/LICENSE-2.0\n", "#\n", "# Unless required by applicable law or agreed to in writing, software\n", "# distributed under the License is distributed on an \"AS IS\" BASIS,\n", "# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n", "# See the License for the specific language governing permissions and\n", "# limitations under the License." ] }, { "cell_type": "markdown", "metadata": { "id": "uMOmzhPEQh7b" }, "source": [ "# 归一化\n", "\n", "\n", " \n", " \n", " \n", " \n", "
View on TensorFlow.orgRun in Google ColabView source on GitHubDownload notebook
\n" ] }, { "cell_type": "markdown", "metadata": { "id": "cthm5dovQMJl" }, "source": [ "## 概述\n", "\n", "此笔记本将简要介绍 TensorFlow 的[归一化层](https://github.com/tensorflow/addons/blob/master/tensorflow_addons/layers/normalizations.py)。当前支持的层包括:\n", "\n", "- **组归一化**(TensorFlow Addons)\n", "- **实例归一化**(TensorFlow Addons)\n", "- **层归一化**(TensorFlow Core)\n", "\n", "这些层背后的基本理念是对激活层的输出进行归一化,以提升训练过程中的收敛。与[批次归一化](https://keras.io/layers/normalization/)相反,这些归一化不适用于批次,而是用于归一化单个样本的激活,这样可使它们同样适用于循环神经网络。\n", "\n", "通常,通过计算输入张量中子组的均值和标准差来执行归一化。此外,也可以对此应用比例因子和修正因子。\n", "\n", "$y_{i} = \\frac{\\gamma ( x_{i} - \\mu )}{\\sigma }+ \\beta$\n", "\n", "$ y$:输出\n", "\n", "$x$:输入\n", "\n", "$\\gamma$:比例因子\n", "\n", "$\\mu$:均值\n", "\n", "$\\sigma$:标准差\n", "\n", "$\\beta$:修正因子\n", "\n", "下面的图像演示了这些技术之间的区别。每个子图显示一个输入张量,其中 N 为批次轴,C 为通道轴,(H, W) 为空间轴(例如图片的高度和宽度)。蓝色像素由相同的均值和方差归一化,均值和方差通过聚合这些像素的值得出。\n", "\n", "![](https://github.com/shaohua0116/Group-Normalization-Tensorflow/raw/master/figure/gn.png)\n", "\n", "来源:(https://arxiv.org/pdf/1803.08494.pdf)\n", "\n", "权重 γ 和 β 可以在所有归一化层中训练,以补偿表征能力的可能损失。您可以通过将 `center` 或 `scale` 标记设置为 `True` 来激活这些因子。当然,您也可以在训练过程中对 `beta` 和 `gamma` 使用 `initializers`、`constraints` 和 `regularizer` 来调整这些值。 " ] }, { "cell_type": "markdown", "metadata": { "id": "I2XlcXf5WBHb" }, "source": [ "## 设置" ] }, { "cell_type": "markdown", "metadata": { "id": "kTlbneoEUKrD" }, "source": [ "### 安装 Tensorflow 2.0 和 Tensorflow-Addons" ] }, { "cell_type": "code", "execution_count": 2, "metadata": { "execution": { "iopub.execute_input": "2020-11-12T07:32:26.264173Z", "iopub.status.busy": "2020-11-12T07:32:26.261638Z", "iopub.status.idle": "2020-11-12T07:32:28.011883Z", "shell.execute_reply": "2020-11-12T07:32:28.011314Z" }, "id": "_ZQGY_ALnirQ" }, "outputs": [], "source": [ "!pip install -q -U tensorflow-addons" ] }, { "cell_type": "code", "execution_count": 3, "metadata": { "execution": { "iopub.execute_input": "2020-11-12T07:32:28.016298Z", "iopub.status.busy": "2020-11-12T07:32:28.015643Z", "iopub.status.idle": "2020-11-12T07:32:34.452989Z", "shell.execute_reply": "2020-11-12T07:32:34.452415Z" }, "id": "7aGgPZG_WBHg" }, "outputs": [], "source": [ "import tensorflow as tf\n", "import tensorflow_addons as tfa" ] }, { "cell_type": "markdown", "metadata": { "id": "u82Gz_gOUPDZ" }, "source": [ "### 准备数据集" ] }, { "cell_type": "code", "execution_count": 4, "metadata": { "execution": { "iopub.execute_input": "2020-11-12T07:32:34.458789Z", "iopub.status.busy": "2020-11-12T07:32:34.458215Z", "iopub.status.idle": "2020-11-12T07:32:35.170850Z", "shell.execute_reply": "2020-11-12T07:32:35.170196Z" }, "id": "3wso9oidUZZQ" }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Downloading data from https://storage.googleapis.com/tensorflow/tf-keras-datasets/mnist.npz\n", "\r", " 8192/11490434 [..............................] - ETA: 0s" ] }, { "name": "stdout", "output_type": "stream", "text": [ "\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\r", " 4202496/11490434 [=========>....................] - ETA: 0s" ] }, { "name": "stdout", "output_type": "stream", "text": [ "\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\r", " 9838592/11490434 [========================>.....] - ETA: 0s" ] }, { "name": "stdout", "output_type": "stream", "text": [ "\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\r", "11493376/11490434 [==============================] - 0s 0us/step\n" ] } ], "source": [ "mnist = tf.keras.datasets.mnist\n", "\n", "(x_train, y_train),(x_test, y_test) = mnist.load_data()\n", "x_train, x_test = x_train / 255.0, x_test / 255.0" ] }, { "cell_type": "markdown", "metadata": { "id": "UTQH56j89POZ" }, "source": [ "## 组归一化教程\n", "\n", "### 简介\n", "\n", "组归一化 (GN) 将输入的通道分成较小的子组,并根据其均值和方差归一化这些值。由于 GN 只对单一样本起作用,因此该技术与批次大小无关。\n", "\n", "在图像分类任务中,GN 的实验得分与批次归一化十分接近。如果您的整体 批次大小很小,则使用 GN 而不是批次归一化可能更为有利,因为较小的批次大小会导致批次归一化的性能不佳。\n", "\n", "###下面的示例在 Conv2D 层之后将 10 个通道按标准的“最后一个通道”设置分为 5 个子组:" ] }, { "cell_type": "code", "execution_count": 5, "metadata": { "execution": { "iopub.execute_input": "2020-11-12T07:32:35.179049Z", "iopub.status.busy": "2020-11-12T07:32:35.177194Z", "iopub.status.idle": "2020-11-12T07:32:47.039482Z", "shell.execute_reply": "2020-11-12T07:32:47.039928Z" }, "id": "aIGjLwYWAm0v" }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "\r", " 1/313 [..............................] - ETA: 0s - loss: 2.6802 - accuracy: 0.1562" ] }, { "name": "stdout", "output_type": "stream", "text": [ "\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\r", " 23/313 [=>............................] - ETA: 0s - loss: 1.4086 - accuracy: 0.6223" ] }, { "name": "stdout", "output_type": "stream", "text": [ "\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\r", " 45/313 [===>..........................] - ETA: 0s - loss: 1.0807 - accuracy: 0.7028" ] }, { "name": "stdout", "output_type": "stream", "text": [ "\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\r", " 67/313 [=====>........................] - ETA: 0s - loss: 0.8995 - accuracy: 0.7430" ] }, { "name": "stdout", "output_type": "stream", "text": [ "\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\r", " 90/313 [=======>......................] - ETA: 0s - loss: 0.7721 - accuracy: 0.7795" ] }, { "name": "stdout", "output_type": "stream", "text": [ "\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\r", "112/313 [=========>....................] - ETA: 0s - loss: 0.7078 - accuracy: 0.7938" ] }, { "name": "stdout", "output_type": "stream", "text": [ "\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\r", "135/313 [===========>..................] - ETA: 0s - loss: 0.6590 - accuracy: 0.8090" ] }, { "name": "stdout", "output_type": "stream", "text": [ "\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\r", "157/313 [==============>...............] - ETA: 0s - loss: 0.6255 - accuracy: 0.8173" ] }, { "name": "stdout", "output_type": "stream", "text": [ "\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\r", "179/313 [================>.............] - ETA: 0s - loss: 0.5991 - accuracy: 0.8272" ] }, { "name": "stdout", "output_type": "stream", "text": [ "\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\r", "202/313 [==================>...........] - ETA: 0s - loss: 0.5719 - accuracy: 0.8342" ] }, { "name": "stdout", "output_type": "stream", "text": [ "\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\r", "225/313 [====================>.........] - ETA: 0s - loss: 0.5554 - accuracy: 0.8399" ] }, { "name": "stdout", "output_type": "stream", "text": [ "\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\r", "248/313 [======================>.......] - ETA: 0s - loss: 0.5410 - accuracy: 0.8429" ] }, { "name": "stdout", "output_type": "stream", "text": [ "\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\r", "270/313 [========================>.....] - ETA: 0s - loss: 0.5217 - accuracy: 0.8479" ] }, { "name": "stdout", "output_type": "stream", "text": [ "\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\r", "293/313 [===========================>..] - ETA: 0s - loss: 0.5099 - accuracy: 0.8516" ] }, { "name": "stdout", "output_type": "stream", "text": [ "\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\r", "313/313 [==============================] - ETA: 0s - loss: 0.4978 - accuracy: 0.8549" ] }, { "name": "stdout", "output_type": "stream", "text": [ "\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\r", "313/313 [==============================] - 1s 2ms/step - loss: 0.4978 - accuracy: 0.8549\n" ] }, { "data": { "text/plain": [ "" ] }, "execution_count": 1, "metadata": {}, "output_type": "execute_result" } ], "source": [ "model = tf.keras.models.Sequential([\n", " # Reshape into \"channels last\" setup.\n", " tf.keras.layers.Reshape((28,28,1), input_shape=(28,28)),\n", " tf.keras.layers.Conv2D(filters=10, kernel_size=(3,3),data_format=\"channels_last\"),\n", " # Groupnorm Layer\n", " tfa.layers.GroupNormalization(groups=5, axis=3),\n", " tf.keras.layers.Flatten(),\n", " tf.keras.layers.Dense(128, activation='relu'),\n", " tf.keras.layers.Dropout(0.2),\n", " tf.keras.layers.Dense(10, activation='softmax')\n", "])\n", "\n", "model.compile(optimizer='adam',\n", " loss='sparse_categorical_crossentropy',\n", " metrics=['accuracy'])\n", "model.fit(x_test, y_test)" ] }, { "cell_type": "markdown", "metadata": { "id": "QMwUfJUib3ka" }, "source": [ "## 实例归一化教程\n", "\n", "### 简介\n", "\n", "实例归一化是组归一化的特例,其中组大小与通道大小(或轴大小)相同。\n", "\n", "实验结果表明,当替换批次归一化时,实例归一化在样式迁移方面表现良好。最近,实例归一化也已被用来代替 GAN 中的批次归一化。\n", "\n", "### 示例\n", "\n", "在 Conv2D 层之后应用 InstanceNormalization 并使用统一的初始化比例和偏移因子。" ] }, { "cell_type": "code", "execution_count": 6, "metadata": { "execution": { "iopub.execute_input": "2020-11-12T07:32:47.057281Z", "iopub.status.busy": "2020-11-12T07:32:47.056587Z", "iopub.status.idle": "2020-11-12T07:32:48.370818Z", "shell.execute_reply": "2020-11-12T07:32:48.370310Z" }, "id": "6sLVv-C8f6Kf" }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "\r", " 1/313 [..............................] - ETA: 0s - loss: 2.3093 - accuracy: 0.1562" ] }, { "name": "stdout", "output_type": "stream", "text": [ "\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\r", " 24/313 [=>............................] - ETA: 0s - loss: 2.0008 - accuracy: 0.3424" ] }, { "name": "stdout", "output_type": "stream", "text": [ "\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\r", " 47/313 [===>..........................] - ETA: 0s - loss: 1.5448 - accuracy: 0.5279" ] }, { "name": "stdout", "output_type": "stream", "text": [ "\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\r", " 70/313 [=====>........................] - ETA: 0s - loss: 1.2671 - accuracy: 0.6085" ] }, { "name": "stdout", "output_type": "stream", "text": [ "\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\r", " 93/313 [=======>......................] - ETA: 0s - loss: 1.0761 - accuracy: 0.6687" ] }, { "name": "stdout", "output_type": "stream", "text": [ "\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\r", "116/313 [==========>...................] - ETA: 0s - loss: 0.9631 - accuracy: 0.7039" ] }, { "name": "stdout", "output_type": "stream", "text": [ "\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\r", "139/313 [============>.................] - ETA: 0s - loss: 0.8813 - accuracy: 0.7284" ] }, { "name": "stdout", "output_type": "stream", "text": [ "\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\r", "161/313 [==============>...............] - ETA: 0s - loss: 0.8085 - accuracy: 0.7510" ] }, { "name": "stdout", "output_type": "stream", "text": [ "\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\r", "184/313 [================>.............] - ETA: 0s - loss: 0.7536 - accuracy: 0.7677" ] }, { "name": "stdout", "output_type": "stream", "text": [ "\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\r", "206/313 [==================>...........] - ETA: 0s - loss: 0.7065 - accuracy: 0.7816" ] }, { "name": "stdout", "output_type": "stream", "text": [ "\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\r", "228/313 [====================>.........] - ETA: 0s - loss: 0.6685 - accuracy: 0.7930" ] }, { "name": "stdout", "output_type": "stream", "text": [ "\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\r", "251/313 [=======================>......] - ETA: 0s - loss: 0.6418 - accuracy: 0.8019" ] }, { "name": "stdout", "output_type": "stream", "text": [ "\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\r", "273/313 [=========================>....] - ETA: 0s - loss: 0.6147 - accuracy: 0.8110" ] }, { "name": "stdout", "output_type": "stream", "text": [ "\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\r", "296/313 [===========================>..] - ETA: 0s - loss: 0.5890 - accuracy: 0.8193" ] }, { "name": "stdout", "output_type": "stream", "text": [ "\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\r", "313/313 [==============================] - 1s 2ms/step - loss: 0.5717 - accuracy: 0.8241\n" ] }, { "data": { "text/plain": [ "" ] }, "execution_count": 1, "metadata": {}, "output_type": "execute_result" } ], "source": [ "model = tf.keras.models.Sequential([\n", " # Reshape into \"channels last\" setup.\n", " tf.keras.layers.Reshape((28,28,1), input_shape=(28,28)),\n", " tf.keras.layers.Conv2D(filters=10, kernel_size=(3,3),data_format=\"channels_last\"),\n", " # LayerNorm Layer\n", " tfa.layers.InstanceNormalization(axis=3, \n", " center=True, \n", " scale=True,\n", " beta_initializer=\"random_uniform\",\n", " gamma_initializer=\"random_uniform\"),\n", " tf.keras.layers.Flatten(),\n", " tf.keras.layers.Dense(128, activation='relu'),\n", " tf.keras.layers.Dropout(0.2),\n", " tf.keras.layers.Dense(10, activation='softmax')\n", "])\n", "\n", "model.compile(optimizer='adam',\n", " loss='sparse_categorical_crossentropy',\n", " metrics=['accuracy'])\n", "model.fit(x_test, y_test)" ] }, { "cell_type": "markdown", "metadata": { "id": "qYdnEocRUCll" }, "source": [ "## 层归一化教程\n", "\n", "### 简介\n", "\n", "层归一化是组归一化的特例,其中组大小为 1。均值和标准差根据单个样本的所有激活计算得出。\n", "\n", "实验结果表明,层归一化非常适合循环神经网络,因为它可以独立于批大小工作。\n", "\n", "### 示例\n", "\n", "在 Conv2D 层之后应用 Layernormalization 并使用比例和偏移因子。 " ] }, { "cell_type": "code", "execution_count": 7, "metadata": { "execution": { "iopub.execute_input": "2020-11-12T07:32:48.383117Z", "iopub.status.busy": "2020-11-12T07:32:48.382479Z", "iopub.status.idle": "2020-11-12T07:32:49.664865Z", "shell.execute_reply": "2020-11-12T07:32:49.665274Z" }, "id": "Fh-Pp_e5UB54" }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "\r", " 1/313 [..............................] - ETA: 0s - loss: 2.8231 - accuracy: 0.0625" ] }, { "name": "stdout", "output_type": "stream", "text": [ "\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\r", " 23/313 [=>............................] - ETA: 0s - loss: 0.9555 - accuracy: 0.6997" ] }, { "name": "stdout", "output_type": "stream", "text": [ "\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\r", " 45/313 [===>..........................] - ETA: 0s - loss: 0.7968 - accuracy: 0.7660" ] }, { "name": "stdout", "output_type": "stream", "text": [ "\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\r", " 67/313 [=====>........................] - ETA: 0s - loss: 0.7102 - accuracy: 0.7980" ] }, { "name": "stdout", "output_type": "stream", "text": [ "\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\r", " 89/313 [=======>......................] - ETA: 0s - loss: 0.6493 - accuracy: 0.8136" ] }, { "name": "stdout", "output_type": "stream", "text": [ "\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\r", "110/313 [=========>....................] - ETA: 0s - loss: 0.5853 - accuracy: 0.8352" ] }, { "name": "stdout", "output_type": "stream", "text": [ "\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\r", "132/313 [===========>..................] - ETA: 0s - loss: 0.5574 - accuracy: 0.8452" ] }, { "name": "stdout", "output_type": "stream", "text": [ "\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\r", "153/313 [=============>................] - ETA: 0s - loss: 0.5373 - accuracy: 0.8517" ] }, { "name": "stdout", "output_type": "stream", "text": [ "\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\r", "175/313 [===============>..............] - ETA: 0s - loss: 0.5145 - accuracy: 0.8580" ] }, { "name": "stdout", "output_type": "stream", "text": [ "\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\r", "197/313 [=================>............] - ETA: 0s - loss: 0.4949 - accuracy: 0.8626" ] }, { "name": "stdout", "output_type": "stream", "text": [ "\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\r", "219/313 [===================>..........] - ETA: 0s - loss: 0.4734 - accuracy: 0.8674" ] }, { "name": "stdout", "output_type": "stream", "text": [ "\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\r", "241/313 [======================>.......] - ETA: 0s - loss: 0.4670 - accuracy: 0.8696" ] }, { "name": "stdout", "output_type": "stream", "text": [ "\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\r", "263/313 [========================>.....] - ETA: 0s - loss: 0.4527 - accuracy: 0.8723" ] }, { "name": "stdout", "output_type": "stream", "text": [ "\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\r", "285/313 [==========================>...] - ETA: 0s - loss: 0.4388 - accuracy: 0.8760" ] }, { "name": "stdout", "output_type": "stream", "text": [ "\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\r", "308/313 [============================>.] - ETA: 0s - loss: 0.4276 - accuracy: 0.8789" ] }, { "name": "stdout", "output_type": "stream", "text": [ "\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\r", "313/313 [==============================] - 1s 2ms/step - loss: 0.4242 - accuracy: 0.8794\n" ] }, { "data": { "text/plain": [ "" ] }, "execution_count": 1, "metadata": {}, "output_type": "execute_result" } ], "source": [ "model = tf.keras.models.Sequential([\n", " # Reshape into \"channels last\" setup.\n", " tf.keras.layers.Reshape((28,28,1), input_shape=(28,28)),\n", " tf.keras.layers.Conv2D(filters=10, kernel_size=(3,3),data_format=\"channels_last\"),\n", " # LayerNorm Layer\n", " tf.keras.layers.LayerNormalization(axis=1 , center=True , scale=True),\n", " tf.keras.layers.Flatten(),\n", " tf.keras.layers.Dense(128, activation='relu'),\n", " tf.keras.layers.Dropout(0.2),\n", " tf.keras.layers.Dense(10, activation='softmax')\n", "])\n", "\n", "model.compile(optimizer='adam',\n", " loss='sparse_categorical_crossentropy',\n", " metrics=['accuracy'])\n", "model.fit(x_test, y_test)" ] }, { "cell_type": "markdown", "metadata": { "id": "shvGfnB0WpQQ" }, "source": [ "## 文献\n", "\n", "[Layer norm](https://arxiv.org/pdf/1607.06450.pdf)\n", "\n", "[Instance norm](https://arxiv.org/pdf/1607.08022.pdf)\n", "\n", "[Group Norm](https://arxiv.org/pdf/1803.08494.pdf)\n", "\n", "[Complete Normalizations Overview](http://mlexplained.com/2018/11/30/an-overview-of-normalization-methods-in-deep-learning/)" ] } ], "metadata": { "colab": { "collapsed_sections": [], "name": "layers_normalizations.ipynb", "toc_visible": true }, "kernelspec": { "display_name": "Python 3", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.6.9" } }, "nbformat": 4, "nbformat_minor": 0 }