{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "hX4n9TsbGw-f"
   },
   "source": [
    "##### Copyright 2020 The TensorFlow Authors."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "metadata": {
    "cellView": "form",
    "execution": {
     "iopub.execute_input": "2022-12-14T21:20:03.485516Z",
     "iopub.status.busy": "2022-12-14T21:20:03.484968Z",
     "iopub.status.idle": "2022-12-14T21:20:03.489189Z",
     "shell.execute_reply": "2022-12-14T21:20:03.488581Z"
    },
    "id": "0nbI5DtDGw-i"
   },
   "outputs": [],
   "source": [
    "#@title Licensed under the Apache License, Version 2.0 (the \"License\");\n",
    "# you may not use this file except in compliance with the License.\n",
    "# You may obtain a copy of the License at\n",
    "#\n",
    "# https://www.apache.org/licenses/LICENSE-2.0\n",
    "#\n",
    "# Unless required by applicable law or agreed to in writing, software\n",
    "# distributed under the License is distributed on an \"AS IS\" BASIS,\n",
    "# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n",
    "# See the License for the specific language governing permissions and\n",
    "# limitations under the License."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "AOpGoE2T-YXS"
   },
   "source": [
    "<table class=\"tfo-notebook-buttons\" align=\"left\">\n",
    "  <td>     <a target=\"_blank\" href=\"https://www.tensorflow.org/tutorials/text/word2vec\">     <img src=\"https://www.tensorflow.org/images/tf_logo_32px.png\">TensorFlow.org에서 보기</a>\n",
    "</td>\n",
    "  <td>     <a target=\"_blank\" href=\"https://colab.research.google.com/github/tensorflow/docs-l10n/blob/master/site/ko/tutorials/text/word2vec.ipynb\">     <img src=\"https://www.tensorflow.org/images/colab_logo_32px.png\">Google Colab에서 실행하기</a>\n",
    "</td>\n",
    "  <td>     <a target=\"_blank\" href=\"https://github.com/tensorflow/docs-l10n/blob/master/site/ko/tutorials/text/word2vec.ipynb\">     <img src=\"https://www.tensorflow.org/images/GitHub-Mark-32px.png\">GitHub에서 소스 보기</a>\n",
    "</td>\n",
    "  <td>     <a href=\"https://storage.googleapis.com/tensorflow_docs/docs-l10n/site/ko/tutorials/text/word2vec.ipynb\"><img src=\"https://www.tensorflow.org/images/download_logo_32px.png\">노트북 다운로드</a>\n",
    "</td>\n",
    "</table>"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "haJUNjSB60Kh"
   },
   "source": [
    "# word2vec"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "99d4ky2lWFvn"
   },
   "source": [
    "word2vec은 단일 알고리즘이 아니며 그보다는 대규모 데이터세트에서 단어 임베딩을 학습하는 데 사용할 수 있는 모델 아키텍처 및 최적화 제품군입니다. word2vec을 통해 학습한 임베딩은 여러 다운스트림 자연어 처리 작업에서 성공적인 것으로 입증되었습니다.\n",
    "\n",
    "참고: 이 튜토리얼은 [벡터 공간의 단어 표현 효율적인 평가](https://arxiv.org/pdf/1301.3781.pdf) 및 [단어 및 구문의 분산된 표현 및 구성성](https://papers.nips.cc/paper/5021-distributed-representations-of-words-and-phrases-and-their-compositionality.pdf)에 기반합니다. 이것은 논문에 대한 정확한 구현은 아닙니다. 그보다는 주요 아이디어를 설명하기 위한 것입니다.\n",
    "\n",
    "이러한 논문들은 단어 표현을 학습하는 데 다음과 같은 두 가지 메서드를 제안합니다.\n",
    "\n",
    "- **지속적인 bag-of-words 모델**: 주변의 콘텍스트 단어를 바탕으로 중간 단어를 예측합니다. 콘텍스트는 현재(중간) 단어의 앞과 뒤의 몇몇 단어로 구성되어 있습니다. 이 아키텍처는 콘텍스트의 단어 순서가 중요하지 않기 때문에 bag-of-words 모델이라고 불립니다.\n",
    "- **지속적인 skip-gram 모델**: 동일한 문장 내의 현재 단어의 앞과 뒤 일정 범위 내의 단어를 예측합니다. 이에 대한 작업 예제는 아래와 같습니다.\n",
    "\n",
    "이 튜토리얼에서는 skip-gram 접근 방식을 사용할 것입니다. 우선, 묘사를 위한 단일 문장을 사용해 skip-gram과 다른 개념을 살펴보겠습니다. 다음으로, 작은 데이터세트에서 자신의 word2vec 모델을 훈련합니다. 이 튜토리얼은 또한 훈련된 임베딩을 내보내기하고 [TensorFlow Embedding Projector](http://projector.tensorflow.org/)에서 시각화하는 코드를 포함합니다.\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "xP00WlaMWBZC"
   },
   "source": [
    "## Skip-gram 및 네거티브 샘플링 "
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "Zr2wjv0bW236"
   },
   "source": [
    "bag-of-words 모델이 주변의 콘텍스트가 주어지면 단어를 예측하는 한편, skip-gram 모델은 단어 자체가 주어지면 단어의 콘텍스트(또는 주변)을 예측합니다. 모델은 토큰을 생략할 수 있는 n-grams인 skip-grams에서 훈련됩니다(예제는 아래 다이어그램 참조). 단어의 콘텍스트는 `context_word`가 `target_word`의 주변 콘텍스트에서 나타나는 `(target_word, context_word)`의 일련의 skip-gram 쌍을 통해 표시될 수 있습니다. "
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "ICjc-McbaVTd"
   },
   "source": [
    "여덟 단어의 다음 문장을 고려해 보세요.\n",
    "\n",
    "> The wide road shimmered in the hot sun.\n",
    "\n",
    "이 문장의 여덟 단어에 대한 각각의 콘텍스트 단어는 윈도 사이즈로 정의됩니다. 윈도 사이즈는 `context word`로 간주할 수 있는 `target_word`의 각 측면의 단어 범위로 결정됩니다. 아래는 다른 윈도 사이즈를 바탕으로 한 대상 단어의 skip-grams에 대한 표입니다."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "YKE87IKT_YT8"
   },
   "source": [
    "참고: 이 튜토리얼의 경우, `n`의 윈도 사이즈는 한 단어에 2*n+1개의 단어인 총 윈도 범위와 함께 각 측면에 n개의 단어를 내포합니다."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "RsCwQ07E8mqU"
   },
   "source": [
    "![word2vec_skipgrams](https://tensorflow.org/tutorials/text/images/word2vec_skipgram.png)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "gK1gN1jwkMpU"
   },
   "source": [
    "skip-gram 모델의 훈련 오브젝티브는 주어진 대상 단어의 콘텍스트 단어를 예측하는 확률을 최대화하는 것입니다. 단어 *w<sub>1</sub>, w<sub>2</sub>, ... w<sub>T</sub>* 시퀀스의 경우, 오브젝티브는 평균 로그 확률대로 작성될 수 있습니다."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "pILO_iAc84e-"
   },
   "source": [
    "![word2vec_skipgram_objective](https://tensorflow.org/tutorials/text/images/word2vec_skipgram_objective.png)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "Gsy6TUbtnz_K"
   },
   "source": [
    "여기에서 `c`는 훈련 콘텍스트의 사이즈입니다. 기본 skip-gram 공식은 softmax 함수를 사용해 이 확률을 정의합니다."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "P81Qavbb9APd"
   },
   "source": [
    "![word2vec_full_softmax](https://tensorflow.org/tutorials/text/images/word2vec_full_softmax.png)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "axZvd-hhotVB"
   },
   "source": [
    "여기에서 *v* 및 *v<sup>'</sup>*<sup></sup>는 단어의 대상 및 콘텍스트 벡터 표현이며 *W*는 어휘 사이즈입니다. "
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "SoLzxbqSpT6_"
   },
   "source": [
    "이 공식에 대한 분모를 계산하는 것은 종종 큰 (10<sup>5</sup>-10<sup>7</sup>) 항인 전체 어휘 단어에 대한 전체 softmax를 수행하는 것을 포함합니다."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "Y5VWYtmFzHkU"
   },
   "source": [
    "[잡음 대조 예측](https://www.tensorflow.org/api_docs/python/tf/nn/nce_loss)(NCE) 손실 함수는 전체 softmax에 대한 효율적인 예측입니다. 단어 분포를 모델링하는 대신 단어 임베딩을 학습하기 위한 오브젝티브를 통해 NCE 손실은 [단순화](https://papers.nips.cc/paper/5021-distributed-representations-of-words-and-phrases-and-their-compositionality.pdf)되어 네거티브 샘플링을 사용할 수 있습니다. "
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "WTZBPf1RsOsg"
   },
   "source": [
    "대상 단어에 대한 단순화된 네거티브 샘플링 오브젝티브는 단어의 잡음 분포 *P<sub>n</sub>(w)*에서 가져온 `num_ns` 네거티브 샘플의 콘텍스트 단어를 구별하는 것입니다. 더 명확하게 말하자면, 어휘에 대한 전체 softmax의 효율적인 근사치는 skip-gram 쌍의 경우 콘텍스트 단어 및 `num_ns` 네거티브 샘플 사이의 분류 문제로 대상 단어에 대한 손실을 제기하는 것입니다. "
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "Cl0rSfHjt6Mf"
   },
   "source": [
    "네거티브 샘플은 `(target_word, context_word)` 쌍으로 정의되어 `target_word`의 `window_size` 주변에 `context_word`가 표시되지 않습니다. 예제 문장의 경우, 몇몇 잠재적인 네거티브 샘플이 있습니다(`window_size`가 `2`인 경우).\n",
    "\n",
    "```\n",
    "(hot, shimmered)\n",
    "(wide, hot)\n",
    "(wide, sun)\n",
    "```"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "kq0q2uqbucFg"
   },
   "source": [
    "다음 섹션에서는, 단일 문장에 대한 skip-grams 및 네거티브 샘플을 생성합니다. 또한 하위 샘플링 기술에 대해 배우고 이 튜토리얼에서 추후에 포지티브 및 테거티브 훈련 예제에 대한 분류 모델을 훈련합니다."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "mk4-Hpe1CH16"
   },
   "source": [
    "## 설치"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "metadata": {
    "execution": {
     "iopub.execute_input": "2022-12-14T21:20:03.494518Z",
     "iopub.status.busy": "2022-12-14T21:20:03.494104Z",
     "iopub.status.idle": "2022-12-14T21:20:05.603360Z",
     "shell.execute_reply": "2022-12-14T21:20:05.602624Z"
    },
    "id": "RutaI-Tpev3T"
   },
   "outputs": [
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "2022-12-14 21:20:04.536976: W tensorflow/compiler/xla/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libnvinfer.so.7'; dlerror: libnvinfer.so.7: cannot open shared object file: No such file or directory\n",
      "2022-12-14 21:20:04.537078: W tensorflow/compiler/xla/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libnvinfer_plugin.so.7'; dlerror: libnvinfer_plugin.so.7: cannot open shared object file: No such file or directory\n",
      "2022-12-14 21:20:04.537088: W tensorflow/compiler/tf2tensorrt/utils/py_utils.cc:38] TF-TRT Warning: Cannot dlopen some TensorRT libraries. If you would like to use Nvidia GPU with TensorRT, please make sure the missing libraries mentioned above are installed properly.\n"
     ]
    }
   ],
   "source": [
    "import io\n",
    "import re\n",
    "import string\n",
    "import tqdm\n",
    "\n",
    "import numpy as np\n",
    "\n",
    "import tensorflow as tf\n",
    "from tensorflow.keras import layers"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "metadata": {
    "execution": {
     "iopub.execute_input": "2022-12-14T21:20:05.607486Z",
     "iopub.status.busy": "2022-12-14T21:20:05.607072Z",
     "iopub.status.idle": "2022-12-14T21:20:05.612882Z",
     "shell.execute_reply": "2022-12-14T21:20:05.612232Z"
    },
    "id": "10pyUMFkGKVQ"
   },
   "outputs": [],
   "source": [
    "# Load the TensorBoard notebook extension\n",
    "%load_ext tensorboard"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 4,
   "metadata": {
    "execution": {
     "iopub.execute_input": "2022-12-14T21:20:05.616595Z",
     "iopub.status.busy": "2022-12-14T21:20:05.615933Z",
     "iopub.status.idle": "2022-12-14T21:20:05.619248Z",
     "shell.execute_reply": "2022-12-14T21:20:05.618658Z"
    },
    "id": "XkJ5299Tek6B"
   },
   "outputs": [],
   "source": [
    "SEED = 42\n",
    "AUTOTUNE = tf.data.AUTOTUNE"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "RW-g5buCHwh3"
   },
   "source": [
    "### 예제 문장 벡터화"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "y8TfZIgoQrcP"
   },
   "source": [
    "다음 문장을 고려해 보세요.\n",
    "\n",
    "> The wide road shimmered in the hot sun.\n",
    "\n",
    "문장을 토큰화합니다."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 5,
   "metadata": {
    "execution": {
     "iopub.execute_input": "2022-12-14T21:20:05.622790Z",
     "iopub.status.busy": "2022-12-14T21:20:05.622248Z",
     "iopub.status.idle": "2022-12-14T21:20:05.625990Z",
     "shell.execute_reply": "2022-12-14T21:20:05.625410Z"
    },
    "id": "bsl7jBzV6_KK"
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "8\n"
     ]
    }
   ],
   "source": [
    "sentence = \"The wide road shimmered in the hot sun\"\n",
    "tokens = list(sentence.lower().split())\n",
    "print(len(tokens))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "PU-bs1XtThEw"
   },
   "source": [
    "어휘를 생성하여 토큰에서 정수 인덱스로 매핑을 저장합니다."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 6,
   "metadata": {
    "execution": {
     "iopub.execute_input": "2022-12-14T21:20:05.629487Z",
     "iopub.status.busy": "2022-12-14T21:20:05.628854Z",
     "iopub.status.idle": "2022-12-14T21:20:05.633754Z",
     "shell.execute_reply": "2022-12-14T21:20:05.633069Z"
    },
    "id": "UdYv1HJUQ8XA"
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "{'<pad>': 0, 'the': 1, 'wide': 2, 'road': 3, 'shimmered': 4, 'in': 5, 'hot': 6, 'sun': 7}\n"
     ]
    }
   ],
   "source": [
    "vocab, index = {}, 1  # start indexing from 1\n",
    "vocab['<pad>'] = 0  # add a padding token\n",
    "for token in tokens:\n",
    "  if token not in vocab:\n",
    "    vocab[token] = index\n",
    "    index += 1\n",
    "vocab_size = len(vocab)\n",
    "print(vocab)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "ZpuP43Dddasr"
   },
   "source": [
    "정반대의 어휘를 생성하여 정수 인덱스에서 토큰으로 매핑을 저장합니다."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 7,
   "metadata": {
    "execution": {
     "iopub.execute_input": "2022-12-14T21:20:05.636749Z",
     "iopub.status.busy": "2022-12-14T21:20:05.636490Z",
     "iopub.status.idle": "2022-12-14T21:20:05.640430Z",
     "shell.execute_reply": "2022-12-14T21:20:05.639765Z"
    },
    "id": "o9ULAJYtEvKl"
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "{0: '<pad>', 1: 'the', 2: 'wide', 3: 'road', 4: 'shimmered', 5: 'in', 6: 'hot', 7: 'sun'}\n"
     ]
    }
   ],
   "source": [
    "inverse_vocab = {index: token for token, index in vocab.items()}\n",
    "print(inverse_vocab)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "n3qtuyxIRyii"
   },
   "source": [
    "문장을 벡터화합니다."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 8,
   "metadata": {
    "execution": {
     "iopub.execute_input": "2022-12-14T21:20:05.643830Z",
     "iopub.status.busy": "2022-12-14T21:20:05.643288Z",
     "iopub.status.idle": "2022-12-14T21:20:05.646938Z",
     "shell.execute_reply": "2022-12-14T21:20:05.646269Z"
    },
    "id": "CsB3-9uQQYyl"
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "[1, 2, 3, 4, 5, 1, 6, 7]\n"
     ]
    }
   ],
   "source": [
    "example_sequence = [vocab[word] for word in tokens]\n",
    "print(example_sequence)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "ox1I28JRIOdM"
   },
   "source": [
    "### 한 문장에서 skip-grams 생성하기"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "t7NNKAmSiHvy"
   },
   "source": [
    "`tf.keras.preprocessing.sequence` 모듈은 word2vec에 대한 데이터 준비를 단순화하는 유용한 함수를 제공합니다. `tf.keras.preprocessing.sequence.skipgrams`를 사용해 범위 `[0, vocab_size)`의 토큰에서 주어진 `window_size`를 통해 `example_sequence`에서 skip-gram 쌍을 생성할 수 있습니다.\n",
    "\n",
    "참고: 이 함수로 생성된 네거티브 샘플을 배칭하려면 약간의 코드가 필요하기 때문에 `negative_samples`가 여기 `0`에 설정되었습니다. 다음 섹션에서 네거티브 샘플링 수행을 위해 다른 함수를 사용할 것입니다."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 9,
   "metadata": {
    "execution": {
     "iopub.execute_input": "2022-12-14T21:20:05.650494Z",
     "iopub.status.busy": "2022-12-14T21:20:05.649923Z",
     "iopub.status.idle": "2022-12-14T21:20:05.654211Z",
     "shell.execute_reply": "2022-12-14T21:20:05.653369Z"
    },
    "id": "USAJxW4RD7pn"
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "26\n"
     ]
    }
   ],
   "source": [
    "window_size = 2\n",
    "positive_skip_grams, _ = tf.keras.preprocessing.sequence.skipgrams(\n",
    "      example_sequence,\n",
    "      vocabulary_size=vocab_size,\n",
    "      window_size=window_size,\n",
    "      negative_samples=0)\n",
    "print(len(positive_skip_grams))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "uc9uhiMwY-AQ"
   },
   "source": [
    "몇몇 네거티브 skip-grams을 프린트합니다."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 10,
   "metadata": {
    "execution": {
     "iopub.execute_input": "2022-12-14T21:20:05.657803Z",
     "iopub.status.busy": "2022-12-14T21:20:05.657264Z",
     "iopub.status.idle": "2022-12-14T21:20:05.661320Z",
     "shell.execute_reply": "2022-12-14T21:20:05.660496Z"
    },
    "id": "SCnqEukIE9pt"
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "(2, 1): (wide, the)\n",
      "(1, 7): (the, sun)\n",
      "(1, 3): (the, road)\n",
      "(6, 7): (hot, sun)\n",
      "(3, 5): (road, in)\n"
     ]
    }
   ],
   "source": [
    "for target, context in positive_skip_grams[:5]:\n",
    "  print(f\"({target}, {context}): ({inverse_vocab[target]}, {inverse_vocab[context]})\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "_ua9PkMTISF0"
   },
   "source": [
    "### 하나의 skip-gram에 대한 네거티브 샘플링 "
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "Esqn8WBfZnEK"
   },
   "source": [
    "`skipgrams` 함수는 주어진 윈도 범위를 슬라이딩하여 모든 포지티브 skip-gram 쌍을 반환합니다. 훈련을 위한 네거티브 샘플 역할을 할 추가 skip-gram 쌍을 생성하려면 어휘에서 랜덤 단어를 샘플링해야 합니다. `tf.random.log_uniform_candidate_sampler` 함수를 사용해 윈도의 주어진 대상 단어에 대한 네거티브 샘플 `num_ns`개를 샘플링합니다. 하나의 skip-gram의 대상 단어에서 함수를 호출하고 true 클래스로 콘텍스트 단어를 전달해 샘플링에서 제외할 수 있습니다.\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "AgH3aSvw3xTD"
   },
   "source": [
    "주요 포인트: `[2, 5]` 범위에는 `num_ns`가 더 큰 규모의 데이터세트에 충분한 반면 `[5, 20]` 범위의 `num_ns` (포지티브 콘텍스트 단어당 네거티브 샘플의 수)는 더 작은 규모의 데이터세트에 최적으로 [작동하는 것으로 보입니다](https://papers.nips.cc/paper/5021-distributed-representations-of-words-and-phrases-and-their-compositionality.pdf)."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 11,
   "metadata": {
    "execution": {
     "iopub.execute_input": "2022-12-14T21:20:05.665171Z",
     "iopub.status.busy": "2022-12-14T21:20:05.664562Z",
     "iopub.status.idle": "2022-12-14T21:20:09.320623Z",
     "shell.execute_reply": "2022-12-14T21:20:09.319824Z"
    },
    "id": "m_LmdzqIGr5L"
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "tf.Tensor([2 1 4 3], shape=(4,), dtype=int64)\n",
      "['wide', 'the', 'shimmered', 'road']\n"
     ]
    }
   ],
   "source": [
    "# Get target and context words for one positive skip-gram.\n",
    "target_word, context_word = positive_skip_grams[0]\n",
    "\n",
    "# Set the number of negative samples per positive context.\n",
    "num_ns = 4\n",
    "\n",
    "context_class = tf.reshape(tf.constant(context_word, dtype=\"int64\"), (1, 1))\n",
    "negative_sampling_candidates, _, _ = tf.random.log_uniform_candidate_sampler(\n",
    "    true_classes=context_class,  # class that should be sampled as 'positive'\n",
    "    num_true=1,  # each positive skip-gram has 1 positive context class\n",
    "    num_sampled=num_ns,  # number of negative context words to sample\n",
    "    unique=True,  # all the negative samples should be unique\n",
    "    range_max=vocab_size,  # pick index of the samples from [0, vocab_size]\n",
    "    seed=SEED,  # seed for reproducibility\n",
    "    name=\"negative_sampling\"  # name of this operation\n",
    ")\n",
    "print(negative_sampling_candidates)\n",
    "print([inverse_vocab[index.numpy()] for index in negative_sampling_candidates])"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "8MSxWCrLIalp"
   },
   "source": [
    "### 하나의 훈련 예제 구성하기"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "Q6uEWdj8vKKv"
   },
   "source": [
    "주어진 포지티브 `(target_word, context_word)` skip-gram의 경우, 이제 또한 `target_word`의 윈도 사이즈 주변에 표시되지 않는 `num_ns` 네거티브 샘플링된 콘텍스트 단어도 있습니다. `1` 포지티브 `context_word` 및 `num_ns` 네거티브 콘텍스트 단어를 하나의 텐서로 배치합니다. 이는 각 대상 단어에 대한 일련의 포지티브 skip-grams(`1`로 레이블링 됨) 및 네거티브 샘플(`0`으로 레이블링 됨)을 생성합니다."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 12,
   "metadata": {
    "execution": {
     "iopub.execute_input": "2022-12-14T21:20:09.324274Z",
     "iopub.status.busy": "2022-12-14T21:20:09.323998Z",
     "iopub.status.idle": "2022-12-14T21:20:09.330788Z",
     "shell.execute_reply": "2022-12-14T21:20:09.330100Z"
    },
    "id": "zSiZwifuLvHf"
   },
   "outputs": [],
   "source": [
    "# Reduce a dimension so you can use concatenation (in the next step).\n",
    "squeezed_context_class = tf.squeeze(context_class, 1)\n",
    "\n",
    "# Concatenate a positive context word with negative sampled words.\n",
    "context = tf.concat([squeezed_context_class, negative_sampling_candidates], 0)\n",
    "\n",
    "# Label the first context word as `1` (positive) followed by `num_ns` `0`s (negative).\n",
    "label = tf.constant([1] + [0]*num_ns, dtype=\"int64\")\n",
    "target = target_word\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "OIJeoFCAwtXJ"
   },
   "source": [
    "위의 skip-gram 예제의 대상 단어에 대한 콘텍스트와 해당 레이블을 확인합니다."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 13,
   "metadata": {
    "execution": {
     "iopub.execute_input": "2022-12-14T21:20:09.334068Z",
     "iopub.status.busy": "2022-12-14T21:20:09.333808Z",
     "iopub.status.idle": "2022-12-14T21:20:09.341299Z",
     "shell.execute_reply": "2022-12-14T21:20:09.340568Z"
    },
    "id": "tzyCPCuZwmdL"
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "target_index    : 2\n",
      "target_word     : wide\n",
      "context_indices : [1 2 1 4 3]\n",
      "context_words   : ['the', 'wide', 'the', 'shimmered', 'road']\n",
      "label           : [1 0 0 0 0]\n"
     ]
    }
   ],
   "source": [
    "print(f\"target_index    : {target}\")\n",
    "print(f\"target_word     : {inverse_vocab[target_word]}\")\n",
    "print(f\"context_indices : {context}\")\n",
    "print(f\"context_words   : {[inverse_vocab[c.numpy()] for c in context]}\")\n",
    "print(f\"label           : {label}\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "gBtTcUVQr8EO"
   },
   "source": [
    "`(target, context, label)` 텐서의 튜플은 skip-gram 네거티브 샘플링 word2vec 모델 훈련을 위한 하나의 훈련 예제로 구성되어 있습니다. 콘텍스트 및 레이블의 형태는 `(1+num_ns,)`인 반면 대상의 형태는 `(1,)`인 점에 주의하세요."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 14,
   "metadata": {
    "execution": {
     "iopub.execute_input": "2022-12-14T21:20:09.344668Z",
     "iopub.status.busy": "2022-12-14T21:20:09.344390Z",
     "iopub.status.idle": "2022-12-14T21:20:09.348704Z",
     "shell.execute_reply": "2022-12-14T21:20:09.348035Z"
    },
    "id": "x-FwkR8jx9-Z"
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "target  : 2\n",
      "context : tf.Tensor([1 2 1 4 3], shape=(5,), dtype=int64)\n",
      "label   : tf.Tensor([1 0 0 0 0], shape=(5,), dtype=int64)\n"
     ]
    }
   ],
   "source": [
    "print(\"target  :\", target)\n",
    "print(\"context :\", context)\n",
    "print(\"label   :\", label)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "4bRJIlow4Dlv"
   },
   "source": [
    "### 요약"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "pWkuha0oykG5"
   },
   "source": [
    "이 다이어그램은 문장에서 훈련 예제를 생성하는 절차를 요약합니다.\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "_KlwdiAa9crJ"
   },
   "source": [
    "![word2vec_negative_sampling](https://tensorflow.org/tutorials/text/images/word2vec_negative_sampling.png)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "37e53f07f67c"
   },
   "source": [
    "단어 `temperature` 및 `code`는 입력 문장의 일부가 아닌 점에 주의하세요. 이 단어들은 위의 다이어그램에서 사용된 특정 다른 인덱스처럼 어휘에 속합니다."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "9wmdO_MEIpaM"
   },
   "source": [
    "## 모든 단계를 하나의 함수로 컴파일\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "iLKwNAczHsKg"
   },
   "source": [
    "### Skip-gram 샘플링 표 "
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "TUUK3uDtFNFE"
   },
   "source": [
    "대규모 데이터세트는 불용어와 같은 빈도가 더 높은 단어의 수가 더 많은 더 큰 규모의 어휘를 의미합니다. 흔히 발생하는 단어(예: `the`, `is`, `on`) 샘플링에서 얻은 예제를 훈련하는 것은 모델이 학습할 유용한 정보를 더해주지 않습니다. [Mikolov 등](https://papers.nips.cc/paper/5021-distributed-representations-of-words-and-phrases-and-their-compositionality.pdf)은 임베딩 품질을 개선하기 위해 유용한 방법으로 자주 사용하는 단어의 하위 샘플링을 제안합니다. "
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "bPtbv7zNP7Dx"
   },
   "source": [
    "`tf.keras.preprocessing.sequence.skipgrams` 함수는 샘플링 표 인수를 허용하여 모든 토큰을 샘플링 하는 확률을 인코딩합니다. `tf.keras.preprocessing.sequence.make_sampling_table`을 사용해 확률적 샘플링 표를 기반으로 한 단어 빈도 순위를 생성하고 이를 `skipgrams` 함수에 전달할 수 있습니다. 10의 `vocab_size`에 대한 샘플링 확률을 검사합니다."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 15,
   "metadata": {
    "execution": {
     "iopub.execute_input": "2022-12-14T21:20:09.352704Z",
     "iopub.status.busy": "2022-12-14T21:20:09.352436Z",
     "iopub.status.idle": "2022-12-14T21:20:09.356820Z",
     "shell.execute_reply": "2022-12-14T21:20:09.356178Z"
    },
    "id": "Rn9zAnDccyRg"
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "[0.00315225 0.00315225 0.00547597 0.00741556 0.00912817 0.01068435\n",
      " 0.01212381 0.01347162 0.01474487 0.0159558 ]\n"
     ]
    }
   ],
   "source": [
    "sampling_table = tf.keras.preprocessing.sequence.make_sampling_table(size=10)\n",
    "print(sampling_table)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "EHvSptcPk5fp"
   },
   "source": [
    "`sampling_table[i]`은 데이터세트의 i번째로 가장 흔한 단어를 샘플링 할 확률을 의미합니다. 함수는 샘플링을 위한 단어 빈도의 [Zipf 분포](https://en.wikipedia.org/wiki/Zipf%27s_law)를 추정합니다."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "mRHMssMmHgH-"
   },
   "source": [
    "주요 포인트: `tf.random.log_uniform_candidate_sampler`는 이미 어휘 빈도가 로그 균일(Zipf) 분포를 따른다고 가정합니다. 이러한 분포 가중 샘플링을 사용하는 것은 또한 네거티브 샘플링 오브젝티브를 훈련하는 데 더 단순한 손실 함수로 잡음 대조 추정(NCE) 손실의 근사치를 계산하는 데 도움이 됩니다."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "aj--8RFK6fgW"
   },
   "source": [
    "### 훈련 데이터 생성하기"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "dy5hl4lQ0B2M"
   },
   "source": [
    "위에 설명된 모든 단계를 모든 텍스트 데이터세트에서 획득한 벡터화된 문장의 목록에 호출할 수 있는 함수로 컴파일합니다. 샘플링 표가 skip-gram 단어 쌍을 샘플링 하기 전에 빌드되었다는 점에 주의하세요. 이 함수는 다음 섹션에서 사용합니다."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 16,
   "metadata": {
    "execution": {
     "iopub.execute_input": "2022-12-14T21:20:09.360294Z",
     "iopub.status.busy": "2022-12-14T21:20:09.360039Z",
     "iopub.status.idle": "2022-12-14T21:20:09.366807Z",
     "shell.execute_reply": "2022-12-14T21:20:09.366173Z"
    },
    "id": "63INISDEX1Hu"
   },
   "outputs": [],
   "source": [
    "# Generates skip-gram pairs with negative sampling for a list of sequences\n",
    "# (int-encoded sentences) based on window size, number of negative samples\n",
    "# and vocabulary size.\n",
    "def generate_training_data(sequences, window_size, num_ns, vocab_size, seed):\n",
    "  # Elements of each training example are appended to these lists.\n",
    "  targets, contexts, labels = [], [], []\n",
    "\n",
    "  # Build the sampling table for `vocab_size` tokens.\n",
    "  sampling_table = tf.keras.preprocessing.sequence.make_sampling_table(vocab_size)\n",
    "\n",
    "  # Iterate over all sequences (sentences) in the dataset.\n",
    "  for sequence in tqdm.tqdm(sequences):\n",
    "\n",
    "    # Generate positive skip-gram pairs for a sequence (sentence).\n",
    "    positive_skip_grams, _ = tf.keras.preprocessing.sequence.skipgrams(\n",
    "          sequence,\n",
    "          vocabulary_size=vocab_size,\n",
    "          sampling_table=sampling_table,\n",
    "          window_size=window_size,\n",
    "          negative_samples=0)\n",
    "\n",
    "    # Iterate over each positive skip-gram pair to produce training examples\n",
    "    # with a positive context word and negative samples.\n",
    "    for target_word, context_word in positive_skip_grams:\n",
    "      context_class = tf.expand_dims(\n",
    "          tf.constant([context_word], dtype=\"int64\"), 1)\n",
    "      negative_sampling_candidates, _, _ = tf.random.log_uniform_candidate_sampler(\n",
    "          true_classes=context_class,\n",
    "          num_true=1,\n",
    "          num_sampled=num_ns,\n",
    "          unique=True,\n",
    "          range_max=vocab_size,\n",
    "          seed=seed,\n",
    "          name=\"negative_sampling\")\n",
    "\n",
    "      # Build context and label vectors (for one target word)\n",
    "      context = tf.concat([tf.squeeze(context_class,1), negative_sampling_candidates], 0)\n",
    "      label = tf.constant([1] + [0]*num_ns, dtype=\"int64\")\n",
    "\n",
    "      # Append each element from the training example to global lists.\n",
    "      targets.append(target_word)\n",
    "      contexts.append(context)\n",
    "      labels.append(label)\n",
    "\n",
    "  return targets, contexts, labels"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "shvPC8Ji2cMK"
   },
   "source": [
    "## word2vec에 대한 훈련 데이터 준비하기"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "j5mbZsZu6uKg"
   },
   "source": [
    "word2vec 모델에 기반한 skip-gram 네거티브 샘플링을 위한 하나의 문장으로 작업하는 방법을 이해하여 더 큰 규모의 문장 목록에서 훈련 예제를 생성할 수 있습니다."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "OFlikI6L26nh"
   },
   "source": [
    "### 텍스트 말뭉치 다운로드\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "rEFavOgN98al"
   },
   "source": [
    "이 튜토리얼에서는 Shakespeare가 작성한 텍스트 파일을 사용합니다. 다음 라인을 변경하여 자신의 데이터에 대해 이 코드를 실행하세요."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 17,
   "metadata": {
    "execution": {
     "iopub.execute_input": "2022-12-14T21:20:09.370129Z",
     "iopub.status.busy": "2022-12-14T21:20:09.369878Z",
     "iopub.status.idle": "2022-12-14T21:20:09.419502Z",
     "shell.execute_reply": "2022-12-14T21:20:09.418840Z"
    },
    "id": "QFkitxzVVaAi"
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Downloading data from https://storage.googleapis.com/download.tensorflow.org/data/shakespeare.txt\n"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\r",
      "   8192/1115394 [..............................] - ETA: 0s"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\r",
      "1115394/1115394 [==============================] - 0s 0us/step\n"
     ]
    }
   ],
   "source": [
    "path_to_file = tf.keras.utils.get_file('shakespeare.txt', 'https://storage.googleapis.com/download.tensorflow.org/data/shakespeare.txt')"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "sOsbLq8a37dr"
   },
   "source": [
    "파일에서 텍스트를 읽고 처음 몇 개의 라인을 프린트합니다. "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 18,
   "metadata": {
    "execution": {
     "iopub.execute_input": "2022-12-14T21:20:09.422884Z",
     "iopub.status.busy": "2022-12-14T21:20:09.422568Z",
     "iopub.status.idle": "2022-12-14T21:20:09.432473Z",
     "shell.execute_reply": "2022-12-14T21:20:09.431861Z"
    },
    "id": "lfgnsUw3ofMD"
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "First Citizen:\n",
      "Before we proceed any further, hear me speak.\n",
      "\n",
      "All:\n",
      "Speak, speak.\n",
      "\n",
      "First Citizen:\n",
      "You are all resolved rather to die than to famish?\n",
      "\n",
      "All:\n",
      "Resolved. resolved.\n",
      "\n",
      "First Citizen:\n",
      "First, you know Caius Marcius is chief enemy to the people.\n",
      "\n",
      "All:\n",
      "We know't, we know't.\n",
      "\n",
      "First Citizen:\n",
      "Let us kill him, and we'll have corn at our own price.\n"
     ]
    }
   ],
   "source": [
    "with open(path_to_file) as f:\n",
    "  lines = f.read().splitlines()\n",
    "for line in lines[:20]:\n",
    "  print(line)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "gTNZYqUs5C2V"
   },
   "source": [
    "공백이 없는 라인을 사용해 다음 단계를 위해 `tf.data.TextLineDataset` 객체를 구성합니다."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 19,
   "metadata": {
    "execution": {
     "iopub.execute_input": "2022-12-14T21:20:09.435742Z",
     "iopub.status.busy": "2022-12-14T21:20:09.435485Z",
     "iopub.status.idle": "2022-12-14T21:20:09.476372Z",
     "shell.execute_reply": "2022-12-14T21:20:09.475579Z"
    },
    "id": "ViDrwy-HjAs9"
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "WARNING:tensorflow:From /tmpfs/src/tf_docs_env/lib/python3.9/site-packages/tensorflow/python/autograph/pyct/static_analysis/liveness.py:83: Analyzer.lamba_check (from tensorflow.python.autograph.pyct.static_analysis.liveness) is deprecated and will be removed after 2023-09-23.\n",
      "Instructions for updating:\n",
      "Lambda fuctions will be no more assumed to be used in the statement where they are used, or at least in the same block. https://github.com/tensorflow/tensorflow/issues/56089\n"
     ]
    }
   ],
   "source": [
    "text_ds = tf.data.TextLineDataset(path_to_file).filter(lambda x: tf.cast(tf.strings.length(x), bool))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "vfsc88zE9upk"
   },
   "source": [
    "### 말뭉치에서 문장 벡터화"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "XfgZo8zR94KK"
   },
   "source": [
    "`TextVectorization` 레이어를 사용하여 말뭉치의 문장을 벡터화할 수 있습니다. [텍스트 분류](https://www.tensorflow.org/tutorials/keras/text_classification) 튜토리얼에서 이 레이어 사용 방법에 대해 더 자세히 알아보세요. 위의 처음 몇몇 문장에서 텍스트는 한 경우에 사용해야 하고 구두점은 없어야 한다는 점을 알 수 있습니다. 이렇게 하려면 TextVectorization 레이어에서 사용할 수 있는 `custom_standardization function`을 정의합니다."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 20,
   "metadata": {
    "execution": {
     "iopub.execute_input": "2022-12-14T21:20:09.480208Z",
     "iopub.status.busy": "2022-12-14T21:20:09.479939Z",
     "iopub.status.idle": "2022-12-14T21:20:09.495062Z",
     "shell.execute_reply": "2022-12-14T21:20:09.494415Z"
    },
    "id": "2MlsXzo-ZlfK"
   },
   "outputs": [],
   "source": [
    "# Now, create a custom standardization function to lowercase the text and\n",
    "# remove punctuation.\n",
    "def custom_standardization(input_data):\n",
    "  lowercase = tf.strings.lower(input_data)\n",
    "  return tf.strings.regex_replace(lowercase,\n",
    "                                  '[%s]' % re.escape(string.punctuation), '')\n",
    "\n",
    "\n",
    "# Define the vocabulary size and the number of words in a sequence.\n",
    "vocab_size = 4096\n",
    "sequence_length = 10\n",
    "\n",
    "# Use the `TextVectorization` layer to normalize, split, and map strings to\n",
    "# integers. Set the `output_sequence_length` length to pad all samples to the\n",
    "# same length.\n",
    "vectorize_layer = layers.TextVectorization(\n",
    "    standardize=custom_standardization,\n",
    "    max_tokens=vocab_size,\n",
    "    output_mode='int',\n",
    "    output_sequence_length=sequence_length)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "g92LuvnyBmz1"
   },
   "source": [
    "텍스트 데이터세트에서 `TextVectorization.adapt`를 호출하여 어휘를 생성합니다.\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 21,
   "metadata": {
    "execution": {
     "iopub.execute_input": "2022-12-14T21:20:09.498735Z",
     "iopub.status.busy": "2022-12-14T21:20:09.498471Z",
     "iopub.status.idle": "2022-12-14T21:20:10.824155Z",
     "shell.execute_reply": "2022-12-14T21:20:10.823165Z"
    },
    "id": "seZau_iYMPFT"
   },
   "outputs": [],
   "source": [
    "vectorize_layer.adapt(text_ds.batch(1024))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "jg2z7eeHMnH-"
   },
   "source": [
    "레이어의 상태가 텍스트 말뭉치를 나타내기 위해 조정되면 어휘는 `TextVectorization.get_vocabulary`로 액세스할 수 있습니다. 이 함수는 빈도로 정렬된(내림차순) 모든 어휘 토큰의 목록을 반환합니다."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 22,
   "metadata": {
    "execution": {
     "iopub.execute_input": "2022-12-14T21:20:10.829048Z",
     "iopub.status.busy": "2022-12-14T21:20:10.828284Z",
     "iopub.status.idle": "2022-12-14T21:20:10.839534Z",
     "shell.execute_reply": "2022-12-14T21:20:10.838646Z"
    },
    "id": "jgw9pTA7MRaU"
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "['', '[UNK]', 'the', 'and', 'to', 'i', 'of', 'you', 'my', 'a', 'that', 'in', 'is', 'not', 'for', 'with', 'me', 'it', 'be', 'your']\n"
     ]
    }
   ],
   "source": [
    "# Save the created vocabulary for reference.\n",
    "inverse_vocab = vectorize_layer.get_vocabulary()\n",
    "print(inverse_vocab[:20])"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "DOQ30Tx6KA2G"
   },
   "source": [
    "`vectorize_layer`는 이제 `text_ds`( `tf.data.Dataset`)의 각 요소에 대한 벡터를 생성하는 데 사용할 수 있습니다. `Dataset.batch`, `Dataset.prefetch`, `Dataset.map` 및 `Dataset.unbatch`를 적용합니다."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 23,
   "metadata": {
    "execution": {
     "iopub.execute_input": "2022-12-14T21:20:10.843175Z",
     "iopub.status.busy": "2022-12-14T21:20:10.842517Z",
     "iopub.status.idle": "2022-12-14T21:20:10.890312Z",
     "shell.execute_reply": "2022-12-14T21:20:10.889395Z"
    },
    "id": "yUVYrDp0araQ"
   },
   "outputs": [],
   "source": [
    "# Vectorize the data in text_ds.\n",
    "text_vector_ds = text_ds.batch(1024).prefetch(AUTOTUNE).map(vectorize_layer).unbatch()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "7YyH_SYzB72p"
   },
   "source": [
    "### 데이터세트에서 시퀀스 획득하기"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "NFUQLX0_KaRC"
   },
   "source": [
    "이제 정수 인코딩된 문장의 `tf.data.Dataset`가 있습니다. word2vec 모델을 훈련하기 위해 데이터세트를 준비하려면 데이터세트를 문장 벡터 시퀀스 목록으로 평면화합니다. 이 단계는 데이터세트의 각 문장을 반복하여 포지티브 및 네거티브 예제를 생성하기 때문에 필요합니다.\n",
    "\n",
    "참고: 이전에 정의된 `generate_training_data()`가 TensorFlow가 아닌 Python/NumPy 함수를 사용하기 때문에, `tf.data.Dataset.map`과 함께 `tf.py_function` 또는 `tf.numpy_function`도 사용할 수 있습니다."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 24,
   "metadata": {
    "execution": {
     "iopub.execute_input": "2022-12-14T21:20:10.894781Z",
     "iopub.status.busy": "2022-12-14T21:20:10.894023Z",
     "iopub.status.idle": "2022-12-14T21:20:15.558210Z",
     "shell.execute_reply": "2022-12-14T21:20:15.557448Z"
    },
    "id": "sGXoOh9y11pM"
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "32777\n"
     ]
    }
   ],
   "source": [
    "sequences = list(text_vector_ds.as_numpy_iterator())\n",
    "print(len(sequences))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "tDc4riukLTqg"
   },
   "source": [
    "`sequences`에서 몇몇 예제를 검사합니다."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 25,
   "metadata": {
    "execution": {
     "iopub.execute_input": "2022-12-14T21:20:15.561757Z",
     "iopub.status.busy": "2022-12-14T21:20:15.561443Z",
     "iopub.status.idle": "2022-12-14T21:20:15.566522Z",
     "shell.execute_reply": "2022-12-14T21:20:15.565807Z"
    },
    "id": "WZf1RIbB2Dfb"
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "[ 89 270   0   0   0   0   0   0   0   0] => ['first', 'citizen', '', '', '', '', '', '', '', '']\n",
      "[138  36 982 144 673 125  16 106   0   0] => ['before', 'we', 'proceed', 'any', 'further', 'hear', 'me', 'speak', '', '']\n",
      "[34  0  0  0  0  0  0  0  0  0] => ['all', '', '', '', '', '', '', '', '', '']\n",
      "[106 106   0   0   0   0   0   0   0   0] => ['speak', 'speak', '', '', '', '', '', '', '', '']\n",
      "[ 89 270   0   0   0   0   0   0   0   0] => ['first', 'citizen', '', '', '', '', '', '', '', '']\n"
     ]
    }
   ],
   "source": [
    "for seq in sequences[:5]:\n",
    "  print(f\"{seq} => {[inverse_vocab[i] for i in seq]}\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "yDzSOjNwCWNh"
   },
   "source": [
    "### 시퀀스에서 훈련 예제 생성하기"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "BehvYr-nEKyY"
   },
   "source": [
    "`sequences`는 이제 int로 인코딩된 문장의 목록입니다. 이전에 정의된 `generate_training_data` 함수를 호출하여 word2vec 모델에 대한 훈련 예제를 생성합니다. 요약하자면 함수는 각 시퀀스의 각 단어를 다시 반복하여 포지티브 및 네거티브 콘텍스트 단어를 수집합니다. 대상, 콘텍스트 및 레이블의 길이는 동일해야 하며 훈련 예제의 총수를 나타냅니다."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 26,
   "metadata": {
    "execution": {
     "iopub.execute_input": "2022-12-14T21:20:15.570018Z",
     "iopub.status.busy": "2022-12-14T21:20:15.569754Z",
     "iopub.status.idle": "2022-12-14T21:21:12.602418Z",
     "shell.execute_reply": "2022-12-14T21:21:12.601634Z"
    },
    "id": "44DJ22M6nX5o"
   },
   "outputs": [
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "\r",
      "  0%|          | 0/32777 [00:00<?, ?it/s]"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "\r",
      "  0%|          | 41/32777 [00:00<02:21, 231.68it/s]"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "\r",
      "  0%|          | 90/32777 [00:00<01:35, 342.96it/s]"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "\r",
      "  0%|          | 148/32777 [00:00<01:16, 425.35it/s]"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "\r",
      "  1%|          | 220/32777 [00:00<01:02, 524.09it/s]"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "\r",
      "  1%|          | 276/32777 [00:00<01:02, 518.14it/s]"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "\r",
      "  1%|          | 357/32777 [00:00<00:53, 604.47it/s]"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "\r",
      "  1%|▏         | 433/32777 [00:00<00:49, 648.62it/s]"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "\r",
      "  2%|▏         | 500/32777 [00:00<00:57, 565.78it/s]"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "\r",
      "  2%|▏         | 590/32777 [00:01<00:49, 655.24it/s]"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "\r",
      "  2%|▏         | 659/32777 [00:01<00:48, 660.40it/s]"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "\r",
      "  2%|▏         | 728/32777 [00:01<00:50, 640.52it/s]"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "\r",
      "  2%|▏         | 794/32777 [00:01<00:49, 640.16it/s]"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "\r",
      "  3%|▎         | 865/32777 [00:01<00:48, 658.98it/s]"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "\r",
      "  3%|▎         | 941/32777 [00:01<00:46, 687.82it/s]"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "\r",
      "  3%|▎         | 1011/32777 [00:01<00:49, 646.77it/s]"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "\r",
      "  3%|▎         | 1089/32777 [00:01<00:46, 682.90it/s]"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "\r",
      "  4%|▎         | 1176/32777 [00:01<00:43, 724.53it/s]"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "\r",
      "  4%|▍         | 1250/32777 [00:02<00:43, 717.53it/s]"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "\r",
      "  4%|▍         | 1328/32777 [00:02<00:42, 732.67it/s]"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "\r",
      "  4%|▍         | 1402/32777 [00:02<00:42, 731.75it/s]"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "\r",
      "  5%|▍         | 1493/32777 [00:02<00:40, 780.41it/s]"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "\r",
      "  5%|▍         | 1572/32777 [00:02<00:40, 773.03it/s]"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "\r",
      "  5%|▌         | 1650/32777 [00:02<00:42, 729.94it/s]"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "\r",
      "  5%|▌         | 1730/32777 [00:02<00:41, 743.44it/s]"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "\r",
      "  6%|▌         | 1817/32777 [00:02<00:40, 773.57it/s]"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "\r",
      "  6%|▌         | 1895/32777 [00:02<00:40, 762.00it/s]"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "\r",
      "  6%|▌         | 1972/32777 [00:02<00:42, 732.97it/s]"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "\r",
      "  6%|▌         | 2046/32777 [00:03<00:43, 711.18it/s]"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "\r",
      "  7%|▋         | 2143/32777 [00:03<00:39, 779.79it/s]"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "\r",
      "  7%|▋         | 2222/32777 [00:03<00:40, 761.53it/s]"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "\r",
      "  7%|▋         | 2299/32777 [00:03<00:40, 756.31it/s]"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "\r",
      "  7%|▋         | 2403/32777 [00:03<00:36, 837.60it/s]"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "\r",
      "  8%|▊         | 2511/32777 [00:03<00:33, 905.27it/s]"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "\r",
      "  8%|▊         | 2603/32777 [00:03<00:34, 880.74it/s]"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "\r",
      "  8%|▊         | 2723/32777 [00:03<00:31, 969.44it/s]"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "\r",
      "  9%|▊         | 2821/32777 [00:03<00:38, 772.21it/s]"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "\r",
      "  9%|▉         | 2923/32777 [00:04<00:35, 832.45it/s]"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "\r",
      "  9%|▉         | 3013/32777 [00:04<00:37, 800.12it/s]"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "\r",
      "  9%|▉         | 3098/32777 [00:04<00:37, 792.94it/s]"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "\r",
      " 10%|▉         | 3181/32777 [00:04<00:38, 766.38it/s]"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "\r",
      " 10%|▉         | 3275/32777 [00:04<00:36, 811.61it/s]"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "\r",
      " 10%|█         | 3359/32777 [00:04<00:40, 724.66it/s]"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "\r",
      " 11%|█         | 3459/32777 [00:04<00:36, 794.97it/s]"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "\r",
      " 11%|█         | 3542/32777 [00:04<00:43, 672.38it/s]"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "\r",
      " 11%|█         | 3615/32777 [00:05<00:43, 670.60it/s]"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "\r",
      " 11%|█         | 3686/32777 [00:05<00:43, 671.63it/s]"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "\r",
      " 12%|█▏        | 3814/32777 [00:05<00:35, 820.44it/s]"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "\r",
      " 12%|█▏        | 3900/32777 [00:05<00:36, 783.98it/s]"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "\r",
      " 12%|█▏        | 3981/32777 [00:05<00:37, 769.77it/s]"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "\r",
      " 12%|█▏        | 4060/32777 [00:05<00:39, 724.44it/s]"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "\r",
      " 13%|█▎        | 4143/32777 [00:05<00:38, 747.73it/s]"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "\r",
      " 13%|█▎        | 4219/32777 [00:05<00:41, 684.08it/s]"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "\r",
      " 13%|█▎        | 4289/32777 [00:06<00:46, 610.07it/s]"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "\r",
      " 13%|█▎        | 4353/32777 [00:06<00:47, 602.44it/s]"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "\r",
      " 13%|█▎        | 4415/32777 [00:06<00:49, 571.31it/s]"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "\r",
      " 14%|█▎        | 4474/32777 [00:06<00:53, 526.94it/s]"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "\r",
      " 14%|█▍        | 4548/32777 [00:06<00:48, 579.81it/s]"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "\r",
      " 14%|█▍        | 4635/32777 [00:06<00:43, 649.80it/s]"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "\r",
      " 14%|█▍        | 4702/32777 [00:06<00:45, 613.97it/s]"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "\r",
      " 15%|█▍        | 4780/32777 [00:06<00:42, 652.51it/s]"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "\r",
      " 15%|█▍        | 4861/32777 [00:06<00:40, 695.40it/s]"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "\r",
      " 15%|█▌        | 4932/32777 [00:07<00:44, 620.07it/s]"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "\r",
      " 15%|█▌        | 5008/32777 [00:07<00:42, 656.29it/s]"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "\r",
      " 15%|█▌        | 5076/32777 [00:07<00:46, 599.24it/s]"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "\r",
      " 16%|█▌        | 5139/32777 [00:07<00:49, 553.16it/s]"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "\r",
      " 16%|█▌        | 5204/32777 [00:07<00:47, 577.26it/s]"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "\r",
      " 16%|█▌        | 5273/32777 [00:07<00:45, 606.03it/s]"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "\r",
      " 16%|█▋        | 5336/32777 [00:07<00:45, 600.87it/s]"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "\r",
      " 16%|█▋        | 5398/32777 [00:07<00:45, 606.13it/s]"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "\r",
      " 17%|█▋        | 5460/32777 [00:07<00:45, 597.19it/s]"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "\r",
      " 17%|█▋        | 5521/32777 [00:08<00:48, 566.86it/s]"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "\r",
      " 17%|█▋        | 5579/32777 [00:08<00:47, 569.24it/s]"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "\r",
      " 17%|█▋        | 5640/32777 [00:08<00:46, 577.49it/s]"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "\r",
      " 17%|█▋        | 5703/32777 [00:08<00:46, 585.45it/s]"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "\r",
      " 18%|█▊        | 5781/32777 [00:08<00:42, 641.11it/s]"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "\r",
      " 18%|█▊        | 5862/32777 [00:08<00:39, 689.90it/s]"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "\r",
      " 18%|█▊        | 5932/32777 [00:08<00:42, 631.98it/s]"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "\r",
      " 18%|█▊        | 5997/32777 [00:08<00:44, 602.68it/s]"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "\r",
      " 18%|█▊        | 6063/32777 [00:08<00:43, 611.05it/s]"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "\r",
      " 19%|█▊        | 6136/32777 [00:09<00:41, 641.31it/s]"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "\r",
      " 19%|█▉        | 6204/32777 [00:09<00:40, 648.90it/s]"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "\r",
      " 19%|█▉        | 6270/32777 [00:09<00:40, 650.75it/s]"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "\r",
      " 19%|█▉        | 6336/32777 [00:09<00:40, 649.38it/s]"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "\r",
      " 20%|█▉        | 6416/32777 [00:09<00:38, 691.35it/s]"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "\r",
      " 20%|█▉        | 6486/32777 [00:09<00:43, 610.01it/s]"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "\r",
      " 20%|█▉        | 6549/32777 [00:09<00:43, 602.35it/s]"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "\r",
      " 20%|██        | 6611/32777 [00:09<00:45, 574.66it/s]"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "\r",
      " 20%|██        | 6677/32777 [00:09<00:44, 592.81it/s]"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "\r",
      " 21%|██        | 6750/32777 [00:10<00:41, 627.88it/s]"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "\r",
      " 21%|██        | 6818/32777 [00:10<00:40, 640.15it/s]"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "\r",
      " 21%|██        | 6883/32777 [00:10<00:41, 625.86it/s]"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "\r",
      " 21%|██▏       | 6976/32777 [00:10<00:36, 710.59it/s]"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "\r",
      " 22%|██▏       | 7048/32777 [00:10<00:36, 700.16it/s]"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "\r",
      " 22%|██▏       | 7119/32777 [00:10<00:37, 681.81it/s]"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "\r",
      " 22%|██▏       | 7192/32777 [00:10<00:37, 686.49it/s]"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "\r",
      " 22%|██▏       | 7261/32777 [00:10<00:37, 680.76it/s]"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "\r",
      " 22%|██▏       | 7348/32777 [00:10<00:34, 734.44it/s]"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "\r",
      " 23%|██▎       | 7422/32777 [00:11<00:37, 668.36it/s]"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "\r",
      " 23%|██▎       | 7494/32777 [00:11<00:37, 678.16it/s]"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "\r",
      " 23%|██▎       | 7563/32777 [00:11<00:39, 639.61it/s]"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "\r",
      " 23%|██▎       | 7628/32777 [00:11<00:41, 608.96it/s]"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "\r",
      " 23%|██▎       | 7690/32777 [00:11<00:42, 590.78it/s]"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "\r",
      " 24%|██▎       | 7760/32777 [00:11<00:40, 619.34it/s]"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "\r",
      " 24%|██▍       | 7823/32777 [00:11<00:42, 586.54it/s]"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "\r",
      " 24%|██▍       | 7896/32777 [00:11<00:39, 623.69it/s]"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "\r",
      " 24%|██▍       | 7960/32777 [00:11<00:41, 601.39it/s]"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "\r",
      " 24%|██▍       | 8021/32777 [00:12<00:43, 567.52it/s]"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "\r",
      " 25%|██▍       | 8115/32777 [00:12<00:37, 662.38it/s]"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "\r",
      " 25%|██▍       | 8191/32777 [00:12<00:35, 684.26it/s]"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "\r",
      " 25%|██▌       | 8270/32777 [00:12<00:34, 711.15it/s]"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "\r",
      " 25%|██▌       | 8342/32777 [00:12<00:37, 647.49it/s]"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "\r",
      " 26%|██▌       | 8409/32777 [00:12<00:43, 563.28it/s]"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "\r",
      " 26%|██▌       | 8481/32777 [00:12<00:40, 598.97it/s]"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "\r",
      " 26%|██▌       | 8551/32777 [00:12<00:38, 623.39it/s]"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "\r",
      " 26%|██▋       | 8637/32777 [00:12<00:35, 685.47it/s]"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "\r",
      " 27%|██▋       | 8708/32777 [00:13<00:41, 584.28it/s]"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "\r",
      " 27%|██▋       | 8771/32777 [00:13<00:41, 577.32it/s]"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "\r",
      " 27%|██▋       | 8832/32777 [00:13<00:41, 583.21it/s]"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "\r",
      " 27%|██▋       | 8921/32777 [00:13<00:35, 665.22it/s]"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "\r",
      " 27%|██▋       | 9000/32777 [00:13<00:34, 687.78it/s]"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "\r",
      " 28%|██▊       | 9071/32777 [00:13<00:40, 588.36it/s]"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "\r",
      " 28%|██▊       | 9134/32777 [00:13<00:40, 585.43it/s]"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "\r",
      " 28%|██▊       | 9202/32777 [00:13<00:38, 606.67it/s]"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "\r",
      " 28%|██▊       | 9265/32777 [00:14<00:41, 561.15it/s]"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "\r",
      " 28%|██▊       | 9323/32777 [00:14<00:42, 549.72it/s]"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "\r",
      " 29%|██▊       | 9385/32777 [00:14<00:41, 567.66it/s]"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "\r",
      " 29%|██▉       | 9448/32777 [00:14<00:40, 580.44it/s]"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "\r",
      " 29%|██▉       | 9507/32777 [00:14<00:41, 558.59it/s]"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "\r",
      " 29%|██▉       | 9564/32777 [00:14<00:43, 531.18it/s]"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "\r",
      " 29%|██▉       | 9618/32777 [00:14<00:47, 491.02it/s]"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "\r",
      " 30%|██▉       | 9677/32777 [00:14<00:45, 509.99it/s]"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "\r",
      " 30%|██▉       | 9737/32777 [00:14<00:43, 533.07it/s]"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "\r",
      " 30%|██▉       | 9792/32777 [00:15<00:48, 476.86it/s]"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "\r",
      " 30%|███       | 9847/32777 [00:15<00:46, 490.44it/s]"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "\r",
      " 30%|███       | 9898/32777 [00:15<00:46, 487.13it/s]"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "\r",
      " 30%|███       | 9955/32777 [00:15<00:44, 509.31it/s]"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "\r",
      " 31%|███       | 10014/32777 [00:15<00:42, 530.27it/s]"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "\r",
      " 31%|███       | 10068/32777 [00:15<00:43, 525.29it/s]"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "\r",
      " 31%|███       | 10136/32777 [00:15<00:39, 569.68it/s]"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "\r",
      " 31%|███       | 10194/32777 [00:15<00:41, 545.49it/s]"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "\r",
      " 31%|███▏      | 10250/32777 [00:15<00:43, 516.92it/s]"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "\r",
      " 31%|███▏      | 10305/32777 [00:16<00:42, 524.76it/s]"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "\r",
      " 32%|███▏      | 10358/32777 [00:16<00:45, 497.13it/s]"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "\r",
      " 32%|███▏      | 10409/32777 [00:16<00:47, 468.37it/s]"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "\r",
      " 32%|███▏      | 10467/32777 [00:16<00:45, 492.16it/s]"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "\r",
      " 32%|███▏      | 10517/32777 [00:16<00:45, 490.26it/s]"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "\r",
      " 32%|███▏      | 10575/32777 [00:16<00:43, 505.75it/s]"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "\r",
      " 32%|███▏      | 10642/32777 [00:16<00:40, 550.25it/s]"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "\r",
      " 33%|███▎      | 10705/32777 [00:16<00:38, 571.55it/s]"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "\r",
      " 33%|███▎      | 10773/32777 [00:16<00:36, 598.70it/s]"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "\r",
      " 33%|███▎      | 10841/32777 [00:17<00:35, 621.96it/s]"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "\r",
      " 33%|███▎      | 10919/32777 [00:17<00:32, 663.36it/s]"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "\r",
      " 34%|███▎      | 10986/32777 [00:17<00:33, 651.94it/s]"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "\r",
      " 34%|███▎      | 11052/32777 [00:17<00:34, 624.92it/s]"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "\r",
      " 34%|███▍      | 11115/32777 [00:17<00:38, 558.07it/s]"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "\r",
      " 34%|███▍      | 11173/32777 [00:17<00:42, 505.64it/s]"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "\r",
      " 34%|███▍      | 11228/32777 [00:17<00:41, 516.08it/s]"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "\r",
      " 34%|███▍      | 11281/32777 [00:17<00:42, 508.38it/s]"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "\r",
      " 35%|███▍      | 11333/32777 [00:17<00:43, 489.80it/s]"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "\r",
      " 35%|███▍      | 11413/32777 [00:18<00:37, 571.24it/s]"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "\r",
      " 35%|███▌      | 11472/32777 [00:18<00:40, 528.20it/s]"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "\r",
      " 35%|███▌      | 11529/32777 [00:18<00:39, 538.90it/s]"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "\r",
      " 35%|███▌      | 11584/32777 [00:18<00:42, 497.51it/s]"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "\r",
      " 36%|███▌      | 11643/32777 [00:18<00:40, 516.74it/s]"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "\r",
      " 36%|███▌      | 11696/32777 [00:18<00:40, 518.17it/s]"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "\r",
      " 36%|███▌      | 11760/32777 [00:18<00:38, 548.94it/s]"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "\r",
      " 36%|███▌      | 11835/32777 [00:18<00:34, 605.40it/s]"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "\r",
      " 36%|███▋      | 11897/32777 [00:18<00:37, 563.94it/s]"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "\r",
      " 36%|███▋      | 11955/32777 [00:19<00:38, 545.93it/s]"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "\r",
      " 37%|███▋      | 12011/32777 [00:19<00:37, 549.33it/s]"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "\r",
      " 37%|███▋      | 12074/32777 [00:19<00:36, 570.25it/s]"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "\r",
      " 37%|███▋      | 12134/32777 [00:19<00:35, 576.74it/s]"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "\r",
      " 37%|███▋      | 12199/32777 [00:19<00:34, 595.25it/s]"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "\r",
      " 37%|███▋      | 12259/32777 [00:19<00:37, 553.64it/s]"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "\r",
      " 38%|███▊      | 12316/32777 [00:19<00:38, 530.76it/s]"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "\r",
      " 38%|███▊      | 12372/32777 [00:19<00:38, 535.66it/s]"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "\r",
      " 38%|███▊      | 12439/32777 [00:19<00:35, 571.23it/s]"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "\r",
      " 38%|███▊      | 12512/32777 [00:20<00:32, 615.76it/s]"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "\r",
      " 38%|███▊      | 12575/32777 [00:20<00:34, 587.06it/s]"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "\r",
      " 39%|███▊      | 12641/32777 [00:20<00:33, 599.39it/s]"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "\r",
      " 39%|███▉      | 12702/32777 [00:20<00:35, 573.20it/s]"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "\r",
      " 39%|███▉      | 12762/32777 [00:20<00:34, 579.01it/s]"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "\r",
      " 39%|███▉      | 12821/32777 [00:20<00:38, 524.26it/s]"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "\r",
      " 39%|███▉      | 12892/32777 [00:20<00:34, 572.64it/s]"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "\r",
      " 40%|███▉      | 12982/32777 [00:20<00:30, 655.98it/s]"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "\r",
      " 40%|███▉      | 13050/32777 [00:20<00:31, 634.12it/s]"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "\r",
      " 40%|████      | 13116/32777 [00:21<00:30, 639.77it/s]"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "\r",
      " 40%|████      | 13188/32777 [00:21<00:29, 661.31it/s]"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "\r",
      " 40%|████      | 13255/32777 [00:21<00:33, 578.26it/s]"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "\r",
      " 41%|████      | 13347/32777 [00:21<00:29, 668.09it/s]"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "\r",
      " 41%|████      | 13417/32777 [00:21<00:30, 643.61it/s]"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "\r",
      " 41%|████      | 13484/32777 [00:21<00:31, 618.03it/s]"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "\r",
      " 41%|████▏     | 13548/32777 [00:21<00:31, 606.38it/s]"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "\r",
      " 42%|████▏     | 13610/32777 [00:21<00:32, 582.41it/s]"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "\r",
      " 42%|████▏     | 13669/32777 [00:21<00:34, 549.02it/s]"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "\r",
      " 42%|████▏     | 13738/32777 [00:22<00:32, 586.55it/s]"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "\r",
      " 42%|████▏     | 13807/32777 [00:22<00:31, 610.71it/s]"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "\r",
      " 42%|████▏     | 13869/32777 [00:22<00:32, 582.00it/s]"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "\r",
      " 42%|████▏     | 13928/32777 [00:22<00:33, 569.43it/s]"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "\r",
      " 43%|████▎     | 13986/32777 [00:22<00:33, 556.48it/s]"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "\r",
      " 43%|████▎     | 14051/32777 [00:22<00:32, 577.83it/s]"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "\r",
      " 43%|████▎     | 14125/32777 [00:22<00:30, 620.89it/s]"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "\r",
      " 43%|████▎     | 14188/32777 [00:22<00:30, 608.16it/s]"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "\r",
      " 43%|████▎     | 14250/32777 [00:22<00:30, 604.32it/s]"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "\r",
      " 44%|████▎     | 14319/32777 [00:23<00:29, 625.63it/s]"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "\r",
      " 44%|████▍     | 14393/32777 [00:23<00:28, 655.10it/s]"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "\r",
      " 44%|████▍     | 14468/32777 [00:23<00:26, 680.50it/s]"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "\r",
      " 44%|████▍     | 14552/32777 [00:23<00:25, 725.17it/s]"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "\r",
      " 45%|████▍     | 14625/32777 [00:23<00:26, 674.87it/s]"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "\r",
      " 45%|████▍     | 14694/32777 [00:23<00:28, 644.84it/s]"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "\r",
      " 45%|████▌     | 14771/32777 [00:23<00:26, 679.38it/s]"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "\r",
      " 45%|████▌     | 14840/32777 [00:23<00:26, 679.64it/s]"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "\r",
      " 46%|████▌     | 14914/32777 [00:23<00:25, 690.72it/s]"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "\r",
      " 46%|████▌     | 14984/32777 [00:24<00:27, 645.60it/s]"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "\r",
      " 46%|████▌     | 15050/32777 [00:24<00:27, 643.41it/s]"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "\r",
      " 46%|████▌     | 15115/32777 [00:24<00:31, 560.45it/s]"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "\r",
      " 46%|████▋     | 15175/32777 [00:24<00:31, 566.28it/s]"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "\r",
      " 47%|████▋     | 15250/32777 [00:24<00:28, 607.89it/s]"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "\r",
      " 47%|████▋     | 15313/32777 [00:24<00:30, 565.41it/s]"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "\r",
      " 47%|████▋     | 15373/32777 [00:24<00:30, 569.43it/s]"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "\r",
      " 47%|████▋     | 15431/32777 [00:24<00:32, 540.68it/s]"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "\r",
      " 47%|████▋     | 15504/32777 [00:24<00:29, 591.42it/s]"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "\r",
      " 47%|████▋     | 15565/32777 [00:25<00:31, 542.22it/s]"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "\r",
      " 48%|████▊     | 15629/32777 [00:25<00:30, 564.07it/s]"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "\r",
      " 48%|████▊     | 15706/32777 [00:25<00:27, 619.67it/s]"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "\r",
      " 48%|████▊     | 15783/32777 [00:25<00:25, 658.69it/s]"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "\r",
      " 48%|████▊     | 15851/32777 [00:25<00:28, 584.37it/s]"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "\r",
      " 49%|████▊     | 15933/32777 [00:25<00:26, 645.93it/s]"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "\r",
      " 49%|████▉     | 16001/32777 [00:25<00:26, 642.84it/s]"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "\r",
      " 49%|████▉     | 16069/32777 [00:25<00:25, 649.47it/s]"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "\r",
      " 49%|████▉     | 16136/32777 [00:25<00:27, 616.27it/s]"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "\r",
      " 49%|████▉     | 16211/32777 [00:26<00:25, 651.87it/s]"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "\r",
      " 50%|████▉     | 16278/32777 [00:26<00:25, 644.41it/s]"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "\r",
      " 50%|████▉     | 16344/32777 [00:26<00:25, 639.31it/s]"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "\r",
      " 50%|█████     | 16409/32777 [00:26<00:29, 564.31it/s]"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "\r",
      " 50%|█████     | 16469/32777 [00:26<00:28, 565.32it/s]"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "\r",
      " 50%|█████     | 16551/32777 [00:26<00:25, 630.77it/s]"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "\r",
      " 51%|█████     | 16616/32777 [00:26<00:25, 623.47it/s]"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "\r",
      " 51%|█████     | 16683/32777 [00:26<00:25, 635.61it/s]"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "\r",
      " 51%|█████     | 16748/32777 [00:26<00:25, 624.18it/s]"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "\r",
      " 51%|█████▏    | 16811/32777 [00:27<00:25, 624.57it/s]"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "\r",
      " 52%|█████▏    | 16898/32777 [00:27<00:23, 688.88it/s]"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "\r",
      " 52%|█████▏    | 16976/32777 [00:27<00:22, 711.31it/s]"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "\r",
      " 52%|█████▏    | 17065/32777 [00:27<00:20, 757.28it/s]"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "\r",
      " 52%|█████▏    | 17141/32777 [00:27<00:22, 695.01it/s]"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "\r",
      " 53%|█████▎    | 17222/32777 [00:27<00:21, 724.28it/s]"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "\r",
      " 53%|█████▎    | 17296/32777 [00:27<00:22, 687.86it/s]"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "\r",
      " 53%|█████▎    | 17366/32777 [00:27<00:25, 613.42it/s]"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "\r",
      " 53%|█████▎    | 17430/32777 [00:27<00:27, 559.54it/s]"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "\r",
      " 53%|█████▎    | 17488/32777 [00:28<00:28, 544.18it/s]"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "\r",
      " 54%|█████▎    | 17544/32777 [00:28<00:28, 534.54it/s]"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "\r",
      " 54%|█████▎    | 17599/32777 [00:28<00:29, 520.63it/s]"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "\r",
      " 54%|█████▍    | 17652/32777 [00:28<00:29, 512.26it/s]"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "\r",
      " 54%|█████▍    | 17704/32777 [00:28<00:31, 473.74it/s]"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "\r",
      " 54%|█████▍    | 17767/32777 [00:28<00:29, 510.66it/s]"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "\r",
      " 54%|█████▍    | 17819/32777 [00:28<00:29, 508.98it/s]"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "\r",
      " 55%|█████▍    | 17891/32777 [00:28<00:26, 565.92it/s]"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "\r",
      " 55%|█████▍    | 17950/32777 [00:28<00:26, 563.39it/s]"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "\r",
      " 55%|█████▍    | 18007/32777 [00:29<00:27, 528.17it/s]"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "\r",
      " 55%|█████▌    | 18070/32777 [00:29<00:26, 553.12it/s]"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "\r",
      " 55%|█████▌    | 18126/32777 [00:29<00:29, 503.97it/s]"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "\r",
      " 55%|█████▌    | 18178/32777 [00:29<00:31, 466.58it/s]"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "\r",
      " 56%|█████▌    | 18244/32777 [00:29<00:28, 511.15it/s]"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "\r",
      " 56%|█████▌    | 18297/32777 [00:29<00:29, 484.25it/s]"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "\r",
      " 56%|█████▌    | 18353/32777 [00:29<00:28, 500.47it/s]"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "\r",
      " 56%|█████▌    | 18409/32777 [00:29<00:27, 514.28it/s]"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "\r",
      " 56%|█████▋    | 18465/32777 [00:30<00:27, 526.36it/s]"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "\r",
      " 57%|█████▋    | 18535/32777 [00:30<00:24, 574.96it/s]"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "\r",
      " 57%|█████▋    | 18626/32777 [00:30<00:21, 669.95it/s]"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "\r",
      " 57%|█████▋    | 18698/32777 [00:30<00:20, 679.29it/s]"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "\r",
      " 57%|█████▋    | 18767/32777 [00:30<00:25, 552.78it/s]"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "\r",
      " 57%|█████▋    | 18827/32777 [00:30<00:25, 538.69it/s]"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "\r",
      " 58%|█████▊    | 18894/32777 [00:30<00:24, 571.05it/s]"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "\r",
      " 58%|█████▊    | 18966/32777 [00:30<00:22, 609.44it/s]"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "\r",
      " 58%|█████▊    | 19030/32777 [00:30<00:23, 578.30it/s]"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "\r",
      " 58%|█████▊    | 19095/32777 [00:31<00:22, 595.97it/s]"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "\r",
      " 59%|█████▊    | 19178/32777 [00:31<00:20, 660.90it/s]"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "\r",
      " 59%|█████▊    | 19246/32777 [00:31<00:20, 651.42it/s]"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "\r",
      " 59%|█████▉    | 19313/32777 [00:31<00:20, 642.53it/s]"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "\r",
      " 59%|█████▉    | 19378/32777 [00:31<00:21, 619.76it/s]"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "\r",
      " 59%|█████▉    | 19441/32777 [00:31<00:22, 597.88it/s]"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "\r",
      " 60%|█████▉    | 19503/32777 [00:31<00:22, 599.18it/s]"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "\r",
      " 60%|█████▉    | 19571/32777 [00:31<00:21, 621.51it/s]"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "\r",
      " 60%|█████▉    | 19634/32777 [00:31<00:22, 596.92it/s]"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "\r",
      " 60%|██████    | 19702/32777 [00:31<00:21, 619.83it/s]"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "\r",
      " 60%|██████    | 19765/32777 [00:32<00:21, 593.00it/s]"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "\r",
      " 61%|██████    | 19833/32777 [00:32<00:21, 614.87it/s]"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "\r",
      " 61%|██████    | 19895/32777 [00:32<00:22, 564.67it/s]"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "\r",
      " 61%|██████    | 19968/32777 [00:32<00:21, 608.92it/s]"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "\r",
      " 61%|██████    | 20031/32777 [00:32<00:22, 555.64it/s]"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "\r",
      " 61%|██████▏   | 20089/32777 [00:32<00:24, 520.39it/s]"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "\r",
      " 61%|██████▏   | 20143/32777 [00:32<00:24, 524.09it/s]"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "\r",
      " 62%|██████▏   | 20197/32777 [00:32<00:24, 516.66it/s]"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "\r",
      " 62%|██████▏   | 20258/32777 [00:33<00:23, 533.79it/s]"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "\r",
      " 62%|██████▏   | 20312/32777 [00:33<00:23, 525.83it/s]"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "\r",
      " 62%|██████▏   | 20367/32777 [00:33<00:23, 528.98it/s]"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "\r",
      " 62%|██████▏   | 20430/32777 [00:33<00:22, 557.17it/s]"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "\r",
      " 63%|██████▎   | 20487/32777 [00:33<00:21, 559.60it/s]"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "\r",
      " 63%|██████▎   | 20558/32777 [00:33<00:20, 602.27it/s]"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "\r",
      " 63%|██████▎   | 20628/32777 [00:33<00:19, 626.75it/s]"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "\r",
      " 63%|██████▎   | 20702/32777 [00:33<00:18, 655.88it/s]"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "\r",
      " 63%|██████▎   | 20778/32777 [00:33<00:17, 680.22it/s]"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "\r",
      " 64%|██████▎   | 20861/32777 [00:33<00:16, 720.74it/s]"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "\r",
      " 64%|██████▍   | 20934/32777 [00:34<00:17, 673.02it/s]"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "\r",
      " 64%|██████▍   | 21002/32777 [00:34<00:18, 638.69it/s]"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "\r",
      " 64%|██████▍   | 21077/32777 [00:34<00:17, 667.71it/s]"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "\r",
      " 65%|██████▍   | 21145/32777 [00:34<00:18, 635.58it/s]"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "\r",
      " 65%|██████▍   | 21225/32777 [00:34<00:16, 680.23it/s]"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "\r",
      " 65%|██████▍   | 21303/32777 [00:34<00:16, 704.19it/s]"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "\r",
      " 65%|██████▌   | 21384/32777 [00:34<00:15, 733.19it/s]"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "\r",
      " 65%|██████▌   | 21461/32777 [00:34<00:15, 741.82it/s]"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "\r",
      " 66%|██████▌   | 21539/32777 [00:34<00:15, 748.18it/s]"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "\r",
      " 66%|██████▌   | 21628/32777 [00:35<00:14, 780.86it/s]"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "\r",
      " 66%|██████▌   | 21714/32777 [00:35<00:13, 798.05it/s]"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "\r",
      " 66%|██████▋   | 21794/32777 [00:35<00:14, 744.64it/s]"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "\r",
      " 67%|██████▋   | 21870/32777 [00:35<00:16, 670.90it/s]"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "\r",
      " 67%|██████▋   | 21939/32777 [00:35<00:17, 632.15it/s]"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "\r",
      " 67%|██████▋   | 22007/32777 [00:35<00:16, 635.92it/s]"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "\r",
      " 67%|██████▋   | 22082/32777 [00:35<00:16, 656.23it/s]"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "\r",
      " 68%|██████▊   | 22149/32777 [00:35<00:18, 567.15it/s]"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "\r",
      " 68%|██████▊   | 22209/32777 [00:36<00:18, 557.41it/s]"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "\r",
      " 68%|██████▊   | 22268/32777 [00:36<00:18, 562.24it/s]"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "\r",
      " 68%|██████▊   | 22326/32777 [00:36<00:18, 563.63it/s]"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "\r",
      " 68%|██████▊   | 22384/32777 [00:36<00:27, 372.13it/s]"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "\r",
      " 68%|██████▊   | 22431/32777 [00:36<00:26, 391.57it/s]"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "\r",
      " 69%|██████▊   | 22496/32777 [00:36<00:22, 447.97it/s]"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "\r",
      " 69%|██████▉   | 22567/32777 [00:36<00:20, 509.26it/s]"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "\r",
      " 69%|██████▉   | 22625/32777 [00:36<00:19, 525.34it/s]"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "\r",
      " 69%|██████▉   | 22688/32777 [00:37<00:18, 549.58it/s]"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "\r",
      " 69%|██████▉   | 22750/32777 [00:37<00:17, 568.35it/s]"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "\r",
      " 70%|██████▉   | 22819/32777 [00:37<00:16, 601.76it/s]"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "\r",
      " 70%|██████▉   | 22911/32777 [00:37<00:14, 691.44it/s]"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "\r",
      " 70%|███████   | 22995/32777 [00:37<00:13, 725.09it/s]"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "\r",
      " 70%|███████   | 23069/32777 [00:37<00:13, 702.44it/s]"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "\r",
      " 71%|███████   | 23162/32777 [00:37<00:12, 754.93it/s]"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "\r",
      " 71%|███████   | 23239/32777 [00:37<00:13, 729.87it/s]"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "\r",
      " 71%|███████   | 23313/32777 [00:37<00:14, 663.45it/s]"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "\r",
      " 71%|███████▏  | 23381/32777 [00:37<00:14, 655.92it/s]"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "\r",
      " 72%|███████▏  | 23448/32777 [00:38<00:14, 659.71it/s]"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "\r",
      " 72%|███████▏  | 23526/32777 [00:38<00:13, 686.59it/s]"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "\r",
      " 72%|███████▏  | 23604/32777 [00:38<00:12, 710.19it/s]"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "\r",
      " 72%|███████▏  | 23676/32777 [00:38<00:13, 675.91it/s]"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "\r",
      " 72%|███████▏  | 23745/32777 [00:38<00:13, 665.16it/s]"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "\r",
      " 73%|███████▎  | 23821/32777 [00:38<00:12, 690.79it/s]"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "\r",
      " 73%|███████▎  | 23926/32777 [00:38<00:11, 793.60it/s]"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "\r",
      " 73%|███████▎  | 24007/32777 [00:38<00:11, 742.60it/s]"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "\r",
      " 73%|███████▎  | 24083/32777 [00:38<00:11, 735.31it/s]"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "\r",
      " 74%|███████▎  | 24158/32777 [00:39<00:13, 643.71it/s]"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "\r",
      " 74%|███████▍  | 24225/32777 [00:39<00:13, 637.79it/s]"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "\r",
      " 74%|███████▍  | 24291/32777 [00:39<00:13, 637.86it/s]"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "\r",
      " 74%|███████▍  | 24371/32777 [00:39<00:12, 678.08it/s]"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "\r",
      " 75%|███████▍  | 24441/32777 [00:39<00:12, 684.19it/s]"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "\r",
      " 75%|███████▍  | 24511/32777 [00:39<00:12, 668.68it/s]"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "\r",
      " 75%|███████▍  | 24581/32777 [00:39<00:12, 670.30it/s]"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "\r",
      " 75%|███████▌  | 24655/32777 [00:39<00:11, 684.54it/s]"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "\r",
      " 75%|███████▌  | 24734/32777 [00:39<00:11, 712.23it/s]"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "\r",
      " 76%|███████▌  | 24822/32777 [00:40<00:10, 756.37it/s]"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "\r",
      " 76%|███████▌  | 24898/32777 [00:40<00:11, 716.02it/s]"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "\r",
      " 76%|███████▌  | 24971/32777 [00:40<00:11, 692.10it/s]"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "\r",
      " 76%|███████▋  | 25048/32777 [00:40<00:10, 710.77it/s]"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "\r",
      " 77%|███████▋  | 25120/32777 [00:40<00:10, 700.11it/s]"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "\r",
      " 77%|███████▋  | 25214/32777 [00:40<00:09, 763.24it/s]"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "\r",
      " 77%|███████▋  | 25315/32777 [00:40<00:09, 823.64it/s]"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "\r",
      " 77%|███████▋  | 25398/32777 [00:40<00:09, 819.63it/s]"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "\r",
      " 78%|███████▊  | 25481/32777 [00:40<00:09, 794.38it/s]"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "\r",
      " 78%|███████▊  | 25572/32777 [00:41<00:08, 825.60it/s]"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "\r",
      " 78%|███████▊  | 25663/32777 [00:41<00:08, 849.97it/s]"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "\r",
      " 79%|███████▊  | 25749/32777 [00:41<00:09, 742.77it/s]"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "\r",
      " 79%|███████▉  | 25826/32777 [00:41<00:09, 745.21it/s]"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "\r",
      " 79%|███████▉  | 25904/32777 [00:41<00:09, 753.96it/s]"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "\r",
      " 79%|███████▉  | 25981/32777 [00:41<00:09, 744.55it/s]"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "\r",
      " 79%|███████▉  | 26057/32777 [00:41<00:09, 676.15it/s]"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "\r",
      " 80%|███████▉  | 26142/32777 [00:41<00:09, 719.62it/s]"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "\r",
      " 80%|████████  | 26259/32777 [00:41<00:07, 840.01it/s]"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "\r",
      " 80%|████████  | 26346/32777 [00:42<00:08, 778.82it/s]"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "\r",
      " 81%|████████  | 26427/32777 [00:42<00:08, 711.46it/s]"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "\r",
      " 81%|████████  | 26501/32777 [00:42<00:08, 705.31it/s]"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "\r",
      " 81%|████████  | 26574/32777 [00:42<00:08, 710.77it/s]"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "\r",
      " 81%|████████▏ | 26647/32777 [00:42<00:08, 714.16it/s]"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "\r",
      " 82%|████████▏ | 26722/32777 [00:42<00:08, 723.92it/s]"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "\r",
      " 82%|████████▏ | 26796/32777 [00:42<00:08, 697.75it/s]"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "\r",
      " 82%|████████▏ | 26868/32777 [00:42<00:08, 697.04it/s]"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "\r",
      " 82%|████████▏ | 26939/32777 [00:42<00:08, 658.61it/s]"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "\r",
      " 82%|████████▏ | 27028/32777 [00:43<00:08, 713.14it/s]"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "\r",
      " 83%|████████▎ | 27101/32777 [00:43<00:08, 645.00it/s]"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "\r",
      " 83%|████████▎ | 27168/32777 [00:43<00:08, 651.58it/s]"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "\r",
      " 83%|████████▎ | 27264/32777 [00:43<00:07, 733.06it/s]"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "\r",
      " 83%|████████▎ | 27339/32777 [00:43<00:07, 711.43it/s]"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "\r",
      " 84%|████████▎ | 27412/32777 [00:43<00:07, 699.06it/s]"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "\r",
      " 84%|████████▍ | 27489/32777 [00:43<00:07, 712.17it/s]"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "\r",
      " 84%|████████▍ | 27582/32777 [00:43<00:06, 771.34it/s]"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "\r",
      " 84%|████████▍ | 27671/32777 [00:43<00:06, 805.15it/s]"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "\r",
      " 85%|████████▍ | 27753/32777 [00:44<00:06, 729.89it/s]"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "\r",
      " 85%|████████▍ | 27828/32777 [00:44<00:06, 734.05it/s]"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "\r",
      " 85%|████████▌ | 27922/32777 [00:44<00:06, 786.34it/s]"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "\r",
      " 85%|████████▌ | 28002/32777 [00:44<00:06, 725.94it/s]"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "\r",
      " 86%|████████▌ | 28095/32777 [00:44<00:06, 776.98it/s]"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "\r",
      " 86%|████████▌ | 28175/32777 [00:44<00:06, 696.05it/s]"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "\r",
      " 86%|████████▌ | 28248/32777 [00:44<00:07, 584.58it/s]"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "\r",
      " 86%|████████▋ | 28311/32777 [00:44<00:08, 551.17it/s]"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "\r",
      " 87%|████████▋ | 28369/32777 [00:45<00:07, 551.70it/s]"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "\r",
      " 87%|████████▋ | 28427/32777 [00:45<00:08, 527.16it/s]"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "\r",
      " 87%|████████▋ | 28509/32777 [00:45<00:07, 596.56it/s]"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "\r",
      " 87%|████████▋ | 28571/32777 [00:45<00:07, 594.63it/s]"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "\r",
      " 87%|████████▋ | 28652/32777 [00:45<00:06, 652.66it/s]"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "\r",
      " 88%|████████▊ | 28721/32777 [00:45<00:06, 657.91it/s]"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "\r",
      " 88%|████████▊ | 28788/32777 [00:45<00:06, 614.79it/s]"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "\r",
      " 88%|████████▊ | 28859/32777 [00:45<00:06, 640.81it/s]"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "\r",
      " 88%|████████▊ | 28925/32777 [00:45<00:06, 596.91it/s]"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "\r",
      " 88%|████████▊ | 28987/32777 [00:46<00:06, 596.73it/s]"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "\r",
      " 89%|████████▊ | 29051/32777 [00:46<00:06, 604.82it/s]"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "\r",
      " 89%|████████▉ | 29121/32777 [00:46<00:05, 627.06it/s]"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "\r",
      " 89%|████████▉ | 29194/32777 [00:46<00:05, 648.91it/s]"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "\r",
      " 89%|████████▉ | 29260/32777 [00:46<00:05, 631.94it/s]"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "\r",
      " 89%|████████▉ | 29324/32777 [00:46<00:05, 632.52it/s]"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "\r",
      " 90%|████████▉ | 29395/32777 [00:46<00:05, 652.90it/s]"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "\r",
      " 90%|████████▉ | 29461/32777 [00:46<00:05, 641.13it/s]"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "\r",
      " 90%|█████████ | 29543/32777 [00:46<00:04, 684.35it/s]"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "\r",
      " 90%|█████████ | 29612/32777 [00:47<00:05, 629.20it/s]"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "\r",
      " 91%|█████████ | 29676/32777 [00:47<00:05, 616.24it/s]"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "\r",
      " 91%|█████████ | 29747/32777 [00:47<00:04, 641.89it/s]"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "\r",
      " 91%|█████████ | 29812/32777 [00:47<00:04, 613.73it/s]"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "\r",
      " 91%|█████████ | 29875/32777 [00:47<00:04, 616.65it/s]"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "\r",
      " 91%|█████████▏| 29941/32777 [00:47<00:04, 621.76it/s]"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "\r",
      " 92%|█████████▏| 30012/32777 [00:47<00:04, 641.87it/s]"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "\r",
      " 92%|█████████▏| 30079/32777 [00:47<00:04, 648.24it/s]"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "\r",
      " 92%|█████████▏| 30145/32777 [00:47<00:04, 638.88it/s]"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "\r",
      " 92%|█████████▏| 30217/32777 [00:47<00:03, 651.75it/s]"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "\r",
      " 92%|█████████▏| 30295/32777 [00:48<00:03, 681.78it/s]"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "\r",
      " 93%|█████████▎| 30368/32777 [00:48<00:03, 691.92it/s]"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "\r",
      " 93%|█████████▎| 30438/32777 [00:48<00:03, 680.84it/s]"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "\r",
      " 93%|█████████▎| 30507/32777 [00:48<00:03, 670.45it/s]"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "\r",
      " 93%|█████████▎| 30575/32777 [00:48<00:03, 659.83it/s]"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "\r",
      " 93%|█████████▎| 30645/32777 [00:48<00:03, 663.20it/s]"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "\r",
      " 94%|█████████▎| 30726/32777 [00:48<00:02, 700.96it/s]"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "\r",
      " 94%|█████████▍| 30820/32777 [00:48<00:02, 769.66it/s]"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "\r",
      " 94%|█████████▍| 30898/32777 [00:48<00:02, 713.99it/s]"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "\r",
      " 94%|█████████▍| 30971/32777 [00:49<00:02, 704.45it/s]"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "\r",
      " 95%|█████████▍| 31055/32777 [00:49<00:02, 732.47it/s]"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "\r",
      " 95%|█████████▍| 31132/32777 [00:49<00:02, 741.61it/s]"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "\r",
      " 95%|█████████▌| 31211/32777 [00:49<00:02, 752.00it/s]"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "\r",
      " 95%|█████████▌| 31297/32777 [00:49<00:01, 774.55it/s]"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "\r",
      " 96%|█████████▌| 31384/32777 [00:49<00:01, 800.45it/s]"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "\r",
      " 96%|█████████▌| 31488/32777 [00:49<00:01, 869.96it/s]"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "\r",
      " 96%|█████████▋| 31576/32777 [00:49<00:01, 815.52it/s]"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "\r",
      " 97%|█████████▋| 31659/32777 [00:49<00:01, 719.78it/s]"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "\r",
      " 97%|█████████▋| 31734/32777 [00:50<00:01, 721.25it/s]"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "\r",
      " 97%|█████████▋| 31821/32777 [00:50<00:01, 757.77it/s]"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "\r",
      " 97%|█████████▋| 31899/32777 [00:50<00:01, 657.72it/s]"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "\r",
      " 98%|█████████▊| 31973/32777 [00:50<00:01, 674.15it/s]"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "\r",
      " 98%|█████████▊| 32043/32777 [00:50<00:01, 631.50it/s]"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "\r",
      " 98%|█████████▊| 32125/32777 [00:50<00:00, 680.44it/s]"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "\r",
      " 98%|█████████▊| 32201/32777 [00:50<00:00, 696.51it/s]"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "\r",
      " 98%|█████████▊| 32273/32777 [00:50<00:00, 688.10it/s]"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "\r",
      " 99%|█████████▊| 32356/32777 [00:50<00:00, 725.71it/s]"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "\r",
      " 99%|█████████▉| 32431/32777 [00:51<00:00, 730.05it/s]"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "\r",
      " 99%|█████████▉| 32532/32777 [00:51<00:00, 809.42it/s]"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "\r",
      "100%|█████████▉| 32614/32777 [00:51<00:00, 783.84it/s]"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "\r",
      "100%|█████████▉| 32711/32777 [00:51<00:00, 833.38it/s]"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "\r",
      "100%|██████████| 32777/32777 [00:51<00:00, 637.32it/s]"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "\n"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n",
      "\n",
      "targets.shape: (65598,)\n",
      "contexts.shape: (65598, 5)\n",
      "labels.shape: (65598, 5)\n"
     ]
    }
   ],
   "source": [
    "targets, contexts, labels = generate_training_data(\n",
    "    sequences=sequences,\n",
    "    window_size=2,\n",
    "    num_ns=4,\n",
    "    vocab_size=vocab_size,\n",
    "    seed=SEED)\n",
    "\n",
    "targets = np.array(targets)\n",
    "contexts = np.array(contexts)\n",
    "labels = np.array(labels)\n",
    "\n",
    "print('\\n')\n",
    "print(f\"targets.shape: {targets.shape}\")\n",
    "print(f\"contexts.shape: {contexts.shape}\")\n",
    "print(f\"labels.shape: {labels.shape}\")\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "97PqsusOFEpc"
   },
   "source": [
    "### 성능을 높이기 위해 데이터세트 구성하기"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "7jnFVySViQTj"
   },
   "source": [
    "잠재적으로 훈련 예제의 많은 수에 대한 효과적인 배치를 수행하려면 `tf.data.Dataset` API를 사용합니다. 이 단계 후, word2vec 모델 훈련을 위한 `(target_word, context_word), (label)` 요소의 `tf.data.Dataset` 객체를 갖게 됩니다!"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 27,
   "metadata": {
    "execution": {
     "iopub.execute_input": "2022-12-14T21:21:12.606906Z",
     "iopub.status.busy": "2022-12-14T21:21:12.606103Z",
     "iopub.status.idle": "2022-12-14T21:21:12.624238Z",
     "shell.execute_reply": "2022-12-14T21:21:12.623536Z"
    },
    "id": "nbu8PxPSnVY2"
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "<BatchDataset element_spec=((TensorSpec(shape=(1024,), dtype=tf.int64, name=None), TensorSpec(shape=(1024, 5), dtype=tf.int64, name=None)), TensorSpec(shape=(1024, 5), dtype=tf.int64, name=None))>\n"
     ]
    }
   ],
   "source": [
    "BATCH_SIZE = 1024\n",
    "BUFFER_SIZE = 10000\n",
    "dataset = tf.data.Dataset.from_tensor_slices(((targets, contexts), labels))\n",
    "dataset = dataset.shuffle(BUFFER_SIZE).batch(BATCH_SIZE, drop_remainder=True)\n",
    "print(dataset)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "tyrNX6Fs6K3F"
   },
   "source": [
    "`Dataset.cache` 및 `Dataset.prefetch`를 적용하여 성능을 개선합니다."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 28,
   "metadata": {
    "execution": {
     "iopub.execute_input": "2022-12-14T21:21:12.627895Z",
     "iopub.status.busy": "2022-12-14T21:21:12.627216Z",
     "iopub.status.idle": "2022-12-14T21:21:12.635004Z",
     "shell.execute_reply": "2022-12-14T21:21:12.634258Z"
    },
    "id": "Y5Ueg6bcFPVL"
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "<PrefetchDataset element_spec=((TensorSpec(shape=(1024,), dtype=tf.int64, name=None), TensorSpec(shape=(1024, 5), dtype=tf.int64, name=None)), TensorSpec(shape=(1024, 5), dtype=tf.int64, name=None))>\n"
     ]
    }
   ],
   "source": [
    "dataset = dataset.cache().prefetch(buffer_size=AUTOTUNE)\n",
    "print(dataset)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "1S-CmUMszyEf"
   },
   "source": [
    "## 모델 및 훈련"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "sQFqaBMPwBqC"
   },
   "source": [
    "word2vec 모델은 분류자로 구현되어 skip-grams의 True 콘텍스트 단어와 네거티브 샘플링을 통해 획득한 False 콘텍스트 단어를 식별할 수 있습니다. 대상 및 콘텍스트 단어의 임베딩 간 내적 곱셈을 수행하여 레이블에 대한 예측을 획득하고 데이터세트의 true 레이블에 대한 손실 함수를 계산할 수 있습니다."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "oc7kTbiwD9sy"
   },
   "source": [
    "### 하위 분류된 word2vec 모델"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "Jvr9pM1G1sQN"
   },
   "source": [
    "[Keras 하위 클래스화 API](https://www.tensorflow.org/guide/keras/custom_layers_and_models)를 사용해 다음 레이어를 통해 word2vec 모델을 정의합니다.\n",
    "\n",
    "- `target_embedding`: 대상 단어로 나타났을 때 단어의 임베딩을 검색하는 `tf.keras.layers.Embedding` 레이어. 이 레이어의 매개변수 수는 `(vocab_size * embedding_dim)`입니다.\n",
    "- `context_embedding`: 콘텍스트 단어로 나타났을 때 단어의 임베딩을 검색하는 다른 `tf.keras.layers.Embedding` 레이어. 이 레이어의 매개변수 수는 `target_embedding`의 매개변수의 수와 같습니다(즉, `(vocab_size * embedding_dim)`).\n",
    "- `dots`: 대상의 내적과 훈련 쌍의 콘텍스트 임베딩을 계산하는 `tf.keras.layers.Dot` 레이어입니다.\n",
    "- `flatten`: `dots` 레이어의 결과를 로짓으로 평면화하는 `tf.keras.layers.Flatten` 레이어입니다.\n",
    "\n",
    "하위 분류된 모델로 해당 임베딩 레이어로 전달될 수 있는 `(target, context)` 쌍을 허용하는 `call()` 함수를 정의할 수 있습니다. `context_embedding`의 형상을 변경해 `target_embedding`로 내적을 수행하고 평면화된 결과를 반환합니다."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "KiAwuIqqw7-7"
   },
   "source": [
    "주요 포인트: `target_embedding` 및 `context_embedding` 레이어 역시 공유될 수 있습니다. 또한 최종 word2vec 임베딩으로 두 임베딩의 연결을 사용할 수도 있습니다."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 29,
   "metadata": {
    "execution": {
     "iopub.execute_input": "2022-12-14T21:21:12.638620Z",
     "iopub.status.busy": "2022-12-14T21:21:12.638257Z",
     "iopub.status.idle": "2022-12-14T21:21:12.644767Z",
     "shell.execute_reply": "2022-12-14T21:21:12.643846Z"
    },
    "id": "i9ec-sS6xd8Z"
   },
   "outputs": [],
   "source": [
    "class Word2Vec(tf.keras.Model):\n",
    "  def __init__(self, vocab_size, embedding_dim):\n",
    "    super(Word2Vec, self).__init__()\n",
    "    self.target_embedding = layers.Embedding(vocab_size,\n",
    "                                      embedding_dim,\n",
    "                                      input_length=1,\n",
    "                                      name=\"w2v_embedding\")\n",
    "    self.context_embedding = layers.Embedding(vocab_size,\n",
    "                                       embedding_dim,\n",
    "                                       input_length=num_ns+1)\n",
    "\n",
    "  def call(self, pair):\n",
    "    target, context = pair\n",
    "    # target: (batch, dummy?)  # The dummy axis doesn't exist in TF2.7+\n",
    "    # context: (batch, context)\n",
    "    if len(target.shape) == 2:\n",
    "      target = tf.squeeze(target, axis=1)\n",
    "    # target: (batch,)\n",
    "    word_emb = self.target_embedding(target)\n",
    "    # word_emb: (batch, embed)\n",
    "    context_emb = self.context_embedding(context)\n",
    "    # context_emb: (batch, context, embed)\n",
    "    dots = tf.einsum('be,bce->bc', word_emb, context_emb)\n",
    "    # dots: (batch, context)\n",
    "    return dots"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "-RLKz9LFECXu"
   },
   "source": [
    "### 손실 함수 정의 및 모델 컴파일\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "I3Md-9QanqBM"
   },
   "source": [
    "단순성을 위해, `tf.keras.losses.CategoricalCrossEntropy`를 네거티브 샘플링 손실에 대한 대안으로 사용할 수 있습니다. 자체 사용자 정의 손실 함수를 작성하고 싶다면 다음을 수행할 수도 있습니다.\n",
    "\n",
    "```python\n",
    "def custom_loss(x_logit, y_true):\n",
    "      return tf.nn.sigmoid_cross_entropy_with_logits(logits=x_logit, labels=y_true)\n",
    "```\n",
    "\n",
    "모델을 빌드할 시간입니다! 128 임베딩 차원으로 word2vec 클래스를 인스턴스화합니다(다른 값으로 실험할 수 있습니다), `tf.keras.optimizers.Adam` 옵티마이저로 모델을 컴파일합니다. "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 30,
   "metadata": {
    "execution": {
     "iopub.execute_input": "2022-12-14T21:21:12.648478Z",
     "iopub.status.busy": "2022-12-14T21:21:12.647844Z",
     "iopub.status.idle": "2022-12-14T21:21:12.670343Z",
     "shell.execute_reply": "2022-12-14T21:21:12.669641Z"
    },
    "id": "ekQg_KbWnnmQ"
   },
   "outputs": [],
   "source": [
    "embedding_dim = 128\n",
    "word2vec = Word2Vec(vocab_size, embedding_dim)\n",
    "word2vec.compile(optimizer='adam',\n",
    "                 loss=tf.keras.losses.CategoricalCrossentropy(from_logits=True),\n",
    "                 metrics=['accuracy'])"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "P3MUMrluqNX2"
   },
   "source": [
    "또한 콜백을 정의하여 TensorBoard에 대한 훈련 통계를 기록합니다."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 31,
   "metadata": {
    "execution": {
     "iopub.execute_input": "2022-12-14T21:21:12.673798Z",
     "iopub.status.busy": "2022-12-14T21:21:12.673230Z",
     "iopub.status.idle": "2022-12-14T21:21:12.676854Z",
     "shell.execute_reply": "2022-12-14T21:21:12.676203Z"
    },
    "id": "9d-ftBCeEZIR"
   },
   "outputs": [],
   "source": [
    "tensorboard_callback = tf.keras.callbacks.TensorBoard(log_dir=\"logs\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "h5wEBotlGZ7B"
   },
   "source": [
    "얼마간의 epoch 동안 `dataset`에서 모델을 훈련합니다."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 32,
   "metadata": {
    "execution": {
     "iopub.execute_input": "2022-12-14T21:21:12.680222Z",
     "iopub.status.busy": "2022-12-14T21:21:12.679646Z",
     "iopub.status.idle": "2022-12-14T21:21:26.688400Z",
     "shell.execute_reply": "2022-12-14T21:21:26.687638Z"
    },
    "id": "gmC1BJalEZIY"
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Epoch 1/20\n"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\r",
      " 1/64 [..............................] - ETA: 1:13 - loss: 1.6091 - accuracy: 0.2207"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\r",
      " 2/64 [..............................] - ETA: 9s - loss: 1.6092 - accuracy: 0.2217  "
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\r",
      " 3/64 [>.............................] - ETA: 9s - loss: 1.6093 - accuracy: 0.2181"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\r",
      " 4/64 [>.............................] - ETA: 9s - loss: 1.6094 - accuracy: 0.2153"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\r",
      " 5/64 [=>............................] - ETA: 9s - loss: 1.6094 - accuracy: 0.2117"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\r",
      " 6/64 [=>............................] - ETA: 9s - loss: 1.6094 - accuracy: 0.2114"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\r",
      " 7/64 [==>...........................] - ETA: 10s - loss: 1.6094 - accuracy: 0.2084"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\r",
      " 8/64 [==>...........................] - ETA: 9s - loss: 1.6093 - accuracy: 0.2111 "
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\r",
      " 9/64 [===>..........................] - ETA: 9s - loss: 1.6094 - accuracy: 0.2096"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\r",
      "10/64 [===>..........................] - ETA: 9s - loss: 1.6093 - accuracy: 0.2119"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\r",
      "11/64 [====>.........................] - ETA: 8s - loss: 1.6093 - accuracy: 0.2125"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\r",
      "12/64 [====>.........................] - ETA: 8s - loss: 1.6093 - accuracy: 0.2142"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\r",
      "13/64 [=====>........................] - ETA: 8s - loss: 1.6093 - accuracy: 0.2154"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\r",
      "14/64 [=====>........................] - ETA: 8s - loss: 1.6092 - accuracy: 0.2169"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\r",
      "15/64 [======>.......................] - ETA: 8s - loss: 1.6092 - accuracy: 0.2173"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\r",
      "16/64 [======>.......................] - ETA: 7s - loss: 1.6092 - accuracy: 0.2160"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\r",
      "17/64 [======>.......................] - ETA: 7s - loss: 1.6092 - accuracy: 0.2148"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\r",
      "18/64 [=======>......................] - ETA: 7s - loss: 1.6092 - accuracy: 0.2150"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\r",
      "19/64 [=======>......................] - ETA: 7s - loss: 1.6092 - accuracy: 0.2149"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\r",
      "20/64 [========>.....................] - ETA: 7s - loss: 1.6092 - accuracy: 0.2154"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\r",
      "21/64 [========>.....................] - ETA: 6s - loss: 1.6091 - accuracy: 0.2150"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\r",
      "22/64 [=========>....................] - ETA: 6s - loss: 1.6091 - accuracy: 0.2158"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\r",
      "23/64 [=========>....................] - ETA: 6s - loss: 1.6091 - accuracy: 0.2151"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\r",
      "24/64 [==========>...................] - ETA: 6s - loss: 1.6091 - accuracy: 0.2153"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\r",
      "25/64 [==========>...................] - ETA: 6s - loss: 1.6091 - accuracy: 0.2153"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\r",
      "26/64 [===========>..................] - ETA: 6s - loss: 1.6090 - accuracy: 0.2165"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\r",
      "27/64 [===========>..................] - ETA: 5s - loss: 1.6090 - accuracy: 0.2170"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\r",
      "28/64 [============>.................] - ETA: 5s - loss: 1.6090 - accuracy: 0.2180"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\r",
      "29/64 [============>.................] - ETA: 5s - loss: 1.6090 - accuracy: 0.2177"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\r",
      "30/64 [=============>................] - ETA: 5s - loss: 1.6090 - accuracy: 0.2180"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\r",
      "31/64 [=============>................] - ETA: 5s - loss: 1.6089 - accuracy: 0.2191"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\r",
      "32/64 [==============>...............] - ETA: 5s - loss: 1.6089 - accuracy: 0.2192"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\r",
      "33/64 [==============>...............] - ETA: 4s - loss: 1.6089 - accuracy: 0.2201"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\r",
      "34/64 [==============>...............] - ETA: 4s - loss: 1.6089 - accuracy: 0.2210"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\r",
      "35/64 [===============>..............] - ETA: 4s - loss: 1.6088 - accuracy: 0.2211"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\r",
      "36/64 [===============>..............] - ETA: 4s - loss: 1.6088 - accuracy: 0.2214"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\r",
      "37/64 [================>.............] - ETA: 4s - loss: 1.6088 - accuracy: 0.2214"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\r",
      "38/64 [================>.............] - ETA: 4s - loss: 1.6088 - accuracy: 0.2219"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\r",
      "39/64 [=================>............] - ETA: 3s - loss: 1.6088 - accuracy: 0.2219"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\r",
      "40/64 [=================>............] - ETA: 3s - loss: 1.6088 - accuracy: 0.2222"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\r",
      "41/64 [==================>...........] - ETA: 3s - loss: 1.6087 - accuracy: 0.2223"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\r",
      "42/64 [==================>...........] - ETA: 3s - loss: 1.6087 - accuracy: 0.2228"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\r",
      "43/64 [===================>..........] - ETA: 3s - loss: 1.6087 - accuracy: 0.2232"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\r",
      "44/64 [===================>..........] - ETA: 3s - loss: 1.6087 - accuracy: 0.2235"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\r",
      "46/64 [====================>.........] - ETA: 2s - loss: 1.6087 - accuracy: 0.2240"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\r",
      "47/64 [=====================>........] - ETA: 2s - loss: 1.6086 - accuracy: 0.2247"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\r",
      "48/64 [=====================>........] - ETA: 2s - loss: 1.6086 - accuracy: 0.2253"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\r",
      "49/64 [=====================>........] - ETA: 2s - loss: 1.6086 - accuracy: 0.2255"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\r",
      "50/64 [======================>.......] - ETA: 2s - loss: 1.6086 - accuracy: 0.2256"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\r",
      "53/64 [=======================>......] - ETA: 1s - loss: 1.6085 - accuracy: 0.2271"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\r",
      "54/64 [========================>.....] - ETA: 1s - loss: 1.6085 - accuracy: 0.2275"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\r",
      "55/64 [========================>.....] - ETA: 1s - loss: 1.6085 - accuracy: 0.2275"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\r",
      "57/64 [=========================>....] - ETA: 1s - loss: 1.6084 - accuracy: 0.2287"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\r",
      "58/64 [==========================>...] - ETA: 0s - loss: 1.6084 - accuracy: 0.2294"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\r",
      "59/64 [==========================>...] - ETA: 0s - loss: 1.6084 - accuracy: 0.2300"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\r",
      "60/64 [===========================>..] - ETA: 0s - loss: 1.6083 - accuracy: 0.2306"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\r",
      "61/64 [===========================>..] - ETA: 0s - loss: 1.6083 - accuracy: 0.2309"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\r",
      "62/64 [============================>.] - ETA: 0s - loss: 1.6083 - accuracy: 0.2314"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\r",
      "63/64 [============================>.] - ETA: 0s - loss: 1.6082 - accuracy: 0.2324"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\r",
      "64/64 [==============================] - ETA: 0s - loss: 1.6082 - accuracy: 0.2331"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\r",
      "64/64 [==============================] - 10s 147ms/step - loss: 1.6082 - accuracy: 0.2331\n"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Epoch 2/20\n"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\r",
      " 1/64 [..............................] - ETA: 0s - loss: 1.5897 - accuracy: 0.7617"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\r",
      "20/64 [========>.....................] - ETA: 0s - loss: 1.5939 - accuracy: 0.5930"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\r",
      "39/64 [=================>............] - ETA: 0s - loss: 1.5920 - accuracy: 0.5697"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\r",
      "59/64 [==========================>...] - ETA: 0s - loss: 1.5893 - accuracy: 0.5529"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\r",
      "64/64 [==============================] - 0s 3ms/step - loss: 1.5883 - accuracy: 0.5508\n"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Epoch 3/20\n"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\r",
      " 1/64 [..............................] - ETA: 0s - loss: 1.5575 - accuracy: 0.7529"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\r",
      "20/64 [========>.....................] - ETA: 0s - loss: 1.5568 - accuracy: 0.6419"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\r",
      "39/64 [=================>............] - ETA: 0s - loss: 1.5493 - accuracy: 0.6127"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\r",
      "58/64 [==========================>...] - ETA: 0s - loss: 1.5415 - accuracy: 0.5931"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\r",
      "64/64 [==============================] - 0s 3ms/step - loss: 1.5387 - accuracy: 0.5880\n"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Epoch 4/20\n"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\r",
      " 1/64 [..............................] - ETA: 0s - loss: 1.4859 - accuracy: 0.6543"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\r",
      "20/64 [========>.....................] - ETA: 0s - loss: 1.4799 - accuracy: 0.5842"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\r",
      "40/64 [=================>............] - ETA: 0s - loss: 1.4681 - accuracy: 0.5700"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\r",
      "59/64 [==========================>...] - ETA: 0s - loss: 1.4573 - accuracy: 0.5638"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\r",
      "64/64 [==============================] - 0s 3ms/step - loss: 1.4541 - accuracy: 0.5629\n"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Epoch 5/20\n"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\r",
      " 1/64 [..............................] - ETA: 0s - loss: 1.3890 - accuracy: 0.6201"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\r",
      "20/64 [========>.....................] - ETA: 0s - loss: 1.3817 - accuracy: 0.5772"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\r",
      "39/64 [=================>............] - ETA: 0s - loss: 1.3699 - accuracy: 0.5728"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\r",
      "59/64 [==========================>...] - ETA: 0s - loss: 1.3592 - accuracy: 0.5713"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\r",
      "64/64 [==============================] - 0s 3ms/step - loss: 1.3560 - accuracy: 0.5719\n"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Epoch 6/20\n"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\r",
      " 1/64 [..............................] - ETA: 0s - loss: 1.2888 - accuracy: 0.6328"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\r",
      "20/64 [========>.....................] - ETA: 0s - loss: 1.2833 - accuracy: 0.5996"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\r",
      "39/64 [=================>............] - ETA: 0s - loss: 1.2727 - accuracy: 0.5994"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\r",
      "58/64 [==========================>...] - ETA: 0s - loss: 1.2640 - accuracy: 0.5994"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\r",
      "64/64 [==============================] - 0s 3ms/step - loss: 1.2605 - accuracy: 0.6006\n"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Epoch 7/20\n"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\r",
      " 1/64 [..............................] - ETA: 0s - loss: 1.1948 - accuracy: 0.6641"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\r",
      "20/64 [========>.....................] - ETA: 0s - loss: 1.1921 - accuracy: 0.6317"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\r",
      "40/64 [=================>............] - ETA: 0s - loss: 1.1826 - accuracy: 0.6324"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\r",
      "60/64 [===========================>..] - ETA: 0s - loss: 1.1743 - accuracy: 0.6340"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\r",
      "64/64 [==============================] - 0s 3ms/step - loss: 1.1717 - accuracy: 0.6352\n"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Epoch 8/20\n"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\r",
      " 1/64 [..............................] - ETA: 0s - loss: 1.1081 - accuracy: 0.6895"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\r",
      "20/64 [========>.....................] - ETA: 0s - loss: 1.1079 - accuracy: 0.6668"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\r",
      "39/64 [=================>............] - ETA: 0s - loss: 1.0987 - accuracy: 0.6680"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\r",
      "58/64 [==========================>...] - ETA: 0s - loss: 1.0924 - accuracy: 0.6682"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\r",
      "64/64 [==============================] - 0s 3ms/step - loss: 1.0892 - accuracy: 0.6698\n"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Epoch 9/20\n"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\r",
      " 1/64 [..............................] - ETA: 0s - loss: 1.0278 - accuracy: 0.7207"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\r",
      "20/64 [========>.....................] - ETA: 0s - loss: 1.0297 - accuracy: 0.6993"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\r",
      "39/64 [=================>............] - ETA: 0s - loss: 1.0207 - accuracy: 0.7008"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\r",
      "58/64 [==========================>...] - ETA: 0s - loss: 1.0152 - accuracy: 0.7020"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\r",
      "64/64 [==============================] - 0s 3ms/step - loss: 1.0121 - accuracy: 0.7037\n"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Epoch 10/20\n"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\r",
      " 1/64 [..............................] - ETA: 0s - loss: 0.9531 - accuracy: 0.7422"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\r",
      "20/64 [========>.....................] - ETA: 0s - loss: 0.9570 - accuracy: 0.7291"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\r",
      "39/64 [=================>............] - ETA: 0s - loss: 0.9480 - accuracy: 0.7311"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\r",
      "58/64 [==========================>...] - ETA: 0s - loss: 0.9432 - accuracy: 0.7322"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\r",
      "64/64 [==============================] - 0s 3ms/step - loss: 0.9403 - accuracy: 0.7342\n"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Epoch 11/20\n"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\r",
      " 1/64 [..............................] - ETA: 0s - loss: 0.8835 - accuracy: 0.7686"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\r",
      "20/64 [========>.....................] - ETA: 0s - loss: 0.8893 - accuracy: 0.7554"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\r",
      "38/64 [================>.............] - ETA: 0s - loss: 0.8808 - accuracy: 0.7585"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\r",
      "57/64 [=========================>....] - ETA: 0s - loss: 0.8765 - accuracy: 0.7602"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\r",
      "64/64 [==============================] - 0s 3ms/step - loss: 0.8735 - accuracy: 0.7619\n"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Epoch 12/20\n"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\r",
      " 1/64 [..............................] - ETA: 0s - loss: 0.8190 - accuracy: 0.7949"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\r",
      "19/64 [=======>......................] - ETA: 0s - loss: 0.8250 - accuracy: 0.7809"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\r",
      "39/64 [=================>............] - ETA: 0s - loss: 0.8177 - accuracy: 0.7824"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\r",
      "58/64 [==========================>...] - ETA: 0s - loss: 0.8142 - accuracy: 0.7836"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\r",
      "64/64 [==============================] - 0s 3ms/step - loss: 0.8116 - accuracy: 0.7850\n"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Epoch 13/20\n"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\r",
      " 1/64 [..............................] - ETA: 0s - loss: 0.7594 - accuracy: 0.8105"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\r",
      "20/64 [========>.....................] - ETA: 0s - loss: 0.7686 - accuracy: 0.7988"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\r",
      "39/64 [=================>............] - ETA: 0s - loss: 0.7598 - accuracy: 0.8022"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\r",
      "59/64 [==========================>...] - ETA: 0s - loss: 0.7567 - accuracy: 0.8026"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\r",
      "64/64 [==============================] - 0s 3ms/step - loss: 0.7545 - accuracy: 0.8042\n"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Epoch 14/20\n"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\r",
      " 1/64 [..............................] - ETA: 0s - loss: 0.7047 - accuracy: 0.8291"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\r",
      "21/64 [========>.....................] - ETA: 0s - loss: 0.7139 - accuracy: 0.8175"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\r",
      "40/64 [=================>............] - ETA: 0s - loss: 0.7072 - accuracy: 0.8202"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\r",
      "59/64 [==========================>...] - ETA: 0s - loss: 0.7041 - accuracy: 0.8208"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\r",
      "64/64 [==============================] - 0s 3ms/step - loss: 0.7021 - accuracy: 0.8224\n"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Epoch 15/20\n"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\r",
      " 1/64 [..............................] - ETA: 0s - loss: 0.6546 - accuracy: 0.8447"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\r",
      "20/64 [========>.....................] - ETA: 0s - loss: 0.6663 - accuracy: 0.8325"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\r",
      "39/64 [=================>............] - ETA: 0s - loss: 0.6579 - accuracy: 0.8354"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\r",
      "58/64 [==========================>...] - ETA: 0s - loss: 0.6562 - accuracy: 0.8362"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\r",
      "64/64 [==============================] - 0s 3ms/step - loss: 0.6542 - accuracy: 0.8377\n"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Epoch 16/20\n"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\r",
      " 1/64 [..............................] - ETA: 0s - loss: 0.6090 - accuracy: 0.8594"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\r",
      "20/64 [========>.....................] - ETA: 0s - loss: 0.6216 - accuracy: 0.8468"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\r",
      "39/64 [=================>............] - ETA: 0s - loss: 0.6135 - accuracy: 0.8492"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\r",
      "57/64 [=========================>....] - ETA: 0s - loss: 0.6125 - accuracy: 0.8497"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\r",
      "64/64 [==============================] - 0s 3ms/step - loss: 0.6104 - accuracy: 0.8513\n"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Epoch 17/20\n"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\r",
      " 1/64 [..............................] - ETA: 0s - loss: 0.5676 - accuracy: 0.8691"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\r",
      "20/64 [========>.....................] - ETA: 0s - loss: 0.5807 - accuracy: 0.8614"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\r",
      "39/64 [=================>............] - ETA: 0s - loss: 0.5730 - accuracy: 0.8631"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\r",
      "58/64 [==========================>...] - ETA: 0s - loss: 0.5722 - accuracy: 0.8630"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\r",
      "64/64 [==============================] - 0s 3ms/step - loss: 0.5705 - accuracy: 0.8643\n"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Epoch 18/20\n"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\r",
      " 1/64 [..............................] - ETA: 0s - loss: 0.5301 - accuracy: 0.8750"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\r",
      "19/64 [=======>......................] - ETA: 0s - loss: 0.5422 - accuracy: 0.8730"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\r",
      "37/64 [================>.............] - ETA: 0s - loss: 0.5367 - accuracy: 0.8736"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\r",
      "55/64 [========================>.....] - ETA: 0s - loss: 0.5359 - accuracy: 0.8736"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\r",
      "64/64 [==============================] - 0s 3ms/step - loss: 0.5342 - accuracy: 0.8746\n"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Epoch 19/20\n"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\r",
      " 1/64 [..............................] - ETA: 0s - loss: 0.4962 - accuracy: 0.8926"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\r",
      "19/64 [=======>......................] - ETA: 0s - loss: 0.5084 - accuracy: 0.8834"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\r",
      "37/64 [================>.............] - ETA: 0s - loss: 0.5031 - accuracy: 0.8842"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\r",
      "55/64 [========================>.....] - ETA: 0s - loss: 0.5028 - accuracy: 0.8840"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\r",
      "64/64 [==============================] - 0s 3ms/step - loss: 0.5011 - accuracy: 0.8848\n"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Epoch 20/20\n"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\r",
      " 1/64 [..............................] - ETA: 0s - loss: 0.4655 - accuracy: 0.9014"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\r",
      "19/64 [=======>......................] - ETA: 0s - loss: 0.4776 - accuracy: 0.8919"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\r",
      "38/64 [================>.............] - ETA: 0s - loss: 0.4721 - accuracy: 0.8927"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\r",
      "57/64 [=========================>....] - ETA: 0s - loss: 0.4726 - accuracy: 0.8921"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\r",
      "64/64 [==============================] - 0s 3ms/step - loss: 0.4711 - accuracy: 0.8930\n"
     ]
    },
    {
     "data": {
      "text/plain": [
       "<keras.callbacks.History at 0x7f22a8241c40>"
      ]
     },
     "execution_count": 32,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "word2vec.fit(dataset, epochs=20, callbacks=[tensorboard_callback])"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "wze38jG57XvZ"
   },
   "source": [
    "TensorBoard는 이제 word2vec 모델의 정확성과 손실을 표시합니다."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "id": "22E9eqS55rgz"
   },
   "outputs": [],
   "source": [
    "#docs_infra: no_execute\n",
    "%tensorboard --logdir logs"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "awF3iRQCZOLj"
   },
   "source": [
    "<!-- <img class=\"tfo-display-only-on-site\" src=\"images/word2vec_tensorboard.png\"/> -->"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "TaDW2tIIz8fL"
   },
   "source": [
    "## 임베딩 검색 및 분석"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "Zp5rv01WG2YA"
   },
   "source": [
    "`Model.get_layer` 및 `Layer.get_weights`을 사용해 모델에서 가중치를 얻습니다. `TextVectorization.get_vocabulary` 함수는 어휘를 제공하여 라인당 하나의 토큰으로 메타데이터 파일을 빌드합니다."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 33,
   "metadata": {
    "execution": {
     "iopub.execute_input": "2022-12-14T21:21:26.692486Z",
     "iopub.status.busy": "2022-12-14T21:21:26.691948Z",
     "iopub.status.idle": "2022-12-14T21:21:26.704866Z",
     "shell.execute_reply": "2022-12-14T21:21:26.704104Z"
    },
    "id": "_Uamp1YH8RzU"
   },
   "outputs": [],
   "source": [
    "weights = word2vec.get_layer('w2v_embedding').get_weights()[0]\n",
    "vocab = vectorize_layer.get_vocabulary()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "gWzdmUzS8Sl4"
   },
   "source": [
    "벡터 및 메타데이터 파일을 생성하고 저장합니다."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 34,
   "metadata": {
    "execution": {
     "iopub.execute_input": "2022-12-14T21:21:26.708745Z",
     "iopub.status.busy": "2022-12-14T21:21:26.708135Z",
     "iopub.status.idle": "2022-12-14T21:21:27.020970Z",
     "shell.execute_reply": "2022-12-14T21:21:27.020196Z"
    },
    "id": "VLIahl9s53XT"
   },
   "outputs": [],
   "source": [
    "out_v = io.open('vectors.tsv', 'w', encoding='utf-8')\n",
    "out_m = io.open('metadata.tsv', 'w', encoding='utf-8')\n",
    "\n",
    "for index, word in enumerate(vocab):\n",
    "  if index == 0:\n",
    "    continue  # skip 0, it's padding.\n",
    "  vec = weights[index]\n",
    "  out_v.write('\\t'.join([str(x) for x in vec]) + \"\\n\")\n",
    "  out_m.write(word + \"\\n\")\n",
    "out_v.close()\n",
    "out_m.close()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "1T8KcThhIU8-"
   },
   "source": [
    "`vectors.tsv` 및 `metadata.tsv`를 다운로드하여 [Embedding Projector](https://projector.tensorflow.org/)에서 획득한 임베딩을 분석합니다."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 35,
   "metadata": {
    "execution": {
     "iopub.execute_input": "2022-12-14T21:21:27.025522Z",
     "iopub.status.busy": "2022-12-14T21:21:27.024884Z",
     "iopub.status.idle": "2022-12-14T21:21:27.028950Z",
     "shell.execute_reply": "2022-12-14T21:21:27.028293Z"
    },
    "id": "lUsjQOKMIV2z"
   },
   "outputs": [],
   "source": [
    "try:\n",
    "  from google.colab import files\n",
    "  files.download('vectors.tsv')\n",
    "  files.download('metadata.tsv')\n",
    "except Exception:\n",
    "  pass"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "iS_uMeMw3Xpj"
   },
   "source": [
    "## 다음 단계\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "BSgAZpwF5xF_"
   },
   "source": [
    "이 튜토리얼은 처음부터 네거티브 샘플링으로 skip-gram word2vec 모델을 구현하고 획득한 단어 임베딩을 시각화하는 방법을 보여주었습니다.\n",
    "\n",
    "- 단어 벡터와 수학적 표현에 대해 더 자세히 알아보려면 이러한 [참고](https://web.stanford.edu/class/cs224n/readings/cs224n-2019-notes01-wordvecs1.pdf)를 참조하세요.\n",
    "\n",
    "- 고급 텍스트 처리에 대해 더 자세히 알아보려면 [언어 이해를 위한 트랜스포머 모델](https://www.tensorflow.org/tutorials/text/transformer) 튜토리얼을 읽으세요.\n",
    "\n",
    "- 사전 훈련된 임베딩 모델에 관심이 있다면 [TF-Hub CORD-19 Swivel 임베딩 탐색](https://www.tensorflow.org/hub/tutorials/cord_19_embeddings_keras) 또는 [다국어 범용 문장 인코더](https://www.tensorflow.org/hub/tutorials/cross_lingual_similarity_with_tf_hub_multilingual_universal_encoder)에 관심이 있을 수 있습니다.\n",
    "\n",
    "- 또한 새로운 데이터세트에서 모델을 훈련하고 싶을 수도 있습니다([TensorFlow 데이터세트](https://www.tensorflow.org/datasets)에서 많은 것이 가능합니다).\n"
   ]
  }
 ],
 "metadata": {
  "colab": {
   "collapsed_sections": [],
   "name": "word2vec.ipynb",
   "toc_visible": true
  },
  "kernelspec": {
   "display_name": "Python 3",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.9.16"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 0
}