{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "_as3tyDPAvzM"
   },
   "source": [
    "##### Copyright 2021 The TensorFlow Authors."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "metadata": {
    "cellView": "form",
    "execution": {
     "iopub.execute_input": "2024-07-19T12:50:42.457611Z",
     "iopub.status.busy": "2024-07-19T12:50:42.457384Z",
     "iopub.status.idle": "2024-07-19T12:50:42.461072Z",
     "shell.execute_reply": "2024-07-19T12:50:42.460473Z"
    },
    "id": "-CoWjX1EBXJX"
   },
   "outputs": [],
   "source": [
    "#@title Licensed under the Apache License, Version 2.0 (the \"License\");\n",
    "# you may not use this file except in compliance with the License.\n",
    "# You may obtain a copy of the License at\n",
    "#\n",
    "# https://www.apache.org/licenses/LICENSE-2.0\n",
    "#\n",
    "# Unless required by applicable law or agreed to in writing, software\n",
    "# distributed under the License is distributed on an \"AS IS\" BASIS,\n",
    "# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n",
    "# See the License for the specific language governing permissions and\n",
    "# limitations under the License."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "7hQmWrtkBBQB"
   },
   "source": [
    "# Converting TensorFlow Text operators to TensorFlow Lite"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "qmGnheU8BPKN"
   },
   "source": [
    "<table class=\"tfo-notebook-buttons\" align=\"left\">\n",
    "  <td>\n",
    "    <a target=\"_blank\" href=\"https://www.tensorflow.org/text/guide/text_tf_lite\"><img src=\"https://www.tensorflow.org/images/tf_logo_32px.png\" />View on TensorFlow.org</a>\n",
    "  </td>\n",
    "  <td>\n",
    "    <a target=\"_blank\" href=\"https://colab.research.google.com/github/tensorflow/text/blob/master/docs/guide/text_tf_lite.ipynb\"><img src=\"https://www.tensorflow.org/images/colab_logo_32px.png\" />Run in Google Colab</a>\n",
    "  </td>\n",
    "  <td>\n",
    "    <a target=\"_blank\" href=\"https://github.com/tensorflow/text/blob/master/docs/guide/text_tf_lite.ipynb\"><img src=\"https://www.tensorflow.org/images/GitHub-Mark-32px.png\" />View on GitHub</a>\n",
    "  </td>\n",
    "  <td>\n",
    "    <a href=\"https://storage.googleapis.com/tensorflow_docs/text/docs/guide/text_tf_lite.ipynb\"><img src=\"https://www.tensorflow.org/images/download_logo_32px.png\" />Download notebook</a>\n",
    "  </td>\n",
    "</table>"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "hz1hOEHPTF2n"
   },
   "source": [
    "## Overview\n",
    "\n",
    "Machine learning models are frequently deployed using TensorFlow Lite to mobile, embedded, and IoT devices to improve data privacy and lower response times. These models often require support for text processing operations. TensorFlow Text version 2.7 and higher provides improved performance, reduced binary sizes, and operations specifically optimized for use in these environments.\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "_mdIyFfqTMjc"
   },
   "source": [
    "## Text operators\n",
    "\n",
    "The following TensorFlow Text classes and functions can be used from within a TensorFlow Lite model.\n",
    "\n",
    "* `text.ByteSplitter`\n",
    "* `text.FastBertNormalizer`\n",
    "* `text.FastBertTokenizer`\n",
    "* `text.FastWordpieceTokenizer`\n",
    "* `text.FastSentencepieceTokenizer`\n",
    "* `text.WhitespaceTokenizer`\n",
    "* `text.ngrams`\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "x6NAs1fcUwUn"
   },
   "source": [
    "## Model Example"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "metadata": {
    "execution": {
     "iopub.execute_input": "2024-07-19T12:50:42.464709Z",
     "iopub.status.busy": "2024-07-19T12:50:42.464493Z",
     "iopub.status.idle": "2024-07-19T12:51:07.301309Z",
     "shell.execute_reply": "2024-07-19T12:51:07.300465Z"
    },
    "id": "8ZalFZQvTJf5"
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Collecting tensorflow-text==2.11.*\r\n",
      "  Using cached tensorflow_text-2.11.0-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata (2.0 kB)\r\n",
      "Requirement already satisfied: tensorflow-hub>=0.8.0 in /tmpfs/src/tf_docs_env/lib/python3.9/site-packages (from tensorflow-text==2.11.*) (0.16.1)\r\n"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Collecting tensorflow<2.12,>=2.11.0 (from tensorflow-text==2.11.*)\r\n",
      "  Using cached tensorflow-2.11.1-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata (3.1 kB)\r\n",
      "Requirement already satisfied: absl-py>=1.0.0 in /tmpfs/src/tf_docs_env/lib/python3.9/site-packages (from tensorflow<2.12,>=2.11.0->tensorflow-text==2.11.*) (2.1.0)\r\n",
      "Requirement already satisfied: astunparse>=1.6.0 in /tmpfs/src/tf_docs_env/lib/python3.9/site-packages (from tensorflow<2.12,>=2.11.0->tensorflow-text==2.11.*) (1.6.3)\r\n",
      "Requirement already satisfied: flatbuffers>=2.0 in /tmpfs/src/tf_docs_env/lib/python3.9/site-packages (from tensorflow<2.12,>=2.11.0->tensorflow-text==2.11.*) (24.3.25)\r\n"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Collecting gast<=0.4.0,>=0.2.1 (from tensorflow<2.12,>=2.11.0->tensorflow-text==2.11.*)\r\n",
      "  Using cached gast-0.4.0-py3-none-any.whl.metadata (1.1 kB)\r\n",
      "Requirement already satisfied: google-pasta>=0.1.1 in /tmpfs/src/tf_docs_env/lib/python3.9/site-packages (from tensorflow<2.12,>=2.11.0->tensorflow-text==2.11.*) (0.2.0)\r\n",
      "Requirement already satisfied: grpcio<2.0,>=1.24.3 in /tmpfs/src/tf_docs_env/lib/python3.9/site-packages (from tensorflow<2.12,>=2.11.0->tensorflow-text==2.11.*) (1.65.1)\r\n",
      "Requirement already satisfied: h5py>=2.9.0 in /tmpfs/src/tf_docs_env/lib/python3.9/site-packages (from tensorflow<2.12,>=2.11.0->tensorflow-text==2.11.*) (3.11.0)\r\n",
      "Collecting keras<2.12,>=2.11.0 (from tensorflow<2.12,>=2.11.0->tensorflow-text==2.11.*)\r\n",
      "  Using cached keras-2.11.0-py2.py3-none-any.whl.metadata (1.4 kB)\r\n",
      "Requirement already satisfied: libclang>=13.0.0 in /tmpfs/src/tf_docs_env/lib/python3.9/site-packages (from tensorflow<2.12,>=2.11.0->tensorflow-text==2.11.*) (18.1.1)\r\n",
      "Requirement already satisfied: numpy>=1.20 in /tmpfs/src/tf_docs_env/lib/python3.9/site-packages (from tensorflow<2.12,>=2.11.0->tensorflow-text==2.11.*) (1.26.4)\r\n",
      "Requirement already satisfied: opt-einsum>=2.3.2 in /tmpfs/src/tf_docs_env/lib/python3.9/site-packages (from tensorflow<2.12,>=2.11.0->tensorflow-text==2.11.*) (3.3.0)\r\n",
      "Requirement already satisfied: packaging in /tmpfs/src/tf_docs_env/lib/python3.9/site-packages (from tensorflow<2.12,>=2.11.0->tensorflow-text==2.11.*) (24.1)\r\n"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Collecting protobuf<3.20,>=3.9.2 (from tensorflow<2.12,>=2.11.0->tensorflow-text==2.11.*)\r\n",
      "  Using cached protobuf-3.19.6-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata (787 bytes)\r\n",
      "Requirement already satisfied: setuptools in /tmpfs/src/tf_docs_env/lib/python3.9/site-packages (from tensorflow<2.12,>=2.11.0->tensorflow-text==2.11.*) (71.0.3)\r\n",
      "Requirement already satisfied: six>=1.12.0 in /tmpfs/src/tf_docs_env/lib/python3.9/site-packages (from tensorflow<2.12,>=2.11.0->tensorflow-text==2.11.*) (1.16.0)\r\n"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Collecting tensorboard<2.12,>=2.11 (from tensorflow<2.12,>=2.11.0->tensorflow-text==2.11.*)\r\n",
      "  Using cached tensorboard-2.11.2-py3-none-any.whl.metadata (1.9 kB)\r\n"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Collecting tensorflow-estimator<2.12,>=2.11.0 (from tensorflow<2.12,>=2.11.0->tensorflow-text==2.11.*)\r\n",
      "  Using cached tensorflow_estimator-2.11.0-py2.py3-none-any.whl.metadata (1.3 kB)\r\n",
      "Requirement already satisfied: termcolor>=1.1.0 in /tmpfs/src/tf_docs_env/lib/python3.9/site-packages (from tensorflow<2.12,>=2.11.0->tensorflow-text==2.11.*) (2.4.0)\r\n",
      "Requirement already satisfied: typing-extensions>=3.6.6 in /tmpfs/src/tf_docs_env/lib/python3.9/site-packages (from tensorflow<2.12,>=2.11.0->tensorflow-text==2.11.*) (4.12.2)\r\n",
      "Requirement already satisfied: wrapt>=1.11.0 in /tmpfs/src/tf_docs_env/lib/python3.9/site-packages (from tensorflow<2.12,>=2.11.0->tensorflow-text==2.11.*) (1.16.0)\r\n",
      "Requirement already satisfied: tensorflow-io-gcs-filesystem>=0.23.1 in /tmpfs/src/tf_docs_env/lib/python3.9/site-packages (from tensorflow<2.12,>=2.11.0->tensorflow-text==2.11.*) (0.37.1)\r\n",
      "Requirement already satisfied: tf-keras>=2.14.1 in /tmpfs/src/tf_docs_env/lib/python3.9/site-packages (from tensorflow-hub>=0.8.0->tensorflow-text==2.11.*) (2.17.0)\r\n",
      "Requirement already satisfied: wheel<1.0,>=0.23.0 in /tmpfs/src/tf_docs_env/lib/python3.9/site-packages (from astunparse>=1.6.0->tensorflow<2.12,>=2.11.0->tensorflow-text==2.11.*) (0.43.0)\r\n"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Collecting google-auth<3,>=1.6.3 (from tensorboard<2.12,>=2.11->tensorflow<2.12,>=2.11.0->tensorflow-text==2.11.*)\r\n",
      "  Using cached google_auth-2.32.0-py2.py3-none-any.whl.metadata (4.7 kB)\r\n",
      "Collecting google-auth-oauthlib<0.5,>=0.4.1 (from tensorboard<2.12,>=2.11->tensorflow<2.12,>=2.11.0->tensorflow-text==2.11.*)\r\n",
      "  Using cached google_auth_oauthlib-0.4.6-py2.py3-none-any.whl.metadata (2.7 kB)\r\n",
      "Requirement already satisfied: markdown>=2.6.8 in /tmpfs/src/tf_docs_env/lib/python3.9/site-packages (from tensorboard<2.12,>=2.11->tensorflow<2.12,>=2.11.0->tensorflow-text==2.11.*) (3.6)\r\n"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Requirement already satisfied: requests<3,>=2.21.0 in /tmpfs/src/tf_docs_env/lib/python3.9/site-packages (from tensorboard<2.12,>=2.11->tensorflow<2.12,>=2.11.0->tensorflow-text==2.11.*) (2.32.3)\r\n",
      "Collecting tensorboard-data-server<0.7.0,>=0.6.0 (from tensorboard<2.12,>=2.11->tensorflow<2.12,>=2.11.0->tensorflow-text==2.11.*)\r\n",
      "  Using cached tensorboard_data_server-0.6.1-py3-none-manylinux2010_x86_64.whl.metadata (1.1 kB)\r\n"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Collecting tensorboard-plugin-wit>=1.6.0 (from tensorboard<2.12,>=2.11->tensorflow<2.12,>=2.11.0->tensorflow-text==2.11.*)\r\n",
      "  Using cached tensorboard_plugin_wit-1.8.1-py3-none-any.whl.metadata (873 bytes)\r\n",
      "Requirement already satisfied: werkzeug>=1.0.1 in /tmpfs/src/tf_docs_env/lib/python3.9/site-packages (from tensorboard<2.12,>=2.11->tensorflow<2.12,>=2.11.0->tensorflow-text==2.11.*) (3.0.3)\r\n",
      "INFO: pip is looking at multiple versions of tf-keras to determine which version is compatible with other requirements. This could take a while.\r\n",
      "Collecting tf-keras>=2.14.1 (from tensorflow-hub>=0.8.0->tensorflow-text==2.11.*)\r\n",
      "  Using cached tf_keras-2.16.0-py3-none-any.whl.metadata (1.6 kB)\r\n"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "  Using cached tf_keras-2.15.1-py3-none-any.whl.metadata (1.7 kB)\r\n",
      "  Using cached tf_keras-2.15.0-py3-none-any.whl.metadata (1.6 kB)\r\n",
      "Requirement already satisfied: cachetools<6.0,>=2.0.0 in /tmpfs/src/tf_docs_env/lib/python3.9/site-packages (from google-auth<3,>=1.6.3->tensorboard<2.12,>=2.11->tensorflow<2.12,>=2.11.0->tensorflow-text==2.11.*) (5.4.0)\r\n"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Collecting pyasn1-modules>=0.2.1 (from google-auth<3,>=1.6.3->tensorboard<2.12,>=2.11->tensorflow<2.12,>=2.11.0->tensorflow-text==2.11.*)\r\n",
      "  Using cached pyasn1_modules-0.4.0-py3-none-any.whl.metadata (3.4 kB)\r\n",
      "Collecting rsa<5,>=3.1.4 (from google-auth<3,>=1.6.3->tensorboard<2.12,>=2.11->tensorflow<2.12,>=2.11.0->tensorflow-text==2.11.*)\r\n",
      "  Using cached rsa-4.9-py3-none-any.whl.metadata (4.2 kB)\r\n"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Collecting requests-oauthlib>=0.7.0 (from google-auth-oauthlib<0.5,>=0.4.1->tensorboard<2.12,>=2.11->tensorflow<2.12,>=2.11.0->tensorflow-text==2.11.*)\r\n",
      "  Using cached requests_oauthlib-2.0.0-py2.py3-none-any.whl.metadata (11 kB)\r\n",
      "Requirement already satisfied: importlib-metadata>=4.4 in /tmpfs/src/tf_docs_env/lib/python3.9/site-packages (from markdown>=2.6.8->tensorboard<2.12,>=2.11->tensorflow<2.12,>=2.11.0->tensorflow-text==2.11.*) (8.0.0)\r\n"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Requirement already satisfied: charset-normalizer<4,>=2 in /tmpfs/src/tf_docs_env/lib/python3.9/site-packages (from requests<3,>=2.21.0->tensorboard<2.12,>=2.11->tensorflow<2.12,>=2.11.0->tensorflow-text==2.11.*) (3.3.2)\r\n",
      "Requirement already satisfied: idna<4,>=2.5 in /tmpfs/src/tf_docs_env/lib/python3.9/site-packages (from requests<3,>=2.21.0->tensorboard<2.12,>=2.11->tensorflow<2.12,>=2.11.0->tensorflow-text==2.11.*) (3.7)\r\n",
      "Requirement already satisfied: urllib3<3,>=1.21.1 in /tmpfs/src/tf_docs_env/lib/python3.9/site-packages (from requests<3,>=2.21.0->tensorboard<2.12,>=2.11->tensorflow<2.12,>=2.11.0->tensorflow-text==2.11.*) (2.2.2)\r\n",
      "Requirement already satisfied: certifi>=2017.4.17 in /tmpfs/src/tf_docs_env/lib/python3.9/site-packages (from requests<3,>=2.21.0->tensorboard<2.12,>=2.11->tensorflow<2.12,>=2.11.0->tensorflow-text==2.11.*) (2024.7.4)\r\n",
      "Requirement already satisfied: MarkupSafe>=2.1.1 in /tmpfs/src/tf_docs_env/lib/python3.9/site-packages (from werkzeug>=1.0.1->tensorboard<2.12,>=2.11->tensorflow<2.12,>=2.11.0->tensorflow-text==2.11.*) (2.1.5)\r\n",
      "Requirement already satisfied: zipp>=0.5 in /tmpfs/src/tf_docs_env/lib/python3.9/site-packages (from importlib-metadata>=4.4->markdown>=2.6.8->tensorboard<2.12,>=2.11->tensorflow<2.12,>=2.11.0->tensorflow-text==2.11.*) (3.19.2)\r\n"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Collecting pyasn1<0.7.0,>=0.4.6 (from pyasn1-modules>=0.2.1->google-auth<3,>=1.6.3->tensorboard<2.12,>=2.11->tensorflow<2.12,>=2.11.0->tensorflow-text==2.11.*)\r\n",
      "  Using cached pyasn1-0.6.0-py2.py3-none-any.whl.metadata (8.3 kB)\r\n",
      "Collecting oauthlib>=3.0.0 (from requests-oauthlib>=0.7.0->google-auth-oauthlib<0.5,>=0.4.1->tensorboard<2.12,>=2.11->tensorflow<2.12,>=2.11.0->tensorflow-text==2.11.*)\r\n",
      "  Using cached oauthlib-3.2.2-py3-none-any.whl.metadata (7.5 kB)\r\n"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Using cached tensorflow_text-2.11.0-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (5.8 MB)\r\n",
      "Using cached tensorflow-2.11.1-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (588.3 MB)\r\n"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Using cached gast-0.4.0-py3-none-any.whl (9.8 kB)\r\n",
      "Using cached keras-2.11.0-py2.py3-none-any.whl (1.7 MB)\r\n",
      "Using cached protobuf-3.19.6-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (1.1 MB)\r\n",
      "Using cached tensorboard-2.11.2-py3-none-any.whl (6.0 MB)\r\n",
      "Using cached tensorflow_estimator-2.11.0-py2.py3-none-any.whl (439 kB)\r\n",
      "Using cached tf_keras-2.15.0-py3-none-any.whl (1.7 MB)\r\n",
      "Using cached google_auth-2.32.0-py2.py3-none-any.whl (195 kB)\r\n",
      "Using cached google_auth_oauthlib-0.4.6-py2.py3-none-any.whl (18 kB)\r\n",
      "Using cached tensorboard_data_server-0.6.1-py3-none-manylinux2010_x86_64.whl (4.9 MB)\r\n"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Using cached tensorboard_plugin_wit-1.8.1-py3-none-any.whl (781 kB)\r\n",
      "Using cached pyasn1_modules-0.4.0-py3-none-any.whl (181 kB)\r\n",
      "Using cached requests_oauthlib-2.0.0-py2.py3-none-any.whl (24 kB)\r\n",
      "Using cached rsa-4.9-py3-none-any.whl (34 kB)\r\n",
      "Using cached oauthlib-3.2.2-py3-none-any.whl (151 kB)\r\n",
      "Using cached pyasn1-0.6.0-py2.py3-none-any.whl (85 kB)\r\n"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Installing collected packages: tensorboard-plugin-wit, tf-keras, tensorflow-estimator, tensorboard-data-server, pyasn1, protobuf, oauthlib, keras, gast, rsa, requests-oauthlib, pyasn1-modules, google-auth, google-auth-oauthlib, tensorboard, tensorflow, tensorflow-text\r\n"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "  Attempting uninstall: tf-keras\r\n",
      "    Found existing installation: tf_keras 2.17.0\r\n"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "    Uninstalling tf_keras-2.17.0:\r\n"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "      Successfully uninstalled tf_keras-2.17.0\r\n"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "  Attempting uninstall: tensorboard-data-server\r\n",
      "    Found existing installation: tensorboard-data-server 0.7.2\r\n",
      "    Uninstalling tensorboard-data-server-0.7.2:\r\n",
      "      Successfully uninstalled tensorboard-data-server-0.7.2\r\n"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "  Attempting uninstall: protobuf\r\n",
      "    Found existing installation: protobuf 3.20.3\r\n",
      "    Uninstalling protobuf-3.20.3:\r\n"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "      Successfully uninstalled protobuf-3.20.3\r\n"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "  Attempting uninstall: keras\r\n",
      "    Found existing installation: keras 3.4.1\r\n"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "    Uninstalling keras-3.4.1:\r\n",
      "      Successfully uninstalled keras-3.4.1\r\n"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "  Attempting uninstall: gast\r\n",
      "    Found existing installation: gast 0.6.0\r\n",
      "    Uninstalling gast-0.6.0:\r\n",
      "      Successfully uninstalled gast-0.6.0\r\n"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "  Attempting uninstall: tensorboard\r\n",
      "    Found existing installation: tensorboard 2.17.0\r\n",
      "    Uninstalling tensorboard-2.17.0:\r\n",
      "      Successfully uninstalled tensorboard-2.17.0\r\n"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "  Attempting uninstall: tensorflow\r\n",
      "    Found existing installation: tensorflow 2.17.0\r\n"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "    Uninstalling tensorflow-2.17.0:\r\n"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "      Successfully uninstalled tensorflow-2.17.0\r\n"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[31mERROR: pip's dependency resolver does not currently take into account all the packages that are installed. This behaviour is the source of the following dependency conflicts.\r\n",
      "tensorflow-datasets 4.9.3 requires protobuf>=3.20, but you have protobuf 3.19.6 which is incompatible.\r\n",
      "tensorflow-metadata 1.15.0 requires protobuf<4.21,>=3.20.3; python_version < \"3.11\", but you have protobuf 3.19.6 which is incompatible.\u001b[0m\u001b[31m\r\n",
      "\u001b[0mSuccessfully installed gast-0.4.0 google-auth-2.32.0 google-auth-oauthlib-0.4.6 keras-2.11.0 oauthlib-3.2.2 protobuf-3.19.6 pyasn1-0.6.0 pyasn1-modules-0.4.0 requests-oauthlib-2.0.0 rsa-4.9 tensorboard-2.11.2 tensorboard-data-server-0.6.1 tensorboard-plugin-wit-1.8.1 tensorflow-2.11.1 tensorflow-estimator-2.11.0 tensorflow-text-2.11.0 tf-keras-2.15.0\r\n"
     ]
    }
   ],
   "source": [
    "!pip install -U \"tensorflow-text==2.11.*\""
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "metadata": {
    "execution": {
     "iopub.execute_input": "2024-07-19T12:51:07.305553Z",
     "iopub.status.busy": "2024-07-19T12:51:07.305254Z",
     "iopub.status.idle": "2024-07-19T12:51:09.437845Z",
     "shell.execute_reply": "2024-07-19T12:51:09.437131Z"
    },
    "id": "uL-I0CyPTXnN"
   },
   "outputs": [
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "2024-07-19 12:51:07.630699: W tensorflow/compiler/xla/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libcudart.so.11.0'; dlerror: libcudart.so.11.0: cannot open shared object file: No such file or directory\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "2024-07-19 12:51:08.294550: W tensorflow/compiler/xla/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libnvinfer.so.7'; dlerror: libnvinfer.so.7: cannot open shared object file: No such file or directory\n",
      "2024-07-19 12:51:08.294636: W tensorflow/compiler/xla/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libnvinfer_plugin.so.7'; dlerror: libnvinfer_plugin.so.7: cannot open shared object file: No such file or directory\n",
      "2024-07-19 12:51:08.294645: W tensorflow/compiler/tf2tensorrt/utils/py_utils.cc:38] TF-TRT Warning: Cannot dlopen some TensorRT libraries. If you would like to use Nvidia GPU with TensorRT, please make sure the missing libraries mentioned above are installed properly.\n"
     ]
    }
   ],
   "source": [
    "from absl import app\n",
    "import numpy as np\n",
    "import tensorflow as tf\n",
    "import tensorflow_text as tf_text\n",
    "\n",
    "from tensorflow.lite.python import interpreter"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "qj_bJ-xVTfU1"
   },
   "source": [
    "The following code example shows the conversion process and interpretation in Python using a simple test model. Note that the output of a model cannot be a `tf.RaggedTensor` object when you are using TensorFlow Lite. However, you can return the components of a `tf.RaggedTensor` object or convert it using its `to_tensor` function. See [the RaggedTensor guide](https://www.tensorflow.org/guide/ragged_tensor) for more details."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 4,
   "metadata": {
    "execution": {
     "iopub.execute_input": "2024-07-19T12:51:09.441810Z",
     "iopub.status.busy": "2024-07-19T12:51:09.441416Z",
     "iopub.status.idle": "2024-07-19T12:51:09.447007Z",
     "shell.execute_reply": "2024-07-19T12:51:09.446421Z"
    },
    "id": "nqQjBcXqTf_0"
   },
   "outputs": [],
   "source": [
    "class TokenizerModel(tf.keras.Model):\n",
    "\n",
    "  def __init__(self, **kwargs):\n",
    "    super().__init__(**kwargs)\n",
    "    self.tokenizer = tf_text.WhitespaceTokenizer()\n",
    "\n",
    "  @tf.function(input_signature=[\n",
    "      tf.TensorSpec(shape=[None], dtype=tf.string, name='input')\n",
    "  ])\n",
    "  def call(self, input_tensor):\n",
    "    return { 'tokens': self.tokenizer.tokenize(input_tensor).flat_values }"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 5,
   "metadata": {
    "execution": {
     "iopub.execute_input": "2024-07-19T12:51:09.449848Z",
     "iopub.status.busy": "2024-07-19T12:51:09.449607Z",
     "iopub.status.idle": "2024-07-19T12:51:10.703789Z",
     "shell.execute_reply": "2024-07-19T12:51:10.703000Z"
    },
    "id": "jsPFI-55TiF_"
   },
   "outputs": [
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "2024-07-19 12:51:09.979230: W tensorflow/compiler/xla/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libcudart.so.11.0'; dlerror: libcudart.so.11.0: cannot open shared object file: No such file or directory\n",
      "2024-07-19 12:51:09.979338: W tensorflow/compiler/xla/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libcublas.so.11'; dlerror: libcublas.so.11: cannot open shared object file: No such file or directory\n",
      "2024-07-19 12:51:09.979414: W tensorflow/compiler/xla/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libcublasLt.so.11'; dlerror: libcublasLt.so.11: cannot open shared object file: No such file or directory\n",
      "2024-07-19 12:51:09.979477: W tensorflow/compiler/xla/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libcufft.so.10'; dlerror: libcufft.so.10: cannot open shared object file: No such file or directory\n",
      "2024-07-19 12:51:10.036760: W tensorflow/compiler/xla/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libcusparse.so.11'; dlerror: libcusparse.so.11: cannot open shared object file: No such file or directory\n",
      "2024-07-19 12:51:10.036958: W tensorflow/core/common_runtime/gpu/gpu_device.cc:1934] Cannot dlopen some GPU libraries. Please make sure the missing libraries mentioned above are installed properly if you would like to use GPU. Follow the guide at https://www.tensorflow.org/install/gpu for how to download and setup the required libraries for your platform.\n",
      "Skipping registering GPU devices...\n"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "TensorFlow result =  tf.Tensor([b'Some' b'minds' b'are' b'better' b'kept' b'apart'], shape=(6,), dtype=string)\n"
     ]
    }
   ],
   "source": [
    "# Test input data.\n",
    "input_data = np.array(['Some minds are better kept apart'])\n",
    "\n",
    "# Define a Keras model.\n",
    "model = TokenizerModel()\n",
    "\n",
    "# Perform TensorFlow Text inference.\n",
    "tf_result = model(tf.constant(input_data))\n",
    "print('TensorFlow result = ', tf_result['tokens'])"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "YKpFsvJGTlPq"
   },
   "source": [
    "## Convert the TensorFlow model to TensorFlow Lite\n",
    "\n",
    "When converting a TensorFlow model with TensorFlow Text operators to TensorFlow Lite, you need to\n",
    "indicate to the `TFLiteConverter` that there are custom operators using the\n",
    "`allow_custom_ops` attribute as in the example below. You can then run the model conversion as you normally would. Review the [TensorFlow Lite converter](https://www.tensorflow.org/lite/convert) documentation for a detailed guide on the basics of model conversion."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 6,
   "metadata": {
    "execution": {
     "iopub.execute_input": "2024-07-19T12:51:10.707127Z",
     "iopub.status.busy": "2024-07-19T12:51:10.706878Z",
     "iopub.status.idle": "2024-07-19T12:51:12.927063Z",
     "shell.execute_reply": "2024-07-19T12:51:12.926314Z"
    },
    "id": "6hYWezs1Tndo"
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "INFO:tensorflow:Assets written to: /tmpfs/tmp/tmpgiujamnl/assets\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "2024-07-19 12:51:12.711464: W tensorflow/compiler/mlir/lite/python/tf_tfl_flatbuffer_helpers.cc:362] Ignored output_format.\n",
      "2024-07-19 12:51:12.711504: W tensorflow/compiler/mlir/lite/python/tf_tfl_flatbuffer_helpers.cc:365] Ignored drop_control_dependency.\n",
      "2024-07-19 12:51:12.906497: W tensorflow/compiler/mlir/lite/flatbuffer_export.cc:2057] The following operation(s) need TFLite custom op implementation(s):\n",
      "Custom ops: TFText>WhitespaceTokenizeWithOffsetsV2\n",
      "Details:\n",
      "\ttf.TFText>WhitespaceTokenizeWithOffsetsV2(tensor<?x!tf_type.string>, tensor<!tf_type.string>) -> (tensor<?x!tf_type.string>, tensor<?xi64>, tensor<?xi32>, tensor<?xi32>) : {device = \"\"}\n",
      "See instructions: https://www.tensorflow.org/lite/guide/ops_custom\n"
     ]
    }
   ],
   "source": [
    "# Convert to TensorFlow Lite.\n",
    "converter = tf.lite.TFLiteConverter.from_keras_model(model)\n",
    "converter.target_spec.supported_ops = [tf.lite.OpsSet.TFLITE_BUILTINS]\n",
    "converter.allow_custom_ops = True\n",
    "tflite_model = converter.convert()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "cxCdhrHATpSR"
   },
   "source": [
    "## Inference\n",
    "\n",
    "For the TensorFlow Lite interpreter to properly read your model containing TensorFlow Text operators, you must configure it to use these custom operators, and provide registration methods for them. Use `tf_text.tflite_registrar.SELECT_TFTEXT_OPS` to provide the full suite of registration functions for the supported TensorFlow Text operators to `InterpreterWithCustomOps`.\n",
    "\n",
    "Note, that while the example below shows inference in Python, the steps are similar in other languages with some minor API translations, and the necessity to build the `tflite_registrar` into your binary. See [TensorFlow Lite Inference](https://www.tensorflow.org/lite/guide/inference) for more details."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 7,
   "metadata": {
    "execution": {
     "iopub.execute_input": "2024-07-19T12:51:12.930522Z",
     "iopub.status.busy": "2024-07-19T12:51:12.930240Z",
     "iopub.status.idle": "2024-07-19T12:51:12.937969Z",
     "shell.execute_reply": "2024-07-19T12:51:12.937328Z"
    },
    "id": "kykFg2pXTriw"
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "{'serving_default': {'inputs': ['input'], 'outputs': ['tokens']}}"
      ]
     },
     "execution_count": 7,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# Perform TensorFlow Lite inference.\n",
    "interp = interpreter.InterpreterWithCustomOps(\n",
    "    model_content=tflite_model,\n",
    "    custom_op_registerers=tf_text.tflite_registrar.SELECT_TFTEXT_OPS)\n",
    "interp.get_signature_list()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "rNGPpHCCTxVX"
   },
   "source": [
    "Next, the TensorFlow Lite interpreter is invoked with the input, providing a result which matches the TensorFlow result from above."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 8,
   "metadata": {
    "execution": {
     "iopub.execute_input": "2024-07-19T12:51:12.941131Z",
     "iopub.status.busy": "2024-07-19T12:51:12.940889Z",
     "iopub.status.idle": "2024-07-19T12:51:12.945202Z",
     "shell.execute_reply": "2024-07-19T12:51:12.944510Z"
    },
    "id": "vmSbfbgJTyKY"
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "TensorFlow Lite result =  [b'Some' b'minds' b'are' b'better' b'kept' b'apart']\n"
     ]
    }
   ],
   "source": [
    "tokenize = interp.get_signature_runner('serving_default')\n",
    "output = tokenize(input=input_data)\n",
    "print('TensorFlow Lite result = ', output['tokens'])"
   ]
  }
 ],
 "metadata": {
  "colab": {
   "collapsed_sections": [],
   "name": "text_tf_lite.ipynb",
   "toc_visible": true
  },
  "kernelspec": {
   "display_name": "Python 3",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.9.19"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 0
}