{ "cells": [ { "cell_type": "markdown", "metadata": { "id": "DjUA6S30k52h" }, "source": [ "##### Copyright 2021 The TensorFlow Authors." ] }, { "cell_type": "code", "execution_count": 1, "metadata": { "execution": { "iopub.execute_input": "2024-05-08T09:19:56.794046Z", "iopub.status.busy": "2024-05-08T09:19:56.793532Z", "iopub.status.idle": "2024-05-08T09:19:56.797635Z", "shell.execute_reply": "2024-05-08T09:19:56.796939Z" }, "id": "SpNWyqewk8fE" }, "outputs": [], "source": [ "#@title Licensed under the Apache License, Version 2.0 (the \"License\");\n", "# you may not use this file except in compliance with the License.\n", "# You may obtain a copy of the License at\n", "#\n", "# https://www.apache.org/licenses/LICENSE-2.0\n", "#\n", "# Unless required by applicable law or agreed to in writing, software\n", "# distributed under the License is distributed on an \"AS IS\" BASIS,\n", "# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n", "# See the License for the specific language governing permissions and\n", "# limitations under the License." ] }, { "cell_type": "markdown", "metadata": { "id": "6x1ypzczQCwy" }, "source": [ "# Feature Engineering using TFX Pipeline and TensorFlow Transform\n", "\n", "***Transform input data and train a model with a TFX pipeline.***" ] }, { "cell_type": "markdown", "metadata": { "id": "HU9YYythm0dx" }, "source": [ "Note: We recommend running this tutorial in a Colab notebook, with no setup required! Just click \"Run in Google Colab\".\n", "\n", "
\n", "\n", "\n", "\n", "\n", "
\n", "View on TensorFlow.org\n", "Run in Google Colab\n", "View source on GitHubDownload notebook
" ] }, { "cell_type": "markdown", "metadata": { "id": "_VuwrlnvQJ5k" }, "source": [ "In this notebook-based tutorial, we will create and run a TFX pipeline\n", "to ingest raw input data and preprocess it appropriately for ML training.\n", "This notebook is based on the TFX pipeline we built in\n", "[Data validation using TFX Pipeline and TensorFlow Data Validation Tutorial](https://www.tensorflow.org/tfx/tutorials/tfx/penguin_tfdv).\n", "If you have not read that one yet, you should read it before proceeding with\n", "this notebook.\n", "\n", "You can increase the predictive quality of your data and/or reduce\n", "dimensionality with feature engineering. One of the benefits of using TFX is\n", "that you will write your transformation code once, and the resulting transforms\n", "will be consistent between training and serving in\n", "order to avoid training/serving skew.\n", "\n", "We will add a `Transform` component to the pipeline. The Transform component is\n", "implemented using the\n", "[tf.transform](https://www.tensorflow.org/tfx/transform/get_started) library.\n", "\n", "Please see\n", "[Understanding TFX Pipelines](https://www.tensorflow.org/tfx/guide/understanding_tfx_pipelines)\n", "to learn more about various concepts in TFX." ] }, { "cell_type": "markdown", "metadata": { "id": "Fmgi8ZvQkScg" }, "source": [ "## Set Up\n", "We first need to install the TFX Python package and download\n", "the dataset which we will use for our model.\n", "\n", "### Upgrade Pip\n", "\n", "To avoid upgrading Pip in a system when running locally,\n", "check to make sure that we are running in Colab.\n", "Local systems can of course be upgraded separately." ] }, { "cell_type": "code", "execution_count": 2, "metadata": { "execution": { "iopub.execute_input": "2024-05-08T09:19:56.801173Z", "iopub.status.busy": "2024-05-08T09:19:56.800717Z", "iopub.status.idle": "2024-05-08T09:19:56.809671Z", "shell.execute_reply": "2024-05-08T09:19:56.809016Z" }, "id": "as4OTe2ukSqm" }, "outputs": [], "source": [ "try:\n", " import colab\n", " !pip install --upgrade pip\n", "except:\n", " pass" ] }, { "cell_type": "markdown", "metadata": { "id": "MZOYTt1RW4TK" }, "source": [ "### Install TFX\n" ] }, { "cell_type": "code", "execution_count": 3, "metadata": { "execution": { "iopub.execute_input": "2024-05-08T09:19:56.812984Z", "iopub.status.busy": "2024-05-08T09:19:56.812384Z", "iopub.status.idle": "2024-05-08T09:20:07.661347Z", "shell.execute_reply": "2024-05-08T09:20:07.660462Z" }, "id": "iyQtljP-qPHY" }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Requirement already satisfied: tfx in /tmpfs/src/tf_docs_env/lib/python3.9/site-packages (1.15.0)\r\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "Requirement already satisfied: ml-pipelines-sdk==1.15.0 in /tmpfs/src/tf_docs_env/lib/python3.9/site-packages (from tfx) (1.15.0)\r\n", "Requirement already satisfied: absl-py<2.0.0,>=0.9 in /tmpfs/src/tf_docs_env/lib/python3.9/site-packages (from tfx) (1.4.0)\r\n", "Requirement already satisfied: ml-metadata<1.16.0,>=1.15.0 in /tmpfs/src/tf_docs_env/lib/python3.9/site-packages (from tfx) (1.15.0)\r\n", "Requirement already satisfied: packaging>=22 in /tmpfs/src/tf_docs_env/lib/python3.9/site-packages (from tfx) (24.0)\r\n", "Requirement already satisfied: portpicker<2,>=1.3.1 in /tmpfs/src/tf_docs_env/lib/python3.9/site-packages (from tfx) (1.6.0)\r\n", "Requirement already satisfied: protobuf<5,>=3.20.3 in /tmpfs/src/tf_docs_env/lib/python3.9/site-packages (from tfx) (3.20.3)\r\n", "Requirement already satisfied: docker<5,>=4.1 in /tmpfs/src/tf_docs_env/lib/python3.9/site-packages (from tfx) (4.4.4)\r\n", "Requirement already satisfied: google-apitools<1,>=0.5 in /tmpfs/src/tf_docs_env/lib/python3.9/site-packages (from tfx) (0.5.31)\r\n", "Requirement already satisfied: google-api-python-client<2,>=1.8 in /tmpfs/src/tf_docs_env/lib/python3.9/site-packages (from tfx) (1.12.11)\r\n", "Requirement already satisfied: jinja2<4,>=2.7.3 in /tmpfs/src/tf_docs_env/lib/python3.9/site-packages (from tfx) (3.1.4)\r\n", "Requirement already satisfied: typing-extensions<5,>=3.10.0.2 in /tmpfs/src/tf_docs_env/lib/python3.9/site-packages (from tfx) (4.11.0)\r\n", "Requirement already satisfied: apache-beam<3,>=2.47 in /tmpfs/src/tf_docs_env/lib/python3.9/site-packages (from apache-beam[gcp]<3,>=2.47->tfx) (2.56.0)\r\n", "Requirement already satisfied: attrs<24,>=19.3.0 in /tmpfs/src/tf_docs_env/lib/python3.9/site-packages (from tfx) (23.2.0)\r\n", "Requirement already satisfied: click<9,>=7 in /tmpfs/src/tf_docs_env/lib/python3.9/site-packages (from tfx) (8.1.7)\r\n", "Requirement already satisfied: google-api-core<3 in /tmpfs/src/tf_docs_env/lib/python3.9/site-packages (from tfx) (2.19.0)\r\n", "Requirement already satisfied: google-cloud-aiplatform<2,>=1.6.2 in /tmpfs/src/tf_docs_env/lib/python3.9/site-packages (from tfx) (1.50.0)\r\n", "Requirement already satisfied: google-cloud-bigquery<4,>=3 in /tmpfs/src/tf_docs_env/lib/python3.9/site-packages (from tfx) (3.22.0)\r\n", "Requirement already satisfied: grpcio<2,>=1.28.1 in /tmpfs/src/tf_docs_env/lib/python3.9/site-packages (from tfx) (1.63.0)\r\n", "Requirement already satisfied: keras-tuner!=1.4.0,!=1.4.1,<2,>=1.0.4 in /tmpfs/src/tf_docs_env/lib/python3.9/site-packages (from tfx) (1.4.7)\r\n", "Requirement already satisfied: kubernetes<13,>=10.0.1 in /tmpfs/src/tf_docs_env/lib/python3.9/site-packages (from tfx) (12.0.1)\r\n", "Requirement already satisfied: numpy<2,>=1.16 in /tmpfs/src/tf_docs_env/lib/python3.9/site-packages (from tfx) (1.26.4)\r\n", "Requirement already satisfied: pyarrow<11,>=10 in /tmpfs/src/tf_docs_env/lib/python3.9/site-packages (from tfx) (10.0.1)\r\n", "Requirement already satisfied: scipy<1.13 in /tmpfs/src/tf_docs_env/lib/python3.9/site-packages (from tfx) (1.12.0)\r\n", "Requirement already satisfied: pyyaml<7,>=6 in /tmpfs/src/tf_docs_env/lib/python3.9/site-packages (from tfx) (6.0.1)\r\n", "Requirement already satisfied: tensorflow<2.16,>=2.15.0 in /tmpfs/src/tf_docs_env/lib/python3.9/site-packages (from tfx) (2.15.1)\r\n", "Requirement already satisfied: tensorflow-hub<0.16,>=0.15.0 in /tmpfs/src/tf_docs_env/lib/python3.9/site-packages (from tfx) (0.15.0)\r\n", "Requirement already satisfied: tensorflow-data-validation<1.16.0,>=1.15.1 in /tmpfs/src/tf_docs_env/lib/python3.9/site-packages (from tfx) (1.15.1)\r\n", "Requirement already satisfied: tensorflow-model-analysis<0.47.0,>=0.46.0 in /tmpfs/src/tf_docs_env/lib/python3.9/site-packages (from tfx) (0.46.0)\r\n", "Requirement already satisfied: tensorflow-serving-api<2.16,>=2.15 in /tmpfs/src/tf_docs_env/lib/python3.9/site-packages (from tfx) (2.15.1)\r\n", "Requirement already satisfied: tensorflow-transform<1.16.0,>=1.15.0 in /tmpfs/src/tf_docs_env/lib/python3.9/site-packages (from tfx) (1.15.0)\r\n", "Requirement already satisfied: tfx-bsl<1.16.0,>=1.15.1 in /tmpfs/src/tf_docs_env/lib/python3.9/site-packages (from tfx) (1.15.1)\r\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "Requirement already satisfied: crcmod<2.0,>=1.7 in /tmpfs/src/tf_docs_env/lib/python3.9/site-packages (from apache-beam<3,>=2.47->apache-beam[gcp]<3,>=2.47->tfx) (1.7)\r\n", "Requirement already satisfied: orjson<4,>=3.9.7 in /tmpfs/src/tf_docs_env/lib/python3.9/site-packages (from apache-beam<3,>=2.47->apache-beam[gcp]<3,>=2.47->tfx) (3.10.3)\r\n", "Requirement already satisfied: dill<0.3.2,>=0.3.1.1 in /tmpfs/src/tf_docs_env/lib/python3.9/site-packages (from apache-beam<3,>=2.47->apache-beam[gcp]<3,>=2.47->tfx) (0.3.1.1)\r\n", "Requirement already satisfied: cloudpickle~=2.2.1 in /tmpfs/src/tf_docs_env/lib/python3.9/site-packages (from apache-beam<3,>=2.47->apache-beam[gcp]<3,>=2.47->tfx) (2.2.1)\r\n", "Requirement already satisfied: fastavro<2,>=0.23.6 in /tmpfs/src/tf_docs_env/lib/python3.9/site-packages (from apache-beam<3,>=2.47->apache-beam[gcp]<3,>=2.47->tfx) (1.9.4)\r\n", "Requirement already satisfied: fasteners<1.0,>=0.3 in /tmpfs/src/tf_docs_env/lib/python3.9/site-packages (from apache-beam<3,>=2.47->apache-beam[gcp]<3,>=2.47->tfx) (0.19)\r\n", "Requirement already satisfied: hdfs<3.0.0,>=2.1.0 in /tmpfs/src/tf_docs_env/lib/python3.9/site-packages (from apache-beam<3,>=2.47->apache-beam[gcp]<3,>=2.47->tfx) (2.7.3)\r\n", "Requirement already satisfied: httplib2<0.23.0,>=0.8 in /tmpfs/src/tf_docs_env/lib/python3.9/site-packages (from apache-beam<3,>=2.47->apache-beam[gcp]<3,>=2.47->tfx) (0.22.0)\r\n", "Requirement already satisfied: jsonschema<5.0.0,>=4.0.0 in /tmpfs/src/tf_docs_env/lib/python3.9/site-packages (from apache-beam<3,>=2.47->apache-beam[gcp]<3,>=2.47->tfx) (4.22.0)\r\n", "Requirement already satisfied: jsonpickle<4.0.0,>=3.0.0 in /tmpfs/src/tf_docs_env/lib/python3.9/site-packages (from apache-beam<3,>=2.47->apache-beam[gcp]<3,>=2.47->tfx) (3.0.4)\r\n", "Requirement already satisfied: objsize<0.8.0,>=0.6.1 in /tmpfs/src/tf_docs_env/lib/python3.9/site-packages (from apache-beam<3,>=2.47->apache-beam[gcp]<3,>=2.47->tfx) (0.7.0)\r\n", "Requirement already satisfied: pymongo<5.0.0,>=3.8.0 in /tmpfs/src/tf_docs_env/lib/python3.9/site-packages (from apache-beam<3,>=2.47->apache-beam[gcp]<3,>=2.47->tfx) (4.7.2)\r\n", "Requirement already satisfied: proto-plus<2,>=1.7.1 in /tmpfs/src/tf_docs_env/lib/python3.9/site-packages (from apache-beam<3,>=2.47->apache-beam[gcp]<3,>=2.47->tfx) (1.23.0)\r\n", "Requirement already satisfied: pydot<2,>=1.2.0 in /tmpfs/src/tf_docs_env/lib/python3.9/site-packages (from apache-beam<3,>=2.47->apache-beam[gcp]<3,>=2.47->tfx) (1.4.2)\r\n", "Requirement already satisfied: python-dateutil<3,>=2.8.0 in /tmpfs/src/tf_docs_env/lib/python3.9/site-packages (from apache-beam<3,>=2.47->apache-beam[gcp]<3,>=2.47->tfx) (2.9.0.post0)\r\n", "Requirement already satisfied: pytz>=2018.3 in /tmpfs/src/tf_docs_env/lib/python3.9/site-packages (from apache-beam<3,>=2.47->apache-beam[gcp]<3,>=2.47->tfx) (2024.1)\r\n", "Requirement already satisfied: redis<6,>=5.0.0 in /tmpfs/src/tf_docs_env/lib/python3.9/site-packages (from apache-beam<3,>=2.47->apache-beam[gcp]<3,>=2.47->tfx) (5.0.4)\r\n", "Requirement already satisfied: regex>=2020.6.8 in /tmpfs/src/tf_docs_env/lib/python3.9/site-packages (from apache-beam<3,>=2.47->apache-beam[gcp]<3,>=2.47->tfx) (2024.4.28)\r\n", "Requirement already satisfied: requests<3.0.0,>=2.24.0 in /tmpfs/src/tf_docs_env/lib/python3.9/site-packages (from apache-beam<3,>=2.47->apache-beam[gcp]<3,>=2.47->tfx) (2.31.0)\r\n", "Requirement already satisfied: zstandard<1,>=0.18.0 in /tmpfs/src/tf_docs_env/lib/python3.9/site-packages (from apache-beam<3,>=2.47->apache-beam[gcp]<3,>=2.47->tfx) (0.22.0)\r\n", "Requirement already satisfied: pyarrow-hotfix<1 in /tmpfs/src/tf_docs_env/lib/python3.9/site-packages (from apache-beam<3,>=2.47->apache-beam[gcp]<3,>=2.47->tfx) (0.6)\r\n", "Requirement already satisfied: js2py<1,>=0.74 in /tmpfs/src/tf_docs_env/lib/python3.9/site-packages (from apache-beam<3,>=2.47->apache-beam[gcp]<3,>=2.47->tfx) (0.74)\r\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "Requirement already satisfied: cachetools<6,>=3.1.0 in /tmpfs/src/tf_docs_env/lib/python3.9/site-packages (from apache-beam[gcp]<3,>=2.47->tfx) (5.3.3)\r\n", "Requirement already satisfied: google-auth<3,>=1.18.0 in /tmpfs/src/tf_docs_env/lib/python3.9/site-packages (from apache-beam[gcp]<3,>=2.47->tfx) (2.29.0)\r\n", "Requirement already satisfied: google-auth-httplib2<0.3.0,>=0.1.0 in /tmpfs/src/tf_docs_env/lib/python3.9/site-packages (from apache-beam[gcp]<3,>=2.47->tfx) (0.2.0)\r\n", "Requirement already satisfied: google-cloud-datastore<3,>=2.0.0 in /tmpfs/src/tf_docs_env/lib/python3.9/site-packages (from apache-beam[gcp]<3,>=2.47->tfx) (2.19.0)\r\n", "Requirement already satisfied: google-cloud-pubsub<3,>=2.1.0 in /tmpfs/src/tf_docs_env/lib/python3.9/site-packages (from apache-beam[gcp]<3,>=2.47->tfx) (2.21.1)\r\n", "Requirement already satisfied: google-cloud-pubsublite<2,>=1.2.0 in /tmpfs/src/tf_docs_env/lib/python3.9/site-packages (from apache-beam[gcp]<3,>=2.47->tfx) (1.10.0)\r\n", "Requirement already satisfied: google-cloud-storage<3,>=2.14.0 in /tmpfs/src/tf_docs_env/lib/python3.9/site-packages (from apache-beam[gcp]<3,>=2.47->tfx) (2.16.0)\r\n", "Requirement already satisfied: google-cloud-bigquery-storage<3,>=2.6.3 in /tmpfs/src/tf_docs_env/lib/python3.9/site-packages (from apache-beam[gcp]<3,>=2.47->tfx) (2.25.0)\r\n", "Requirement already satisfied: google-cloud-core<3,>=2.0.0 in /tmpfs/src/tf_docs_env/lib/python3.9/site-packages (from apache-beam[gcp]<3,>=2.47->tfx) (2.4.1)\r\n", "Requirement already satisfied: google-cloud-bigtable<3,>=2.19.0 in /tmpfs/src/tf_docs_env/lib/python3.9/site-packages (from apache-beam[gcp]<3,>=2.47->tfx) (2.23.1)\r\n", "Requirement already satisfied: google-cloud-spanner<4,>=3.0.0 in /tmpfs/src/tf_docs_env/lib/python3.9/site-packages (from apache-beam[gcp]<3,>=2.47->tfx) (3.46.0)\r\n", "Requirement already satisfied: google-cloud-dlp<4,>=3.0.0 in /tmpfs/src/tf_docs_env/lib/python3.9/site-packages (from apache-beam[gcp]<3,>=2.47->tfx) (3.17.0)\r\n", "Requirement already satisfied: google-cloud-language<3,>=2.0 in /tmpfs/src/tf_docs_env/lib/python3.9/site-packages (from apache-beam[gcp]<3,>=2.47->tfx) (2.13.3)\r\n", "Requirement already satisfied: google-cloud-videointelligence<3,>=2.0 in /tmpfs/src/tf_docs_env/lib/python3.9/site-packages (from apache-beam[gcp]<3,>=2.47->tfx) (2.13.3)\r\n", "Requirement already satisfied: google-cloud-vision<4,>=2 in /tmpfs/src/tf_docs_env/lib/python3.9/site-packages (from apache-beam[gcp]<3,>=2.47->tfx) (3.7.2)\r\n", "Requirement already satisfied: google-cloud-recommendations-ai<0.11.0,>=0.1.0 in /tmpfs/src/tf_docs_env/lib/python3.9/site-packages (from apache-beam[gcp]<3,>=2.47->tfx) (0.10.10)\r\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "Requirement already satisfied: six>=1.4.0 in /tmpfs/src/tf_docs_env/lib/python3.9/site-packages (from docker<5,>=4.1->tfx) (1.16.0)\r\n", "Requirement already satisfied: websocket-client>=0.32.0 in /tmpfs/src/tf_docs_env/lib/python3.9/site-packages (from docker<5,>=4.1->tfx) (1.8.0)\r\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "Requirement already satisfied: googleapis-common-protos<2.0.dev0,>=1.56.2 in /tmpfs/src/tf_docs_env/lib/python3.9/site-packages (from google-api-core<3->tfx) (1.63.0)\r\n", "Requirement already satisfied: uritemplate<4dev,>=3.0.0 in /tmpfs/src/tf_docs_env/lib/python3.9/site-packages (from google-api-python-client<2,>=1.8->tfx) (3.0.1)\r\n", "Requirement already satisfied: oauth2client>=1.4.12 in /tmpfs/src/tf_docs_env/lib/python3.9/site-packages (from google-apitools<1,>=0.5->tfx) (4.1.3)\r\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "Requirement already satisfied: google-cloud-resource-manager<3.0.0dev,>=1.3.3 in /tmpfs/src/tf_docs_env/lib/python3.9/site-packages (from google-cloud-aiplatform<2,>=1.6.2->tfx) (1.12.3)\r\n", "Requirement already satisfied: shapely<3.0.0dev in /tmpfs/src/tf_docs_env/lib/python3.9/site-packages (from google-cloud-aiplatform<2,>=1.6.2->tfx) (2.0.4)\r\n", "Requirement already satisfied: pydantic<3 in /tmpfs/src/tf_docs_env/lib/python3.9/site-packages (from google-cloud-aiplatform<2,>=1.6.2->tfx) (1.10.15)\r\n", "Requirement already satisfied: docstring-parser<1 in /tmpfs/src/tf_docs_env/lib/python3.9/site-packages (from google-cloud-aiplatform<2,>=1.6.2->tfx) (0.16)\r\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "Requirement already satisfied: google-resumable-media<3.0dev,>=0.6.0 in /tmpfs/src/tf_docs_env/lib/python3.9/site-packages (from google-cloud-bigquery<4,>=3->tfx) (2.7.0)\r\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "Requirement already satisfied: MarkupSafe>=2.0 in /tmpfs/src/tf_docs_env/lib/python3.9/site-packages (from jinja2<4,>=2.7.3->tfx) (2.1.5)\r\n", "Requirement already satisfied: keras in /tmpfs/src/tf_docs_env/lib/python3.9/site-packages (from keras-tuner!=1.4.0,!=1.4.1,<2,>=1.0.4->tfx) (2.15.0)\r\n", "Requirement already satisfied: kt-legacy in /tmpfs/src/tf_docs_env/lib/python3.9/site-packages (from keras-tuner!=1.4.0,!=1.4.1,<2,>=1.0.4->tfx) (1.0.5)\r\n", "Requirement already satisfied: certifi>=14.05.14 in /tmpfs/src/tf_docs_env/lib/python3.9/site-packages (from kubernetes<13,>=10.0.1->tfx) (2024.2.2)\r\n", "Requirement already satisfied: setuptools>=21.0.0 in /tmpfs/src/tf_docs_env/lib/python3.9/site-packages (from kubernetes<13,>=10.0.1->tfx) (69.5.1)\r\n", "Requirement already satisfied: requests-oauthlib in /tmpfs/src/tf_docs_env/lib/python3.9/site-packages (from kubernetes<13,>=10.0.1->tfx) (2.0.0)\r\n", "Requirement already satisfied: urllib3>=1.24.2 in /tmpfs/src/tf_docs_env/lib/python3.9/site-packages (from kubernetes<13,>=10.0.1->tfx) (1.26.18)\r\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "Requirement already satisfied: psutil in /tmpfs/src/tf_docs_env/lib/python3.9/site-packages (from portpicker<2,>=1.3.1->tfx) (5.9.8)\r\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "Requirement already satisfied: astunparse>=1.6.0 in /tmpfs/src/tf_docs_env/lib/python3.9/site-packages (from tensorflow<2.16,>=2.15.0->tfx) (1.6.3)\r\n", "Requirement already satisfied: flatbuffers>=23.5.26 in /tmpfs/src/tf_docs_env/lib/python3.9/site-packages (from tensorflow<2.16,>=2.15.0->tfx) (24.3.25)\r\n", "Requirement already satisfied: gast!=0.5.0,!=0.5.1,!=0.5.2,>=0.2.1 in /tmpfs/src/tf_docs_env/lib/python3.9/site-packages (from tensorflow<2.16,>=2.15.0->tfx) (0.5.4)\r\n", "Requirement already satisfied: google-pasta>=0.1.1 in /tmpfs/src/tf_docs_env/lib/python3.9/site-packages (from tensorflow<2.16,>=2.15.0->tfx) (0.2.0)\r\n", "Requirement already satisfied: h5py>=2.9.0 in /tmpfs/src/tf_docs_env/lib/python3.9/site-packages (from tensorflow<2.16,>=2.15.0->tfx) (3.11.0)\r\n", "Requirement already satisfied: libclang>=13.0.0 in /tmpfs/src/tf_docs_env/lib/python3.9/site-packages (from tensorflow<2.16,>=2.15.0->tfx) (18.1.1)\r\n", "Requirement already satisfied: ml-dtypes~=0.3.1 in /tmpfs/src/tf_docs_env/lib/python3.9/site-packages (from tensorflow<2.16,>=2.15.0->tfx) (0.3.2)\r\n", "Requirement already satisfied: opt-einsum>=2.3.2 in /tmpfs/src/tf_docs_env/lib/python3.9/site-packages (from tensorflow<2.16,>=2.15.0->tfx) (3.3.0)\r\n", "Requirement already satisfied: termcolor>=1.1.0 in /tmpfs/src/tf_docs_env/lib/python3.9/site-packages (from tensorflow<2.16,>=2.15.0->tfx) (2.4.0)\r\n", "Requirement already satisfied: wrapt<1.15,>=1.11.0 in /tmpfs/src/tf_docs_env/lib/python3.9/site-packages (from tensorflow<2.16,>=2.15.0->tfx) (1.14.1)\r\n", "Requirement already satisfied: tensorflow-io-gcs-filesystem>=0.23.1 in /tmpfs/src/tf_docs_env/lib/python3.9/site-packages (from tensorflow<2.16,>=2.15.0->tfx) (0.37.0)\r\n", "Requirement already satisfied: tensorboard<2.16,>=2.15 in /tmpfs/src/tf_docs_env/lib/python3.9/site-packages (from tensorflow<2.16,>=2.15.0->tfx) (2.15.2)\r\n", "Requirement already satisfied: tensorflow-estimator<2.16,>=2.15.0 in /tmpfs/src/tf_docs_env/lib/python3.9/site-packages (from tensorflow<2.16,>=2.15.0->tfx) (2.15.0)\r\n", "Requirement already satisfied: joblib>=1.2.0 in /tmpfs/src/tf_docs_env/lib/python3.9/site-packages (from tensorflow-data-validation<1.16.0,>=1.15.1->tfx) (1.4.2)\r\n", "Requirement already satisfied: pandas<2,>=1.0 in /tmpfs/src/tf_docs_env/lib/python3.9/site-packages (from tensorflow-data-validation<1.16.0,>=1.15.1->tfx) (1.5.3)\r\n", "Requirement already satisfied: pyfarmhash<0.4,>=0.2.2 in /tmpfs/src/tf_docs_env/lib/python3.9/site-packages (from tensorflow-data-validation<1.16.0,>=1.15.1->tfx) (0.3.2)\r\n", "Requirement already satisfied: tensorflow-metadata<1.16,>=1.15.0 in /tmpfs/src/tf_docs_env/lib/python3.9/site-packages (from tensorflow-data-validation<1.16.0,>=1.15.1->tfx) (1.15.0)\r\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "Requirement already satisfied: ipython<8,>=7 in /tmpfs/src/tf_docs_env/lib/python3.9/site-packages (from tensorflow-model-analysis<0.47.0,>=0.46.0->tfx) (7.34.0)\r\n", "Requirement already satisfied: ipywidgets<8,>=7 in /tmpfs/src/tf_docs_env/lib/python3.9/site-packages (from tensorflow-model-analysis<0.47.0,>=0.46.0->tfx) (7.8.1)\r\n", "Requirement already satisfied: pillow>=9.4.0 in /tmpfs/src/tf_docs_env/lib/python3.9/site-packages (from tensorflow-model-analysis<0.47.0,>=0.46.0->tfx) (10.3.0)\r\n", "Requirement already satisfied: rouge-score<2,>=0.1.2 in /tmpfs/src/tf_docs_env/lib/python3.9/site-packages (from tensorflow-model-analysis<0.47.0,>=0.46.0->tfx) (0.1.2)\r\n", "Requirement already satisfied: sacrebleu<4,>=2.3 in /tmpfs/src/tf_docs_env/lib/python3.9/site-packages (from tensorflow-model-analysis<0.47.0,>=0.46.0->tfx) (2.4.2)\r\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "Requirement already satisfied: wheel<1.0,>=0.23.0 in /tmpfs/src/tf_docs_env/lib/python3.9/site-packages (from astunparse>=1.6.0->tensorflow<2.16,>=2.15.0->tfx) (0.43.0)\r\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "Requirement already satisfied: grpcio-status<2.0.dev0,>=1.33.2 in /tmpfs/src/tf_docs_env/lib/python3.9/site-packages (from google-api-core[grpc]!=2.0.*,!=2.1.*,!=2.2.*,!=2.3.*,!=2.4.*,!=2.5.*,!=2.6.*,!=2.7.*,<3.0.0dev,>=1.34.1->google-cloud-aiplatform<2,>=1.6.2->tfx) (1.48.2)\r\n", "Requirement already satisfied: pyasn1-modules>=0.2.1 in /tmpfs/src/tf_docs_env/lib/python3.9/site-packages (from google-auth<3,>=1.18.0->apache-beam[gcp]<3,>=2.47->tfx) (0.4.0)\r\n", "Requirement already satisfied: rsa<5,>=3.1.4 in /tmpfs/src/tf_docs_env/lib/python3.9/site-packages (from google-auth<3,>=1.18.0->apache-beam[gcp]<3,>=2.47->tfx) (4.9)\r\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "Requirement already satisfied: grpc-google-iam-v1<1.0.0dev,>=0.12.4 in /tmpfs/src/tf_docs_env/lib/python3.9/site-packages (from google-cloud-bigtable<3,>=2.19.0->apache-beam[gcp]<3,>=2.47->tfx) (0.13.0)\r\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "Requirement already satisfied: overrides<8.0.0,>=6.0.1 in /tmpfs/src/tf_docs_env/lib/python3.9/site-packages (from google-cloud-pubsublite<2,>=1.2.0->apache-beam[gcp]<3,>=2.47->tfx) (7.7.0)\r\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "Requirement already satisfied: sqlparse>=0.4.4 in /tmpfs/src/tf_docs_env/lib/python3.9/site-packages (from google-cloud-spanner<4,>=3.0.0->apache-beam[gcp]<3,>=2.47->tfx) (0.5.0)\r\n", "Requirement already satisfied: grpc-interceptor>=0.15.4 in /tmpfs/src/tf_docs_env/lib/python3.9/site-packages (from google-cloud-spanner<4,>=3.0.0->apache-beam[gcp]<3,>=2.47->tfx) (0.15.4)\r\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "Requirement already satisfied: google-crc32c<2.0dev,>=1.0 in /tmpfs/src/tf_docs_env/lib/python3.9/site-packages (from google-cloud-storage<3,>=2.14.0->apache-beam[gcp]<3,>=2.47->tfx) (1.5.0)\r\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "Requirement already satisfied: docopt in /tmpfs/src/tf_docs_env/lib/python3.9/site-packages (from hdfs<3.0.0,>=2.1.0->apache-beam<3,>=2.47->apache-beam[gcp]<3,>=2.47->tfx) (0.6.2)\r\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "Requirement already satisfied: pyparsing!=3.0.0,!=3.0.1,!=3.0.2,!=3.0.3,<4,>=2.4.2 in /tmpfs/src/tf_docs_env/lib/python3.9/site-packages (from httplib2<0.23.0,>=0.8->apache-beam<3,>=2.47->apache-beam[gcp]<3,>=2.47->tfx) (3.1.2)\r\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "Requirement already satisfied: jedi>=0.16 in /tmpfs/src/tf_docs_env/lib/python3.9/site-packages (from ipython<8,>=7->tensorflow-model-analysis<0.47.0,>=0.46.0->tfx) (0.19.1)\r\n", "Requirement already satisfied: decorator in /tmpfs/src/tf_docs_env/lib/python3.9/site-packages (from ipython<8,>=7->tensorflow-model-analysis<0.47.0,>=0.46.0->tfx) (5.1.1)\r\n", "Requirement already satisfied: pickleshare in /tmpfs/src/tf_docs_env/lib/python3.9/site-packages (from ipython<8,>=7->tensorflow-model-analysis<0.47.0,>=0.46.0->tfx) (0.7.5)\r\n", "Requirement already satisfied: traitlets>=4.2 in /tmpfs/src/tf_docs_env/lib/python3.9/site-packages (from ipython<8,>=7->tensorflow-model-analysis<0.47.0,>=0.46.0->tfx) (5.14.3)\r\n", "Requirement already satisfied: prompt-toolkit!=3.0.0,!=3.0.1,<3.1.0,>=2.0.0 in /tmpfs/src/tf_docs_env/lib/python3.9/site-packages (from ipython<8,>=7->tensorflow-model-analysis<0.47.0,>=0.46.0->tfx) (3.0.43)\r\n", "Requirement already satisfied: pygments in /tmpfs/src/tf_docs_env/lib/python3.9/site-packages (from ipython<8,>=7->tensorflow-model-analysis<0.47.0,>=0.46.0->tfx) (2.18.0)\r\n", "Requirement already satisfied: backcall in /tmpfs/src/tf_docs_env/lib/python3.9/site-packages (from ipython<8,>=7->tensorflow-model-analysis<0.47.0,>=0.46.0->tfx) (0.2.0)\r\n", "Requirement already satisfied: matplotlib-inline in /tmpfs/src/tf_docs_env/lib/python3.9/site-packages (from ipython<8,>=7->tensorflow-model-analysis<0.47.0,>=0.46.0->tfx) (0.1.7)\r\n", "Requirement already satisfied: pexpect>4.3 in /tmpfs/src/tf_docs_env/lib/python3.9/site-packages (from ipython<8,>=7->tensorflow-model-analysis<0.47.0,>=0.46.0->tfx) (4.9.0)\r\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "Requirement already satisfied: comm>=0.1.3 in /tmpfs/src/tf_docs_env/lib/python3.9/site-packages (from ipywidgets<8,>=7->tensorflow-model-analysis<0.47.0,>=0.46.0->tfx) (0.2.2)\r\n", "Requirement already satisfied: ipython-genutils~=0.2.0 in /tmpfs/src/tf_docs_env/lib/python3.9/site-packages (from ipywidgets<8,>=7->tensorflow-model-analysis<0.47.0,>=0.46.0->tfx) (0.2.0)\r\n", "Requirement already satisfied: widgetsnbextension~=3.6.6 in /tmpfs/src/tf_docs_env/lib/python3.9/site-packages (from ipywidgets<8,>=7->tensorflow-model-analysis<0.47.0,>=0.46.0->tfx) (3.6.6)\r\n", "Requirement already satisfied: jupyterlab-widgets<3,>=1.0.0 in /tmpfs/src/tf_docs_env/lib/python3.9/site-packages (from ipywidgets<8,>=7->tensorflow-model-analysis<0.47.0,>=0.46.0->tfx) (1.1.7)\r\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "Requirement already satisfied: tzlocal>=1.2 in /tmpfs/src/tf_docs_env/lib/python3.9/site-packages (from js2py<1,>=0.74->apache-beam<3,>=2.47->apache-beam[gcp]<3,>=2.47->tfx) (5.2)\r\n", "Requirement already satisfied: pyjsparser>=2.5.1 in /tmpfs/src/tf_docs_env/lib/python3.9/site-packages (from js2py<1,>=0.74->apache-beam<3,>=2.47->apache-beam[gcp]<3,>=2.47->tfx) (2.7.1)\r\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "Requirement already satisfied: jsonschema-specifications>=2023.03.6 in /tmpfs/src/tf_docs_env/lib/python3.9/site-packages (from jsonschema<5.0.0,>=4.0.0->apache-beam<3,>=2.47->apache-beam[gcp]<3,>=2.47->tfx) (2023.12.1)\r\n", "Requirement already satisfied: referencing>=0.28.4 in /tmpfs/src/tf_docs_env/lib/python3.9/site-packages (from jsonschema<5.0.0,>=4.0.0->apache-beam<3,>=2.47->apache-beam[gcp]<3,>=2.47->tfx) (0.35.1)\r\n", "Requirement already satisfied: rpds-py>=0.7.1 in /tmpfs/src/tf_docs_env/lib/python3.9/site-packages (from jsonschema<5.0.0,>=4.0.0->apache-beam<3,>=2.47->apache-beam[gcp]<3,>=2.47->tfx) (0.18.1)\r\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "Requirement already satisfied: pyasn1>=0.1.7 in /tmpfs/src/tf_docs_env/lib/python3.9/site-packages (from oauth2client>=1.4.12->google-apitools<1,>=0.5->tfx) (0.6.0)\r\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "Requirement already satisfied: dnspython<3.0.0,>=1.16.0 in /tmpfs/src/tf_docs_env/lib/python3.9/site-packages (from pymongo<5.0.0,>=3.8.0->apache-beam<3,>=2.47->apache-beam[gcp]<3,>=2.47->tfx) (2.6.1)\r\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "Requirement already satisfied: async-timeout>=4.0.3 in /tmpfs/src/tf_docs_env/lib/python3.9/site-packages (from redis<6,>=5.0.0->apache-beam<3,>=2.47->apache-beam[gcp]<3,>=2.47->tfx) (4.0.3)\r\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "Requirement already satisfied: charset-normalizer<4,>=2 in /tmpfs/src/tf_docs_env/lib/python3.9/site-packages (from requests<3.0.0,>=2.24.0->apache-beam<3,>=2.47->apache-beam[gcp]<3,>=2.47->tfx) (3.3.2)\r\n", "Requirement already satisfied: idna<4,>=2.5 in /tmpfs/src/tf_docs_env/lib/python3.9/site-packages (from requests<3.0.0,>=2.24.0->apache-beam<3,>=2.47->apache-beam[gcp]<3,>=2.47->tfx) (3.7)\r\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "Requirement already satisfied: nltk in /tmpfs/src/tf_docs_env/lib/python3.9/site-packages (from rouge-score<2,>=0.1.2->tensorflow-model-analysis<0.47.0,>=0.46.0->tfx) (3.8.1)\r\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "Requirement already satisfied: portalocker in /tmpfs/src/tf_docs_env/lib/python3.9/site-packages (from sacrebleu<4,>=2.3->tensorflow-model-analysis<0.47.0,>=0.46.0->tfx) (2.8.2)\r\n", "Requirement already satisfied: tabulate>=0.8.9 in /tmpfs/src/tf_docs_env/lib/python3.9/site-packages (from sacrebleu<4,>=2.3->tensorflow-model-analysis<0.47.0,>=0.46.0->tfx) (0.9.0)\r\n", "Requirement already satisfied: colorama in /tmpfs/src/tf_docs_env/lib/python3.9/site-packages (from sacrebleu<4,>=2.3->tensorflow-model-analysis<0.47.0,>=0.46.0->tfx) (0.4.6)\r\n", "Requirement already satisfied: lxml in /tmpfs/src/tf_docs_env/lib/python3.9/site-packages (from sacrebleu<4,>=2.3->tensorflow-model-analysis<0.47.0,>=0.46.0->tfx) (5.2.1)\r\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "Requirement already satisfied: google-auth-oauthlib<2,>=0.5 in /tmpfs/src/tf_docs_env/lib/python3.9/site-packages (from tensorboard<2.16,>=2.15->tensorflow<2.16,>=2.15.0->tfx) (1.2.0)\r\n", "Requirement already satisfied: markdown>=2.6.8 in /tmpfs/src/tf_docs_env/lib/python3.9/site-packages (from tensorboard<2.16,>=2.15->tensorflow<2.16,>=2.15.0->tfx) (3.6)\r\n", "Requirement already satisfied: tensorboard-data-server<0.8.0,>=0.7.0 in /tmpfs/src/tf_docs_env/lib/python3.9/site-packages (from tensorboard<2.16,>=2.15->tensorflow<2.16,>=2.15.0->tfx) (0.7.2)\r\n", "Requirement already satisfied: werkzeug>=1.0.1 in /tmpfs/src/tf_docs_env/lib/python3.9/site-packages (from tensorboard<2.16,>=2.15->tensorflow<2.16,>=2.15.0->tfx) (3.0.3)\r\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "Requirement already satisfied: oauthlib>=3.0.0 in /tmpfs/src/tf_docs_env/lib/python3.9/site-packages (from requests-oauthlib->kubernetes<13,>=10.0.1->tfx) (3.2.2)\r\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "Requirement already satisfied: parso<0.9.0,>=0.8.3 in /tmpfs/src/tf_docs_env/lib/python3.9/site-packages (from jedi>=0.16->ipython<8,>=7->tensorflow-model-analysis<0.47.0,>=0.46.0->tfx) (0.8.4)\r\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "Requirement already satisfied: importlib-metadata>=4.4 in /tmpfs/src/tf_docs_env/lib/python3.9/site-packages (from markdown>=2.6.8->tensorboard<2.16,>=2.15->tensorflow<2.16,>=2.15.0->tfx) (7.1.0)\r\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "Requirement already satisfied: ptyprocess>=0.5 in /tmpfs/src/tf_docs_env/lib/python3.9/site-packages (from pexpect>4.3->ipython<8,>=7->tensorflow-model-analysis<0.47.0,>=0.46.0->tfx) (0.7.0)\r\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "Requirement already satisfied: wcwidth in /tmpfs/src/tf_docs_env/lib/python3.9/site-packages (from prompt-toolkit!=3.0.0,!=3.0.1,<3.1.0,>=2.0.0->ipython<8,>=7->tensorflow-model-analysis<0.47.0,>=0.46.0->tfx) (0.2.13)\r\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "Requirement already satisfied: notebook>=4.4.1 in /tmpfs/src/tf_docs_env/lib/python3.9/site-packages (from widgetsnbextension~=3.6.6->ipywidgets<8,>=7->tensorflow-model-analysis<0.47.0,>=0.46.0->tfx) (7.1.3)\r\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "Requirement already satisfied: tqdm in /tmpfs/src/tf_docs_env/lib/python3.9/site-packages (from nltk->rouge-score<2,>=0.1.2->tensorflow-model-analysis<0.47.0,>=0.46.0->tfx) (4.66.4)\r\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "Requirement already satisfied: zipp>=0.5 in /tmpfs/src/tf_docs_env/lib/python3.9/site-packages (from importlib-metadata>=4.4->markdown>=2.6.8->tensorboard<2.16,>=2.15->tensorflow<2.16,>=2.15.0->tfx) (3.18.1)\r\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "Requirement already satisfied: jupyter-server<3,>=2.4.0 in /tmpfs/src/tf_docs_env/lib/python3.9/site-packages (from notebook>=4.4.1->widgetsnbextension~=3.6.6->ipywidgets<8,>=7->tensorflow-model-analysis<0.47.0,>=0.46.0->tfx) (2.14.0)\r\n", "Requirement already satisfied: jupyterlab-server<3,>=2.22.1 in /tmpfs/src/tf_docs_env/lib/python3.9/site-packages (from notebook>=4.4.1->widgetsnbextension~=3.6.6->ipywidgets<8,>=7->tensorflow-model-analysis<0.47.0,>=0.46.0->tfx) (2.27.1)\r\n", "Requirement already satisfied: jupyterlab<4.2,>=4.1.1 in /tmpfs/src/tf_docs_env/lib/python3.9/site-packages (from notebook>=4.4.1->widgetsnbextension~=3.6.6->ipywidgets<8,>=7->tensorflow-model-analysis<0.47.0,>=0.46.0->tfx) (4.1.8)\r\n", "Requirement already satisfied: notebook-shim<0.3,>=0.2 in /tmpfs/src/tf_docs_env/lib/python3.9/site-packages (from notebook>=4.4.1->widgetsnbextension~=3.6.6->ipywidgets<8,>=7->tensorflow-model-analysis<0.47.0,>=0.46.0->tfx) (0.2.4)\r\n", "Requirement already satisfied: tornado>=6.2.0 in /tmpfs/src/tf_docs_env/lib/python3.9/site-packages (from notebook>=4.4.1->widgetsnbextension~=3.6.6->ipywidgets<8,>=7->tensorflow-model-analysis<0.47.0,>=0.46.0->tfx) (6.4)\r\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "Requirement already satisfied: anyio>=3.1.0 in /tmpfs/src/tf_docs_env/lib/python3.9/site-packages (from jupyter-server<3,>=2.4.0->notebook>=4.4.1->widgetsnbextension~=3.6.6->ipywidgets<8,>=7->tensorflow-model-analysis<0.47.0,>=0.46.0->tfx) (4.3.0)\r\n", "Requirement already satisfied: argon2-cffi>=21.1 in /tmpfs/src/tf_docs_env/lib/python3.9/site-packages (from jupyter-server<3,>=2.4.0->notebook>=4.4.1->widgetsnbextension~=3.6.6->ipywidgets<8,>=7->tensorflow-model-analysis<0.47.0,>=0.46.0->tfx) (23.1.0)\r\n", "Requirement already satisfied: jupyter-client>=7.4.4 in /tmpfs/src/tf_docs_env/lib/python3.9/site-packages (from jupyter-server<3,>=2.4.0->notebook>=4.4.1->widgetsnbextension~=3.6.6->ipywidgets<8,>=7->tensorflow-model-analysis<0.47.0,>=0.46.0->tfx) (8.6.1)\r\n", "Requirement already satisfied: jupyter-core!=5.0.*,>=4.12 in /tmpfs/src/tf_docs_env/lib/python3.9/site-packages (from jupyter-server<3,>=2.4.0->notebook>=4.4.1->widgetsnbextension~=3.6.6->ipywidgets<8,>=7->tensorflow-model-analysis<0.47.0,>=0.46.0->tfx) (5.7.2)\r\n", "Requirement already satisfied: jupyter-events>=0.9.0 in /tmpfs/src/tf_docs_env/lib/python3.9/site-packages (from jupyter-server<3,>=2.4.0->notebook>=4.4.1->widgetsnbextension~=3.6.6->ipywidgets<8,>=7->tensorflow-model-analysis<0.47.0,>=0.46.0->tfx) (0.10.0)\r\n", "Requirement already satisfied: jupyter-server-terminals>=0.4.4 in /tmpfs/src/tf_docs_env/lib/python3.9/site-packages (from jupyter-server<3,>=2.4.0->notebook>=4.4.1->widgetsnbextension~=3.6.6->ipywidgets<8,>=7->tensorflow-model-analysis<0.47.0,>=0.46.0->tfx) (0.5.3)\r\n", "Requirement already satisfied: nbconvert>=6.4.4 in /tmpfs/src/tf_docs_env/lib/python3.9/site-packages (from jupyter-server<3,>=2.4.0->notebook>=4.4.1->widgetsnbextension~=3.6.6->ipywidgets<8,>=7->tensorflow-model-analysis<0.47.0,>=0.46.0->tfx) (7.16.4)\r\n", "Requirement already satisfied: nbformat>=5.3.0 in /tmpfs/src/tf_docs_env/lib/python3.9/site-packages (from jupyter-server<3,>=2.4.0->notebook>=4.4.1->widgetsnbextension~=3.6.6->ipywidgets<8,>=7->tensorflow-model-analysis<0.47.0,>=0.46.0->tfx) (5.10.4)\r\n", "Requirement already satisfied: prometheus-client>=0.9 in /tmpfs/src/tf_docs_env/lib/python3.9/site-packages (from jupyter-server<3,>=2.4.0->notebook>=4.4.1->widgetsnbextension~=3.6.6->ipywidgets<8,>=7->tensorflow-model-analysis<0.47.0,>=0.46.0->tfx) (0.20.0)\r\n", "Requirement already satisfied: pyzmq>=24 in /tmpfs/src/tf_docs_env/lib/python3.9/site-packages (from jupyter-server<3,>=2.4.0->notebook>=4.4.1->widgetsnbextension~=3.6.6->ipywidgets<8,>=7->tensorflow-model-analysis<0.47.0,>=0.46.0->tfx) (26.0.3)\r\n", "Requirement already satisfied: send2trash>=1.8.2 in /tmpfs/src/tf_docs_env/lib/python3.9/site-packages (from jupyter-server<3,>=2.4.0->notebook>=4.4.1->widgetsnbextension~=3.6.6->ipywidgets<8,>=7->tensorflow-model-analysis<0.47.0,>=0.46.0->tfx) (1.8.3)\r\n", "Requirement already satisfied: terminado>=0.8.3 in /tmpfs/src/tf_docs_env/lib/python3.9/site-packages (from jupyter-server<3,>=2.4.0->notebook>=4.4.1->widgetsnbextension~=3.6.6->ipywidgets<8,>=7->tensorflow-model-analysis<0.47.0,>=0.46.0->tfx) (0.18.1)\r\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "Requirement already satisfied: async-lru>=1.0.0 in /tmpfs/src/tf_docs_env/lib/python3.9/site-packages (from jupyterlab<4.2,>=4.1.1->notebook>=4.4.1->widgetsnbextension~=3.6.6->ipywidgets<8,>=7->tensorflow-model-analysis<0.47.0,>=0.46.0->tfx) (2.0.4)\r\n", "Requirement already satisfied: httpx>=0.25.0 in /tmpfs/src/tf_docs_env/lib/python3.9/site-packages (from jupyterlab<4.2,>=4.1.1->notebook>=4.4.1->widgetsnbextension~=3.6.6->ipywidgets<8,>=7->tensorflow-model-analysis<0.47.0,>=0.46.0->tfx) (0.27.0)\r\n", "Requirement already satisfied: ipykernel>=6.5.0 in /tmpfs/src/tf_docs_env/lib/python3.9/site-packages (from jupyterlab<4.2,>=4.1.1->notebook>=4.4.1->widgetsnbextension~=3.6.6->ipywidgets<8,>=7->tensorflow-model-analysis<0.47.0,>=0.46.0->tfx) (6.29.4)\r\n", "Requirement already satisfied: jupyter-lsp>=2.0.0 in /tmpfs/src/tf_docs_env/lib/python3.9/site-packages (from jupyterlab<4.2,>=4.1.1->notebook>=4.4.1->widgetsnbextension~=3.6.6->ipywidgets<8,>=7->tensorflow-model-analysis<0.47.0,>=0.46.0->tfx) (2.2.5)\r\n", "Requirement already satisfied: tomli>=1.2.2 in /tmpfs/src/tf_docs_env/lib/python3.9/site-packages (from jupyterlab<4.2,>=4.1.1->notebook>=4.4.1->widgetsnbextension~=3.6.6->ipywidgets<8,>=7->tensorflow-model-analysis<0.47.0,>=0.46.0->tfx) (2.0.1)\r\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "Requirement already satisfied: babel>=2.10 in /tmpfs/src/tf_docs_env/lib/python3.9/site-packages (from jupyterlab-server<3,>=2.22.1->notebook>=4.4.1->widgetsnbextension~=3.6.6->ipywidgets<8,>=7->tensorflow-model-analysis<0.47.0,>=0.46.0->tfx) (2.15.0)\r\n", "Requirement already satisfied: json5>=0.9.0 in /tmpfs/src/tf_docs_env/lib/python3.9/site-packages (from jupyterlab-server<3,>=2.22.1->notebook>=4.4.1->widgetsnbextension~=3.6.6->ipywidgets<8,>=7->tensorflow-model-analysis<0.47.0,>=0.46.0->tfx) (0.9.25)\r\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "Requirement already satisfied: sniffio>=1.1 in /tmpfs/src/tf_docs_env/lib/python3.9/site-packages (from anyio>=3.1.0->jupyter-server<3,>=2.4.0->notebook>=4.4.1->widgetsnbextension~=3.6.6->ipywidgets<8,>=7->tensorflow-model-analysis<0.47.0,>=0.46.0->tfx) (1.3.1)\r\n", "Requirement already satisfied: exceptiongroup>=1.0.2 in /tmpfs/src/tf_docs_env/lib/python3.9/site-packages (from anyio>=3.1.0->jupyter-server<3,>=2.4.0->notebook>=4.4.1->widgetsnbextension~=3.6.6->ipywidgets<8,>=7->tensorflow-model-analysis<0.47.0,>=0.46.0->tfx) (1.2.1)\r\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "Requirement already satisfied: argon2-cffi-bindings in /tmpfs/src/tf_docs_env/lib/python3.9/site-packages (from argon2-cffi>=21.1->jupyter-server<3,>=2.4.0->notebook>=4.4.1->widgetsnbextension~=3.6.6->ipywidgets<8,>=7->tensorflow-model-analysis<0.47.0,>=0.46.0->tfx) (21.2.0)\r\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "Requirement already satisfied: httpcore==1.* in /tmpfs/src/tf_docs_env/lib/python3.9/site-packages (from httpx>=0.25.0->jupyterlab<4.2,>=4.1.1->notebook>=4.4.1->widgetsnbextension~=3.6.6->ipywidgets<8,>=7->tensorflow-model-analysis<0.47.0,>=0.46.0->tfx) (1.0.5)\r\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "Requirement already satisfied: h11<0.15,>=0.13 in /tmpfs/src/tf_docs_env/lib/python3.9/site-packages (from httpcore==1.*->httpx>=0.25.0->jupyterlab<4.2,>=4.1.1->notebook>=4.4.1->widgetsnbextension~=3.6.6->ipywidgets<8,>=7->tensorflow-model-analysis<0.47.0,>=0.46.0->tfx) (0.14.0)\r\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "Requirement already satisfied: debugpy>=1.6.5 in /tmpfs/src/tf_docs_env/lib/python3.9/site-packages (from ipykernel>=6.5.0->jupyterlab<4.2,>=4.1.1->notebook>=4.4.1->widgetsnbextension~=3.6.6->ipywidgets<8,>=7->tensorflow-model-analysis<0.47.0,>=0.46.0->tfx) (1.8.1)\r\n", "Requirement already satisfied: nest-asyncio in /tmpfs/src/tf_docs_env/lib/python3.9/site-packages (from ipykernel>=6.5.0->jupyterlab<4.2,>=4.1.1->notebook>=4.4.1->widgetsnbextension~=3.6.6->ipywidgets<8,>=7->tensorflow-model-analysis<0.47.0,>=0.46.0->tfx) (1.6.0)\r\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "Requirement already satisfied: platformdirs>=2.5 in /tmpfs/src/tf_docs_env/lib/python3.9/site-packages (from jupyter-core!=5.0.*,>=4.12->jupyter-server<3,>=2.4.0->notebook>=4.4.1->widgetsnbextension~=3.6.6->ipywidgets<8,>=7->tensorflow-model-analysis<0.47.0,>=0.46.0->tfx) (4.2.1)\r\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "Requirement already satisfied: python-json-logger>=2.0.4 in /tmpfs/src/tf_docs_env/lib/python3.9/site-packages (from jupyter-events>=0.9.0->jupyter-server<3,>=2.4.0->notebook>=4.4.1->widgetsnbextension~=3.6.6->ipywidgets<8,>=7->tensorflow-model-analysis<0.47.0,>=0.46.0->tfx) (2.0.7)\r\n", "Requirement already satisfied: rfc3339-validator in /tmpfs/src/tf_docs_env/lib/python3.9/site-packages (from jupyter-events>=0.9.0->jupyter-server<3,>=2.4.0->notebook>=4.4.1->widgetsnbextension~=3.6.6->ipywidgets<8,>=7->tensorflow-model-analysis<0.47.0,>=0.46.0->tfx) (0.1.4)\r\n", "Requirement already satisfied: rfc3986-validator>=0.1.1 in /tmpfs/src/tf_docs_env/lib/python3.9/site-packages (from jupyter-events>=0.9.0->jupyter-server<3,>=2.4.0->notebook>=4.4.1->widgetsnbextension~=3.6.6->ipywidgets<8,>=7->tensorflow-model-analysis<0.47.0,>=0.46.0->tfx) (0.1.1)\r\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "Requirement already satisfied: beautifulsoup4 in /tmpfs/src/tf_docs_env/lib/python3.9/site-packages (from nbconvert>=6.4.4->jupyter-server<3,>=2.4.0->notebook>=4.4.1->widgetsnbextension~=3.6.6->ipywidgets<8,>=7->tensorflow-model-analysis<0.47.0,>=0.46.0->tfx) (4.12.3)\r\n", "Requirement already satisfied: bleach!=5.0.0 in /tmpfs/src/tf_docs_env/lib/python3.9/site-packages (from nbconvert>=6.4.4->jupyter-server<3,>=2.4.0->notebook>=4.4.1->widgetsnbextension~=3.6.6->ipywidgets<8,>=7->tensorflow-model-analysis<0.47.0,>=0.46.0->tfx) (6.1.0)\r\n", "Requirement already satisfied: defusedxml in /tmpfs/src/tf_docs_env/lib/python3.9/site-packages (from nbconvert>=6.4.4->jupyter-server<3,>=2.4.0->notebook>=4.4.1->widgetsnbextension~=3.6.6->ipywidgets<8,>=7->tensorflow-model-analysis<0.47.0,>=0.46.0->tfx) (0.7.1)\r\n", "Requirement already satisfied: jupyterlab-pygments in /tmpfs/src/tf_docs_env/lib/python3.9/site-packages (from nbconvert>=6.4.4->jupyter-server<3,>=2.4.0->notebook>=4.4.1->widgetsnbextension~=3.6.6->ipywidgets<8,>=7->tensorflow-model-analysis<0.47.0,>=0.46.0->tfx) (0.3.0)\r\n", "Requirement already satisfied: mistune<4,>=2.0.3 in /tmpfs/src/tf_docs_env/lib/python3.9/site-packages (from nbconvert>=6.4.4->jupyter-server<3,>=2.4.0->notebook>=4.4.1->widgetsnbextension~=3.6.6->ipywidgets<8,>=7->tensorflow-model-analysis<0.47.0,>=0.46.0->tfx) (3.0.2)\r\n", "Requirement already satisfied: nbclient>=0.5.0 in /tmpfs/src/tf_docs_env/lib/python3.9/site-packages (from nbconvert>=6.4.4->jupyter-server<3,>=2.4.0->notebook>=4.4.1->widgetsnbextension~=3.6.6->ipywidgets<8,>=7->tensorflow-model-analysis<0.47.0,>=0.46.0->tfx) (0.10.0)\r\n", "Requirement already satisfied: pandocfilters>=1.4.1 in /tmpfs/src/tf_docs_env/lib/python3.9/site-packages (from nbconvert>=6.4.4->jupyter-server<3,>=2.4.0->notebook>=4.4.1->widgetsnbextension~=3.6.6->ipywidgets<8,>=7->tensorflow-model-analysis<0.47.0,>=0.46.0->tfx) (1.5.1)\r\n", "Requirement already satisfied: tinycss2 in /tmpfs/src/tf_docs_env/lib/python3.9/site-packages (from nbconvert>=6.4.4->jupyter-server<3,>=2.4.0->notebook>=4.4.1->widgetsnbextension~=3.6.6->ipywidgets<8,>=7->tensorflow-model-analysis<0.47.0,>=0.46.0->tfx) (1.3.0)\r\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "Requirement already satisfied: fastjsonschema>=2.15 in /tmpfs/src/tf_docs_env/lib/python3.9/site-packages (from nbformat>=5.3.0->jupyter-server<3,>=2.4.0->notebook>=4.4.1->widgetsnbextension~=3.6.6->ipywidgets<8,>=7->tensorflow-model-analysis<0.47.0,>=0.46.0->tfx) (2.19.1)\r\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "Requirement already satisfied: webencodings in /tmpfs/src/tf_docs_env/lib/python3.9/site-packages (from bleach!=5.0.0->nbconvert>=6.4.4->jupyter-server<3,>=2.4.0->notebook>=4.4.1->widgetsnbextension~=3.6.6->ipywidgets<8,>=7->tensorflow-model-analysis<0.47.0,>=0.46.0->tfx) (0.5.1)\r\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "Requirement already satisfied: fqdn in /tmpfs/src/tf_docs_env/lib/python3.9/site-packages (from jsonschema[format-nongpl]>=4.18.0->jupyter-events>=0.9.0->jupyter-server<3,>=2.4.0->notebook>=4.4.1->widgetsnbextension~=3.6.6->ipywidgets<8,>=7->tensorflow-model-analysis<0.47.0,>=0.46.0->tfx) (1.5.1)\r\n", "Requirement already satisfied: isoduration in /tmpfs/src/tf_docs_env/lib/python3.9/site-packages (from jsonschema[format-nongpl]>=4.18.0->jupyter-events>=0.9.0->jupyter-server<3,>=2.4.0->notebook>=4.4.1->widgetsnbextension~=3.6.6->ipywidgets<8,>=7->tensorflow-model-analysis<0.47.0,>=0.46.0->tfx) (20.11.0)\r\n", "Requirement already satisfied: jsonpointer>1.13 in /tmpfs/src/tf_docs_env/lib/python3.9/site-packages (from jsonschema[format-nongpl]>=4.18.0->jupyter-events>=0.9.0->jupyter-server<3,>=2.4.0->notebook>=4.4.1->widgetsnbextension~=3.6.6->ipywidgets<8,>=7->tensorflow-model-analysis<0.47.0,>=0.46.0->tfx) (2.4)\r\n", "Requirement already satisfied: uri-template in /tmpfs/src/tf_docs_env/lib/python3.9/site-packages (from jsonschema[format-nongpl]>=4.18.0->jupyter-events>=0.9.0->jupyter-server<3,>=2.4.0->notebook>=4.4.1->widgetsnbextension~=3.6.6->ipywidgets<8,>=7->tensorflow-model-analysis<0.47.0,>=0.46.0->tfx) (1.3.0)\r\n", "Requirement already satisfied: webcolors>=1.11 in /tmpfs/src/tf_docs_env/lib/python3.9/site-packages (from jsonschema[format-nongpl]>=4.18.0->jupyter-events>=0.9.0->jupyter-server<3,>=2.4.0->notebook>=4.4.1->widgetsnbextension~=3.6.6->ipywidgets<8,>=7->tensorflow-model-analysis<0.47.0,>=0.46.0->tfx) (1.13)\r\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "Requirement already satisfied: cffi>=1.0.1 in /tmpfs/src/tf_docs_env/lib/python3.9/site-packages (from argon2-cffi-bindings->argon2-cffi>=21.1->jupyter-server<3,>=2.4.0->notebook>=4.4.1->widgetsnbextension~=3.6.6->ipywidgets<8,>=7->tensorflow-model-analysis<0.47.0,>=0.46.0->tfx) (1.16.0)\r\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "Requirement already satisfied: soupsieve>1.2 in /tmpfs/src/tf_docs_env/lib/python3.9/site-packages (from beautifulsoup4->nbconvert>=6.4.4->jupyter-server<3,>=2.4.0->notebook>=4.4.1->widgetsnbextension~=3.6.6->ipywidgets<8,>=7->tensorflow-model-analysis<0.47.0,>=0.46.0->tfx) (2.5)\r\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "Requirement already satisfied: pycparser in /tmpfs/src/tf_docs_env/lib/python3.9/site-packages (from cffi>=1.0.1->argon2-cffi-bindings->argon2-cffi>=21.1->jupyter-server<3,>=2.4.0->notebook>=4.4.1->widgetsnbextension~=3.6.6->ipywidgets<8,>=7->tensorflow-model-analysis<0.47.0,>=0.46.0->tfx) (2.22)\r\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "Requirement already satisfied: arrow>=0.15.0 in /tmpfs/src/tf_docs_env/lib/python3.9/site-packages (from isoduration->jsonschema[format-nongpl]>=4.18.0->jupyter-events>=0.9.0->jupyter-server<3,>=2.4.0->notebook>=4.4.1->widgetsnbextension~=3.6.6->ipywidgets<8,>=7->tensorflow-model-analysis<0.47.0,>=0.46.0->tfx) (1.3.0)\r\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "Requirement already satisfied: types-python-dateutil>=2.8.10 in /tmpfs/src/tf_docs_env/lib/python3.9/site-packages (from arrow>=0.15.0->isoduration->jsonschema[format-nongpl]>=4.18.0->jupyter-events>=0.9.0->jupyter-server<3,>=2.4.0->notebook>=4.4.1->widgetsnbextension~=3.6.6->ipywidgets<8,>=7->tensorflow-model-analysis<0.47.0,>=0.46.0->tfx) (2.9.0.20240316)\r\n" ] } ], "source": [ "!pip install -U tfx" ] }, { "cell_type": "markdown", "metadata": { "id": "EwT0nov5QO1M" }, "source": [ "### Did you restart the runtime?\n", "\n", "If you are using Google Colab, the first time that you run\n", "the cell above, you must restart the runtime by clicking\n", "above \"RESTART RUNTIME\" button or using \"Runtime > Restart\n", "runtime ...\" menu. This is because of the way that Colab\n", "loads packages." ] }, { "cell_type": "markdown", "metadata": { "id": "BDnPgN8UJtzN" }, "source": [ "Check the TensorFlow and TFX versions." ] }, { "cell_type": "code", "execution_count": 4, "metadata": { "execution": { "iopub.execute_input": "2024-05-08T09:20:07.666159Z", "iopub.status.busy": "2024-05-08T09:20:07.665603Z", "iopub.status.idle": "2024-05-08T09:20:13.742031Z", "shell.execute_reply": "2024-05-08T09:20:13.741212Z" }, "id": "6jh7vKSRqPHb" }, "outputs": [ { "name": "stderr", "output_type": "stream", "text": [ "2024-05-08 09:20:08.101132: E external/local_xla/xla/stream_executor/cuda/cuda_dnn.cc:9261] Unable to register cuDNN factory: Attempting to register factory for plugin cuDNN when one has already been registered\n", "2024-05-08 09:20:08.101177: E external/local_xla/xla/stream_executor/cuda/cuda_fft.cc:607] Unable to register cuFFT factory: Attempting to register factory for plugin cuFFT when one has already been registered\n", "2024-05-08 09:20:08.102632: E external/local_xla/xla/stream_executor/cuda/cuda_blas.cc:1515] Unable to register cuBLAS factory: Attempting to register factory for plugin cuBLAS when one has already been registered\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "TensorFlow version: 2.15.1\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "TFX version: 1.15.0\n" ] } ], "source": [ "import tensorflow as tf\n", "print('TensorFlow version: {}'.format(tf.__version__))\n", "from tfx import v1 as tfx\n", "print('TFX version: {}'.format(tfx.__version__))" ] }, { "cell_type": "markdown", "metadata": { "id": "aDtLdSkvqPHe" }, "source": [ "### Set up variables\n", "\n", "There are some variables used to define a pipeline. You can customize these\n", "variables as you want. By default all output from the pipeline will be\n", "generated under the current directory." ] }, { "cell_type": "code", "execution_count": 5, "metadata": { "execution": { "iopub.execute_input": "2024-05-08T09:20:13.745748Z", "iopub.status.busy": "2024-05-08T09:20:13.745313Z", "iopub.status.idle": "2024-05-08T09:20:13.750011Z", "shell.execute_reply": "2024-05-08T09:20:13.749380Z" }, "id": "EcUseqJaE2XN" }, "outputs": [], "source": [ "import os\n", "\n", "PIPELINE_NAME = \"penguin-transform\"\n", "\n", "# Output directory to store artifacts generated from the pipeline.\n", "PIPELINE_ROOT = os.path.join('pipelines', PIPELINE_NAME)\n", "# Path to a SQLite DB file to use as an MLMD storage.\n", "METADATA_PATH = os.path.join('metadata', PIPELINE_NAME, 'metadata.db')\n", "# Output directory where created models from the pipeline will be exported.\n", "SERVING_MODEL_DIR = os.path.join('serving_model', PIPELINE_NAME)\n", "\n", "from absl import logging\n", "logging.set_verbosity(logging.INFO) # Set default logging level." ] }, { "cell_type": "markdown", "metadata": { "id": "qsO0l5F3dzOr" }, "source": [ "### Prepare example data\n", "We will download the example dataset for use in our TFX pipeline. The dataset\n", "we are using is\n", "[Palmer Penguins dataset](https://allisonhorst.github.io/palmerpenguins/articles/intro.html).\n", "\n", "However, unlike previous tutorials which used an already preprocessed dataset,\n", "we will use the **raw** Palmer Penguins dataset.\n" ] }, { "cell_type": "markdown", "metadata": { "id": "11J7XiCq6AFP" }, "source": [ "Because the TFX ExampleGen component reads inputs from a directory, we need\n", "to create a directory and copy the dataset to it." ] }, { "cell_type": "code", "execution_count": 6, "metadata": { "execution": { "iopub.execute_input": "2024-05-08T09:20:13.753137Z", "iopub.status.busy": "2024-05-08T09:20:13.752872Z", "iopub.status.idle": "2024-05-08T09:20:13.817861Z", "shell.execute_reply": "2024-05-08T09:20:13.817203Z" }, "id": "4fxMs6u86acP" }, "outputs": [ { "data": { "text/plain": [ "('/tmpfs/tmp/tfx-data244l5nap/data.csv',\n", " )" ] }, "execution_count": 6, "metadata": {}, "output_type": "execute_result" } ], "source": [ "import urllib.request\n", "import tempfile\n", "\n", "DATA_ROOT = tempfile.mkdtemp(prefix='tfx-data') # Create a temporary directory.\n", "_data_path = 'https://storage.googleapis.com/download.tensorflow.org/data/palmer_penguins/penguins_size.csv'\n", "_data_filepath = os.path.join(DATA_ROOT, \"data.csv\")\n", "urllib.request.urlretrieve(_data_path, _data_filepath)" ] }, { "cell_type": "markdown", "metadata": { "id": "ASpoNmxKSQjI" }, "source": [ "Take a quick look at what the raw data looks like." ] }, { "cell_type": "code", "execution_count": 7, "metadata": { "execution": { "iopub.execute_input": "2024-05-08T09:20:13.820957Z", "iopub.status.busy": "2024-05-08T09:20:13.820708Z", "iopub.status.idle": "2024-05-08T09:20:13.957672Z", "shell.execute_reply": "2024-05-08T09:20:13.956832Z" }, "id": "-eSz28UDSnlG" }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "species,island,culmen_length_mm,culmen_depth_mm,flipper_length_mm,body_mass_g,sex\r\n", "Adelie,Torgersen,39.1,18.7,181,3750,MALE\r\n", "Adelie,Torgersen,39.5,17.4,186,3800,FEMALE\r\n", "Adelie,Torgersen,40.3,18,195,3250,FEMALE\r\n", "Adelie,Torgersen,NA,NA,NA,NA,NA\r\n", "Adelie,Torgersen,36.7,19.3,193,3450,FEMALE\r\n", "Adelie,Torgersen,39.3,20.6,190,3650,MALE\r\n", "Adelie,Torgersen,38.9,17.8,181,3625,FEMALE\r\n", "Adelie,Torgersen,39.2,19.6,195,4675,MALE\r\n", "Adelie,Torgersen,34.1,18.1,193,3475,NA\r\n" ] } ], "source": [ "!head {_data_filepath}" ] }, { "cell_type": "markdown", "metadata": { "id": "OTtQNq1DdVvG" }, "source": [ "There are some entries with missing values which are represented as `NA`.\n", "We will just delete those entries in this tutorial." ] }, { "cell_type": "code", "execution_count": 8, "metadata": { "execution": { "iopub.execute_input": "2024-05-08T09:20:13.961327Z", "iopub.status.busy": "2024-05-08T09:20:13.961029Z", "iopub.status.idle": "2024-05-08T09:20:14.225707Z", "shell.execute_reply": "2024-05-08T09:20:14.224816Z" }, "id": "fQhpoaqff9ca" }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "species,island,culmen_length_mm,culmen_depth_mm,flipper_length_mm,body_mass_g,sex\r\n", "Adelie,Torgersen,39.1,18.7,181,3750,MALE\r\n", "Adelie,Torgersen,39.5,17.4,186,3800,FEMALE\r\n", "Adelie,Torgersen,40.3,18,195,3250,FEMALE\r\n", "Adelie,Torgersen,36.7,19.3,193,3450,FEMALE\r\n", "Adelie,Torgersen,39.3,20.6,190,3650,MALE\r\n", "Adelie,Torgersen,38.9,17.8,181,3625,FEMALE\r\n", "Adelie,Torgersen,39.2,19.6,195,4675,MALE\r\n", "Adelie,Torgersen,41.1,17.6,182,3200,FEMALE\r\n", "Adelie,Torgersen,38.6,21.2,191,3800,MALE\r\n" ] } ], "source": [ "!sed -i '/\\bNA\\b/d' {_data_filepath}\n", "!head {_data_filepath}" ] }, { "cell_type": "markdown", "metadata": { "id": "z8EOfCy1dzO2" }, "source": [ "You should be able to see seven features which describe penguins. We will use\n", "the same set of features as the previous tutorials - 'culmen_length_mm',\n", "'culmen_depth_mm', 'flipper_length_mm', 'body_mass_g' - and will predict\n", "the 'species' of a penguin.\n", "\n", "**The only difference will be that the input data is not preprocessed.** Note\n", "that we will not use other features like 'island' or 'sex' in this tutorial." ] }, { "cell_type": "markdown", "metadata": { "id": "Jtbrkjjc-IKA" }, "source": [ "### Prepare a schema file\n", "\n", "As described in\n", "[Data validation using TFX Pipeline and TensorFlow Data Validation Tutorial](https://www.tensorflow.org/tfx/tutorials/tfx/penguin_tfdv),\n", "we need a schema file for the dataset. Because the dataset is different from the previous tutorial we need to generate it again. In this tutorial, we will skip those steps and just use a prepared schema file.\n" ] }, { "cell_type": "code", "execution_count": 9, "metadata": { "execution": { "iopub.execute_input": "2024-05-08T09:20:14.230007Z", "iopub.status.busy": "2024-05-08T09:20:14.229697Z", "iopub.status.idle": "2024-05-08T09:20:14.372362Z", "shell.execute_reply": "2024-05-08T09:20:14.371745Z" }, "id": "EDoB97m8B9nG" }, "outputs": [ { "data": { "text/plain": [ "('schema/schema.pbtxt', )" ] }, "execution_count": 9, "metadata": {}, "output_type": "execute_result" } ], "source": [ "import shutil\n", "\n", "SCHEMA_PATH = 'schema'\n", "\n", "_schema_uri = 'https://raw.githubusercontent.com/tensorflow/tfx/master/tfx/examples/penguin/schema/raw/schema.pbtxt'\n", "_schema_filename = 'schema.pbtxt'\n", "_schema_filepath = os.path.join(SCHEMA_PATH, _schema_filename)\n", "\n", "os.makedirs(SCHEMA_PATH, exist_ok=True)\n", "urllib.request.urlretrieve(_schema_uri, _schema_filepath)" ] }, { "cell_type": "markdown", "metadata": { "id": "gKJ_HDJQB94b" }, "source": [ "This schema file was created with the same pipeline as in the previous tutorial\n", "without any manual changes." ] }, { "cell_type": "markdown", "metadata": { "id": "nH6gizcpSwWV" }, "source": [ "## Create a pipeline\n", "\n", "TFX pipelines are defined using Python APIs. We will add `Transform`\n", "component to the pipeline we created in the\n", "[Data Validation tutorial](https://www.tensorflow.org/tfx/tutorials/tfx/penguin_tfdv).\n", "\n", "A Transform component requires input data from an `ExampleGen` component and\n", "a schema from a `SchemaGen` component, and produces a \"transform graph\". The\n", "output will be used in a `Trainer` component. Transform can optionally\n", "produce \"transformed data\" in addition, which is the materialized data after\n", "transformation.\n", "However, we will transform data during training in this tutorial without\n", "materialization of the intermediate transformed data.\n", "\n", "One thing to note is that we need to define a Python function,\n", "`preprocessing_fn` to describe how input data should be transformed. This is\n", "similar to a Trainer component which also requires user code for model\n", "definition.\n" ] }, { "cell_type": "markdown", "metadata": { "id": "lOjDv93eS5xV" }, "source": [ "### Write preprocessing and training code\n", "\n", "We need to define two Python functions. One for Transform and one for Trainer.\n", "\n", "#### preprocessing_fn\n", "The Transform component will find a function named `preprocessing_fn` in the\n", "given module file as we did for `Trainer` component. You can also specify a\n", "specific function using the\n", "[`preprocessing_fn` parameter](https://github.com/tensorflow/tfx/blob/142de6e887f26f4101ded7925f60d7d4fe9d42ed/tfx/components/transform/component.py#L113)\n", "of the Transform component.\n", "\n", "In this example, we will do two kinds of transformation. For continuous numeric\n", "features like `culmen_length_mm` and `body_mass_g`, we will normalize these\n", "values using the\n", "[tft.scale_to_z_score](https://www.tensorflow.org/tfx/transform/api_docs/python/tft/scale_to_z_score)\n", "function. For the label feature, we need to convert string labels into numeric\n", "index values. We will use\n", "[`tf.lookup.StaticHashTable`](https://www.tensorflow.org/api_docs/python/tf/lookup/StaticHashTable)\n", "for conversion.\n", "\n", "#### run_fn\n", "\n", "The model itself is almost the same as in the previous tutorials, but this time\n", "we will transform the input data using the transform graph from the Transform\n", "component.\n", "\n", "One more important difference compared to the previous tutorial is that now we\n", "export a model for serving which includes not only the computation graph of the\n", "model, but also the transform graph for preprocessing, which is generated in\n", "Transform component. We need to define a separate function which will be used\n", "for serving incoming requests. You can see that the same function\n", "`_apply_preprocessing` was used for both of the training data and the\n", "serving request.\n" ] }, { "cell_type": "code", "execution_count": 10, "metadata": { "execution": { "iopub.execute_input": "2024-05-08T09:20:14.375690Z", "iopub.status.busy": "2024-05-08T09:20:14.375426Z", "iopub.status.idle": "2024-05-08T09:20:14.378709Z", "shell.execute_reply": "2024-05-08T09:20:14.378119Z" }, "id": "aES7Hv5QTDK3" }, "outputs": [], "source": [ "_module_file = 'penguin_utils.py'" ] }, { "cell_type": "code", "execution_count": 11, "metadata": { "execution": { "iopub.execute_input": "2024-05-08T09:20:14.381893Z", "iopub.status.busy": "2024-05-08T09:20:14.381396Z", "iopub.status.idle": "2024-05-08T09:20:14.389282Z", "shell.execute_reply": "2024-05-08T09:20:14.388671Z" }, "id": "Gnc67uQNTDfW" }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Writing penguin_utils.py\n" ] } ], "source": [ "%%writefile {_module_file}\n", "\n", "\n", "from typing import List, Text\n", "from absl import logging\n", "import tensorflow as tf\n", "from tensorflow import keras\n", "from tensorflow_metadata.proto.v0 import schema_pb2\n", "import tensorflow_transform as tft\n", "from tensorflow_transform.tf_metadata import schema_utils\n", "\n", "from tfx import v1 as tfx\n", "from tfx_bsl.public import tfxio\n", "\n", "# Specify features that we will use.\n", "_FEATURE_KEYS = [\n", " 'culmen_length_mm', 'culmen_depth_mm', 'flipper_length_mm', 'body_mass_g'\n", "]\n", "_LABEL_KEY = 'species'\n", "\n", "_TRAIN_BATCH_SIZE = 20\n", "_EVAL_BATCH_SIZE = 10\n", "\n", "\n", "# NEW: TFX Transform will call this function.\n", "def preprocessing_fn(inputs):\n", " \"\"\"tf.transform's callback function for preprocessing inputs.\n", "\n", " Args:\n", " inputs: map from feature keys to raw not-yet-transformed features.\n", "\n", " Returns:\n", " Map from string feature key to transformed feature.\n", " \"\"\"\n", " outputs = {}\n", "\n", " # Uses features defined in _FEATURE_KEYS only.\n", " for key in _FEATURE_KEYS:\n", " # tft.scale_to_z_score computes the mean and variance of the given feature\n", " # and scales the output based on the result.\n", " outputs[key] = tft.scale_to_z_score(inputs[key])\n", "\n", " # For the label column we provide the mapping from string to index.\n", " # We could instead use `tft.compute_and_apply_vocabulary()` in order to\n", " # compute the vocabulary dynamically and perform a lookup.\n", " # Since in this example there are only 3 possible values, we use a hard-coded\n", " # table for simplicity.\n", " table_keys = ['Adelie', 'Chinstrap', 'Gentoo']\n", " initializer = tf.lookup.KeyValueTensorInitializer(\n", " keys=table_keys,\n", " values=tf.cast(tf.range(len(table_keys)), tf.int64),\n", " key_dtype=tf.string,\n", " value_dtype=tf.int64)\n", " table = tf.lookup.StaticHashTable(initializer, default_value=-1)\n", " outputs[_LABEL_KEY] = table.lookup(inputs[_LABEL_KEY])\n", "\n", " return outputs\n", "\n", "\n", "# NEW: This function will apply the same transform operation to training data\n", "# and serving requests.\n", "def _apply_preprocessing(raw_features, tft_layer):\n", " transformed_features = tft_layer(raw_features)\n", " if _LABEL_KEY in raw_features:\n", " transformed_label = transformed_features.pop(_LABEL_KEY)\n", " return transformed_features, transformed_label\n", " else:\n", " return transformed_features, None\n", "\n", "\n", "# NEW: This function will create a handler function which gets a serialized\n", "# tf.example, preprocess and run an inference with it.\n", "def _get_serve_tf_examples_fn(model, tf_transform_output):\n", " # We must save the tft_layer to the model to ensure its assets are kept and\n", " # tracked.\n", " model.tft_layer = tf_transform_output.transform_features_layer()\n", "\n", " @tf.function(input_signature=[\n", " tf.TensorSpec(shape=[None], dtype=tf.string, name='examples')\n", " ])\n", " def serve_tf_examples_fn(serialized_tf_examples):\n", " # Expected input is a string which is serialized tf.Example format.\n", " feature_spec = tf_transform_output.raw_feature_spec()\n", " # Because input schema includes unnecessary fields like 'species' and\n", " # 'island', we filter feature_spec to include required keys only.\n", " required_feature_spec = {\n", " k: v for k, v in feature_spec.items() if k in _FEATURE_KEYS\n", " }\n", " parsed_features = tf.io.parse_example(serialized_tf_examples,\n", " required_feature_spec)\n", "\n", " # Preprocess parsed input with transform operation defined in\n", " # preprocessing_fn().\n", " transformed_features, _ = _apply_preprocessing(parsed_features,\n", " model.tft_layer)\n", " # Run inference with ML model.\n", " return model(transformed_features)\n", "\n", " return serve_tf_examples_fn\n", "\n", "\n", "def _input_fn(file_pattern: List[Text],\n", " data_accessor: tfx.components.DataAccessor,\n", " tf_transform_output: tft.TFTransformOutput,\n", " batch_size: int = 200) -> tf.data.Dataset:\n", " \"\"\"Generates features and label for tuning/training.\n", "\n", " Args:\n", " file_pattern: List of paths or patterns of input tfrecord files.\n", " data_accessor: DataAccessor for converting input to RecordBatch.\n", " tf_transform_output: A TFTransformOutput.\n", " batch_size: representing the number of consecutive elements of returned\n", " dataset to combine in a single batch\n", "\n", " Returns:\n", " A dataset that contains (features, indices) tuple where features is a\n", " dictionary of Tensors, and indices is a single Tensor of label indices.\n", " \"\"\"\n", " dataset = data_accessor.tf_dataset_factory(\n", " file_pattern,\n", " tfxio.TensorFlowDatasetOptions(batch_size=batch_size),\n", " schema=tf_transform_output.raw_metadata.schema)\n", "\n", " transform_layer = tf_transform_output.transform_features_layer()\n", " def apply_transform(raw_features):\n", " return _apply_preprocessing(raw_features, transform_layer)\n", "\n", " return dataset.map(apply_transform).repeat()\n", "\n", "\n", "def _build_keras_model() -> tf.keras.Model:\n", " \"\"\"Creates a DNN Keras model for classifying penguin data.\n", "\n", " Returns:\n", " A Keras Model.\n", " \"\"\"\n", " # The model below is built with Functional API, please refer to\n", " # https://www.tensorflow.org/guide/keras/overview for all API options.\n", " inputs = [\n", " keras.layers.Input(shape=(1,), name=key)\n", " for key in _FEATURE_KEYS\n", " ]\n", " d = keras.layers.concatenate(inputs)\n", " for _ in range(2):\n", " d = keras.layers.Dense(8, activation='relu')(d)\n", " outputs = keras.layers.Dense(3)(d)\n", "\n", " model = keras.Model(inputs=inputs, outputs=outputs)\n", " model.compile(\n", " optimizer=keras.optimizers.Adam(1e-2),\n", " loss=tf.keras.losses.SparseCategoricalCrossentropy(from_logits=True),\n", " metrics=[keras.metrics.SparseCategoricalAccuracy()])\n", "\n", " model.summary(print_fn=logging.info)\n", " return model\n", "\n", "\n", "# TFX Trainer will call this function.\n", "def run_fn(fn_args: tfx.components.FnArgs):\n", " \"\"\"Train the model based on given args.\n", "\n", " Args:\n", " fn_args: Holds args used to train the model as name/value pairs.\n", " \"\"\"\n", " tf_transform_output = tft.TFTransformOutput(fn_args.transform_output)\n", "\n", " train_dataset = _input_fn(\n", " fn_args.train_files,\n", " fn_args.data_accessor,\n", " tf_transform_output,\n", " batch_size=_TRAIN_BATCH_SIZE)\n", " eval_dataset = _input_fn(\n", " fn_args.eval_files,\n", " fn_args.data_accessor,\n", " tf_transform_output,\n", " batch_size=_EVAL_BATCH_SIZE)\n", "\n", " model = _build_keras_model()\n", " model.fit(\n", " train_dataset,\n", " steps_per_epoch=fn_args.train_steps,\n", " validation_data=eval_dataset,\n", " validation_steps=fn_args.eval_steps)\n", "\n", " # NEW: Save a computation graph including transform layer.\n", " signatures = {\n", " 'serving_default': _get_serve_tf_examples_fn(model, tf_transform_output),\n", " }\n", " model.save(fn_args.serving_model_dir, save_format='tf', signatures=signatures)" ] }, { "cell_type": "markdown", "metadata": { "id": "blaw0rs-emEf" }, "source": [ "Now you have completed all of the preparation steps to build a TFX pipeline." ] }, { "cell_type": "markdown", "metadata": { "id": "w3OkNz3gTLwM" }, "source": [ "### Write a pipeline definition\n", "\n", "We define a function to create a TFX pipeline. A `Pipeline` object\n", "represents a TFX pipeline, which can be run using one of the pipeline\n", "orchestration systems that TFX supports.\n" ] }, { "cell_type": "code", "execution_count": 12, "metadata": { "execution": { "iopub.execute_input": "2024-05-08T09:20:14.392378Z", "iopub.status.busy": "2024-05-08T09:20:14.391971Z", "iopub.status.idle": "2024-05-08T09:20:14.400355Z", "shell.execute_reply": "2024-05-08T09:20:14.399761Z" }, "id": "M49yYVNBTPd4" }, "outputs": [], "source": [ "def _create_pipeline(pipeline_name: str, pipeline_root: str, data_root: str,\n", " schema_path: str, module_file: str, serving_model_dir: str,\n", " metadata_path: str) -> tfx.dsl.Pipeline:\n", " \"\"\"Implements the penguin pipeline with TFX.\"\"\"\n", " # Brings data into the pipeline or otherwise joins/converts training data.\n", " example_gen = tfx.components.CsvExampleGen(input_base=data_root)\n", "\n", " # Computes statistics over data for visualization and example validation.\n", " statistics_gen = tfx.components.StatisticsGen(\n", " examples=example_gen.outputs['examples'])\n", "\n", " # Import the schema.\n", " schema_importer = tfx.dsl.Importer(\n", " source_uri=schema_path,\n", " artifact_type=tfx.types.standard_artifacts.Schema).with_id(\n", " 'schema_importer')\n", "\n", " # Performs anomaly detection based on statistics and data schema.\n", " example_validator = tfx.components.ExampleValidator(\n", " statistics=statistics_gen.outputs['statistics'],\n", " schema=schema_importer.outputs['result'])\n", "\n", " # NEW: Transforms input data using preprocessing_fn in the 'module_file'.\n", " transform = tfx.components.Transform(\n", " examples=example_gen.outputs['examples'],\n", " schema=schema_importer.outputs['result'],\n", " materialize=False,\n", " module_file=module_file)\n", "\n", " # Uses user-provided Python function that trains a model.\n", " trainer = tfx.components.Trainer(\n", " module_file=module_file,\n", " examples=example_gen.outputs['examples'],\n", "\n", " # NEW: Pass transform_graph to the trainer.\n", " transform_graph=transform.outputs['transform_graph'],\n", "\n", " train_args=tfx.proto.TrainArgs(num_steps=100),\n", " eval_args=tfx.proto.EvalArgs(num_steps=5))\n", "\n", " # Pushes the model to a filesystem destination.\n", " pusher = tfx.components.Pusher(\n", " model=trainer.outputs['model'],\n", " push_destination=tfx.proto.PushDestination(\n", " filesystem=tfx.proto.PushDestination.Filesystem(\n", " base_directory=serving_model_dir)))\n", "\n", " components = [\n", " example_gen,\n", " statistics_gen,\n", " schema_importer,\n", " example_validator,\n", "\n", " transform, # NEW: Transform component was added to the pipeline.\n", "\n", " trainer,\n", " pusher,\n", " ]\n", "\n", " return tfx.dsl.Pipeline(\n", " pipeline_name=pipeline_name,\n", " pipeline_root=pipeline_root,\n", " metadata_connection_config=tfx.orchestration.metadata\n", " .sqlite_metadata_connection_config(metadata_path),\n", " components=components)" ] }, { "cell_type": "markdown", "metadata": { "id": "mJbq07THU2GV" }, "source": [ "## Run the pipeline\n", "\n", "We will use `LocalDagRunner` as in the previous tutorial." ] }, { "cell_type": "code", "execution_count": 13, "metadata": { "execution": { "iopub.execute_input": "2024-05-08T09:20:14.403558Z", "iopub.status.busy": "2024-05-08T09:20:14.402930Z", "iopub.status.idle": "2024-05-08T09:20:50.085908Z", "shell.execute_reply": "2024-05-08T09:20:50.085145Z" }, "id": "fAtfOZTYWJu-" }, "outputs": [ { "name": "stderr", "output_type": "stream", "text": [ "INFO:absl:Excluding no splits because exclude_splits is not set.\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "INFO:absl:Excluding no splits because exclude_splits is not set.\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "INFO:absl:Generating ephemeral wheel package for '/tmpfs/src/temp/docs/tutorials/tfx/penguin_utils.py' (including modules: ['penguin_utils']).\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "INFO:absl:User module package has hash fingerprint version a5e9139bd7facf5026b5306a6aea534f89db0dea58ebe1bb1fb5ebb9df5fdea9.\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "INFO:absl:Executing: ['/tmpfs/src/tf_docs_env/bin/python', '/tmpfs/tmp/tmpkolg3eiy/_tfx_generated_setup.py', 'bdist_wheel', '--bdist-dir', '/tmpfs/tmp/tmp1yspllww', '--dist-dir', '/tmpfs/tmp/tmp0f9tyt4n']\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "/tmpfs/src/tf_docs_env/lib/python3.9/site-packages/setuptools/_distutils/cmd.py:66: SetuptoolsDeprecationWarning: setup.py install is deprecated.\n", "!!\n", "\n", " ********************************************************************************\n", " Please avoid running ``setup.py`` directly.\n", " Instead, use pypa/build, pypa/installer or other\n", " standards-based tools.\n", "\n", " See https://blog.ganssle.io/articles/2021/10/setup-py-deprecated.html for details.\n", " ********************************************************************************\n", "\n", "!!\n", " self.initialize_options()\n", "INFO:absl:Successfully built user code wheel distribution at 'pipelines/penguin-transform/_wheels/tfx_user_code_Transform-0.0+a5e9139bd7facf5026b5306a6aea534f89db0dea58ebe1bb1fb5ebb9df5fdea9-py3-none-any.whl'; target user module is 'penguin_utils'.\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "INFO:absl:Full user module path is 'penguin_utils@pipelines/penguin-transform/_wheels/tfx_user_code_Transform-0.0+a5e9139bd7facf5026b5306a6aea534f89db0dea58ebe1bb1fb5ebb9df5fdea9-py3-none-any.whl'\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "INFO:absl:Generating ephemeral wheel package for '/tmpfs/src/temp/docs/tutorials/tfx/penguin_utils.py' (including modules: ['penguin_utils']).\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "INFO:absl:User module package has hash fingerprint version a5e9139bd7facf5026b5306a6aea534f89db0dea58ebe1bb1fb5ebb9df5fdea9.\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "INFO:absl:Executing: ['/tmpfs/src/tf_docs_env/bin/python', '/tmpfs/tmp/tmphxgs65wf/_tfx_generated_setup.py', 'bdist_wheel', '--bdist-dir', '/tmpfs/tmp/tmphmpuvvpe', '--dist-dir', '/tmpfs/tmp/tmpmyjp2nkq']\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "running bdist_wheel\n", "running build\n", "running build_py\n", "creating build\n", "creating build/lib\n", "copying penguin_utils.py -> build/lib\n", "installing to /tmpfs/tmp/tmp1yspllww\n", "running install\n", "running install_lib\n", "copying build/lib/penguin_utils.py -> /tmpfs/tmp/tmp1yspllww\n", "running install_egg_info\n", "running egg_info\n", "creating tfx_user_code_Transform.egg-info\n", "writing tfx_user_code_Transform.egg-info/PKG-INFO\n", "writing dependency_links to tfx_user_code_Transform.egg-info/dependency_links.txt\n", "writing top-level names to tfx_user_code_Transform.egg-info/top_level.txt\n", "writing manifest file 'tfx_user_code_Transform.egg-info/SOURCES.txt'\n", "reading manifest file 'tfx_user_code_Transform.egg-info/SOURCES.txt'\n", "writing manifest file 'tfx_user_code_Transform.egg-info/SOURCES.txt'\n", "Copying tfx_user_code_Transform.egg-info to /tmpfs/tmp/tmp1yspllww/tfx_user_code_Transform-0.0+a5e9139bd7facf5026b5306a6aea534f89db0dea58ebe1bb1fb5ebb9df5fdea9-py3.9.egg-info\n", "running install_scripts\n", "creating /tmpfs/tmp/tmp1yspllww/tfx_user_code_Transform-0.0+a5e9139bd7facf5026b5306a6aea534f89db0dea58ebe1bb1fb5ebb9df5fdea9.dist-info/WHEEL\n", "creating '/tmpfs/tmp/tmp0f9tyt4n/tfx_user_code_Transform-0.0+a5e9139bd7facf5026b5306a6aea534f89db0dea58ebe1bb1fb5ebb9df5fdea9-py3-none-any.whl' and adding '/tmpfs/tmp/tmp1yspllww' to it\n", "adding 'penguin_utils.py'\n", "adding 'tfx_user_code_Transform-0.0+a5e9139bd7facf5026b5306a6aea534f89db0dea58ebe1bb1fb5ebb9df5fdea9.dist-info/METADATA'\n", "adding 'tfx_user_code_Transform-0.0+a5e9139bd7facf5026b5306a6aea534f89db0dea58ebe1bb1fb5ebb9df5fdea9.dist-info/WHEEL'\n", "adding 'tfx_user_code_Transform-0.0+a5e9139bd7facf5026b5306a6aea534f89db0dea58ebe1bb1fb5ebb9df5fdea9.dist-info/top_level.txt'\n", "adding 'tfx_user_code_Transform-0.0+a5e9139bd7facf5026b5306a6aea534f89db0dea58ebe1bb1fb5ebb9df5fdea9.dist-info/RECORD'\n", "removing /tmpfs/tmp/tmp1yspllww\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "/tmpfs/src/tf_docs_env/lib/python3.9/site-packages/setuptools/_distutils/cmd.py:66: SetuptoolsDeprecationWarning: setup.py install is deprecated.\n", "!!\n", "\n", " ********************************************************************************\n", " Please avoid running ``setup.py`` directly.\n", " Instead, use pypa/build, pypa/installer or other\n", " standards-based tools.\n", "\n", " See https://blog.ganssle.io/articles/2021/10/setup-py-deprecated.html for details.\n", " ********************************************************************************\n", "\n", "!!\n", " self.initialize_options()\n", "INFO:absl:Successfully built user code wheel distribution at 'pipelines/penguin-transform/_wheels/tfx_user_code_Trainer-0.0+a5e9139bd7facf5026b5306a6aea534f89db0dea58ebe1bb1fb5ebb9df5fdea9-py3-none-any.whl'; target user module is 'penguin_utils'.\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "INFO:absl:Full user module path is 'penguin_utils@pipelines/penguin-transform/_wheels/tfx_user_code_Trainer-0.0+a5e9139bd7facf5026b5306a6aea534f89db0dea58ebe1bb1fb5ebb9df5fdea9-py3-none-any.whl'\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "INFO:absl:Using deployment config:\n", " executor_specs {\n", " key: \"CsvExampleGen\"\n", " value {\n", " beam_executable_spec {\n", " python_executor_spec {\n", " class_path: \"tfx.components.example_gen.csv_example_gen.executor.Executor\"\n", " }\n", " }\n", " }\n", "}\n", "executor_specs {\n", " key: \"ExampleValidator\"\n", " value {\n", " python_class_executable_spec {\n", " class_path: \"tfx.components.example_validator.executor.Executor\"\n", " }\n", " }\n", "}\n", "executor_specs {\n", " key: \"Pusher\"\n", " value {\n", " python_class_executable_spec {\n", " class_path: \"tfx.components.pusher.executor.Executor\"\n", " }\n", " }\n", "}\n", "executor_specs {\n", " key: \"StatisticsGen\"\n", " value {\n", " beam_executable_spec {\n", " python_executor_spec {\n", " class_path: \"tfx.components.statistics_gen.executor.Executor\"\n", " }\n", " }\n", " }\n", "}\n", "executor_specs {\n", " key: \"Trainer\"\n", " value {\n", " python_class_executable_spec {\n", " class_path: \"tfx.components.trainer.executor.GenericExecutor\"\n", " }\n", " }\n", "}\n", "executor_specs {\n", " key: \"Transform\"\n", " value {\n", " beam_executable_spec {\n", " python_executor_spec {\n", " class_path: \"tfx.components.transform.executor.Executor\"\n", " }\n", " }\n", " }\n", "}\n", "custom_driver_specs {\n", " key: \"CsvExampleGen\"\n", " value {\n", " python_class_executable_spec {\n", " class_path: \"tfx.components.example_gen.driver.FileBasedDriver\"\n", " }\n", " }\n", "}\n", "metadata_connection_config {\n", " database_connection_config {\n", " sqlite {\n", " filename_uri: \"metadata/penguin-transform/metadata.db\"\n", " connection_mode: READWRITE_OPENCREATE\n", " }\n", " }\n", "}\n", "\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "INFO:absl:Using connection config:\n", " sqlite {\n", " filename_uri: \"metadata/penguin-transform/metadata.db\"\n", " connection_mode: READWRITE_OPENCREATE\n", "}\n", "\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "INFO:absl:Component CsvExampleGen is running.\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "INFO:absl:Running launcher for node_info {\n", " type {\n", " name: \"tfx.components.example_gen.csv_example_gen.component.CsvExampleGen\"\n", " }\n", " id: \"CsvExampleGen\"\n", "}\n", "contexts {\n", " contexts {\n", " type {\n", " name: \"pipeline\"\n", " }\n", " name {\n", " field_value {\n", " string_value: \"penguin-transform\"\n", " }\n", " }\n", " }\n", " contexts {\n", " type {\n", " name: \"pipeline_run\"\n", " }\n", " name {\n", " field_value {\n", " string_value: \"2024-05-08T09:20:15.209892\"\n", " }\n", " }\n", " }\n", " contexts {\n", " type {\n", " name: \"node\"\n", " }\n", " name {\n", " field_value {\n", " string_value: \"penguin-transform.CsvExampleGen\"\n", " }\n", " }\n", " }\n", "}\n", "outputs {\n", " outputs {\n", " key: \"examples\"\n", " value {\n", " artifact_spec {\n", " type {\n", " name: \"Examples\"\n", " properties {\n", " key: \"span\"\n", " value: INT\n", " }\n", " properties {\n", " key: \"split_names\"\n", " value: STRING\n", " }\n", " properties {\n", " key: \"version\"\n", " value: INT\n", " }\n", " base_type: DATASET\n", " }\n", " }\n", " }\n", " }\n", "}\n", "parameters {\n", " parameters {\n", " key: \"input_base\"\n", " value {\n", " field_value {\n", " string_value: \"/tmpfs/tmp/tfx-data244l5nap\"\n", " }\n", " }\n", " }\n", " parameters {\n", " key: \"input_config\"\n", " value {\n", " field_value {\n", " string_value: \"{\\n \\\"splits\\\": [\\n {\\n \\\"name\\\": \\\"single_split\\\",\\n \\\"pattern\\\": \\\"*\\\"\\n }\\n ]\\n}\"\n", " }\n", " }\n", " }\n", " parameters {\n", " key: \"output_config\"\n", " value {\n", " field_value {\n", " string_value: \"{\\n \\\"split_config\\\": {\\n \\\"splits\\\": [\\n {\\n \\\"hash_buckets\\\": 2,\\n \\\"name\\\": \\\"train\\\"\\n },\\n {\\n \\\"hash_buckets\\\": 1,\\n \\\"name\\\": \\\"eval\\\"\\n }\\n ]\\n }\\n}\"\n", " }\n", " }\n", " }\n", " parameters {\n", " key: \"output_data_format\"\n", " value {\n", " field_value {\n", " int_value: 6\n", " }\n", " }\n", " }\n", " parameters {\n", " key: \"output_file_format\"\n", " value {\n", " field_value {\n", " int_value: 5\n", " }\n", " }\n", " }\n", "}\n", "downstream_nodes: \"StatisticsGen\"\n", "downstream_nodes: \"Trainer\"\n", "downstream_nodes: \"Transform\"\n", "execution_options {\n", " caching_options {\n", " }\n", "}\n", "\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "INFO:absl:MetadataStore with DB connection initialized\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "running bdist_wheel\n", "running build\n", "running build_py\n", "creating build\n", "creating build/lib\n", "copying penguin_utils.py -> build/lib\n", "installing to /tmpfs/tmp/tmphmpuvvpe\n", "running install\n", "running install_lib\n", "copying build/lib/penguin_utils.py -> /tmpfs/tmp/tmphmpuvvpe\n", "running install_egg_info\n", "running egg_info\n", "creating tfx_user_code_Trainer.egg-info\n", "writing tfx_user_code_Trainer.egg-info/PKG-INFO\n", "writing dependency_links to tfx_user_code_Trainer.egg-info/dependency_links.txt\n", "writing top-level names to tfx_user_code_Trainer.egg-info/top_level.txt\n", "writing manifest file 'tfx_user_code_Trainer.egg-info/SOURCES.txt'\n", "reading manifest file 'tfx_user_code_Trainer.egg-info/SOURCES.txt'\n", "writing manifest file 'tfx_user_code_Trainer.egg-info/SOURCES.txt'\n", "Copying tfx_user_code_Trainer.egg-info to /tmpfs/tmp/tmphmpuvvpe/tfx_user_code_Trainer-0.0+a5e9139bd7facf5026b5306a6aea534f89db0dea58ebe1bb1fb5ebb9df5fdea9-py3.9.egg-info\n", "running install_scripts\n", "creating /tmpfs/tmp/tmphmpuvvpe/tfx_user_code_Trainer-0.0+a5e9139bd7facf5026b5306a6aea534f89db0dea58ebe1bb1fb5ebb9df5fdea9.dist-info/WHEEL\n", "creating '/tmpfs/tmp/tmpmyjp2nkq/tfx_user_code_Trainer-0.0+a5e9139bd7facf5026b5306a6aea534f89db0dea58ebe1bb1fb5ebb9df5fdea9-py3-none-any.whl' and adding '/tmpfs/tmp/tmphmpuvvpe' to it\n", "adding 'penguin_utils.py'\n", "adding 'tfx_user_code_Trainer-0.0+a5e9139bd7facf5026b5306a6aea534f89db0dea58ebe1bb1fb5ebb9df5fdea9.dist-info/METADATA'\n", "adding 'tfx_user_code_Trainer-0.0+a5e9139bd7facf5026b5306a6aea534f89db0dea58ebe1bb1fb5ebb9df5fdea9.dist-info/WHEEL'\n", "adding 'tfx_user_code_Trainer-0.0+a5e9139bd7facf5026b5306a6aea534f89db0dea58ebe1bb1fb5ebb9df5fdea9.dist-info/top_level.txt'\n", "adding 'tfx_user_code_Trainer-0.0+a5e9139bd7facf5026b5306a6aea534f89db0dea58ebe1bb1fb5ebb9df5fdea9.dist-info/RECORD'\n", "removing /tmpfs/tmp/tmphmpuvvpe\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "INFO:absl:[CsvExampleGen] Resolved inputs: ({},)\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "INFO:absl:select span and version = (0, None)\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "INFO:absl:latest span and version = (0, None)\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "INFO:absl:MetadataStore with DB connection initialized\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "INFO:absl:Going to run a new execution 1\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "INFO:absl:Going to run a new execution: ExecutionInfo(execution_id=1, input_dict={}, output_dict=defaultdict(, {'examples': [Artifact(artifact: uri: \"pipelines/penguin-transform/CsvExampleGen/examples/1\"\n", "custom_properties {\n", " key: \"input_fingerprint\"\n", " value {\n", " string_value: \"split:single_split,num_files:1,total_bytes:13161,xor_checksum:1715160013,sum_checksum:1715160013\"\n", " }\n", "}\n", "custom_properties {\n", " key: \"span\"\n", " value {\n", " int_value: 0\n", " }\n", "}\n", ", artifact_type: name: \"Examples\"\n", "properties {\n", " key: \"span\"\n", " value: INT\n", "}\n", "properties {\n", " key: \"split_names\"\n", " value: STRING\n", "}\n", "properties {\n", " key: \"version\"\n", " value: INT\n", "}\n", "base_type: DATASET\n", ")]}), exec_properties={'input_base': '/tmpfs/tmp/tfx-data244l5nap', 'output_data_format': 6, 'input_config': '{\\n \"splits\": [\\n {\\n \"name\": \"single_split\",\\n \"pattern\": \"*\"\\n }\\n ]\\n}', 'output_config': '{\\n \"split_config\": {\\n \"splits\": [\\n {\\n \"hash_buckets\": 2,\\n \"name\": \"train\"\\n },\\n {\\n \"hash_buckets\": 1,\\n \"name\": \"eval\"\\n }\\n ]\\n }\\n}', 'output_file_format': 5, 'span': 0, 'version': None, 'input_fingerprint': 'split:single_split,num_files:1,total_bytes:13161,xor_checksum:1715160013,sum_checksum:1715160013'}, execution_output_uri='pipelines/penguin-transform/CsvExampleGen/.system/executor_execution/1/executor_output.pb', stateful_working_dir='pipelines/penguin-transform/CsvExampleGen/.system/stateful_working_dir/2a68db53-a342-4b70-b5ef-4bea08945c28', tmp_dir='pipelines/penguin-transform/CsvExampleGen/.system/executor_execution/1/.temp/', pipeline_node=node_info {\n", " type {\n", " name: \"tfx.components.example_gen.csv_example_gen.component.CsvExampleGen\"\n", " }\n", " id: \"CsvExampleGen\"\n", "}\n", "contexts {\n", " contexts {\n", " type {\n", " name: \"pipeline\"\n", " }\n", " name {\n", " field_value {\n", " string_value: \"penguin-transform\"\n", " }\n", " }\n", " }\n", " contexts {\n", " type {\n", " name: \"pipeline_run\"\n", " }\n", " name {\n", " field_value {\n", " string_value: \"2024-05-08T09:20:15.209892\"\n", " }\n", " }\n", " }\n", " contexts {\n", " type {\n", " name: \"node\"\n", " }\n", " name {\n", " field_value {\n", " string_value: \"penguin-transform.CsvExampleGen\"\n", " }\n", " }\n", " }\n", "}\n", "outputs {\n", " outputs {\n", " key: \"examples\"\n", " value {\n", " artifact_spec {\n", " type {\n", " name: \"Examples\"\n", " properties {\n", " key: \"span\"\n", " value: INT\n", " }\n", " properties {\n", " key: \"split_names\"\n", " value: STRING\n", " }\n", " properties {\n", " key: \"version\"\n", " value: INT\n", " }\n", " base_type: DATASET\n", " }\n", " }\n", " }\n", " }\n", "}\n", "parameters {\n", " parameters {\n", " key: \"input_base\"\n", " value {\n", " field_value {\n", " string_value: \"/tmpfs/tmp/tfx-data244l5nap\"\n", " }\n", " }\n", " }\n", " parameters {\n", " key: \"input_config\"\n", " value {\n", " field_value {\n", " string_value: \"{\\n \\\"splits\\\": [\\n {\\n \\\"name\\\": \\\"single_split\\\",\\n \\\"pattern\\\": \\\"*\\\"\\n }\\n ]\\n}\"\n", " }\n", " }\n", " }\n", " parameters {\n", " key: \"output_config\"\n", " value {\n", " field_value {\n", " string_value: \"{\\n \\\"split_config\\\": {\\n \\\"splits\\\": [\\n {\\n \\\"hash_buckets\\\": 2,\\n \\\"name\\\": \\\"train\\\"\\n },\\n {\\n \\\"hash_buckets\\\": 1,\\n \\\"name\\\": \\\"eval\\\"\\n }\\n ]\\n }\\n}\"\n", " }\n", " }\n", " }\n", " parameters {\n", " key: \"output_data_format\"\n", " value {\n", " field_value {\n", " int_value: 6\n", " }\n", " }\n", " }\n", " parameters {\n", " key: \"output_file_format\"\n", " value {\n", " field_value {\n", " int_value: 5\n", " }\n", " }\n", " }\n", "}\n", "downstream_nodes: \"StatisticsGen\"\n", "downstream_nodes: \"Trainer\"\n", "downstream_nodes: \"Transform\"\n", "execution_options {\n", " caching_options {\n", " }\n", "}\n", ", pipeline_info=id: \"penguin-transform\"\n", ", pipeline_run_id='2024-05-08T09:20:15.209892', top_level_pipeline_run_id=None, frontend_url=None)\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "INFO:absl:Generating examples.\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "WARNING:apache_beam.runners.interactive.interactive_environment:Dependencies required for Interactive Beam PCollection visualization are not available, please use: `pip install apache-beam[interactive]` to install necessary dependencies to enable all data visualization features.\n" ] }, { "data": { "application/javascript": [ "\n", " if (typeof window.interactive_beam_jquery == 'undefined') {\n", " var jqueryScript = document.createElement('script');\n", " jqueryScript.src = 'https://code.jquery.com/jquery-3.4.1.slim.min.js';\n", " jqueryScript.type = 'text/javascript';\n", " jqueryScript.onload = function() {\n", " var datatableScript = document.createElement('script');\n", " datatableScript.src = 'https://cdn.datatables.net/1.10.20/js/jquery.dataTables.min.js';\n", " datatableScript.type = 'text/javascript';\n", " datatableScript.onload = function() {\n", " window.interactive_beam_jquery = jQuery.noConflict(true);\n", " window.interactive_beam_jquery(document).ready(function($){\n", " \n", " });\n", " }\n", " document.head.appendChild(datatableScript);\n", " };\n", " document.head.appendChild(jqueryScript);\n", " } else {\n", " window.interactive_beam_jquery(document).ready(function($){\n", " \n", " });\n", " }" ] }, "metadata": {}, "output_type": "display_data" }, { "name": "stderr", "output_type": "stream", "text": [ "INFO:absl:Processing input csv data /tmpfs/tmp/tfx-data244l5nap/* to TFExample.\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "WARNING:apache_beam.io.tfrecordio:Couldn't find python-snappy so the implementation of _TFRecordUtil._masked_crc32c is not as fast as it could be.\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "INFO:absl:Examples generated.\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "INFO:absl:Value type of key version in exec_properties is not supported, going to drop it\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "INFO:absl:Value type of key _beam_pipeline_args in exec_properties is not supported, going to drop it\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "INFO:absl:Cleaning up stateless execution info.\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "INFO:absl:Execution 1 succeeded.\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "INFO:absl:Cleaning up stateful execution info.\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "INFO:absl:Deleted stateful_working_dir pipelines/penguin-transform/CsvExampleGen/.system/stateful_working_dir/2a68db53-a342-4b70-b5ef-4bea08945c28\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "INFO:absl:Publishing output artifacts defaultdict(, {'examples': [Artifact(artifact: uri: \"pipelines/penguin-transform/CsvExampleGen/examples/1\"\n", "custom_properties {\n", " key: \"input_fingerprint\"\n", " value {\n", " string_value: \"split:single_split,num_files:1,total_bytes:13161,xor_checksum:1715160013,sum_checksum:1715160013\"\n", " }\n", "}\n", "custom_properties {\n", " key: \"span\"\n", " value {\n", " int_value: 0\n", " }\n", "}\n", ", artifact_type: name: \"Examples\"\n", "properties {\n", " key: \"span\"\n", " value: INT\n", "}\n", "properties {\n", " key: \"split_names\"\n", " value: STRING\n", "}\n", "properties {\n", " key: \"version\"\n", " value: INT\n", "}\n", "base_type: DATASET\n", ")]}) for execution 1\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "INFO:absl:MetadataStore with DB connection initialized\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "INFO:absl:Component CsvExampleGen is finished.\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "INFO:absl:Component schema_importer is running.\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "INFO:absl:Running launcher for node_info {\n", " type {\n", " name: \"tfx.dsl.components.common.importer.Importer\"\n", " }\n", " id: \"schema_importer\"\n", "}\n", "contexts {\n", " contexts {\n", " type {\n", " name: \"pipeline\"\n", " }\n", " name {\n", " field_value {\n", " string_value: \"penguin-transform\"\n", " }\n", " }\n", " }\n", " contexts {\n", " type {\n", " name: \"pipeline_run\"\n", " }\n", " name {\n", " field_value {\n", " string_value: \"2024-05-08T09:20:15.209892\"\n", " }\n", " }\n", " }\n", " contexts {\n", " type {\n", " name: \"node\"\n", " }\n", " name {\n", " field_value {\n", " string_value: \"penguin-transform.schema_importer\"\n", " }\n", " }\n", " }\n", "}\n", "outputs {\n", " outputs {\n", " key: \"result\"\n", " value {\n", " artifact_spec {\n", " type {\n", " name: \"Schema\"\n", " }\n", " }\n", " }\n", " }\n", "}\n", "parameters {\n", " parameters {\n", " key: \"artifact_uri\"\n", " value {\n", " field_value {\n", " string_value: \"schema\"\n", " }\n", " }\n", " }\n", " parameters {\n", " key: \"output_key\"\n", " value {\n", " field_value {\n", " string_value: \"result\"\n", " }\n", " }\n", " }\n", " parameters {\n", " key: \"reimport\"\n", " value {\n", " field_value {\n", " int_value: 0\n", " }\n", " }\n", " }\n", "}\n", "downstream_nodes: \"ExampleValidator\"\n", "downstream_nodes: \"Transform\"\n", "execution_options {\n", " caching_options {\n", " }\n", "}\n", "\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "INFO:absl:Running as an importer node.\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "INFO:absl:MetadataStore with DB connection initialized\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "INFO:absl:Processing source uri: schema, properties: {}, custom_properties: {}\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "INFO:absl:Component schema_importer is finished.\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "INFO:absl:Component StatisticsGen is running.\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "INFO:absl:Running launcher for node_info {\n", " type {\n", " name: \"tfx.components.statistics_gen.component.StatisticsGen\"\n", " base_type: PROCESS\n", " }\n", " id: \"StatisticsGen\"\n", "}\n", "contexts {\n", " contexts {\n", " type {\n", " name: \"pipeline\"\n", " }\n", " name {\n", " field_value {\n", " string_value: \"penguin-transform\"\n", " }\n", " }\n", " }\n", " contexts {\n", " type {\n", " name: \"pipeline_run\"\n", " }\n", " name {\n", " field_value {\n", " string_value: \"2024-05-08T09:20:15.209892\"\n", " }\n", " }\n", " }\n", " contexts {\n", " type {\n", " name: \"node\"\n", " }\n", " name {\n", " field_value {\n", " string_value: \"penguin-transform.StatisticsGen\"\n", " }\n", " }\n", " }\n", "}\n", "inputs {\n", " inputs {\n", " key: \"examples\"\n", " value {\n", " channels {\n", " producer_node_query {\n", " id: \"CsvExampleGen\"\n", " }\n", " context_queries {\n", " type {\n", " name: \"pipeline\"\n", " }\n", " name {\n", " field_value {\n", " string_value: \"penguin-transform\"\n", " }\n", " }\n", " }\n", " context_queries {\n", " type {\n", " name: \"pipeline_run\"\n", " }\n", " name {\n", " field_value {\n", " string_value: \"2024-05-08T09:20:15.209892\"\n", " }\n", " }\n", " }\n", " context_queries {\n", " type {\n", " name: \"node\"\n", " }\n", " name {\n", " field_value {\n", " string_value: \"penguin-transform.CsvExampleGen\"\n", " }\n", " }\n", " }\n", " artifact_query {\n", " type {\n", " name: \"Examples\"\n", " base_type: DATASET\n", " }\n", " }\n", " output_key: \"examples\"\n", " }\n", " min_count: 1\n", " }\n", " }\n", "}\n", "outputs {\n", " outputs {\n", " key: \"statistics\"\n", " value {\n", " artifact_spec {\n", " type {\n", " name: \"ExampleStatistics\"\n", " properties {\n", " key: \"span\"\n", " value: INT\n", " }\n", " properties {\n", " key: \"split_names\"\n", " value: STRING\n", " }\n", " base_type: STATISTICS\n", " }\n", " }\n", " }\n", " }\n", "}\n", "parameters {\n", " parameters {\n", " key: \"exclude_splits\"\n", " value {\n", " field_value {\n", " string_value: \"[]\"\n", " }\n", " }\n", " }\n", "}\n", "upstream_nodes: \"CsvExampleGen\"\n", "downstream_nodes: \"ExampleValidator\"\n", "execution_options {\n", " caching_options {\n", " }\n", "}\n", "\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "INFO:absl:MetadataStore with DB connection initialized\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "WARNING:absl:ArtifactQuery.property_predicate is not supported.\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "INFO:absl:[StatisticsGen] Resolved inputs: ({'examples': [Artifact(artifact: id: 1\n", "type_id: 15\n", "uri: \"pipelines/penguin-transform/CsvExampleGen/examples/1\"\n", "properties {\n", " key: \"split_names\"\n", " value {\n", " string_value: \"[\\\"train\\\", \\\"eval\\\"]\"\n", " }\n", "}\n", "custom_properties {\n", " key: \"file_format\"\n", " value {\n", " string_value: \"tfrecords_gzip\"\n", " }\n", "}\n", "custom_properties {\n", " key: \"input_fingerprint\"\n", " value {\n", " string_value: \"split:single_split,num_files:1,total_bytes:13161,xor_checksum:1715160013,sum_checksum:1715160013\"\n", " }\n", "}\n", "custom_properties {\n", " key: \"is_external\"\n", " value {\n", " int_value: 0\n", " }\n", "}\n", "custom_properties {\n", " key: \"payload_format\"\n", " value {\n", " string_value: \"FORMAT_TF_EXAMPLE\"\n", " }\n", "}\n", "custom_properties {\n", " key: \"span\"\n", " value {\n", " int_value: 0\n", " }\n", "}\n", "custom_properties {\n", " key: \"tfx_version\"\n", " value {\n", " string_value: \"1.15.0\"\n", " }\n", "}\n", "state: LIVE\n", "type: \"Examples\"\n", "create_time_since_epoch: 1715160016351\n", "last_update_time_since_epoch: 1715160016351\n", ", artifact_type: id: 15\n", "name: \"Examples\"\n", "properties {\n", " key: \"span\"\n", " value: INT\n", "}\n", "properties {\n", " key: \"split_names\"\n", " value: STRING\n", "}\n", "properties {\n", " key: \"version\"\n", " value: INT\n", "}\n", "base_type: DATASET\n", ")]},)\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "INFO:absl:MetadataStore with DB connection initialized\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "INFO:absl:Going to run a new execution 3\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "INFO:absl:Going to run a new execution: ExecutionInfo(execution_id=3, input_dict={'examples': [Artifact(artifact: id: 1\n", "type_id: 15\n", "uri: \"pipelines/penguin-transform/CsvExampleGen/examples/1\"\n", "properties {\n", " key: \"split_names\"\n", " value {\n", " string_value: \"[\\\"train\\\", \\\"eval\\\"]\"\n", " }\n", "}\n", "custom_properties {\n", " key: \"file_format\"\n", " value {\n", " string_value: \"tfrecords_gzip\"\n", " }\n", "}\n", "custom_properties {\n", " key: \"input_fingerprint\"\n", " value {\n", " string_value: \"split:single_split,num_files:1,total_bytes:13161,xor_checksum:1715160013,sum_checksum:1715160013\"\n", " }\n", "}\n", "custom_properties {\n", " key: \"is_external\"\n", " value {\n", " int_value: 0\n", " }\n", "}\n", "custom_properties {\n", " key: \"payload_format\"\n", " value {\n", " string_value: \"FORMAT_TF_EXAMPLE\"\n", " }\n", "}\n", "custom_properties {\n", " key: \"span\"\n", " value {\n", " int_value: 0\n", " }\n", "}\n", "custom_properties {\n", " key: \"tfx_version\"\n", " value {\n", " string_value: \"1.15.0\"\n", " }\n", "}\n", "state: LIVE\n", "type: \"Examples\"\n", "create_time_since_epoch: 1715160016351\n", "last_update_time_since_epoch: 1715160016351\n", ", artifact_type: id: 15\n", "name: \"Examples\"\n", "properties {\n", " key: \"span\"\n", " value: INT\n", "}\n", "properties {\n", " key: \"split_names\"\n", " value: STRING\n", "}\n", "properties {\n", " key: \"version\"\n", " value: INT\n", "}\n", "base_type: DATASET\n", ")]}, output_dict=defaultdict(, {'statistics': [Artifact(artifact: uri: \"pipelines/penguin-transform/StatisticsGen/statistics/3\"\n", ", artifact_type: name: \"ExampleStatistics\"\n", "properties {\n", " key: \"span\"\n", " value: INT\n", "}\n", "properties {\n", " key: \"split_names\"\n", " value: STRING\n", "}\n", "base_type: STATISTICS\n", ")]}), exec_properties={'exclude_splits': '[]'}, execution_output_uri='pipelines/penguin-transform/StatisticsGen/.system/executor_execution/3/executor_output.pb', stateful_working_dir='pipelines/penguin-transform/StatisticsGen/.system/stateful_working_dir/87ad10ec-9fbd-4d39-bccf-4085439aebed', tmp_dir='pipelines/penguin-transform/StatisticsGen/.system/executor_execution/3/.temp/', pipeline_node=node_info {\n", " type {\n", " name: \"tfx.components.statistics_gen.component.StatisticsGen\"\n", " base_type: PROCESS\n", " }\n", " id: \"StatisticsGen\"\n", "}\n", "contexts {\n", " contexts {\n", " type {\n", " name: \"pipeline\"\n", " }\n", " name {\n", " field_value {\n", " string_value: \"penguin-transform\"\n", " }\n", " }\n", " }\n", " contexts {\n", " type {\n", " name: \"pipeline_run\"\n", " }\n", " name {\n", " field_value {\n", " string_value: \"2024-05-08T09:20:15.209892\"\n", " }\n", " }\n", " }\n", " contexts {\n", " type {\n", " name: \"node\"\n", " }\n", " name {\n", " field_value {\n", " string_value: \"penguin-transform.StatisticsGen\"\n", " }\n", " }\n", " }\n", "}\n", "inputs {\n", " inputs {\n", " key: \"examples\"\n", " value {\n", " channels {\n", " producer_node_query {\n", " id: \"CsvExampleGen\"\n", " }\n", " context_queries {\n", " type {\n", " name: \"pipeline\"\n", " }\n", " name {\n", " field_value {\n", " string_value: \"penguin-transform\"\n", " }\n", " }\n", " }\n", " context_queries {\n", " type {\n", " name: \"pipeline_run\"\n", " }\n", " name {\n", " field_value {\n", " string_value: \"2024-05-08T09:20:15.209892\"\n", " }\n", " }\n", " }\n", " context_queries {\n", " type {\n", " name: \"node\"\n", " }\n", " name {\n", " field_value {\n", " string_value: \"penguin-transform.CsvExampleGen\"\n", " }\n", " }\n", " }\n", " artifact_query {\n", " type {\n", " name: \"Examples\"\n", " base_type: DATASET\n", " }\n", " }\n", " output_key: \"examples\"\n", " }\n", " min_count: 1\n", " }\n", " }\n", "}\n", "outputs {\n", " outputs {\n", " key: \"statistics\"\n", " value {\n", " artifact_spec {\n", " type {\n", " name: \"ExampleStatistics\"\n", " properties {\n", " key: \"span\"\n", " value: INT\n", " }\n", " properties {\n", " key: \"split_names\"\n", " value: STRING\n", " }\n", " base_type: STATISTICS\n", " }\n", " }\n", " }\n", " }\n", "}\n", "parameters {\n", " parameters {\n", " key: \"exclude_splits\"\n", " value {\n", " field_value {\n", " string_value: \"[]\"\n", " }\n", " }\n", " }\n", "}\n", "upstream_nodes: \"CsvExampleGen\"\n", "downstream_nodes: \"ExampleValidator\"\n", "execution_options {\n", " caching_options {\n", " }\n", "}\n", ", pipeline_info=id: \"penguin-transform\"\n", ", pipeline_run_id='2024-05-08T09:20:15.209892', top_level_pipeline_run_id=None, frontend_url=None)\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "INFO:absl:Generating statistics for split train.\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "INFO:absl:Statistics for split train written to pipelines/penguin-transform/StatisticsGen/statistics/3/Split-train.\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "INFO:absl:Generating statistics for split eval.\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "INFO:absl:Statistics for split eval written to pipelines/penguin-transform/StatisticsGen/statistics/3/Split-eval.\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "INFO:absl:Cleaning up stateless execution info.\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "INFO:absl:Execution 3 succeeded.\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "INFO:absl:Cleaning up stateful execution info.\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "INFO:absl:Deleted stateful_working_dir pipelines/penguin-transform/StatisticsGen/.system/stateful_working_dir/87ad10ec-9fbd-4d39-bccf-4085439aebed\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "INFO:absl:Publishing output artifacts defaultdict(, {'statistics': [Artifact(artifact: uri: \"pipelines/penguin-transform/StatisticsGen/statistics/3\"\n", ", artifact_type: name: \"ExampleStatistics\"\n", "properties {\n", " key: \"span\"\n", " value: INT\n", "}\n", "properties {\n", " key: \"split_names\"\n", " value: STRING\n", "}\n", "base_type: STATISTICS\n", ")]}) for execution 3\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "INFO:absl:MetadataStore with DB connection initialized\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "INFO:absl:Component StatisticsGen is finished.\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "INFO:absl:Component Transform is running.\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "INFO:absl:Running launcher for node_info {\n", " type {\n", " name: \"tfx.components.transform.component.Transform\"\n", " base_type: TRANSFORM\n", " }\n", " id: \"Transform\"\n", "}\n", "contexts {\n", " contexts {\n", " type {\n", " name: \"pipeline\"\n", " }\n", " name {\n", " field_value {\n", " string_value: \"penguin-transform\"\n", " }\n", " }\n", " }\n", " contexts {\n", " type {\n", " name: \"pipeline_run\"\n", " }\n", " name {\n", " field_value {\n", " string_value: \"2024-05-08T09:20:15.209892\"\n", " }\n", " }\n", " }\n", " contexts {\n", " type {\n", " name: \"node\"\n", " }\n", " name {\n", " field_value {\n", " string_value: \"penguin-transform.Transform\"\n", " }\n", " }\n", " }\n", "}\n", "inputs {\n", " inputs {\n", " key: \"examples\"\n", " value {\n", " channels {\n", " producer_node_query {\n", " id: \"CsvExampleGen\"\n", " }\n", " context_queries {\n", " type {\n", " name: \"pipeline\"\n", " }\n", " name {\n", " field_value {\n", " string_value: \"penguin-transform\"\n", " }\n", " }\n", " }\n", " context_queries {\n", " type {\n", " name: \"pipeline_run\"\n", " }\n", " name {\n", " field_value {\n", " string_value: \"2024-05-08T09:20:15.209892\"\n", " }\n", " }\n", " }\n", " context_queries {\n", " type {\n", " name: \"node\"\n", " }\n", " name {\n", " field_value {\n", " string_value: \"penguin-transform.CsvExampleGen\"\n", " }\n", " }\n", " }\n", " artifact_query {\n", " type {\n", " name: \"Examples\"\n", " base_type: DATASET\n", " }\n", " }\n", " output_key: \"examples\"\n", " }\n", " min_count: 1\n", " }\n", " }\n", " inputs {\n", " key: \"schema\"\n", " value {\n", " channels {\n", " producer_node_query {\n", " id: \"schema_importer\"\n", " }\n", " context_queries {\n", " type {\n", " name: \"pipeline\"\n", " }\n", " name {\n", " field_value {\n", " string_value: \"penguin-transform\"\n", " }\n", " }\n", " }\n", " context_queries {\n", " type {\n", " name: \"pipeline_run\"\n", " }\n", " name {\n", " field_value {\n", " string_value: \"2024-05-08T09:20:15.209892\"\n", " }\n", " }\n", " }\n", " context_queries {\n", " type {\n", " name: \"node\"\n", " }\n", " name {\n", " field_value {\n", " string_value: \"penguin-transform.schema_importer\"\n", " }\n", " }\n", " }\n", " artifact_query {\n", " type {\n", " name: \"Schema\"\n", " }\n", " }\n", " output_key: \"result\"\n", " }\n", " min_count: 1\n", " }\n", " }\n", "}\n", "outputs {\n", " outputs {\n", " key: \"post_transform_anomalies\"\n", " value {\n", " artifact_spec {\n", " type {\n", " name: \"ExampleAnomalies\"\n", " properties {\n", " key: \"span\"\n", " value: INT\n", " }\n", " properties {\n", " key: \"split_names\"\n", " value: STRING\n", " }\n", " }\n", " }\n", " }\n", " }\n", " outputs {\n", " key: \"post_transform_schema\"\n", " value {\n", " artifact_spec {\n", " type {\n", " name: \"Schema\"\n", " }\n", " }\n", " }\n", " }\n", " outputs {\n", " key: \"post_transform_stats\"\n", " value {\n", " artifact_spec {\n", " type {\n", " name: \"ExampleStatistics\"\n", " properties {\n", " key: \"span\"\n", " value: INT\n", " }\n", " properties {\n", " key: \"split_names\"\n", " value: STRING\n", " }\n", " base_type: STATISTICS\n", " }\n", " }\n", " }\n", " }\n", " outputs {\n", " key: \"pre_transform_schema\"\n", " value {\n", " artifact_spec {\n", " type {\n", " name: \"Schema\"\n", " }\n", " }\n", " }\n", " }\n", " outputs {\n", " key: \"pre_transform_stats\"\n", " value {\n", " artifact_spec {\n", " type {\n", " name: \"ExampleStatistics\"\n", " properties {\n", " key: \"span\"\n", " value: INT\n", " }\n", " properties {\n", " key: \"split_names\"\n", " value: STRING\n", " }\n", " base_type: STATISTICS\n", " }\n", " }\n", " }\n", " }\n", " outputs {\n", " key: \"transform_graph\"\n", " value {\n", " artifact_spec {\n", " type {\n", " name: \"TransformGraph\"\n", " }\n", " }\n", " }\n", " }\n", " outputs {\n", " key: \"updated_analyzer_cache\"\n", " value {\n", " artifact_spec {\n", " type {\n", " name: \"TransformCache\"\n", " }\n", " }\n", " }\n", " }\n", "}\n", "parameters {\n", " parameters {\n", " key: \"custom_config\"\n", " value {\n", " field_value {\n", " string_value: \"null\"\n", " }\n", " }\n", " }\n", " parameters {\n", " key: \"disable_statistics\"\n", " value {\n", " field_value {\n", " int_value: 0\n", " }\n", " }\n", " }\n", " parameters {\n", " key: \"force_tf_compat_v1\"\n", " value {\n", " field_value {\n", " int_value: 0\n", " }\n", " }\n", " }\n", " parameters {\n", " key: \"module_path\"\n", " value {\n", " field_value {\n", " string_value: \"penguin_utils@pipelines/penguin-transform/_wheels/tfx_user_code_Transform-0.0+a5e9139bd7facf5026b5306a6aea534f89db0dea58ebe1bb1fb5ebb9df5fdea9-py3-none-any.whl\"\n", " }\n", " }\n", " }\n", "}\n", "upstream_nodes: \"CsvExampleGen\"\n", "upstream_nodes: \"schema_importer\"\n", "downstream_nodes: \"Trainer\"\n", "execution_options {\n", " caching_options {\n", " }\n", "}\n", "\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "INFO:absl:MetadataStore with DB connection initialized\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "WARNING:absl:ArtifactQuery.property_predicate is not supported.\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "WARNING:absl:ArtifactQuery.property_predicate is not supported.\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "INFO:absl:[Transform] Resolved inputs: ({'schema': [Artifact(artifact: id: 2\n", "type_id: 17\n", "uri: \"schema\"\n", "custom_properties {\n", " key: \"is_external\"\n", " value {\n", " int_value: 1\n", " }\n", "}\n", "custom_properties {\n", " key: \"tfx_version\"\n", " value {\n", " string_value: \"1.15.0\"\n", " }\n", "}\n", "state: LIVE\n", "type: \"Schema\"\n", "create_time_since_epoch: 1715160016371\n", "last_update_time_since_epoch: 1715160016371\n", ", artifact_type: id: 17\n", "name: \"Schema\"\n", ")], 'examples': [Artifact(artifact: id: 1\n", "type_id: 15\n", "uri: \"pipelines/penguin-transform/CsvExampleGen/examples/1\"\n", "properties {\n", " key: \"split_names\"\n", " value {\n", " string_value: \"[\\\"train\\\", \\\"eval\\\"]\"\n", " }\n", "}\n", "custom_properties {\n", " key: \"file_format\"\n", " value {\n", " string_value: \"tfrecords_gzip\"\n", " }\n", "}\n", "custom_properties {\n", " key: \"input_fingerprint\"\n", " value {\n", " string_value: \"split:single_split,num_files:1,total_bytes:13161,xor_checksum:1715160013,sum_checksum:1715160013\"\n", " }\n", "}\n", "custom_properties {\n", " key: \"is_external\"\n", " value {\n", " int_value: 0\n", " }\n", "}\n", "custom_properties {\n", " key: \"payload_format\"\n", " value {\n", " string_value: \"FORMAT_TF_EXAMPLE\"\n", " }\n", "}\n", "custom_properties {\n", " key: \"span\"\n", " value {\n", " int_value: 0\n", " }\n", "}\n", "custom_properties {\n", " key: \"tfx_version\"\n", " value {\n", " string_value: \"1.15.0\"\n", " }\n", "}\n", "state: LIVE\n", "type: \"Examples\"\n", "create_time_since_epoch: 1715160016351\n", "last_update_time_since_epoch: 1715160016351\n", ", artifact_type: id: 15\n", "name: \"Examples\"\n", "properties {\n", " key: \"span\"\n", " value: INT\n", "}\n", "properties {\n", " key: \"split_names\"\n", " value: STRING\n", "}\n", "properties {\n", " key: \"version\"\n", " value: INT\n", "}\n", "base_type: DATASET\n", ")]},)\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "INFO:absl:MetadataStore with DB connection initialized\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "INFO:absl:Going to run a new execution 4\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "INFO:absl:Going to run a new execution: ExecutionInfo(execution_id=4, input_dict={'schema': [Artifact(artifact: id: 2\n", "type_id: 17\n", "uri: \"schema\"\n", "custom_properties {\n", " key: \"is_external\"\n", " value {\n", " int_value: 1\n", " }\n", "}\n", "custom_properties {\n", " key: \"tfx_version\"\n", " value {\n", " string_value: \"1.15.0\"\n", " }\n", "}\n", "state: LIVE\n", "type: \"Schema\"\n", "create_time_since_epoch: 1715160016371\n", "last_update_time_since_epoch: 1715160016371\n", ", artifact_type: id: 17\n", "name: \"Schema\"\n", ")], 'examples': [Artifact(artifact: id: 1\n", "type_id: 15\n", "uri: \"pipelines/penguin-transform/CsvExampleGen/examples/1\"\n", "properties {\n", " key: \"split_names\"\n", " value {\n", " string_value: \"[\\\"train\\\", \\\"eval\\\"]\"\n", " }\n", "}\n", "custom_properties {\n", " key: \"file_format\"\n", " value {\n", " string_value: \"tfrecords_gzip\"\n", " }\n", "}\n", "custom_properties {\n", " key: \"input_fingerprint\"\n", " value {\n", " string_value: \"split:single_split,num_files:1,total_bytes:13161,xor_checksum:1715160013,sum_checksum:1715160013\"\n", " }\n", "}\n", "custom_properties {\n", " key: \"is_external\"\n", " value {\n", " int_value: 0\n", " }\n", "}\n", "custom_properties {\n", " key: \"payload_format\"\n", " value {\n", " string_value: \"FORMAT_TF_EXAMPLE\"\n", " }\n", "}\n", "custom_properties {\n", " key: \"span\"\n", " value {\n", " int_value: 0\n", " }\n", "}\n", "custom_properties {\n", " key: \"tfx_version\"\n", " value {\n", " string_value: \"1.15.0\"\n", " }\n", "}\n", "state: LIVE\n", "type: \"Examples\"\n", "create_time_since_epoch: 1715160016351\n", "last_update_time_since_epoch: 1715160016351\n", ", artifact_type: id: 15\n", "name: \"Examples\"\n", "properties {\n", " key: \"span\"\n", " value: INT\n", "}\n", "properties {\n", " key: \"split_names\"\n", " value: STRING\n", "}\n", "properties {\n", " key: \"version\"\n", " value: INT\n", "}\n", "base_type: DATASET\n", ")]}, output_dict=defaultdict(, {'post_transform_schema': [Artifact(artifact: uri: \"pipelines/penguin-transform/Transform/post_transform_schema/4\"\n", ", artifact_type: name: \"Schema\"\n", ")], 'updated_analyzer_cache': [Artifact(artifact: uri: \"pipelines/penguin-transform/Transform/updated_analyzer_cache/4\"\n", ", artifact_type: name: \"TransformCache\"\n", ")], 'pre_transform_stats': [Artifact(artifact: uri: \"pipelines/penguin-transform/Transform/pre_transform_stats/4\"\n", ", artifact_type: name: \"ExampleStatistics\"\n", "properties {\n", " key: \"span\"\n", " value: INT\n", "}\n", "properties {\n", " key: \"split_names\"\n", " value: STRING\n", "}\n", "base_type: STATISTICS\n", ")], 'transform_graph': [Artifact(artifact: uri: \"pipelines/penguin-transform/Transform/transform_graph/4\"\n", ", artifact_type: name: \"TransformGraph\"\n", ")], 'post_transform_stats': [Artifact(artifact: uri: \"pipelines/penguin-transform/Transform/post_transform_stats/4\"\n", ", artifact_type: name: \"ExampleStatistics\"\n", "properties {\n", " key: \"span\"\n", " value: INT\n", "}\n", "properties {\n", " key: \"split_names\"\n", " value: STRING\n", "}\n", "base_type: STATISTICS\n", ")], 'post_transform_anomalies': [Artifact(artifact: uri: \"pipelines/penguin-transform/Transform/post_transform_anomalies/4\"\n", ", artifact_type: name: \"ExampleAnomalies\"\n", "properties {\n", " key: \"span\"\n", " value: INT\n", "}\n", "properties {\n", " key: \"split_names\"\n", " value: STRING\n", "}\n", ")], 'pre_transform_schema': [Artifact(artifact: uri: \"pipelines/penguin-transform/Transform/pre_transform_schema/4\"\n", ", artifact_type: name: \"Schema\"\n", ")]}), exec_properties={'force_tf_compat_v1': 0, 'module_path': 'penguin_utils@pipelines/penguin-transform/_wheels/tfx_user_code_Transform-0.0+a5e9139bd7facf5026b5306a6aea534f89db0dea58ebe1bb1fb5ebb9df5fdea9-py3-none-any.whl', 'disable_statistics': 0, 'custom_config': 'null'}, execution_output_uri='pipelines/penguin-transform/Transform/.system/executor_execution/4/executor_output.pb', stateful_working_dir='pipelines/penguin-transform/Transform/.system/stateful_working_dir/353ff412-c22b-442d-86fa-d000785c15c6', tmp_dir='pipelines/penguin-transform/Transform/.system/executor_execution/4/.temp/', pipeline_node=node_info {\n", " type {\n", " name: \"tfx.components.transform.component.Transform\"\n", " base_type: TRANSFORM\n", " }\n", " id: \"Transform\"\n", "}\n", "contexts {\n", " contexts {\n", " type {\n", " name: \"pipeline\"\n", " }\n", " name {\n", " field_value {\n", " string_value: \"penguin-transform\"\n", " }\n", " }\n", " }\n", " contexts {\n", " type {\n", " name: \"pipeline_run\"\n", " }\n", " name {\n", " field_value {\n", " string_value: \"2024-05-08T09:20:15.209892\"\n", " }\n", " }\n", " }\n", " contexts {\n", " type {\n", " name: \"node\"\n", " }\n", " name {\n", " field_value {\n", " string_value: \"penguin-transform.Transform\"\n", " }\n", " }\n", " }\n", "}\n", "inputs {\n", " inputs {\n", " key: \"examples\"\n", " value {\n", " channels {\n", " producer_node_query {\n", " id: \"CsvExampleGen\"\n", " }\n", " context_queries {\n", " type {\n", " name: \"pipeline\"\n", " }\n", " name {\n", " field_value {\n", " string_value: \"penguin-transform\"\n", " }\n", " }\n", " }\n", " context_queries {\n", " type {\n", " name: \"pipeline_run\"\n", " }\n", " name {\n", " field_value {\n", " string_value: \"2024-05-08T09:20:15.209892\"\n", " }\n", " }\n", " }\n", " context_queries {\n", " type {\n", " name: \"node\"\n", " }\n", " name {\n", " field_value {\n", " string_value: \"penguin-transform.CsvExampleGen\"\n", " }\n", " }\n", " }\n", " artifact_query {\n", " type {\n", " name: \"Examples\"\n", " base_type: DATASET\n", " }\n", " }\n", " output_key: \"examples\"\n", " }\n", " min_count: 1\n", " }\n", " }\n", " inputs {\n", " key: \"schema\"\n", " value {\n", " channels {\n", " producer_node_query {\n", " id: \"schema_importer\"\n", " }\n", " context_queries {\n", " type {\n", " name: \"pipeline\"\n", " }\n", " name {\n", " field_value {\n", " string_value: \"penguin-transform\"\n", " }\n", " }\n", " }\n", " context_queries {\n", " type {\n", " name: \"pipeline_run\"\n", " }\n", " name {\n", " field_value {\n", " string_value: \"2024-05-08T09:20:15.209892\"\n", " }\n", " }\n", " }\n", " context_queries {\n", " type {\n", " name: \"node\"\n", " }\n", " name {\n", " field_value {\n", " string_value: \"penguin-transform.schema_importer\"\n", " }\n", " }\n", " }\n", " artifact_query {\n", " type {\n", " name: \"Schema\"\n", " }\n", " }\n", " output_key: \"result\"\n", " }\n", " min_count: 1\n", " }\n", " }\n", "}\n", "outputs {\n", " outputs {\n", " key: \"post_transform_anomalies\"\n", " value {\n", " artifact_spec {\n", " type {\n", " name: \"ExampleAnomalies\"\n", " properties {\n", " key: \"span\"\n", " value: INT\n", " }\n", " properties {\n", " key: \"split_names\"\n", " value: STRING\n", " }\n", " }\n", " }\n", " }\n", " }\n", " outputs {\n", " key: \"post_transform_schema\"\n", " value {\n", " artifact_spec {\n", " type {\n", " name: \"Schema\"\n", " }\n", " }\n", " }\n", " }\n", " outputs {\n", " key: \"post_transform_stats\"\n", " value {\n", " artifact_spec {\n", " type {\n", " name: \"ExampleStatistics\"\n", " properties {\n", " key: \"span\"\n", " value: INT\n", " }\n", " properties {\n", " key: \"split_names\"\n", " value: STRING\n", " }\n", " base_type: STATISTICS\n", " }\n", " }\n", " }\n", " }\n", " outputs {\n", " key: \"pre_transform_schema\"\n", " value {\n", " artifact_spec {\n", " type {\n", " name: \"Schema\"\n", " }\n", " }\n", " }\n", " }\n", " outputs {\n", " key: \"pre_transform_stats\"\n", " value {\n", " artifact_spec {\n", " type {\n", " name: \"ExampleStatistics\"\n", " properties {\n", " key: \"span\"\n", " value: INT\n", " }\n", " properties {\n", " key: \"split_names\"\n", " value: STRING\n", " }\n", " base_type: STATISTICS\n", " }\n", " }\n", " }\n", " }\n", " outputs {\n", " key: \"transform_graph\"\n", " value {\n", " artifact_spec {\n", " type {\n", " name: \"TransformGraph\"\n", " }\n", " }\n", " }\n", " }\n", " outputs {\n", " key: \"updated_analyzer_cache\"\n", " value {\n", " artifact_spec {\n", " type {\n", " name: \"TransformCache\"\n", " }\n", " }\n", " }\n", " }\n", "}\n", "parameters {\n", " parameters {\n", " key: \"custom_config\"\n", " value {\n", " field_value {\n", " string_value: \"null\"\n", " }\n", " }\n", " }\n", " parameters {\n", " key: \"disable_statistics\"\n", " value {\n", " field_value {\n", " int_value: 0\n", " }\n", " }\n", " }\n", " parameters {\n", " key: \"force_tf_compat_v1\"\n", " value {\n", " field_value {\n", " int_value: 0\n", " }\n", " }\n", " }\n", " parameters {\n", " key: \"module_path\"\n", " value {\n", " field_value {\n", " string_value: \"penguin_utils@pipelines/penguin-transform/_wheels/tfx_user_code_Transform-0.0+a5e9139bd7facf5026b5306a6aea534f89db0dea58ebe1bb1fb5ebb9df5fdea9-py3-none-any.whl\"\n", " }\n", " }\n", " }\n", "}\n", "upstream_nodes: \"CsvExampleGen\"\n", "upstream_nodes: \"schema_importer\"\n", "downstream_nodes: \"Trainer\"\n", "execution_options {\n", " caching_options {\n", " }\n", "}\n", ", pipeline_info=id: \"penguin-transform\"\n", ", pipeline_run_id='2024-05-08T09:20:15.209892', top_level_pipeline_run_id=None, frontend_url=None)\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "INFO:absl:Analyze the 'train' split and transform all splits when splits_config is not set.\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "INFO:absl:udf_utils.get_fn {'module_file': None, 'module_path': 'penguin_utils@pipelines/penguin-transform/_wheels/tfx_user_code_Transform-0.0+a5e9139bd7facf5026b5306a6aea534f89db0dea58ebe1bb1fb5ebb9df5fdea9-py3-none-any.whl', 'preprocessing_fn': None} 'preprocessing_fn'\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "INFO:absl:Installing 'pipelines/penguin-transform/_wheels/tfx_user_code_Transform-0.0+a5e9139bd7facf5026b5306a6aea534f89db0dea58ebe1bb1fb5ebb9df5fdea9-py3-none-any.whl' to a temporary directory.\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "INFO:absl:Executing: ['/tmpfs/src/tf_docs_env/bin/python', '-m', 'pip', 'install', '--target', '/tmpfs/tmp/tmp0mjyiork', 'pipelines/penguin-transform/_wheels/tfx_user_code_Transform-0.0+a5e9139bd7facf5026b5306a6aea534f89db0dea58ebe1bb1fb5ebb9df5fdea9-py3-none-any.whl']\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "Processing ./pipelines/penguin-transform/_wheels/tfx_user_code_Transform-0.0+a5e9139bd7facf5026b5306a6aea534f89db0dea58ebe1bb1fb5ebb9df5fdea9-py3-none-any.whl\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "INFO:absl:Successfully installed 'pipelines/penguin-transform/_wheels/tfx_user_code_Transform-0.0+a5e9139bd7facf5026b5306a6aea534f89db0dea58ebe1bb1fb5ebb9df5fdea9-py3-none-any.whl'.\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "INFO:absl:udf_utils.get_fn {'module_file': None, 'module_path': 'penguin_utils@pipelines/penguin-transform/_wheels/tfx_user_code_Transform-0.0+a5e9139bd7facf5026b5306a6aea534f89db0dea58ebe1bb1fb5ebb9df5fdea9-py3-none-any.whl', 'stats_options_updater_fn': None} 'stats_options_updater_fn'\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "INFO:absl:Installing 'pipelines/penguin-transform/_wheels/tfx_user_code_Transform-0.0+a5e9139bd7facf5026b5306a6aea534f89db0dea58ebe1bb1fb5ebb9df5fdea9-py3-none-any.whl' to a temporary directory.\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "INFO:absl:Executing: ['/tmpfs/src/tf_docs_env/bin/python', '-m', 'pip', 'install', '--target', '/tmpfs/tmp/tmp_hkp2fs0', 'pipelines/penguin-transform/_wheels/tfx_user_code_Transform-0.0+a5e9139bd7facf5026b5306a6aea534f89db0dea58ebe1bb1fb5ebb9df5fdea9-py3-none-any.whl']\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "Installing collected packages: tfx-user-code-Transform\n", "Successfully installed tfx-user-code-Transform-0.0+a5e9139bd7facf5026b5306a6aea534f89db0dea58ebe1bb1fb5ebb9df5fdea9\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "Processing ./pipelines/penguin-transform/_wheels/tfx_user_code_Transform-0.0+a5e9139bd7facf5026b5306a6aea534f89db0dea58ebe1bb1fb5ebb9df5fdea9-py3-none-any.whl\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "INFO:absl:Successfully installed 'pipelines/penguin-transform/_wheels/tfx_user_code_Transform-0.0+a5e9139bd7facf5026b5306a6aea534f89db0dea58ebe1bb1fb5ebb9df5fdea9-py3-none-any.whl'.\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "INFO:absl:Installing 'pipelines/penguin-transform/_wheels/tfx_user_code_Transform-0.0+a5e9139bd7facf5026b5306a6aea534f89db0dea58ebe1bb1fb5ebb9df5fdea9-py3-none-any.whl' to a temporary directory.\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "INFO:absl:Executing: ['/tmpfs/src/tf_docs_env/bin/python', '-m', 'pip', 'install', '--target', '/tmpfs/tmp/tmprvrky6ts', 'pipelines/penguin-transform/_wheels/tfx_user_code_Transform-0.0+a5e9139bd7facf5026b5306a6aea534f89db0dea58ebe1bb1fb5ebb9df5fdea9-py3-none-any.whl']\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "Installing collected packages: tfx-user-code-Transform\n", "Successfully installed tfx-user-code-Transform-0.0+a5e9139bd7facf5026b5306a6aea534f89db0dea58ebe1bb1fb5ebb9df5fdea9\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "Processing ./pipelines/penguin-transform/_wheels/tfx_user_code_Transform-0.0+a5e9139bd7facf5026b5306a6aea534f89db0dea58ebe1bb1fb5ebb9df5fdea9-py3-none-any.whl\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "INFO:absl:Successfully installed 'pipelines/penguin-transform/_wheels/tfx_user_code_Transform-0.0+a5e9139bd7facf5026b5306a6aea534f89db0dea58ebe1bb1fb5ebb9df5fdea9-py3-none-any.whl'.\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "INFO:absl:Feature body_mass_g has a shape dim {\n", " size: 1\n", "}\n", ". Setting to DenseTensor.\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "INFO:absl:Feature culmen_depth_mm has a shape dim {\n", " size: 1\n", "}\n", ". Setting to DenseTensor.\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "INFO:absl:Feature culmen_length_mm has a shape dim {\n", " size: 1\n", "}\n", ". Setting to DenseTensor.\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "INFO:absl:Feature flipper_length_mm has a shape dim {\n", " size: 1\n", "}\n", ". Setting to DenseTensor.\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "INFO:absl:Feature island has a shape dim {\n", " size: 1\n", "}\n", ". Setting to DenseTensor.\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "INFO:absl:Feature sex has a shape dim {\n", " size: 1\n", "}\n", ". Setting to DenseTensor.\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "INFO:absl:Feature species has a shape dim {\n", " size: 1\n", "}\n", ". Setting to DenseTensor.\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "Installing collected packages: tfx-user-code-Transform\n", "Successfully installed tfx-user-code-Transform-0.0+a5e9139bd7facf5026b5306a6aea534f89db0dea58ebe1bb1fb5ebb9df5fdea9\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "INFO:absl:Feature body_mass_g has a shape dim {\n", " size: 1\n", "}\n", ". Setting to DenseTensor.\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "INFO:absl:Feature culmen_depth_mm has a shape dim {\n", " size: 1\n", "}\n", ". Setting to DenseTensor.\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "INFO:absl:Feature culmen_length_mm has a shape dim {\n", " size: 1\n", "}\n", ". Setting to DenseTensor.\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "INFO:absl:Feature flipper_length_mm has a shape dim {\n", " size: 1\n", "}\n", ". Setting to DenseTensor.\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "INFO:absl:Feature island has a shape dim {\n", " size: 1\n", "}\n", ". Setting to DenseTensor.\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "INFO:absl:Feature sex has a shape dim {\n", " size: 1\n", "}\n", ". Setting to DenseTensor.\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "INFO:absl:Feature species has a shape dim {\n", " size: 1\n", "}\n", ". Setting to DenseTensor.\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "INFO:absl:Feature body_mass_g has a shape dim {\n", " size: 1\n", "}\n", ". Setting to DenseTensor.\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "INFO:absl:Feature culmen_depth_mm has a shape dim {\n", " size: 1\n", "}\n", ". Setting to DenseTensor.\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "INFO:absl:Feature culmen_length_mm has a shape dim {\n", " size: 1\n", "}\n", ". Setting to DenseTensor.\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "INFO:absl:Feature flipper_length_mm has a shape dim {\n", " size: 1\n", "}\n", ". Setting to DenseTensor.\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "INFO:absl:Feature island has a shape dim {\n", " size: 1\n", "}\n", ". Setting to DenseTensor.\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "INFO:absl:Feature sex has a shape dim {\n", " size: 1\n", "}\n", ". Setting to DenseTensor.\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "INFO:absl:Feature species has a shape dim {\n", " size: 1\n", "}\n", ". Setting to DenseTensor.\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "INFO:absl:Feature body_mass_g has a shape dim {\n", " size: 1\n", "}\n", ". Setting to DenseTensor.\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "INFO:absl:Feature culmen_depth_mm has a shape dim {\n", " size: 1\n", "}\n", ". Setting to DenseTensor.\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "INFO:absl:Feature culmen_length_mm has a shape dim {\n", " size: 1\n", "}\n", ". Setting to DenseTensor.\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "INFO:absl:Feature flipper_length_mm has a shape dim {\n", " size: 1\n", "}\n", ". Setting to DenseTensor.\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "INFO:absl:Feature species has a shape dim {\n", " size: 1\n", "}\n", ". Setting to DenseTensor.\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "INFO:absl:Feature body_mass_g has a shape dim {\n", " size: 1\n", "}\n", ". Setting to DenseTensor.\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "INFO:absl:Feature culmen_depth_mm has a shape dim {\n", " size: 1\n", "}\n", ". Setting to DenseTensor.\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "INFO:absl:Feature culmen_length_mm has a shape dim {\n", " size: 1\n", "}\n", ". Setting to DenseTensor.\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "INFO:absl:Feature flipper_length_mm has a shape dim {\n", " size: 1\n", "}\n", ". Setting to DenseTensor.\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "INFO:absl:Feature island has a shape dim {\n", " size: 1\n", "}\n", ". Setting to DenseTensor.\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "INFO:absl:Feature sex has a shape dim {\n", " size: 1\n", "}\n", ". Setting to DenseTensor.\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "INFO:absl:Feature species has a shape dim {\n", " size: 1\n", "}\n", ". Setting to DenseTensor.\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "INFO:absl:Feature body_mass_g has a shape dim {\n", " size: 1\n", "}\n", ". Setting to DenseTensor.\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "INFO:absl:Feature culmen_depth_mm has a shape dim {\n", " size: 1\n", "}\n", ". Setting to DenseTensor.\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "INFO:absl:Feature culmen_length_mm has a shape dim {\n", " size: 1\n", "}\n", ". Setting to DenseTensor.\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "INFO:absl:Feature flipper_length_mm has a shape dim {\n", " size: 1\n", "}\n", ". Setting to DenseTensor.\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "INFO:absl:Feature island has a shape dim {\n", " size: 1\n", "}\n", ". Setting to DenseTensor.\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "INFO:absl:Feature sex has a shape dim {\n", " size: 1\n", "}\n", ". Setting to DenseTensor.\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "INFO:absl:Feature species has a shape dim {\n", " size: 1\n", "}\n", ". Setting to DenseTensor.\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "INFO:absl:Feature body_mass_g has a shape dim {\n", " size: 1\n", "}\n", ". Setting to DenseTensor.\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "INFO:absl:Feature culmen_depth_mm has a shape dim {\n", " size: 1\n", "}\n", ". Setting to DenseTensor.\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "INFO:absl:Feature culmen_length_mm has a shape dim {\n", " size: 1\n", "}\n", ". Setting to DenseTensor.\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "INFO:absl:Feature flipper_length_mm has a shape dim {\n", " size: 1\n", "}\n", ". Setting to DenseTensor.\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "INFO:absl:Feature island has a shape dim {\n", " size: 1\n", "}\n", ". Setting to DenseTensor.\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "INFO:absl:Feature sex has a shape dim {\n", " size: 1\n", "}\n", ". Setting to DenseTensor.\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "INFO:absl:Feature species has a shape dim {\n", " size: 1\n", "}\n", ". Setting to DenseTensor.\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "WARNING:absl:Tables initialized inside a tf.function will be re-initialized on every invocation of the function. This re-initialization can have significant impact on performance. Consider lifting them out of the graph context using `tf.init_scope`.: key_value_init/LookupTableImportV2\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "WARNING:absl:Tables initialized inside a tf.function will be re-initialized on every invocation of the function. This re-initialization can have significant impact on performance. Consider lifting them out of the graph context using `tf.init_scope`.: key_value_init/LookupTableImportV2\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "WARNING:absl:Tables initialized inside a tf.function will be re-initialized on every invocation of the function. This re-initialization can have significant impact on performance. Consider lifting them out of the graph context using `tf.init_scope`.: key_value_init/LookupTableImportV2\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "WARNING:absl:Tables initialized inside a tf.function will be re-initialized on every invocation of the function. This re-initialization can have significant impact on performance. Consider lifting them out of the graph context using `tf.init_scope`.: key_value_init/LookupTableImportV2\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "INFO:absl:Feature body_mass_g has a shape dim {\n", " size: 1\n", "}\n", ". Setting to DenseTensor.\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "INFO:absl:Feature culmen_depth_mm has a shape dim {\n", " size: 1\n", "}\n", ". Setting to DenseTensor.\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "INFO:absl:Feature culmen_length_mm has a shape dim {\n", " size: 1\n", "}\n", ". Setting to DenseTensor.\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "INFO:absl:Feature flipper_length_mm has a shape dim {\n", " size: 1\n", "}\n", ". Setting to DenseTensor.\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "INFO:absl:Feature species has a shape dim {\n", " size: 1\n", "}\n", ". Setting to DenseTensor.\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "INFO:absl:Feature body_mass_g has a shape dim {\n", " size: 1\n", "}\n", ". Setting to DenseTensor.\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "INFO:absl:Feature culmen_depth_mm has a shape dim {\n", " size: 1\n", "}\n", ". Setting to DenseTensor.\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "INFO:absl:Feature culmen_length_mm has a shape dim {\n", " size: 1\n", "}\n", ". Setting to DenseTensor.\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "INFO:absl:Feature flipper_length_mm has a shape dim {\n", " size: 1\n", "}\n", ". Setting to DenseTensor.\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "INFO:absl:Feature island has a shape dim {\n", " size: 1\n", "}\n", ". Setting to DenseTensor.\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "INFO:absl:Feature sex has a shape dim {\n", " size: 1\n", "}\n", ". Setting to DenseTensor.\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "INFO:absl:Feature species has a shape dim {\n", " size: 1\n", "}\n", ". Setting to DenseTensor.\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "INFO:absl:Feature body_mass_g has a shape dim {\n", " size: 1\n", "}\n", ". Setting to DenseTensor.\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "INFO:absl:Feature culmen_depth_mm has a shape dim {\n", " size: 1\n", "}\n", ". Setting to DenseTensor.\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "INFO:absl:Feature culmen_length_mm has a shape dim {\n", " size: 1\n", "}\n", ". Setting to DenseTensor.\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "INFO:absl:Feature flipper_length_mm has a shape dim {\n", " size: 1\n", "}\n", ". Setting to DenseTensor.\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "INFO:absl:Feature species has a shape dim {\n", " size: 1\n", "}\n", ". Setting to DenseTensor.\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "INFO:absl:Feature body_mass_g has a shape dim {\n", " size: 1\n", "}\n", ". Setting to DenseTensor.\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "INFO:absl:Feature culmen_depth_mm has a shape dim {\n", " size: 1\n", "}\n", ". Setting to DenseTensor.\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "INFO:absl:Feature culmen_length_mm has a shape dim {\n", " size: 1\n", "}\n", ". Setting to DenseTensor.\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "INFO:absl:Feature flipper_length_mm has a shape dim {\n", " size: 1\n", "}\n", ". Setting to DenseTensor.\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "INFO:absl:Feature island has a shape dim {\n", " size: 1\n", "}\n", ". Setting to DenseTensor.\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "INFO:absl:Feature sex has a shape dim {\n", " size: 1\n", "}\n", ". Setting to DenseTensor.\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "INFO:absl:Feature species has a shape dim {\n", " size: 1\n", "}\n", ". Setting to DenseTensor.\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "INFO:tensorflow:Assets written to: pipelines/penguin-transform/Transform/transform_graph/4/.temp_path/tftransform_tmp/c08ebbeca8814f288e2437f43b1224e5/assets\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "INFO:tensorflow:Assets written to: pipelines/penguin-transform/Transform/transform_graph/4/.temp_path/tftransform_tmp/c08ebbeca8814f288e2437f43b1224e5/assets\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "INFO:absl:Writing fingerprint to pipelines/penguin-transform/Transform/transform_graph/4/.temp_path/tftransform_tmp/c08ebbeca8814f288e2437f43b1224e5/fingerprint.pb\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "WARNING:absl:Tables initialized inside a tf.function will be re-initialized on every invocation of the function. This re-initialization can have significant impact on performance. Consider lifting them out of the graph context using `tf.init_scope`.: key_value_init/LookupTableImportV2\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "INFO:tensorflow:struct2tensor is not available.\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "INFO:tensorflow:struct2tensor is not available.\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "INFO:tensorflow:tensorflow_decision_forests is not available.\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "INFO:tensorflow:tensorflow_decision_forests is not available.\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "INFO:tensorflow:tensorflow_text is not available.\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "INFO:tensorflow:tensorflow_text is not available.\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "INFO:tensorflow:Assets written to: pipelines/penguin-transform/Transform/transform_graph/4/.temp_path/tftransform_tmp/922454c8ed954a439ec7388ab1479458/assets\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "INFO:tensorflow:Assets written to: pipelines/penguin-transform/Transform/transform_graph/4/.temp_path/tftransform_tmp/922454c8ed954a439ec7388ab1479458/assets\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "INFO:absl:Writing fingerprint to pipelines/penguin-transform/Transform/transform_graph/4/.temp_path/tftransform_tmp/922454c8ed954a439ec7388ab1479458/fingerprint.pb\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "WARNING:absl:Tables initialized inside a tf.function will be re-initialized on every invocation of the function. This re-initialization can have significant impact on performance. Consider lifting them out of the graph context using `tf.init_scope`.: key_value_init/LookupTableImportV2\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "INFO:absl:Feature body_mass_g has a shape dim {\n", " size: 1\n", "}\n", ". Setting to DenseTensor.\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "INFO:absl:Feature culmen_depth_mm has a shape dim {\n", " size: 1\n", "}\n", ". Setting to DenseTensor.\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "INFO:absl:Feature culmen_length_mm has a shape dim {\n", " size: 1\n", "}\n", ". Setting to DenseTensor.\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "INFO:absl:Feature flipper_length_mm has a shape dim {\n", " size: 1\n", "}\n", ". Setting to DenseTensor.\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "INFO:absl:Feature species has a shape dim {\n", " size: 1\n", "}\n", ". Setting to DenseTensor.\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "INFO:absl:Feature body_mass_g has a shape dim {\n", " size: 1\n", "}\n", ". Setting to DenseTensor.\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "INFO:absl:Feature culmen_depth_mm has a shape dim {\n", " size: 1\n", "}\n", ". Setting to DenseTensor.\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "INFO:absl:Feature culmen_length_mm has a shape dim {\n", " size: 1\n", "}\n", ". Setting to DenseTensor.\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "INFO:absl:Feature flipper_length_mm has a shape dim {\n", " size: 1\n", "}\n", ". Setting to DenseTensor.\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "INFO:absl:Feature species has a shape dim {\n", " size: 1\n", "}\n", ". Setting to DenseTensor.\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "INFO:tensorflow:struct2tensor is not available.\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "INFO:tensorflow:struct2tensor is not available.\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "INFO:tensorflow:tensorflow_decision_forests is not available.\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "INFO:tensorflow:tensorflow_decision_forests is not available.\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "INFO:tensorflow:tensorflow_text is not available.\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "INFO:tensorflow:tensorflow_text is not available.\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "INFO:tensorflow:struct2tensor is not available.\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "INFO:tensorflow:struct2tensor is not available.\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "INFO:tensorflow:tensorflow_decision_forests is not available.\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "INFO:tensorflow:tensorflow_decision_forests is not available.\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "INFO:tensorflow:tensorflow_text is not available.\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "INFO:tensorflow:tensorflow_text is not available.\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "INFO:absl:Cleaning up stateless execution info.\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "INFO:absl:Execution 4 succeeded.\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "INFO:absl:Cleaning up stateful execution info.\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "INFO:absl:Deleted stateful_working_dir pipelines/penguin-transform/Transform/.system/stateful_working_dir/353ff412-c22b-442d-86fa-d000785c15c6\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "INFO:absl:Publishing output artifacts defaultdict(, {'post_transform_schema': [Artifact(artifact: uri: \"pipelines/penguin-transform/Transform/post_transform_schema/4\"\n", ", artifact_type: name: \"Schema\"\n", ")], 'updated_analyzer_cache': [Artifact(artifact: uri: \"pipelines/penguin-transform/Transform/updated_analyzer_cache/4\"\n", ", artifact_type: name: \"TransformCache\"\n", ")], 'pre_transform_stats': [Artifact(artifact: uri: \"pipelines/penguin-transform/Transform/pre_transform_stats/4\"\n", ", artifact_type: name: \"ExampleStatistics\"\n", "properties {\n", " key: \"span\"\n", " value: INT\n", "}\n", "properties {\n", " key: \"split_names\"\n", " value: STRING\n", "}\n", "base_type: STATISTICS\n", ")], 'transform_graph': [Artifact(artifact: uri: \"pipelines/penguin-transform/Transform/transform_graph/4\"\n", ", artifact_type: name: \"TransformGraph\"\n", ")], 'post_transform_stats': [Artifact(artifact: uri: \"pipelines/penguin-transform/Transform/post_transform_stats/4\"\n", ", artifact_type: name: \"ExampleStatistics\"\n", "properties {\n", " key: \"span\"\n", " value: INT\n", "}\n", "properties {\n", " key: \"split_names\"\n", " value: STRING\n", "}\n", "base_type: STATISTICS\n", ")], 'post_transform_anomalies': [Artifact(artifact: uri: \"pipelines/penguin-transform/Transform/post_transform_anomalies/4\"\n", ", artifact_type: name: \"ExampleAnomalies\"\n", "properties {\n", " key: \"span\"\n", " value: INT\n", "}\n", "properties {\n", " key: \"split_names\"\n", " value: STRING\n", "}\n", ")], 'pre_transform_schema': [Artifact(artifact: uri: \"pipelines/penguin-transform/Transform/pre_transform_schema/4\"\n", ", artifact_type: name: \"Schema\"\n", ")]}) for execution 4\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "INFO:absl:MetadataStore with DB connection initialized\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "INFO:absl:Component Transform is finished.\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "INFO:absl:Component ExampleValidator is running.\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "INFO:absl:Running launcher for node_info {\n", " type {\n", " name: \"tfx.components.example_validator.component.ExampleValidator\"\n", " }\n", " id: \"ExampleValidator\"\n", "}\n", "contexts {\n", " contexts {\n", " type {\n", " name: \"pipeline\"\n", " }\n", " name {\n", " field_value {\n", " string_value: \"penguin-transform\"\n", " }\n", " }\n", " }\n", " contexts {\n", " type {\n", " name: \"pipeline_run\"\n", " }\n", " name {\n", " field_value {\n", " string_value: \"2024-05-08T09:20:15.209892\"\n", " }\n", " }\n", " }\n", " contexts {\n", " type {\n", " name: \"node\"\n", " }\n", " name {\n", " field_value {\n", " string_value: \"penguin-transform.ExampleValidator\"\n", " }\n", " }\n", " }\n", "}\n", "inputs {\n", " inputs {\n", " key: \"schema\"\n", " value {\n", " channels {\n", " producer_node_query {\n", " id: \"schema_importer\"\n", " }\n", " context_queries {\n", " type {\n", " name: \"pipeline\"\n", " }\n", " name {\n", " field_value {\n", " string_value: \"penguin-transform\"\n", " }\n", " }\n", " }\n", " context_queries {\n", " type {\n", " name: \"pipeline_run\"\n", " }\n", " name {\n", " field_value {\n", " string_value: \"2024-05-08T09:20:15.209892\"\n", " }\n", " }\n", " }\n", " context_queries {\n", " type {\n", " name: \"node\"\n", " }\n", " name {\n", " field_value {\n", " string_value: \"penguin-transform.schema_importer\"\n", " }\n", " }\n", " }\n", " artifact_query {\n", " type {\n", " name: \"Schema\"\n", " }\n", " }\n", " output_key: \"result\"\n", " }\n", " min_count: 1\n", " }\n", " }\n", " inputs {\n", " key: \"statistics\"\n", " value {\n", " channels {\n", " producer_node_query {\n", " id: \"StatisticsGen\"\n", " }\n", " context_queries {\n", " type {\n", " name: \"pipeline\"\n", " }\n", " name {\n", " field_value {\n", " string_value: \"penguin-transform\"\n", " }\n", " }\n", " }\n", " context_queries {\n", " type {\n", " name: \"pipeline_run\"\n", " }\n", " name {\n", " field_value {\n", " string_value: \"2024-05-08T09:20:15.209892\"\n", " }\n", " }\n", " }\n", " context_queries {\n", " type {\n", " name: \"node\"\n", " }\n", " name {\n", " field_value {\n", " string_value: \"penguin-transform.StatisticsGen\"\n", " }\n", " }\n", " }\n", " artifact_query {\n", " type {\n", " name: \"ExampleStatistics\"\n", " base_type: STATISTICS\n", " }\n", " }\n", " output_key: \"statistics\"\n", " }\n", " min_count: 1\n", " }\n", " }\n", "}\n", "outputs {\n", " outputs {\n", " key: \"anomalies\"\n", " value {\n", " artifact_spec {\n", " type {\n", " name: \"ExampleAnomalies\"\n", " properties {\n", " key: \"span\"\n", " value: INT\n", " }\n", " properties {\n", " key: \"split_names\"\n", " value: STRING\n", " }\n", " }\n", " }\n", " }\n", " }\n", "}\n", "parameters {\n", " parameters {\n", " key: \"exclude_splits\"\n", " value {\n", " field_value {\n", " string_value: \"[]\"\n", " }\n", " }\n", " }\n", "}\n", "upstream_nodes: \"StatisticsGen\"\n", "upstream_nodes: \"schema_importer\"\n", "execution_options {\n", " caching_options {\n", " }\n", "}\n", "\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "INFO:absl:MetadataStore with DB connection initialized\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "WARNING:absl:ArtifactQuery.property_predicate is not supported.\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "WARNING:absl:ArtifactQuery.property_predicate is not supported.\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "INFO:absl:[ExampleValidator] Resolved inputs: ({'statistics': [Artifact(artifact: id: 3\n", "type_id: 19\n", "uri: \"pipelines/penguin-transform/StatisticsGen/statistics/3\"\n", "properties {\n", " key: \"span\"\n", " value {\n", " int_value: 0\n", " }\n", "}\n", "properties {\n", " key: \"split_names\"\n", " value {\n", " string_value: \"[\\\"train\\\", \\\"eval\\\"]\"\n", " }\n", "}\n", "custom_properties {\n", " key: \"is_external\"\n", " value {\n", " int_value: 0\n", " }\n", "}\n", "custom_properties {\n", " key: \"stats_dashboard_link\"\n", " value {\n", " string_value: \"\"\n", " }\n", "}\n", "custom_properties {\n", " key: \"tfx_version\"\n", " value {\n", " string_value: \"1.15.0\"\n", " }\n", "}\n", "state: LIVE\n", "type: \"ExampleStatistics\"\n", "create_time_since_epoch: 1715160019591\n", "last_update_time_since_epoch: 1715160019591\n", ", artifact_type: id: 19\n", "name: \"ExampleStatistics\"\n", "properties {\n", " key: \"span\"\n", " value: INT\n", "}\n", "properties {\n", " key: \"split_names\"\n", " value: STRING\n", "}\n", "base_type: STATISTICS\n", ")], 'schema': [Artifact(artifact: id: 2\n", "type_id: 17\n", "uri: \"schema\"\n", "custom_properties {\n", " key: \"is_external\"\n", " value {\n", " int_value: 1\n", " }\n", "}\n", "custom_properties {\n", " key: \"tfx_version\"\n", " value {\n", " string_value: \"1.15.0\"\n", " }\n", "}\n", "state: LIVE\n", "type: \"Schema\"\n", "create_time_since_epoch: 1715160016371\n", "last_update_time_since_epoch: 1715160016371\n", ", artifact_type: id: 17\n", "name: \"Schema\"\n", ")]},)\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "INFO:absl:MetadataStore with DB connection initialized\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "INFO:absl:Going to run a new execution 5\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "INFO:absl:Going to run a new execution: ExecutionInfo(execution_id=5, input_dict={'statistics': [Artifact(artifact: id: 3\n", "type_id: 19\n", "uri: \"pipelines/penguin-transform/StatisticsGen/statistics/3\"\n", "properties {\n", " key: \"span\"\n", " value {\n", " int_value: 0\n", " }\n", "}\n", "properties {\n", " key: \"split_names\"\n", " value {\n", " string_value: \"[\\\"train\\\", \\\"eval\\\"]\"\n", " }\n", "}\n", "custom_properties {\n", " key: \"is_external\"\n", " value {\n", " int_value: 0\n", " }\n", "}\n", "custom_properties {\n", " key: \"stats_dashboard_link\"\n", " value {\n", " string_value: \"\"\n", " }\n", "}\n", "custom_properties {\n", " key: \"tfx_version\"\n", " value {\n", " string_value: \"1.15.0\"\n", " }\n", "}\n", "state: LIVE\n", "type: \"ExampleStatistics\"\n", "create_time_since_epoch: 1715160019591\n", "last_update_time_since_epoch: 1715160019591\n", ", artifact_type: id: 19\n", "name: \"ExampleStatistics\"\n", "properties {\n", " key: \"span\"\n", " value: INT\n", "}\n", "properties {\n", " key: \"split_names\"\n", " value: STRING\n", "}\n", "base_type: STATISTICS\n", ")], 'schema': [Artifact(artifact: id: 2\n", "type_id: 17\n", "uri: \"schema\"\n", "custom_properties {\n", " key: \"is_external\"\n", " value {\n", " int_value: 1\n", " }\n", "}\n", "custom_properties {\n", " key: \"tfx_version\"\n", " value {\n", " string_value: \"1.15.0\"\n", " }\n", "}\n", "state: LIVE\n", "type: \"Schema\"\n", "create_time_since_epoch: 1715160016371\n", "last_update_time_since_epoch: 1715160016371\n", ", artifact_type: id: 17\n", "name: \"Schema\"\n", ")]}, output_dict=defaultdict(, {'anomalies': [Artifact(artifact: uri: \"pipelines/penguin-transform/ExampleValidator/anomalies/5\"\n", ", artifact_type: name: \"ExampleAnomalies\"\n", "properties {\n", " key: \"span\"\n", " value: INT\n", "}\n", "properties {\n", " key: \"split_names\"\n", " value: STRING\n", "}\n", ")]}), exec_properties={'exclude_splits': '[]'}, execution_output_uri='pipelines/penguin-transform/ExampleValidator/.system/executor_execution/5/executor_output.pb', stateful_working_dir='pipelines/penguin-transform/ExampleValidator/.system/stateful_working_dir/5c7a9496-8531-48e8-9884-9db5c9dd4bb2', tmp_dir='pipelines/penguin-transform/ExampleValidator/.system/executor_execution/5/.temp/', pipeline_node=node_info {\n", " type {\n", " name: \"tfx.components.example_validator.component.ExampleValidator\"\n", " }\n", " id: \"ExampleValidator\"\n", "}\n", "contexts {\n", " contexts {\n", " type {\n", " name: \"pipeline\"\n", " }\n", " name {\n", " field_value {\n", " string_value: \"penguin-transform\"\n", " }\n", " }\n", " }\n", " contexts {\n", " type {\n", " name: \"pipeline_run\"\n", " }\n", " name {\n", " field_value {\n", " string_value: \"2024-05-08T09:20:15.209892\"\n", " }\n", " }\n", " }\n", " contexts {\n", " type {\n", " name: \"node\"\n", " }\n", " name {\n", " field_value {\n", " string_value: \"penguin-transform.ExampleValidator\"\n", " }\n", " }\n", " }\n", "}\n", "inputs {\n", " inputs {\n", " key: \"schema\"\n", " value {\n", " channels {\n", " producer_node_query {\n", " id: \"schema_importer\"\n", " }\n", " context_queries {\n", " type {\n", " name: \"pipeline\"\n", " }\n", " name {\n", " field_value {\n", " string_value: \"penguin-transform\"\n", " }\n", " }\n", " }\n", " context_queries {\n", " type {\n", " name: \"pipeline_run\"\n", " }\n", " name {\n", " field_value {\n", " string_value: \"2024-05-08T09:20:15.209892\"\n", " }\n", " }\n", " }\n", " context_queries {\n", " type {\n", " name: \"node\"\n", " }\n", " name {\n", " field_value {\n", " string_value: \"penguin-transform.schema_importer\"\n", " }\n", " }\n", " }\n", " artifact_query {\n", " type {\n", " name: \"Schema\"\n", " }\n", " }\n", " output_key: \"result\"\n", " }\n", " min_count: 1\n", " }\n", " }\n", " inputs {\n", " key: \"statistics\"\n", " value {\n", " channels {\n", " producer_node_query {\n", " id: \"StatisticsGen\"\n", " }\n", " context_queries {\n", " type {\n", " name: \"pipeline\"\n", " }\n", " name {\n", " field_value {\n", " string_value: \"penguin-transform\"\n", " }\n", " }\n", " }\n", " context_queries {\n", " type {\n", " name: \"pipeline_run\"\n", " }\n", " name {\n", " field_value {\n", " string_value: \"2024-05-08T09:20:15.209892\"\n", " }\n", " }\n", " }\n", " context_queries {\n", " type {\n", " name: \"node\"\n", " }\n", " name {\n", " field_value {\n", " string_value: \"penguin-transform.StatisticsGen\"\n", " }\n", " }\n", " }\n", " artifact_query {\n", " type {\n", " name: \"ExampleStatistics\"\n", " base_type: STATISTICS\n", " }\n", " }\n", " output_key: \"statistics\"\n", " }\n", " min_count: 1\n", " }\n", " }\n", "}\n", "outputs {\n", " outputs {\n", " key: \"anomalies\"\n", " value {\n", " artifact_spec {\n", " type {\n", " name: \"ExampleAnomalies\"\n", " properties {\n", " key: \"span\"\n", " value: INT\n", " }\n", " properties {\n", " key: \"split_names\"\n", " value: STRING\n", " }\n", " }\n", " }\n", " }\n", " }\n", "}\n", "parameters {\n", " parameters {\n", " key: \"exclude_splits\"\n", " value {\n", " field_value {\n", " string_value: \"[]\"\n", " }\n", " }\n", " }\n", "}\n", "upstream_nodes: \"StatisticsGen\"\n", "upstream_nodes: \"schema_importer\"\n", "execution_options {\n", " caching_options {\n", " }\n", "}\n", ", pipeline_info=id: \"penguin-transform\"\n", ", pipeline_run_id='2024-05-08T09:20:15.209892', top_level_pipeline_run_id=None, frontend_url=None)\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "INFO:absl:Validating schema against the computed statistics for split train.\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "INFO:absl:Anomalies alerts created for split train.\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "INFO:absl:Validation complete for split train. Anomalies written to pipelines/penguin-transform/ExampleValidator/anomalies/5/Split-train.\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "INFO:absl:Validating schema against the computed statistics for split eval.\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "INFO:absl:Anomalies alerts created for split eval.\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "INFO:absl:Validation complete for split eval. Anomalies written to pipelines/penguin-transform/ExampleValidator/anomalies/5/Split-eval.\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "INFO:absl:Cleaning up stateless execution info.\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "INFO:absl:Execution 5 succeeded.\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "INFO:absl:Cleaning up stateful execution info.\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "INFO:absl:Deleted stateful_working_dir pipelines/penguin-transform/ExampleValidator/.system/stateful_working_dir/5c7a9496-8531-48e8-9884-9db5c9dd4bb2\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "INFO:absl:Publishing output artifacts defaultdict(, {'anomalies': [Artifact(artifact: uri: \"pipelines/penguin-transform/ExampleValidator/anomalies/5\"\n", ", artifact_type: name: \"ExampleAnomalies\"\n", "properties {\n", " key: \"span\"\n", " value: INT\n", "}\n", "properties {\n", " key: \"split_names\"\n", " value: STRING\n", "}\n", ")]}) for execution 5\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "INFO:absl:MetadataStore with DB connection initialized\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "INFO:absl:Component ExampleValidator is finished.\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "INFO:absl:Component Trainer is running.\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "INFO:absl:Running launcher for node_info {\n", " type {\n", " name: \"tfx.components.trainer.component.Trainer\"\n", " base_type: TRAIN\n", " }\n", " id: \"Trainer\"\n", "}\n", "contexts {\n", " contexts {\n", " type {\n", " name: \"pipeline\"\n", " }\n", " name {\n", " field_value {\n", " string_value: \"penguin-transform\"\n", " }\n", " }\n", " }\n", " contexts {\n", " type {\n", " name: \"pipeline_run\"\n", " }\n", " name {\n", " field_value {\n", " string_value: \"2024-05-08T09:20:15.209892\"\n", " }\n", " }\n", " }\n", " contexts {\n", " type {\n", " name: \"node\"\n", " }\n", " name {\n", " field_value {\n", " string_value: \"penguin-transform.Trainer\"\n", " }\n", " }\n", " }\n", "}\n", "inputs {\n", " inputs {\n", " key: \"examples\"\n", " value {\n", " channels {\n", " producer_node_query {\n", " id: \"CsvExampleGen\"\n", " }\n", " context_queries {\n", " type {\n", " name: \"pipeline\"\n", " }\n", " name {\n", " field_value {\n", " string_value: \"penguin-transform\"\n", " }\n", " }\n", " }\n", " context_queries {\n", " type {\n", " name: \"pipeline_run\"\n", " }\n", " name {\n", " field_value {\n", " string_value: \"2024-05-08T09:20:15.209892\"\n", " }\n", " }\n", " }\n", " context_queries {\n", " type {\n", " name: \"node\"\n", " }\n", " name {\n", " field_value {\n", " string_value: \"penguin-transform.CsvExampleGen\"\n", " }\n", " }\n", " }\n", " artifact_query {\n", " type {\n", " name: \"Examples\"\n", " base_type: DATASET\n", " }\n", " }\n", " output_key: \"examples\"\n", " }\n", " min_count: 1\n", " }\n", " }\n", " inputs {\n", " key: \"transform_graph\"\n", " value {\n", " channels {\n", " producer_node_query {\n", " id: \"Transform\"\n", " }\n", " context_queries {\n", " type {\n", " name: \"pipeline\"\n", " }\n", " name {\n", " field_value {\n", " string_value: \"penguin-transform\"\n", " }\n", " }\n", " }\n", " context_queries {\n", " type {\n", " name: \"pipeline_run\"\n", " }\n", " name {\n", " field_value {\n", " string_value: \"2024-05-08T09:20:15.209892\"\n", " }\n", " }\n", " }\n", " context_queries {\n", " type {\n", " name: \"node\"\n", " }\n", " name {\n", " field_value {\n", " string_value: \"penguin-transform.Transform\"\n", " }\n", " }\n", " }\n", " artifact_query {\n", " type {\n", " name: \"TransformGraph\"\n", " }\n", " }\n", " output_key: \"transform_graph\"\n", " }\n", " }\n", " }\n", "}\n", "outputs {\n", " outputs {\n", " key: \"model\"\n", " value {\n", " artifact_spec {\n", " type {\n", " name: \"Model\"\n", " base_type: MODEL\n", " }\n", " }\n", " }\n", " }\n", " outputs {\n", " key: \"model_run\"\n", " value {\n", " artifact_spec {\n", " type {\n", " name: \"ModelRun\"\n", " }\n", " }\n", " }\n", " }\n", "}\n", "parameters {\n", " parameters {\n", " key: \"custom_config\"\n", " value {\n", " field_value {\n", " string_value: \"null\"\n", " }\n", " }\n", " }\n", " parameters {\n", " key: \"eval_args\"\n", " value {\n", " field_value {\n", " string_value: \"{\\n \\\"num_steps\\\": 5\\n}\"\n", " }\n", " }\n", " }\n", " parameters {\n", " key: \"module_path\"\n", " value {\n", " field_value {\n", " string_value: \"penguin_utils@pipelines/penguin-transform/_wheels/tfx_user_code_Trainer-0.0+a5e9139bd7facf5026b5306a6aea534f89db0dea58ebe1bb1fb5ebb9df5fdea9-py3-none-any.whl\"\n", " }\n", " }\n", " }\n", " parameters {\n", " key: \"train_args\"\n", " value {\n", " field_value {\n", " string_value: \"{\\n \\\"num_steps\\\": 100\\n}\"\n", " }\n", " }\n", " }\n", "}\n", "upstream_nodes: \"CsvExampleGen\"\n", "upstream_nodes: \"Transform\"\n", "downstream_nodes: \"Pusher\"\n", "execution_options {\n", " caching_options {\n", " }\n", "}\n", "\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "INFO:absl:MetadataStore with DB connection initialized\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "WARNING:absl:ArtifactQuery.property_predicate is not supported.\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "WARNING:absl:ArtifactQuery.property_predicate is not supported.\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "INFO:absl:[Trainer] Resolved inputs: ({'examples': [Artifact(artifact: id: 1\n", "type_id: 15\n", "uri: \"pipelines/penguin-transform/CsvExampleGen/examples/1\"\n", "properties {\n", " key: \"split_names\"\n", " value {\n", " string_value: \"[\\\"train\\\", \\\"eval\\\"]\"\n", " }\n", "}\n", "custom_properties {\n", " key: \"file_format\"\n", " value {\n", " string_value: \"tfrecords_gzip\"\n", " }\n", "}\n", "custom_properties {\n", " key: \"input_fingerprint\"\n", " value {\n", " string_value: \"split:single_split,num_files:1,total_bytes:13161,xor_checksum:1715160013,sum_checksum:1715160013\"\n", " }\n", "}\n", "custom_properties {\n", " key: \"is_external\"\n", " value {\n", " int_value: 0\n", " }\n", "}\n", "custom_properties {\n", " key: \"payload_format\"\n", " value {\n", " string_value: \"FORMAT_TF_EXAMPLE\"\n", " }\n", "}\n", "custom_properties {\n", " key: \"span\"\n", " value {\n", " int_value: 0\n", " }\n", "}\n", "custom_properties {\n", " key: \"tfx_version\"\n", " value {\n", " string_value: \"1.15.0\"\n", " }\n", "}\n", "state: LIVE\n", "type: \"Examples\"\n", "create_time_since_epoch: 1715160016351\n", "last_update_time_since_epoch: 1715160016351\n", ", artifact_type: id: 15\n", "name: \"Examples\"\n", "properties {\n", " key: \"span\"\n", " value: INT\n", "}\n", "properties {\n", " key: \"split_names\"\n", " value: STRING\n", "}\n", "properties {\n", " key: \"version\"\n", " value: INT\n", "}\n", "base_type: DATASET\n", ")], 'transform_graph': [Artifact(artifact: id: 7\n", "type_id: 22\n", "uri: \"pipelines/penguin-transform/Transform/transform_graph/4\"\n", "custom_properties {\n", " key: \"is_external\"\n", " value {\n", " int_value: 0\n", " }\n", "}\n", "custom_properties {\n", " key: \"tfx_version\"\n", " value {\n", " string_value: \"1.15.0\"\n", " }\n", "}\n", "state: LIVE\n", "type: \"TransformGraph\"\n", "create_time_since_epoch: 1715160039921\n", "last_update_time_since_epoch: 1715160039921\n", ", artifact_type: id: 22\n", "name: \"TransformGraph\"\n", ")]},)\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "INFO:absl:MetadataStore with DB connection initialized\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "INFO:absl:Going to run a new execution 6\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "INFO:absl:Going to run a new execution: ExecutionInfo(execution_id=6, input_dict={'examples': [Artifact(artifact: id: 1\n", "type_id: 15\n", "uri: \"pipelines/penguin-transform/CsvExampleGen/examples/1\"\n", "properties {\n", " key: \"split_names\"\n", " value {\n", " string_value: \"[\\\"train\\\", \\\"eval\\\"]\"\n", " }\n", "}\n", "custom_properties {\n", " key: \"file_format\"\n", " value {\n", " string_value: \"tfrecords_gzip\"\n", " }\n", "}\n", "custom_properties {\n", " key: \"input_fingerprint\"\n", " value {\n", " string_value: \"split:single_split,num_files:1,total_bytes:13161,xor_checksum:1715160013,sum_checksum:1715160013\"\n", " }\n", "}\n", "custom_properties {\n", " key: \"is_external\"\n", " value {\n", " int_value: 0\n", " }\n", "}\n", "custom_properties {\n", " key: \"payload_format\"\n", " value {\n", " string_value: \"FORMAT_TF_EXAMPLE\"\n", " }\n", "}\n", "custom_properties {\n", " key: \"span\"\n", " value {\n", " int_value: 0\n", " }\n", "}\n", "custom_properties {\n", " key: \"tfx_version\"\n", " value {\n", " string_value: \"1.15.0\"\n", " }\n", "}\n", "state: LIVE\n", "type: \"Examples\"\n", "create_time_since_epoch: 1715160016351\n", "last_update_time_since_epoch: 1715160016351\n", ", artifact_type: id: 15\n", "name: \"Examples\"\n", "properties {\n", " key: \"span\"\n", " value: INT\n", "}\n", "properties {\n", " key: \"split_names\"\n", " value: STRING\n", "}\n", "properties {\n", " key: \"version\"\n", " value: INT\n", "}\n", "base_type: DATASET\n", ")], 'transform_graph': [Artifact(artifact: id: 7\n", "type_id: 22\n", "uri: \"pipelines/penguin-transform/Transform/transform_graph/4\"\n", "custom_properties {\n", " key: \"is_external\"\n", " value {\n", " int_value: 0\n", " }\n", "}\n", "custom_properties {\n", " key: \"tfx_version\"\n", " value {\n", " string_value: \"1.15.0\"\n", " }\n", "}\n", "state: LIVE\n", "type: \"TransformGraph\"\n", "create_time_since_epoch: 1715160039921\n", "last_update_time_since_epoch: 1715160039921\n", ", artifact_type: id: 22\n", "name: \"TransformGraph\"\n", ")]}, output_dict=defaultdict(, {'model': [Artifact(artifact: uri: \"pipelines/penguin-transform/Trainer/model/6\"\n", ", artifact_type: name: \"Model\"\n", "base_type: MODEL\n", ")], 'model_run': [Artifact(artifact: uri: \"pipelines/penguin-transform/Trainer/model_run/6\"\n", ", artifact_type: name: \"ModelRun\"\n", ")]}), exec_properties={'train_args': '{\\n \"num_steps\": 100\\n}', 'custom_config': 'null', 'module_path': 'penguin_utils@pipelines/penguin-transform/_wheels/tfx_user_code_Trainer-0.0+a5e9139bd7facf5026b5306a6aea534f89db0dea58ebe1bb1fb5ebb9df5fdea9-py3-none-any.whl', 'eval_args': '{\\n \"num_steps\": 5\\n}'}, execution_output_uri='pipelines/penguin-transform/Trainer/.system/executor_execution/6/executor_output.pb', stateful_working_dir='pipelines/penguin-transform/Trainer/.system/stateful_working_dir/04611601-4dbf-412d-8c6c-4e4864bb335a', tmp_dir='pipelines/penguin-transform/Trainer/.system/executor_execution/6/.temp/', pipeline_node=node_info {\n", " type {\n", " name: \"tfx.components.trainer.component.Trainer\"\n", " base_type: TRAIN\n", " }\n", " id: \"Trainer\"\n", "}\n", "contexts {\n", " contexts {\n", " type {\n", " name: \"pipeline\"\n", " }\n", " name {\n", " field_value {\n", " string_value: \"penguin-transform\"\n", " }\n", " }\n", " }\n", " contexts {\n", " type {\n", " name: \"pipeline_run\"\n", " }\n", " name {\n", " field_value {\n", " string_value: \"2024-05-08T09:20:15.209892\"\n", " }\n", " }\n", " }\n", " contexts {\n", " type {\n", " name: \"node\"\n", " }\n", " name {\n", " field_value {\n", " string_value: \"penguin-transform.Trainer\"\n", " }\n", " }\n", " }\n", "}\n", "inputs {\n", " inputs {\n", " key: \"examples\"\n", " value {\n", " channels {\n", " producer_node_query {\n", " id: \"CsvExampleGen\"\n", " }\n", " context_queries {\n", " type {\n", " name: \"pipeline\"\n", " }\n", " name {\n", " field_value {\n", " string_value: \"penguin-transform\"\n", " }\n", " }\n", " }\n", " context_queries {\n", " type {\n", " name: \"pipeline_run\"\n", " }\n", " name {\n", " field_value {\n", " string_value: \"2024-05-08T09:20:15.209892\"\n", " }\n", " }\n", " }\n", " context_queries {\n", " type {\n", " name: \"node\"\n", " }\n", " name {\n", " field_value {\n", " string_value: \"penguin-transform.CsvExampleGen\"\n", " }\n", " }\n", " }\n", " artifact_query {\n", " type {\n", " name: \"Examples\"\n", " base_type: DATASET\n", " }\n", " }\n", " output_key: \"examples\"\n", " }\n", " min_count: 1\n", " }\n", " }\n", " inputs {\n", " key: \"transform_graph\"\n", " value {\n", " channels {\n", " producer_node_query {\n", " id: \"Transform\"\n", " }\n", " context_queries {\n", " type {\n", " name: \"pipeline\"\n", " }\n", " name {\n", " field_value {\n", " string_value: \"penguin-transform\"\n", " }\n", " }\n", " }\n", " context_queries {\n", " type {\n", " name: \"pipeline_run\"\n", " }\n", " name {\n", " field_value {\n", " string_value: \"2024-05-08T09:20:15.209892\"\n", " }\n", " }\n", " }\n", " context_queries {\n", " type {\n", " name: \"node\"\n", " }\n", " name {\n", " field_value {\n", " string_value: \"penguin-transform.Transform\"\n", " }\n", " }\n", " }\n", " artifact_query {\n", " type {\n", " name: \"TransformGraph\"\n", " }\n", " }\n", " output_key: \"transform_graph\"\n", " }\n", " }\n", " }\n", "}\n", "outputs {\n", " outputs {\n", " key: \"model\"\n", " value {\n", " artifact_spec {\n", " type {\n", " name: \"Model\"\n", " base_type: MODEL\n", " }\n", " }\n", " }\n", " }\n", " outputs {\n", " key: \"model_run\"\n", " value {\n", " artifact_spec {\n", " type {\n", " name: \"ModelRun\"\n", " }\n", " }\n", " }\n", " }\n", "}\n", "parameters {\n", " parameters {\n", " key: \"custom_config\"\n", " value {\n", " field_value {\n", " string_value: \"null\"\n", " }\n", " }\n", " }\n", " parameters {\n", " key: \"eval_args\"\n", " value {\n", " field_value {\n", " string_value: \"{\\n \\\"num_steps\\\": 5\\n}\"\n", " }\n", " }\n", " }\n", " parameters {\n", " key: \"module_path\"\n", " value {\n", " field_value {\n", " string_value: \"penguin_utils@pipelines/penguin-transform/_wheels/tfx_user_code_Trainer-0.0+a5e9139bd7facf5026b5306a6aea534f89db0dea58ebe1bb1fb5ebb9df5fdea9-py3-none-any.whl\"\n", " }\n", " }\n", " }\n", " parameters {\n", " key: \"train_args\"\n", " value {\n", " field_value {\n", " string_value: \"{\\n \\\"num_steps\\\": 100\\n}\"\n", " }\n", " }\n", " }\n", "}\n", "upstream_nodes: \"CsvExampleGen\"\n", "upstream_nodes: \"Transform\"\n", "downstream_nodes: \"Pusher\"\n", "execution_options {\n", " caching_options {\n", " }\n", "}\n", ", pipeline_info=id: \"penguin-transform\"\n", ", pipeline_run_id='2024-05-08T09:20:15.209892', top_level_pipeline_run_id=None, frontend_url=None)\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "INFO:absl:Train on the 'train' split when train_args.splits is not set.\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "INFO:absl:Evaluate on the 'eval' split when eval_args.splits is not set.\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "INFO:absl:udf_utils.get_fn {'train_args': '{\\n \"num_steps\": 100\\n}', 'custom_config': 'null', 'module_path': 'penguin_utils@pipelines/penguin-transform/_wheels/tfx_user_code_Trainer-0.0+a5e9139bd7facf5026b5306a6aea534f89db0dea58ebe1bb1fb5ebb9df5fdea9-py3-none-any.whl', 'eval_args': '{\\n \"num_steps\": 5\\n}'} 'run_fn'\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "INFO:absl:Installing 'pipelines/penguin-transform/_wheels/tfx_user_code_Trainer-0.0+a5e9139bd7facf5026b5306a6aea534f89db0dea58ebe1bb1fb5ebb9df5fdea9-py3-none-any.whl' to a temporary directory.\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "INFO:absl:Executing: ['/tmpfs/src/tf_docs_env/bin/python', '-m', 'pip', 'install', '--target', '/tmpfs/tmp/tmpwu08fhdc', 'pipelines/penguin-transform/_wheels/tfx_user_code_Trainer-0.0+a5e9139bd7facf5026b5306a6aea534f89db0dea58ebe1bb1fb5ebb9df5fdea9-py3-none-any.whl']\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "Processing ./pipelines/penguin-transform/_wheels/tfx_user_code_Trainer-0.0+a5e9139bd7facf5026b5306a6aea534f89db0dea58ebe1bb1fb5ebb9df5fdea9-py3-none-any.whl\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "INFO:absl:Successfully installed 'pipelines/penguin-transform/_wheels/tfx_user_code_Trainer-0.0+a5e9139bd7facf5026b5306a6aea534f89db0dea58ebe1bb1fb5ebb9df5fdea9-py3-none-any.whl'.\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "INFO:absl:Training model.\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "INFO:absl:Feature body_mass_g has a shape dim {\n", " size: 1\n", "}\n", ". Setting to DenseTensor.\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "INFO:absl:Feature culmen_depth_mm has a shape dim {\n", " size: 1\n", "}\n", ". Setting to DenseTensor.\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "INFO:absl:Feature culmen_length_mm has a shape dim {\n", " size: 1\n", "}\n", ". Setting to DenseTensor.\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "INFO:absl:Feature flipper_length_mm has a shape dim {\n", " size: 1\n", "}\n", ". Setting to DenseTensor.\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "INFO:absl:Feature island has a shape dim {\n", " size: 1\n", "}\n", ". Setting to DenseTensor.\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "INFO:absl:Feature sex has a shape dim {\n", " size: 1\n", "}\n", ". Setting to DenseTensor.\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "INFO:absl:Feature species has a shape dim {\n", " size: 1\n", "}\n", ". Setting to DenseTensor.\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "Installing collected packages: tfx-user-code-Trainer\n", "Successfully installed tfx-user-code-Trainer-0.0+a5e9139bd7facf5026b5306a6aea534f89db0dea58ebe1bb1fb5ebb9df5fdea9\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "WARNING:tensorflow:From /tmpfs/src/tf_docs_env/lib/python3.9/site-packages/tfx_bsl/tfxio/tf_example_record.py:343: parse_example_dataset (from tensorflow.python.data.experimental.ops.parsing_ops) is deprecated and will be removed in a future version.\n", "Instructions for updating:\n", "Use `tf.data.Dataset.map(tf.io.parse_example(...))` instead.\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "WARNING:tensorflow:From /tmpfs/src/tf_docs_env/lib/python3.9/site-packages/tfx_bsl/tfxio/tf_example_record.py:343: parse_example_dataset (from tensorflow.python.data.experimental.ops.parsing_ops) is deprecated and will be removed in a future version.\n", "Instructions for updating:\n", "Use `tf.data.Dataset.map(tf.io.parse_example(...))` instead.\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "INFO:tensorflow:struct2tensor is not available.\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "INFO:tensorflow:struct2tensor is not available.\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "INFO:tensorflow:tensorflow_decision_forests is not available.\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "INFO:tensorflow:tensorflow_decision_forests is not available.\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "INFO:tensorflow:tensorflow_text is not available.\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "INFO:tensorflow:tensorflow_text is not available.\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "INFO:absl:Feature body_mass_g has a shape dim {\n", " size: 1\n", "}\n", ". Setting to DenseTensor.\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "INFO:absl:Feature culmen_depth_mm has a shape dim {\n", " size: 1\n", "}\n", ". Setting to DenseTensor.\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "INFO:absl:Feature culmen_length_mm has a shape dim {\n", " size: 1\n", "}\n", ". Setting to DenseTensor.\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "INFO:absl:Feature flipper_length_mm has a shape dim {\n", " size: 1\n", "}\n", ". Setting to DenseTensor.\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "INFO:absl:Feature island has a shape dim {\n", " size: 1\n", "}\n", ". Setting to DenseTensor.\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "INFO:absl:Feature sex has a shape dim {\n", " size: 1\n", "}\n", ". Setting to DenseTensor.\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "INFO:absl:Feature species has a shape dim {\n", " size: 1\n", "}\n", ". Setting to DenseTensor.\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "INFO:absl:Model: \"model\"\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "INFO:absl:__________________________________________________________________________________________________\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "INFO:absl: Layer (type) Output Shape Param # Connected to \n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "INFO:absl:==================================================================================================\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "INFO:absl: culmen_length_mm (InputLay [(None, 1)] 0 [] \n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "INFO:absl: er) \n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "INFO:absl: \n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "INFO:absl: culmen_depth_mm (InputLaye [(None, 1)] 0 [] \n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "INFO:absl: r) \n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "INFO:absl: \n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "INFO:absl: flipper_length_mm (InputLa [(None, 1)] 0 [] \n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "INFO:absl: yer) \n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "INFO:absl: \n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "INFO:absl: body_mass_g (InputLayer) [(None, 1)] 0 [] \n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "INFO:absl: \n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "INFO:absl: concatenate (Concatenate) (None, 4) 0 ['culmen_length_mm[0][0]', \n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "INFO:absl: 'culmen_depth_mm[0][0]', \n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "INFO:absl: 'flipper_length_mm[0][0]', \n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "INFO:absl: 'body_mass_g[0][0]'] \n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "INFO:absl: \n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "INFO:absl: dense (Dense) (None, 8) 40 ['concatenate[0][0]'] \n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "INFO:absl: \n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "INFO:absl: dense_1 (Dense) (None, 8) 72 ['dense[0][0]'] \n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "INFO:absl: \n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "INFO:absl: dense_2 (Dense) (None, 3) 27 ['dense_1[0][0]'] \n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "INFO:absl: \n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "INFO:absl:==================================================================================================\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "INFO:absl:Total params: 139 (556.00 Byte)\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "INFO:absl:Trainable params: 139 (556.00 Byte)\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "INFO:absl:Non-trainable params: 0 (0.00 Byte)\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "INFO:absl:__________________________________________________________________________________________________\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "WARNING: All log messages before absl::InitializeLog() is called are written to STDERR\n", "I0000 00:00:1715160045.949607 17085 device_compiler.h:186] Compiled cluster using XLA! This line is logged at most once for the lifetime of the process.\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "\r", " 1/100 [..............................] - ETA: 2:22 - loss: 1.0291 - sparse_categorical_accuracy: 0.6500" ] }, { "name": "stdout", "output_type": "stream", "text": [ "\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\r", " 18/100 [====>.........................] - ETA: 0s - loss: 0.8150 - sparse_categorical_accuracy: 0.7306 " ] }, { "name": "stdout", "output_type": "stream", "text": [ "\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\r", " 37/100 [==========>...................] - ETA: 0s - loss: 0.5600 - sparse_categorical_accuracy: 0.8189" ] }, { "name": "stdout", "output_type": "stream", "text": [ "\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\r", " 56/100 [===============>..............] - ETA: 0s - loss: 0.4027 - sparse_categorical_accuracy: 0.8741" ] }, { "name": "stdout", "output_type": "stream", "text": [ "\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\r", " 75/100 [=====================>........] - ETA: 0s - loss: 0.3119 - sparse_categorical_accuracy: 0.9027" ] }, { "name": "stdout", "output_type": "stream", "text": [ "\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\r", " 94/100 [===========================>..] - ETA: 0s - loss: 0.2559 - sparse_categorical_accuracy: 0.9207" ] }, { "name": "stdout", "output_type": "stream", "text": [ "\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\r", "100/100 [==============================] - 2s 6ms/step - loss: 0.2421 - sparse_categorical_accuracy: 0.9250 - val_loss: 0.0074 - val_sparse_categorical_accuracy: 1.0000\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "INFO:absl:Feature body_mass_g has a shape dim {\n", " size: 1\n", "}\n", ". Setting to DenseTensor.\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "INFO:absl:Feature culmen_depth_mm has a shape dim {\n", " size: 1\n", "}\n", ". Setting to DenseTensor.\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "INFO:absl:Feature culmen_length_mm has a shape dim {\n", " size: 1\n", "}\n", ". Setting to DenseTensor.\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "INFO:absl:Feature flipper_length_mm has a shape dim {\n", " size: 1\n", "}\n", ". Setting to DenseTensor.\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "INFO:absl:Feature island has a shape dim {\n", " size: 1\n", "}\n", ". Setting to DenseTensor.\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "INFO:absl:Feature sex has a shape dim {\n", " size: 1\n", "}\n", ". Setting to DenseTensor.\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "INFO:absl:Feature species has a shape dim {\n", " size: 1\n", "}\n", ". Setting to DenseTensor.\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "INFO:absl:Function `serve_tf_examples_fn` contains input name(s) 5332, resource with unsupported characters which will be renamed to transform_features_layer_5332, model_dense_2_biasadd_readvariableop_resource in the SavedModel.\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "INFO:tensorflow:Assets written to: pipelines/penguin-transform/Trainer/model/6/Format-Serving/assets\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "INFO:tensorflow:Assets written to: pipelines/penguin-transform/Trainer/model/6/Format-Serving/assets\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "INFO:absl:Writing fingerprint to pipelines/penguin-transform/Trainer/model/6/Format-Serving/fingerprint.pb\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "INFO:absl:Training complete. Model written to pipelines/penguin-transform/Trainer/model/6/Format-Serving. ModelRun written to pipelines/penguin-transform/Trainer/model_run/6\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "INFO:absl:Cleaning up stateless execution info.\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "INFO:absl:Execution 6 succeeded.\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "INFO:absl:Cleaning up stateful execution info.\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "INFO:absl:Deleted stateful_working_dir pipelines/penguin-transform/Trainer/.system/stateful_working_dir/04611601-4dbf-412d-8c6c-4e4864bb335a\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "INFO:absl:Publishing output artifacts defaultdict(, {'model': [Artifact(artifact: uri: \"pipelines/penguin-transform/Trainer/model/6\"\n", ", artifact_type: name: \"Model\"\n", "base_type: MODEL\n", ")], 'model_run': [Artifact(artifact: uri: \"pipelines/penguin-transform/Trainer/model_run/6\"\n", ", artifact_type: name: \"ModelRun\"\n", ")]}) for execution 6\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "INFO:absl:MetadataStore with DB connection initialized\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "INFO:absl:Component Trainer is finished.\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "INFO:absl:Component Pusher is running.\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "INFO:absl:Running launcher for node_info {\n", " type {\n", " name: \"tfx.components.pusher.component.Pusher\"\n", " base_type: DEPLOY\n", " }\n", " id: \"Pusher\"\n", "}\n", "contexts {\n", " contexts {\n", " type {\n", " name: \"pipeline\"\n", " }\n", " name {\n", " field_value {\n", " string_value: \"penguin-transform\"\n", " }\n", " }\n", " }\n", " contexts {\n", " type {\n", " name: \"pipeline_run\"\n", " }\n", " name {\n", " field_value {\n", " string_value: \"2024-05-08T09:20:15.209892\"\n", " }\n", " }\n", " }\n", " contexts {\n", " type {\n", " name: \"node\"\n", " }\n", " name {\n", " field_value {\n", " string_value: \"penguin-transform.Pusher\"\n", " }\n", " }\n", " }\n", "}\n", "inputs {\n", " inputs {\n", " key: \"model\"\n", " value {\n", " channels {\n", " producer_node_query {\n", " id: \"Trainer\"\n", " }\n", " context_queries {\n", " type {\n", " name: \"pipeline\"\n", " }\n", " name {\n", " field_value {\n", " string_value: \"penguin-transform\"\n", " }\n", " }\n", " }\n", " context_queries {\n", " type {\n", " name: \"pipeline_run\"\n", " }\n", " name {\n", " field_value {\n", " string_value: \"2024-05-08T09:20:15.209892\"\n", " }\n", " }\n", " }\n", " context_queries {\n", " type {\n", " name: \"node\"\n", " }\n", " name {\n", " field_value {\n", " string_value: \"penguin-transform.Trainer\"\n", " }\n", " }\n", " }\n", " artifact_query {\n", " type {\n", " name: \"Model\"\n", " base_type: MODEL\n", " }\n", " }\n", " output_key: \"model\"\n", " }\n", " }\n", " }\n", "}\n", "outputs {\n", " outputs {\n", " key: \"pushed_model\"\n", " value {\n", " artifact_spec {\n", " type {\n", " name: \"PushedModel\"\n", " base_type: MODEL\n", " }\n", " }\n", " }\n", " }\n", "}\n", "parameters {\n", " parameters {\n", " key: \"custom_config\"\n", " value {\n", " field_value {\n", " string_value: \"null\"\n", " }\n", " }\n", " }\n", " parameters {\n", " key: \"push_destination\"\n", " value {\n", " field_value {\n", " string_value: \"{\\n \\\"filesystem\\\": {\\n \\\"base_directory\\\": \\\"serving_model/penguin-transform\\\"\\n }\\n}\"\n", " }\n", " }\n", " }\n", "}\n", "upstream_nodes: \"Trainer\"\n", "execution_options {\n", " caching_options {\n", " }\n", "}\n", "\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "INFO:absl:MetadataStore with DB connection initialized\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "WARNING:absl:ArtifactQuery.property_predicate is not supported.\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "INFO:absl:[Pusher] Resolved inputs: ({'model': [Artifact(artifact: id: 12\n", "type_id: 26\n", "uri: \"pipelines/penguin-transform/Trainer/model/6\"\n", "custom_properties {\n", " key: \"is_external\"\n", " value {\n", " int_value: 0\n", " }\n", "}\n", "custom_properties {\n", " key: \"tfx_version\"\n", " value {\n", " string_value: \"1.15.0\"\n", " }\n", "}\n", "state: LIVE\n", "type: \"Model\"\n", "create_time_since_epoch: 1715160050020\n", "last_update_time_since_epoch: 1715160050020\n", ", artifact_type: id: 26\n", "name: \"Model\"\n", "base_type: MODEL\n", ")]},)\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "INFO:absl:MetadataStore with DB connection initialized\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "INFO:absl:Going to run a new execution 7\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "INFO:absl:Going to run a new execution: ExecutionInfo(execution_id=7, input_dict={'model': [Artifact(artifact: id: 12\n", "type_id: 26\n", "uri: \"pipelines/penguin-transform/Trainer/model/6\"\n", "custom_properties {\n", " key: \"is_external\"\n", " value {\n", " int_value: 0\n", " }\n", "}\n", "custom_properties {\n", " key: \"tfx_version\"\n", " value {\n", " string_value: \"1.15.0\"\n", " }\n", "}\n", "state: LIVE\n", "type: \"Model\"\n", "create_time_since_epoch: 1715160050020\n", "last_update_time_since_epoch: 1715160050020\n", ", artifact_type: id: 26\n", "name: \"Model\"\n", "base_type: MODEL\n", ")]}, output_dict=defaultdict(, {'pushed_model': [Artifact(artifact: uri: \"pipelines/penguin-transform/Pusher/pushed_model/7\"\n", ", artifact_type: name: \"PushedModel\"\n", "base_type: MODEL\n", ")]}), exec_properties={'push_destination': '{\\n \"filesystem\": {\\n \"base_directory\": \"serving_model/penguin-transform\"\\n }\\n}', 'custom_config': 'null'}, execution_output_uri='pipelines/penguin-transform/Pusher/.system/executor_execution/7/executor_output.pb', stateful_working_dir='pipelines/penguin-transform/Pusher/.system/stateful_working_dir/b2701c8d-b1b7-4c37-a72d-ac97a07c8706', tmp_dir='pipelines/penguin-transform/Pusher/.system/executor_execution/7/.temp/', pipeline_node=node_info {\n", " type {\n", " name: \"tfx.components.pusher.component.Pusher\"\n", " base_type: DEPLOY\n", " }\n", " id: \"Pusher\"\n", "}\n", "contexts {\n", " contexts {\n", " type {\n", " name: \"pipeline\"\n", " }\n", " name {\n", " field_value {\n", " string_value: \"penguin-transform\"\n", " }\n", " }\n", " }\n", " contexts {\n", " type {\n", " name: \"pipeline_run\"\n", " }\n", " name {\n", " field_value {\n", " string_value: \"2024-05-08T09:20:15.209892\"\n", " }\n", " }\n", " }\n", " contexts {\n", " type {\n", " name: \"node\"\n", " }\n", " name {\n", " field_value {\n", " string_value: \"penguin-transform.Pusher\"\n", " }\n", " }\n", " }\n", "}\n", "inputs {\n", " inputs {\n", " key: \"model\"\n", " value {\n", " channels {\n", " producer_node_query {\n", " id: \"Trainer\"\n", " }\n", " context_queries {\n", " type {\n", " name: \"pipeline\"\n", " }\n", " name {\n", " field_value {\n", " string_value: \"penguin-transform\"\n", " }\n", " }\n", " }\n", " context_queries {\n", " type {\n", " name: \"pipeline_run\"\n", " }\n", " name {\n", " field_value {\n", " string_value: \"2024-05-08T09:20:15.209892\"\n", " }\n", " }\n", " }\n", " context_queries {\n", " type {\n", " name: \"node\"\n", " }\n", " name {\n", " field_value {\n", " string_value: \"penguin-transform.Trainer\"\n", " }\n", " }\n", " }\n", " artifact_query {\n", " type {\n", " name: \"Model\"\n", " base_type: MODEL\n", " }\n", " }\n", " output_key: \"model\"\n", " }\n", " }\n", " }\n", "}\n", "outputs {\n", " outputs {\n", " key: \"pushed_model\"\n", " value {\n", " artifact_spec {\n", " type {\n", " name: \"PushedModel\"\n", " base_type: MODEL\n", " }\n", " }\n", " }\n", " }\n", "}\n", "parameters {\n", " parameters {\n", " key: \"custom_config\"\n", " value {\n", " field_value {\n", " string_value: \"null\"\n", " }\n", " }\n", " }\n", " parameters {\n", " key: \"push_destination\"\n", " value {\n", " field_value {\n", " string_value: \"{\\n \\\"filesystem\\\": {\\n \\\"base_directory\\\": \\\"serving_model/penguin-transform\\\"\\n }\\n}\"\n", " }\n", " }\n", " }\n", "}\n", "upstream_nodes: \"Trainer\"\n", "execution_options {\n", " caching_options {\n", " }\n", "}\n", ", pipeline_info=id: \"penguin-transform\"\n", ", pipeline_run_id='2024-05-08T09:20:15.209892', top_level_pipeline_run_id=None, frontend_url=None)\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "WARNING:absl:Pusher is going to push the model without validation. Consider using Evaluator or InfraValidator in your pipeline.\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "INFO:absl:Model version: 1715160050\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "INFO:absl:Model written to serving path serving_model/penguin-transform/1715160050.\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "INFO:absl:Model pushed to pipelines/penguin-transform/Pusher/pushed_model/7.\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "INFO:absl:Cleaning up stateless execution info.\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "INFO:absl:Execution 7 succeeded.\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "INFO:absl:Cleaning up stateful execution info.\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "INFO:absl:Deleted stateful_working_dir pipelines/penguin-transform/Pusher/.system/stateful_working_dir/b2701c8d-b1b7-4c37-a72d-ac97a07c8706\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "INFO:absl:Publishing output artifacts defaultdict(, {'pushed_model': [Artifact(artifact: uri: \"pipelines/penguin-transform/Pusher/pushed_model/7\"\n", ", artifact_type: name: \"PushedModel\"\n", "base_type: MODEL\n", ")]}) for execution 7\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "INFO:absl:MetadataStore with DB connection initialized\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "INFO:absl:Component Pusher is finished.\n" ] } ], "source": [ "tfx.orchestration.LocalDagRunner().run(\n", " _create_pipeline(\n", " pipeline_name=PIPELINE_NAME,\n", " pipeline_root=PIPELINE_ROOT,\n", " data_root=DATA_ROOT,\n", " schema_path=SCHEMA_PATH,\n", " module_file=_module_file,\n", " serving_model_dir=SERVING_MODEL_DIR,\n", " metadata_path=METADATA_PATH))" ] }, { "cell_type": "markdown", "metadata": { "id": "ppERq0Mj6xvW" }, "source": [ "You should see \"INFO:absl:Component Pusher is finished.\" if the pipeline\n", "finished successfully.\n", "\n", "The pusher component pushes the trained model to the `SERVING_MODEL_DIR` which\n", "is the `serving_model/penguin-transform` directory if you did not change\n", "the variables in the previous steps. You can see the result from the file\n", "browser in the left-side panel in Colab, or using the following command:" ] }, { "cell_type": "code", "execution_count": 14, "metadata": { "execution": { "iopub.execute_input": "2024-05-08T09:20:50.089766Z", "iopub.status.busy": "2024-05-08T09:20:50.089193Z", "iopub.status.idle": "2024-05-08T09:20:50.261316Z", "shell.execute_reply": "2024-05-08T09:20:50.260279Z" }, "id": "NTHROkqX6yHx" }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "serving_model/penguin-transform\r\n", "serving_model/penguin-transform/1715160050\r\n", "serving_model/penguin-transform/1715160050/variables\r\n", "serving_model/penguin-transform/1715160050/variables/variables.index\r\n", "serving_model/penguin-transform/1715160050/variables/variables.data-00000-of-00001\r\n", "serving_model/penguin-transform/1715160050/assets\r\n", "serving_model/penguin-transform/1715160050/keras_metadata.pb\r\n", "serving_model/penguin-transform/1715160050/fingerprint.pb\r\n", "serving_model/penguin-transform/1715160050/saved_model.pb\r\n" ] } ], "source": [ "# List files in created model directory.\n", "!find {SERVING_MODEL_DIR}" ] }, { "cell_type": "markdown", "metadata": { "id": "VTqM-WiZkPbt" }, "source": [ "You can also check the signature of the generated model using the\n", "[`saved_model_cli` tool](https://www.tensorflow.org/guide/saved_model#show_command)." ] }, { "cell_type": "code", "execution_count": 15, "metadata": { "execution": { "iopub.execute_input": "2024-05-08T09:20:50.265368Z", "iopub.status.busy": "2024-05-08T09:20:50.265066Z", "iopub.status.idle": "2024-05-08T09:20:53.563299Z", "shell.execute_reply": "2024-05-08T09:20:53.562394Z" }, "id": "YBfUzD_OkOq_" }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "2024-05-08 09:20:50.856062: E external/local_xla/xla/stream_executor/cuda/cuda_dnn.cc:9261] Unable to register cuDNN factory: Attempting to register factory for plugin cuDNN when one has already been registered\r\n", "2024-05-08 09:20:50.856134: E external/local_xla/xla/stream_executor/cuda/cuda_fft.cc:607] Unable to register cuFFT factory: Attempting to register factory for plugin cuFFT when one has already been registered\r\n", "2024-05-08 09:20:50.857634: E external/local_xla/xla/stream_executor/cuda/cuda_blas.cc:1515] Unable to register cuBLAS factory: Attempting to register factory for plugin cuBLAS when one has already been registered\r\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "The given SavedModel SignatureDef contains the following input(s):\r\n", " inputs['examples'] tensor_info:\r\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ " dtype: DT_STRING\r\n", " shape: (-1)\r\n", " name: serving_default_examples:0\r\n", "The given SavedModel SignatureDef contains the following output(s):\r\n", " outputs['output_0'] tensor_info:\r\n", " dtype: DT_FLOAT\r\n", " shape: (-1, 3)\r\n", " name: StatefulPartitionedCall_1:0\r\n", "Method name is: tensorflow/serving/predict\r\n" ] } ], "source": [ "!saved_model_cli show --dir {SERVING_MODEL_DIR}/$(ls -1 {SERVING_MODEL_DIR} | sort -nr | head -1) --tag_set serve --signature_def serving_default" ] }, { "cell_type": "markdown", "metadata": { "id": "DkAxFs_QszoZ" }, "source": [ "Because we defined `serving_default` with our own `serve_tf_examples_fn`\n", "function, the signature shows that it takes a single string.\n", "This string is a serialized string of tf.Examples and will be parsed with the\n", "[tf.io.parse_example()](https://www.tensorflow.org/api_docs/python/tf/io/parse_example)\n", "function as we defined earlier (learn more about tf.Examples [here](https://www.tensorflow.org/tutorials/load_data/tfrecord)).\n", "\n", "We can load the exported model and try some inferences with a few examples." ] }, { "cell_type": "code", "execution_count": 16, "metadata": { "execution": { "iopub.execute_input": "2024-05-08T09:20:53.567917Z", "iopub.status.busy": "2024-05-08T09:20:53.567623Z", "iopub.status.idle": "2024-05-08T09:20:53.984287Z", "shell.execute_reply": "2024-05-08T09:20:53.983597Z" }, "id": "Z1Yw5yYdvqKf" }, "outputs": [], "source": [ "# Find a model with the latest timestamp.\n", "model_dirs = (item for item in os.scandir(SERVING_MODEL_DIR) if item.is_dir())\n", "model_path = max(model_dirs, key=lambda i: int(i.name)).path\n", "\n", "loaded_model = tf.keras.models.load_model(model_path)\n", "inference_fn = loaded_model.signatures['serving_default']" ] }, { "cell_type": "code", "execution_count": 17, "metadata": { "execution": { "iopub.execute_input": "2024-05-08T09:20:53.988545Z", "iopub.status.busy": "2024-05-08T09:20:53.987983Z", "iopub.status.idle": "2024-05-08T09:20:54.058187Z", "shell.execute_reply": "2024-05-08T09:20:54.057519Z" }, "id": "xrOHIvnIv0-4" }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "[[-2.76344 -0.5130405 7.046433 ]]\n" ] } ], "source": [ "# Prepare an example and run inference.\n", "features = {\n", " 'culmen_length_mm': tf.train.Feature(float_list=tf.train.FloatList(value=[49.9])),\n", " 'culmen_depth_mm': tf.train.Feature(float_list=tf.train.FloatList(value=[16.1])),\n", " 'flipper_length_mm': tf.train.Feature(int64_list=tf.train.Int64List(value=[213])),\n", " 'body_mass_g': tf.train.Feature(int64_list=tf.train.Int64List(value=[5400])),\n", "}\n", "example_proto = tf.train.Example(features=tf.train.Features(feature=features))\n", "examples = example_proto.SerializeToString()\n", "\n", "result = inference_fn(examples=tf.constant([examples]))\n", "print(result['output_0'].numpy())" ] }, { "cell_type": "markdown", "metadata": { "id": "cri3mTgZ0SQ2" }, "source": [ "The third element, which corresponds to 'Gentoo' species, is expected to be the\n", "largest among three." ] }, { "cell_type": "markdown", "metadata": { "id": "08R8qvweThRf" }, "source": [ "## Next steps\n", "\n", "If you want to learn more about Transform component, see\n", "[Transform Component guide](https://www.tensorflow.org/tfx/guide/transform).\n", "You can find more resources on https://www.tensorflow.org/tfx/tutorials.\n", "\n", "Please see\n", "[Understanding TFX Pipelines](https://www.tensorflow.org/tfx/guide/understanding_tfx_pipelines)\n", "to learn more about various concepts in TFX.\n" ] } ], "metadata": { "colab": { "collapsed_sections": [ "DjUA6S30k52h" ], "name": "penguin_transform.ipynb", "provenance": [], "toc_visible": true }, "kernelspec": { "display_name": "Python 3", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.9.19" } }, "nbformat": 4, "nbformat_minor": 0 }