{
"cells": [
{
"cell_type": "markdown",
"metadata": {
"id": "FhGuhbZ6M5tl"
},
"source": [
"##### Copyright 2018 The TensorFlow Authors."
]
},
{
"cell_type": "code",
"execution_count": 1,
"metadata": {
"cellView": "form",
"execution": {
"iopub.execute_input": "2020-09-23T00:13:19.251944Z",
"iopub.status.busy": "2020-09-23T00:13:19.251312Z",
"iopub.status.idle": "2020-09-23T00:13:19.253787Z",
"shell.execute_reply": "2020-09-23T00:13:19.253203Z"
},
"id": "AwOEIRJC6Une"
},
"outputs": [],
"source": [
"#@title Licensed under the Apache License, Version 2.0 (the \"License\");\n",
"# you may not use this file except in compliance with the License.\n",
"# You may obtain a copy of the License at\n",
"#\n",
"# https://www.apache.org/licenses/LICENSE-2.0\n",
"#\n",
"# Unless required by applicable law or agreed to in writing, software\n",
"# distributed under the License is distributed on an \"AS IS\" BASIS,\n",
"# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n",
"# See the License for the specific language governing permissions and\n",
"# limitations under the License."
]
},
{
"cell_type": "code",
"execution_count": 2,
"metadata": {
"cellView": "form",
"execution": {
"iopub.execute_input": "2020-09-23T00:13:19.257382Z",
"iopub.status.busy": "2020-09-23T00:13:19.256735Z",
"iopub.status.idle": "2020-09-23T00:13:19.258597Z",
"shell.execute_reply": "2020-09-23T00:13:19.258994Z"
},
"id": "KyPEtTqk6VdG"
},
"outputs": [],
"source": [
"#@title MIT License\n",
"#\n",
"# Copyright (c) 2017 François Chollet\n",
"#\n",
"# Permission is hereby granted, free of charge, to any person obtaining a\n",
"# copy of this software and associated documentation files (the \"Software\"),\n",
"# to deal in the Software without restriction, including without limitation\n",
"# the rights to use, copy, modify, merge, publish, distribute, sublicense,\n",
"# and/or sell copies of the Software, and to permit persons to whom the\n",
"# Software is furnished to do so, subject to the following conditions:\n",
"#\n",
"# The above copyright notice and this permission notice shall be included in\n",
"# all copies or substantial portions of the Software.\n",
"#\n",
"# THE SOFTWARE IS PROVIDED \"AS IS\", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR\n",
"# IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,\n",
"# FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL\n",
"# THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER\n",
"# LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING\n",
"# FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER\n",
"# DEALINGS IN THE SOFTWARE."
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "EIdT9iu_Z4Rb"
},
"source": [
"# Regresion Basica: Predecir eficiencia de gasolina"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "bBIlTPscrIT9"
},
"source": [
"
"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "Au0QDqNXgRb_"
},
"source": [
"Note: Nuestra comunidad de Tensorflow ha traducido estos documentos. Como las traducciones de la comunidad\n",
"son basados en el \"mejor esfuerzo\", no hay ninguna garantia que esta sea un reflejo preciso y actual \n",
"de la [Documentacion Oficial en Ingles](https://www.tensorflow.org/?hl=en).\n",
"Si tienen sugerencias sobre como mejorar esta traduccion, por favor envian un \"Pull request\"\n",
"al siguiente repositorio [tensorflow/docs](https://github.com/tensorflow/docs).\n",
"Para ofrecerse como voluntario o hacer revision de las traducciones de la Comunidad\n",
"por favor contacten al siguiente grupo [docs@tensorflow.org list](https://groups.google.com/a/tensorflow.org/forum/#!forum/docs)."
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "AHp3M9ZmrIxj"
},
"source": [
"En un problema de *regresion*, buscamos predecir la salida de un valor continuo como la probabilidad de un precio. En contraste en un problema de *Clasificacion*, buscamos seleccionar una clase de una lista de clases (por ejemplo, en donde una imagen contenga una manzana o una naranja queremos reconocer cual es la fruta en la imagen).\n",
"\n",
"Este libro usa el set de datos clasico [Auto MPG](https://archive.ics.uci.edu/ml/datasets/auto+mpg) y construye un modelo para predecir la eficiencia de vehiculos de 1970 y 1980. Para hacer esto proveeremos el modelo con una descripcion de muchos automoviles de ese periodo. Esta descripcion incluye atributos como: Cilindros, desplazamiento, potencia y peso.\n",
"\n",
"Este ejemplo usa el API `tf.keras` , revise [Esta Guia](https://www.tensorflow.org/guide/keras) para obtener mas detalles."
]
},
{
"cell_type": "code",
"execution_count": 3,
"metadata": {
"execution": {
"iopub.execute_input": "2020-09-23T00:13:19.265623Z",
"iopub.status.busy": "2020-09-23T00:13:19.262712Z",
"iopub.status.idle": "2020-09-23T00:13:20.610944Z",
"shell.execute_reply": "2020-09-23T00:13:20.611527Z"
},
"id": "moB4tpEHxKB3"
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"\u001b[33mWARNING: You are using pip version 20.2.2; however, version 20.2.3 is available.\r\n",
"You should consider upgrading via the '/tmpfs/src/tf_docs_env/bin/python -m pip install --upgrade pip' command.\u001b[0m\r\n"
]
}
],
"source": [
"# Use seaborn for pairplot\n",
"!pip install -q seaborn"
]
},
{
"cell_type": "code",
"execution_count": 4,
"metadata": {
"execution": {
"iopub.execute_input": "2020-09-23T00:13:20.617330Z",
"iopub.status.busy": "2020-09-23T00:13:20.616344Z",
"iopub.status.idle": "2020-09-23T00:13:27.321654Z",
"shell.execute_reply": "2020-09-23T00:13:27.321020Z"
},
"id": "1rRo8oNqZ-Rj"
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"2.3.0\n"
]
}
],
"source": [
"import pathlib\n",
"\n",
"import matplotlib.pyplot as plt\n",
"import pandas as pd\n",
"import seaborn as sns\n",
"\n",
"import tensorflow as tf\n",
"\n",
"from tensorflow import keras\n",
"from tensorflow.keras import layers\n",
"\n",
"print(tf.__version__)"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "F_72b0LCNbjx"
},
"source": [
"## El set de Datos de MPG\n",
"\n",
"el set de datos esta disponible de el siguiente repositorio [UCI Machine Learning Repository](https://archive.ics.uci.edu/ml/).\n"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "gFh9ne3FZ-On"
},
"source": [
"### Obtenga la data\n",
"Primero descargue el set de datos."
]
},
{
"cell_type": "code",
"execution_count": 5,
"metadata": {
"execution": {
"iopub.execute_input": "2020-09-23T00:13:27.328243Z",
"iopub.status.busy": "2020-09-23T00:13:27.327585Z",
"iopub.status.idle": "2020-09-23T00:13:27.661960Z",
"shell.execute_reply": "2020-09-23T00:13:27.662382Z"
},
"id": "p9kxxgzvzlyz"
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Downloading data from http://archive.ics.uci.edu/ml/machine-learning-databases/auto-mpg/auto-mpg.data\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
" 8192/30286 [=======>......................] - ETA: 0s"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\r",
"32768/30286 [================================] - 0s 1us/step\n"
]
},
{
"data": {
"text/plain": [
"'/home/kbuilder/.keras/datasets/auto-mpg.data'"
]
},
"execution_count": 1,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"dataset_path = keras.utils.get_file(\"auto-mpg.data\", \"http://archive.ics.uci.edu/ml/machine-learning-databases/auto-mpg/auto-mpg.data\")\n",
"dataset_path"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "nslsRLh7Zss4"
},
"source": [
"Importelo usando pandas."
]
},
{
"cell_type": "code",
"execution_count": 6,
"metadata": {
"execution": {
"iopub.execute_input": "2020-09-23T00:13:27.670510Z",
"iopub.status.busy": "2020-09-23T00:13:27.667688Z",
"iopub.status.idle": "2020-09-23T00:13:27.689919Z",
"shell.execute_reply": "2020-09-23T00:13:27.689290Z"
},
"id": "CiX2FI4gZtTt"
},
"outputs": [
{
"data": {
"text/html": [
"\n",
"\n",
"
\n",
" \n",
" \n",
" \n",
" MPG \n",
" Cylinders \n",
" Displacement \n",
" Horsepower \n",
" Weight \n",
" Acceleration \n",
" Model Year \n",
" Origin \n",
" \n",
" \n",
" \n",
" \n",
" 393 \n",
" 27.0 \n",
" 4 \n",
" 140.0 \n",
" 86.0 \n",
" 2790.0 \n",
" 15.6 \n",
" 82 \n",
" 1 \n",
" \n",
" \n",
" 394 \n",
" 44.0 \n",
" 4 \n",
" 97.0 \n",
" 52.0 \n",
" 2130.0 \n",
" 24.6 \n",
" 82 \n",
" 2 \n",
" \n",
" \n",
" 395 \n",
" 32.0 \n",
" 4 \n",
" 135.0 \n",
" 84.0 \n",
" 2295.0 \n",
" 11.6 \n",
" 82 \n",
" 1 \n",
" \n",
" \n",
" 396 \n",
" 28.0 \n",
" 4 \n",
" 120.0 \n",
" 79.0 \n",
" 2625.0 \n",
" 18.6 \n",
" 82 \n",
" 1 \n",
" \n",
" \n",
" 397 \n",
" 31.0 \n",
" 4 \n",
" 119.0 \n",
" 82.0 \n",
" 2720.0 \n",
" 19.4 \n",
" 82 \n",
" 1 \n",
" \n",
" \n",
"
\n",
"
"
],
"text/plain": [
" MPG Cylinders Displacement Horsepower Weight Acceleration \\\n",
"393 27.0 4 140.0 86.0 2790.0 15.6 \n",
"394 44.0 4 97.0 52.0 2130.0 24.6 \n",
"395 32.0 4 135.0 84.0 2295.0 11.6 \n",
"396 28.0 4 120.0 79.0 2625.0 18.6 \n",
"397 31.0 4 119.0 82.0 2720.0 19.4 \n",
"\n",
" Model Year Origin \n",
"393 82 1 \n",
"394 82 2 \n",
"395 82 1 \n",
"396 82 1 \n",
"397 82 1 "
]
},
"execution_count": 1,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"column_names = ['MPG','Cylinders','Displacement','Horsepower','Weight',\n",
" 'Acceleration', 'Model Year', 'Origin']\n",
"raw_dataset = pd.read_csv(dataset_path, names=column_names,\n",
" na_values = \"?\", comment='\\t',\n",
" sep=\" \", skipinitialspace=True)\n",
"\n",
"dataset = raw_dataset.copy()\n",
"dataset.tail()"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "3MWuJTKEDM-f"
},
"source": [
"### Limpie la data\n",
"\n",
"El set de datos contiene algunos valores desconocidos."
]
},
{
"cell_type": "code",
"execution_count": 7,
"metadata": {
"execution": {
"iopub.execute_input": "2020-09-23T00:13:27.696269Z",
"iopub.status.busy": "2020-09-23T00:13:27.695586Z",
"iopub.status.idle": "2020-09-23T00:13:27.699098Z",
"shell.execute_reply": "2020-09-23T00:13:27.698583Z"
},
"id": "JEJHhN65a2VV"
},
"outputs": [
{
"data": {
"text/plain": [
"MPG 0\n",
"Cylinders 0\n",
"Displacement 0\n",
"Horsepower 6\n",
"Weight 0\n",
"Acceleration 0\n",
"Model Year 0\n",
"Origin 0\n",
"dtype: int64"
]
},
"execution_count": 1,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"dataset.isna().sum()"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "9UPN0KBHa_WI"
},
"source": [
"**Para** Mantener este tutorial inicial sencillo eliminemos las siguientes filas."
]
},
{
"cell_type": "code",
"execution_count": 8,
"metadata": {
"execution": {
"iopub.execute_input": "2020-09-23T00:13:27.704846Z",
"iopub.status.busy": "2020-09-23T00:13:27.704211Z",
"iopub.status.idle": "2020-09-23T00:13:27.708906Z",
"shell.execute_reply": "2020-09-23T00:13:27.708399Z"
},
"id": "4ZUDosChC1UN"
},
"outputs": [],
"source": [
"dataset = dataset.dropna()"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "8XKitwaH4v8h"
},
"source": [
"La columna de `\"Origin\"` realmente es categorica, no numerica. Entonces conviertala a un \"one-hot\":"
]
},
{
"cell_type": "code",
"execution_count": 9,
"metadata": {
"execution": {
"iopub.execute_input": "2020-09-23T00:13:27.713743Z",
"iopub.status.busy": "2020-09-23T00:13:27.713079Z",
"iopub.status.idle": "2020-09-23T00:13:27.715710Z",
"shell.execute_reply": "2020-09-23T00:13:27.715193Z"
},
"id": "gWNTD2QjBWFJ"
},
"outputs": [],
"source": [
"origin = dataset.pop('Origin')"
]
},
{
"cell_type": "code",
"execution_count": 10,
"metadata": {
"execution": {
"iopub.execute_input": "2020-09-23T00:13:27.731879Z",
"iopub.status.busy": "2020-09-23T00:13:27.731117Z",
"iopub.status.idle": "2020-09-23T00:13:27.734434Z",
"shell.execute_reply": "2020-09-23T00:13:27.733917Z"
},
"id": "ulXz4J7PAUzk"
},
"outputs": [
{
"data": {
"text/html": [
"\n",
"\n",
"
\n",
" \n",
" \n",
" \n",
" MPG \n",
" Cylinders \n",
" Displacement \n",
" Horsepower \n",
" Weight \n",
" Acceleration \n",
" Model Year \n",
" USA \n",
" Europe \n",
" Japan \n",
" \n",
" \n",
" \n",
" \n",
" 393 \n",
" 27.0 \n",
" 4 \n",
" 140.0 \n",
" 86.0 \n",
" 2790.0 \n",
" 15.6 \n",
" 82 \n",
" 1.0 \n",
" 0.0 \n",
" 0.0 \n",
" \n",
" \n",
" 394 \n",
" 44.0 \n",
" 4 \n",
" 97.0 \n",
" 52.0 \n",
" 2130.0 \n",
" 24.6 \n",
" 82 \n",
" 0.0 \n",
" 1.0 \n",
" 0.0 \n",
" \n",
" \n",
" 395 \n",
" 32.0 \n",
" 4 \n",
" 135.0 \n",
" 84.0 \n",
" 2295.0 \n",
" 11.6 \n",
" 82 \n",
" 1.0 \n",
" 0.0 \n",
" 0.0 \n",
" \n",
" \n",
" 396 \n",
" 28.0 \n",
" 4 \n",
" 120.0 \n",
" 79.0 \n",
" 2625.0 \n",
" 18.6 \n",
" 82 \n",
" 1.0 \n",
" 0.0 \n",
" 0.0 \n",
" \n",
" \n",
" 397 \n",
" 31.0 \n",
" 4 \n",
" 119.0 \n",
" 82.0 \n",
" 2720.0 \n",
" 19.4 \n",
" 82 \n",
" 1.0 \n",
" 0.0 \n",
" 0.0 \n",
" \n",
" \n",
"
\n",
"
"
],
"text/plain": [
" MPG Cylinders Displacement Horsepower Weight Acceleration \\\n",
"393 27.0 4 140.0 86.0 2790.0 15.6 \n",
"394 44.0 4 97.0 52.0 2130.0 24.6 \n",
"395 32.0 4 135.0 84.0 2295.0 11.6 \n",
"396 28.0 4 120.0 79.0 2625.0 18.6 \n",
"397 31.0 4 119.0 82.0 2720.0 19.4 \n",
"\n",
" Model Year USA Europe Japan \n",
"393 82 1.0 0.0 0.0 \n",
"394 82 0.0 1.0 0.0 \n",
"395 82 1.0 0.0 0.0 \n",
"396 82 1.0 0.0 0.0 \n",
"397 82 1.0 0.0 0.0 "
]
},
"execution_count": 1,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"dataset['USA'] = (origin == 1)*1.0\n",
"dataset['Europe'] = (origin == 2)*1.0\n",
"dataset['Japan'] = (origin == 3)*1.0\n",
"dataset.tail()"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "Cuym4yvk76vU"
},
"source": [
"### Dividamos la data en entrenamiento y prueba\n",
"\n",
"Ahora divida el set de datos en un set de entrenamiento y otro de pruebas.\n",
"\n",
"Usaremos el set de pruebas en la evaluacion final de nuestro modelo."
]
},
{
"cell_type": "code",
"execution_count": 11,
"metadata": {
"execution": {
"iopub.execute_input": "2020-09-23T00:13:27.740592Z",
"iopub.status.busy": "2020-09-23T00:13:27.739928Z",
"iopub.status.idle": "2020-09-23T00:13:27.742001Z",
"shell.execute_reply": "2020-09-23T00:13:27.742428Z"
},
"id": "qn-IGhUE7_1H"
},
"outputs": [],
"source": [
"train_dataset = dataset.sample(frac=0.8,random_state=0)\n",
"test_dataset = dataset.drop(train_dataset.index)"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "J4ubs136WLNp"
},
"source": [
"### Inspeccione la data\n",
"\n",
"Revise rapidamente la distribucion conjunta de un par de columnas de el set de entrenamiento."
]
},
{
"cell_type": "code",
"execution_count": 12,
"metadata": {
"execution": {
"iopub.execute_input": "2020-09-23T00:13:27.749652Z",
"iopub.status.busy": "2020-09-23T00:13:27.748596Z",
"iopub.status.idle": "2020-09-23T00:13:33.056550Z",
"shell.execute_reply": "2020-09-23T00:13:33.057143Z"
},
"id": "oRKO_x8gWKv-"
},
"outputs": [
{
"data": {
"text/plain": [
""
]
},
"execution_count": 1,
"metadata": {},
"output_type": "execute_result"
},
{
"data": {
"image/png": "\n",
"text/plain": [
""
]
},
"metadata": {
"needs_background": "light"
},
"output_type": "display_data"
}
],
"source": [
"sns.pairplot(train_dataset[[\"MPG\", \"Cylinders\", \"Displacement\", \"Weight\"]], diag_kind=\"kde\")"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "gavKO_6DWRMP"
},
"source": [
"Tambien revise las estadisticas generales:"
]
},
{
"cell_type": "code",
"execution_count": 13,
"metadata": {
"execution": {
"iopub.execute_input": "2020-09-23T00:13:33.068038Z",
"iopub.status.busy": "2020-09-23T00:13:33.067343Z",
"iopub.status.idle": "2020-09-23T00:13:33.100862Z",
"shell.execute_reply": "2020-09-23T00:13:33.100322Z"
},
"id": "yi2FzC3T21jR"
},
"outputs": [
{
"data": {
"text/html": [
"\n",
"\n",
"
\n",
" \n",
" \n",
" \n",
" count \n",
" mean \n",
" std \n",
" min \n",
" 25% \n",
" 50% \n",
" 75% \n",
" max \n",
" \n",
" \n",
" \n",
" \n",
" Cylinders \n",
" 314.0 \n",
" 5.477707 \n",
" 1.699788 \n",
" 3.0 \n",
" 4.00 \n",
" 4.0 \n",
" 8.00 \n",
" 8.0 \n",
" \n",
" \n",
" Displacement \n",
" 314.0 \n",
" 195.318471 \n",
" 104.331589 \n",
" 68.0 \n",
" 105.50 \n",
" 151.0 \n",
" 265.75 \n",
" 455.0 \n",
" \n",
" \n",
" Horsepower \n",
" 314.0 \n",
" 104.869427 \n",
" 38.096214 \n",
" 46.0 \n",
" 76.25 \n",
" 94.5 \n",
" 128.00 \n",
" 225.0 \n",
" \n",
" \n",
" Weight \n",
" 314.0 \n",
" 2990.251592 \n",
" 843.898596 \n",
" 1649.0 \n",
" 2256.50 \n",
" 2822.5 \n",
" 3608.00 \n",
" 5140.0 \n",
" \n",
" \n",
" Acceleration \n",
" 314.0 \n",
" 15.559236 \n",
" 2.789230 \n",
" 8.0 \n",
" 13.80 \n",
" 15.5 \n",
" 17.20 \n",
" 24.8 \n",
" \n",
" \n",
" Model Year \n",
" 314.0 \n",
" 75.898089 \n",
" 3.675642 \n",
" 70.0 \n",
" 73.00 \n",
" 76.0 \n",
" 79.00 \n",
" 82.0 \n",
" \n",
" \n",
" USA \n",
" 314.0 \n",
" 0.624204 \n",
" 0.485101 \n",
" 0.0 \n",
" 0.00 \n",
" 1.0 \n",
" 1.00 \n",
" 1.0 \n",
" \n",
" \n",
" Europe \n",
" 314.0 \n",
" 0.178344 \n",
" 0.383413 \n",
" 0.0 \n",
" 0.00 \n",
" 0.0 \n",
" 0.00 \n",
" 1.0 \n",
" \n",
" \n",
" Japan \n",
" 314.0 \n",
" 0.197452 \n",
" 0.398712 \n",
" 0.0 \n",
" 0.00 \n",
" 0.0 \n",
" 0.00 \n",
" 1.0 \n",
" \n",
" \n",
"
\n",
"
"
],
"text/plain": [
" count mean std min 25% 50% \\\n",
"Cylinders 314.0 5.477707 1.699788 3.0 4.00 4.0 \n",
"Displacement 314.0 195.318471 104.331589 68.0 105.50 151.0 \n",
"Horsepower 314.0 104.869427 38.096214 46.0 76.25 94.5 \n",
"Weight 314.0 2990.251592 843.898596 1649.0 2256.50 2822.5 \n",
"Acceleration 314.0 15.559236 2.789230 8.0 13.80 15.5 \n",
"Model Year 314.0 75.898089 3.675642 70.0 73.00 76.0 \n",
"USA 314.0 0.624204 0.485101 0.0 0.00 1.0 \n",
"Europe 314.0 0.178344 0.383413 0.0 0.00 0.0 \n",
"Japan 314.0 0.197452 0.398712 0.0 0.00 0.0 \n",
"\n",
" 75% max \n",
"Cylinders 8.00 8.0 \n",
"Displacement 265.75 455.0 \n",
"Horsepower 128.00 225.0 \n",
"Weight 3608.00 5140.0 \n",
"Acceleration 17.20 24.8 \n",
"Model Year 79.00 82.0 \n",
"USA 1.00 1.0 \n",
"Europe 0.00 1.0 \n",
"Japan 0.00 1.0 "
]
},
"execution_count": 1,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"train_stats = train_dataset.describe()\n",
"train_stats.pop(\"MPG\")\n",
"train_stats = train_stats.transpose()\n",
"train_stats"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "Db7Auq1yXUvh"
},
"source": [
"### Separe las caracteristicas de las etiquetas.\n",
"\n",
"Separe el valor objetivo, o la \"etiqueta\" \n",
"de las caracteristicas. Esta etiqueta es el valor que entrenara el modelo para predecir."
]
},
{
"cell_type": "code",
"execution_count": 14,
"metadata": {
"execution": {
"iopub.execute_input": "2020-09-23T00:13:33.107188Z",
"iopub.status.busy": "2020-09-23T00:13:33.106198Z",
"iopub.status.idle": "2020-09-23T00:13:33.108329Z",
"shell.execute_reply": "2020-09-23T00:13:33.108774Z"
},
"id": "t2sluJdCW7jN"
},
"outputs": [],
"source": [
"train_labels = train_dataset.pop('MPG')\n",
"test_labels = test_dataset.pop('MPG')"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "mRklxK5s388r"
},
"source": [
"### Normalice la data\n",
"\n",
"Revise otra vez el bloque de `train_stats` que se presento antes y note la diferencia de rangos de cada caracteristica."
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "-ywmerQ6dSox"
},
"source": [
"Es una buena práctica normalizar funciones que utilizan diferentes escalas y rangos. Aunque el modelo * podría * converger sin normalización de características, dificulta el entrenamiento y hace que el modelo resultante dependa de la elección de las unidades utilizadas en la entrada.\n",
"\n",
"Nota: Aunque generamos intencionalmente estas estadísticas solo del conjunto de datos de entrenamiento, estas estadísticas también se utilizarán para normalizar el conjunto de datos de prueba. Necesitamos hacer eso para proyectar el conjunto de datos de prueba en la misma distribución en la que el modelo ha sido entrenado."
]
},
{
"cell_type": "code",
"execution_count": 15,
"metadata": {
"execution": {
"iopub.execute_input": "2020-09-23T00:13:33.117479Z",
"iopub.status.busy": "2020-09-23T00:13:33.116765Z",
"iopub.status.idle": "2020-09-23T00:13:33.118784Z",
"shell.execute_reply": "2020-09-23T00:13:33.119274Z"
},
"id": "JlC5ooJrgjQF"
},
"outputs": [],
"source": [
"def norm(x):\n",
" return (x - train_stats['mean']) / train_stats['std']\n",
"normed_train_data = norm(train_dataset)\n",
"normed_test_data = norm(test_dataset)"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "BuiClDk45eS4"
},
"source": [
"Estos datos normalizados es lo que usaremos para entrenar el modelo.\n",
"\n",
"Precaución: las estadísticas utilizadas para normalizar las entradas aquí (media y desviación estándar) deben aplicarse a cualquier otro dato que se alimente al modelo, junto con la codificación de un punto que hicimos anteriormente. Eso incluye el conjunto de pruebas, así como los datos en vivo cuando el modelo se usa en producción."
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "SmjdzxKzEu1-"
},
"source": [
"## El modelo"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "6SWtkIjhrZwa"
},
"source": [
"### Construye el modelo\n",
"\n",
"Construyamos nuestro modelo. Aquí, utilizaremos un modelo `secuencial` con dos capas ocultas densamente conectadas y una capa de salida que devuelve un único valor continuo. Los pasos de construcción del modelo se envuelven en una función, `build_model`, ya que crearemos un segundo modelo, más adelante."
]
},
{
"cell_type": "code",
"execution_count": 16,
"metadata": {
"execution": {
"iopub.execute_input": "2020-09-23T00:13:33.125030Z",
"iopub.status.busy": "2020-09-23T00:13:33.124402Z",
"iopub.status.idle": "2020-09-23T00:13:33.126998Z",
"shell.execute_reply": "2020-09-23T00:13:33.126470Z"
},
"id": "c26juK7ZG8j-"
},
"outputs": [],
"source": [
"def build_model():\n",
" model = keras.Sequential([\n",
" layers.Dense(64, activation='relu', input_shape=[len(train_dataset.keys())]),\n",
" layers.Dense(64, activation='relu'),\n",
" layers.Dense(1)\n",
" ])\n",
"\n",
" optimizer = tf.keras.optimizers.RMSprop(0.001)\n",
"\n",
" model.compile(loss='mse',\n",
" optimizer=optimizer,\n",
" metrics=['mae', 'mse'])\n",
" return model"
]
},
{
"cell_type": "code",
"execution_count": 17,
"metadata": {
"execution": {
"iopub.execute_input": "2020-09-23T00:13:33.130964Z",
"iopub.status.busy": "2020-09-23T00:13:33.130309Z",
"iopub.status.idle": "2020-09-23T00:13:34.872552Z",
"shell.execute_reply": "2020-09-23T00:13:34.871944Z"
},
"id": "cGbPb-PHGbhs"
},
"outputs": [],
"source": [
"model = build_model()"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "Sj49Og4YGULr"
},
"source": [
"### Inspeccione el modelo\n",
"\n",
"Use el método `.summary` para imprimir una descripción simple del modelo"
]
},
{
"cell_type": "code",
"execution_count": 18,
"metadata": {
"execution": {
"iopub.execute_input": "2020-09-23T00:13:34.877263Z",
"iopub.status.busy": "2020-09-23T00:13:34.876610Z",
"iopub.status.idle": "2020-09-23T00:13:34.879990Z",
"shell.execute_reply": "2020-09-23T00:13:34.880521Z"
},
"id": "ReAD0n6MsFK-"
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Model: \"sequential\"\n",
"_________________________________________________________________\n",
"Layer (type) Output Shape Param # \n",
"=================================================================\n",
"dense (Dense) (None, 64) 640 \n",
"_________________________________________________________________\n",
"dense_1 (Dense) (None, 64) 4160 \n",
"_________________________________________________________________\n",
"dense_2 (Dense) (None, 1) 65 \n",
"=================================================================\n",
"Total params: 4,865\n",
"Trainable params: 4,865\n",
"Non-trainable params: 0\n",
"_________________________________________________________________\n"
]
}
],
"source": [
"model.summary()"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "Vt6W50qGsJAL"
},
"source": [
"Ahora pruebe el modelo. Tome un lote de ejemplos `10` de los datos de entrenamiento y llame a` model.predict` en él."
]
},
{
"cell_type": "code",
"execution_count": 19,
"metadata": {
"execution": {
"iopub.execute_input": "2020-09-23T00:13:34.885465Z",
"iopub.status.busy": "2020-09-23T00:13:34.884854Z",
"iopub.status.idle": "2020-09-23T00:13:35.279864Z",
"shell.execute_reply": "2020-09-23T00:13:35.280423Z"
},
"id": "-d-gBaVtGTSC"
},
"outputs": [
{
"data": {
"text/plain": [
"array([[-0.01398571],\n",
" [ 0.09287428],\n",
" [-0.06822745],\n",
" [ 0.01392132],\n",
" [ 0.2776244 ],\n",
" [-0.02446011],\n",
" [ 0.24629724],\n",
" [ 0.585566 ],\n",
" [-0.01556234],\n",
" [ 0.31573048]], dtype=float32)"
]
},
"execution_count": 1,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"example_batch = normed_train_data[:10]\n",
"example_result = model.predict(example_batch)\n",
"example_result"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "QlM8KrSOsaYo"
},
"source": [
"Parece estar funcionando, y produce un resultado de la forma y tipo esperados."
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "0-qWCsh6DlyH"
},
"source": [
"### Entrenar a la modelo\n",
"\n",
"Entrene el modelo durante 1000 épocas y registre la precisión de entrenamiento y validación en el objeto `history`."
]
},
{
"cell_type": "code",
"execution_count": 20,
"metadata": {
"execution": {
"iopub.execute_input": "2020-09-23T00:13:35.289253Z",
"iopub.status.busy": "2020-09-23T00:13:35.288254Z",
"iopub.status.idle": "2020-09-23T00:14:08.890777Z",
"shell.execute_reply": "2020-09-23T00:14:08.891232Z"
},
"id": "sD7qHCmNIOY0"
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"\n",
"......."
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"......."
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"......."
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"......."
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"......."
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"......."
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"......."
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"......."
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"......."
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"......"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"......."
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"......."
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"......"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"......."
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"....\n",
".."
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"......."
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"......."
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"......."
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"......"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"......"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"......."
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"......."
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"......"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"......."
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"......."
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"......."
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"......."
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"......."
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"......."
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"...\n",
"...."
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"......."
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"......."
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"......."
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"......."
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"......."
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"......."
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"......."
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"......."
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"......."
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"......."
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"......."
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"......."
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"......."
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
".....\n",
".."
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"......."
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"......"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"......"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"......"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"......."
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"......."
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"......."
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"......."
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"......."
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"......"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"......."
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"......."
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"......."
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"......."
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"....\n",
"..."
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"......."
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"......."
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"......."
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"......."
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"......."
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"......."
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"......."
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"......."
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"......."
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"......."
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"......."
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"......."
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"......."
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"......\n",
"."
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"......."
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"......."
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"......."
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"......."
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"......."
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"......."
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"......."
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"......."
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"......."
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"......."
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"......."
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"......"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"......."
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"......."
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"..\n",
"....."
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"......."
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"......."
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"......."
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"......."
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"......."
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"......."
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"......."
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"......."
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"......."
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"......."
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"......."
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"......."
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"......."
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"....\n",
"..."
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"......."
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"......."
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"......."
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"......."
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"......."
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"......."
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"......."
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"......."
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"......."
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"......."
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"......."
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"......."
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"......."
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"......\n",
"."
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"......."
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"......."
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"......."
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"......."
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"......"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"......."
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"......."
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"......."
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"......."
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"......."
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"......."
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"......."
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"......."
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"......."
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"..\n",
"....."
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"......."
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"......."
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"......."
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"......."
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"......"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"......."
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"......."
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"......."
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"......."
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"......."
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"......."
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"......."
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"......."
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"....."
]
}
],
"source": [
"# Display training progress by printing a single dot for each completed epoch\n",
"class PrintDot(keras.callbacks.Callback):\n",
" def on_epoch_end(self, epoch, logs):\n",
" if epoch % 100 == 0: print('')\n",
" print('.', end='')\n",
"\n",
"EPOCHS = 1000\n",
"\n",
"history = model.fit(\n",
" normed_train_data, train_labels,\n",
" epochs=EPOCHS, validation_split = 0.2, verbose=0,\n",
" callbacks=[PrintDot()])"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "tQm3pc0FYPQB"
},
"source": [
"Visualice el progreso de entrenamiento del modelo usando las estadísticas almacenadas en el objeto `history`."
]
},
{
"cell_type": "code",
"execution_count": 21,
"metadata": {
"execution": {
"iopub.execute_input": "2020-09-23T00:14:08.904442Z",
"iopub.status.busy": "2020-09-23T00:14:08.896689Z",
"iopub.status.idle": "2020-09-23T00:14:08.907100Z",
"shell.execute_reply": "2020-09-23T00:14:08.907542Z"
},
"id": "4Xj91b-dymEy"
},
"outputs": [
{
"data": {
"text/html": [
"\n",
"\n",
"
\n",
" \n",
" \n",
" \n",
" loss \n",
" mae \n",
" mse \n",
" val_loss \n",
" val_mae \n",
" val_mse \n",
" epoch \n",
" \n",
" \n",
" \n",
" \n",
" 995 \n",
" 2.595970 \n",
" 0.978913 \n",
" 2.595970 \n",
" 11.760169 \n",
" 2.524347 \n",
" 11.760169 \n",
" 995 \n",
" \n",
" \n",
" 996 \n",
" 2.479838 \n",
" 0.976182 \n",
" 2.479838 \n",
" 11.067865 \n",
" 2.483008 \n",
" 11.067865 \n",
" 996 \n",
" \n",
" \n",
" 997 \n",
" 2.674930 \n",
" 1.019684 \n",
" 2.674930 \n",
" 11.104455 \n",
" 2.580837 \n",
" 11.104455 \n",
" 997 \n",
" \n",
" \n",
" 998 \n",
" 2.610440 \n",
" 0.966319 \n",
" 2.610440 \n",
" 10.906663 \n",
" 2.532035 \n",
" 10.906663 \n",
" 998 \n",
" \n",
" \n",
" 999 \n",
" 2.501746 \n",
" 1.019541 \n",
" 2.501746 \n",
" 10.737952 \n",
" 2.528607 \n",
" 10.737952 \n",
" 999 \n",
" \n",
" \n",
"
\n",
"
"
],
"text/plain": [
" loss mae mse val_loss val_mae val_mse epoch\n",
"995 2.595970 0.978913 2.595970 11.760169 2.524347 11.760169 995\n",
"996 2.479838 0.976182 2.479838 11.067865 2.483008 11.067865 996\n",
"997 2.674930 1.019684 2.674930 11.104455 2.580837 11.104455 997\n",
"998 2.610440 0.966319 2.610440 10.906663 2.532035 10.906663 998\n",
"999 2.501746 1.019541 2.501746 10.737952 2.528607 10.737952 999"
]
},
"execution_count": 1,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"hist = pd.DataFrame(history.history)\n",
"hist['epoch'] = history.epoch\n",
"hist.tail()"
]
},
{
"cell_type": "code",
"execution_count": 22,
"metadata": {
"execution": {
"iopub.execute_input": "2020-09-23T00:14:08.916363Z",
"iopub.status.busy": "2020-09-23T00:14:08.915232Z",
"iopub.status.idle": "2020-09-23T00:14:09.466039Z",
"shell.execute_reply": "2020-09-23T00:14:09.466618Z"
},
"id": "B6XriGbVPh2t"
},
"outputs": [
{
"data": {
"image/png": "\n",
"text/plain": [
""
]
},
"metadata": {
"needs_background": "light"
},
"output_type": "display_data"
},
{
"data": {
"image/png": "\n",
"text/plain": [
""
]
},
"metadata": {
"needs_background": "light"
},
"output_type": "display_data"
}
],
"source": [
"def plot_history(history):\n",
" hist = pd.DataFrame(history.history)\n",
" hist['epoch'] = history.epoch\n",
"\n",
" plt.figure()\n",
" plt.xlabel('Epoch')\n",
" plt.ylabel('Mean Abs Error [MPG]')\n",
" plt.plot(hist['epoch'], hist['mae'],\n",
" label='Train Error')\n",
" plt.plot(hist['epoch'], hist['val_mae'],\n",
" label = 'Val Error')\n",
" plt.ylim([0,5])\n",
" plt.legend()\n",
"\n",
" plt.figure()\n",
" plt.xlabel('Epoch')\n",
" plt.ylabel('Mean Square Error [$MPG^2$]')\n",
" plt.plot(hist['epoch'], hist['mse'],\n",
" label='Train Error')\n",
" plt.plot(hist['epoch'], hist['val_mse'],\n",
" label = 'Val Error')\n",
" plt.ylim([0,20])\n",
" plt.legend()\n",
" plt.show()\n",
"\n",
"\n",
"plot_history(history)"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "AqsuANc11FYv"
},
"source": [
"Este gráfico muestra poca mejora, o incluso degradación en el error de validación después de aproximadamente 100 épocas. Actualicemos la llamada `model.fit` para detener automáticamente el entrenamiento cuando el puntaje de validación no mejore. Utilizaremos una * devolución de llamada de EarlyStopping * que pruebe una condición de entrenamiento para cada época. Si transcurre una cantidad determinada de épocas sin mostrar mejoría, entonces detiene automáticamente el entrenamiento.\n",
"\n",
"Puedes obtener más información sobre esta devolución de llamada [Aca](https://www.tensorflow.org/versions/master/api_docs/python/tf/keras/callbacks/EarlyStopping)."
]
},
{
"cell_type": "code",
"execution_count": 23,
"metadata": {
"execution": {
"iopub.execute_input": "2020-09-23T00:14:09.475435Z",
"iopub.status.busy": "2020-09-23T00:14:09.474702Z",
"iopub.status.idle": "2020-09-23T00:14:13.240258Z",
"shell.execute_reply": "2020-09-23T00:14:13.239617Z"
},
"id": "fdMZuhUgzMZ4"
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"\n",
"......"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"........"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"......."
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"......."
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"......."
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"......."
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"......."
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"......."
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"......."
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"......."
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"......."
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"......."
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"...."
]
},
{
"data": {
"image/png": "\n",
"text/plain": [
""
]
},
"metadata": {
"needs_background": "light"
},
"output_type": "display_data"
},
{
"data": {
"image/png": "\n",
"text/plain": [
""
]
},
"metadata": {
"needs_background": "light"
},
"output_type": "display_data"
}
],
"source": [
"model = build_model()\n",
"\n",
"# The patience parameter is the amount of epochs to check for improvement\n",
"early_stop = keras.callbacks.EarlyStopping(monitor='val_loss', patience=10)\n",
"\n",
"history = model.fit(normed_train_data, train_labels, epochs=EPOCHS,\n",
" validation_split = 0.2, verbose=0, callbacks=[early_stop, PrintDot()])\n",
"\n",
"plot_history(history)"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "3St8-DmrX8P4"
},
"source": [
"El gráfico muestra que en el conjunto de validación, el error promedio generalmente es de alrededor de +/- 2 MPG. ¿Es esto bueno? Le dejaremos esa decisión a usted.\n",
"\n",
"Veamos qué tan bien generaliza el modelo al usar el conjunto ** test **, que no usamos al entrenar el modelo. Esto nos dice qué tan bien podemos esperar que el modelo prediga cuándo lo usamos en el mundo real."
]
},
{
"cell_type": "code",
"execution_count": 24,
"metadata": {
"execution": {
"iopub.execute_input": "2020-09-23T00:14:13.245748Z",
"iopub.status.busy": "2020-09-23T00:14:13.244728Z",
"iopub.status.idle": "2020-09-23T00:14:13.297849Z",
"shell.execute_reply": "2020-09-23T00:14:13.298268Z"
},
"id": "jl_yNr5n1kms"
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"3/3 - 0s - loss: 5.9382 - mae: 1.9334 - mse: 5.9382\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Testing set Mean Abs Error: 1.93 MPG\n"
]
}
],
"source": [
"loss, mae, mse = model.evaluate(normed_test_data, test_labels, verbose=2)\n",
"\n",
"print(\"Testing set Mean Abs Error: {:5.2f} MPG\".format(mae))"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "ft603OzXuEZC"
},
"source": [
"### Haga Predicciones\n",
"\n",
"Finalmente, prediga los valores de MPG utilizando datos en el conjunto de pruebas:"
]
},
{
"cell_type": "code",
"execution_count": 25,
"metadata": {
"execution": {
"iopub.execute_input": "2020-09-23T00:14:13.304696Z",
"iopub.status.busy": "2020-09-23T00:14:13.303733Z",
"iopub.status.idle": "2020-09-23T00:14:13.491809Z",
"shell.execute_reply": "2020-09-23T00:14:13.491188Z"
},
"id": "Xe7RXH3N3CWU"
},
"outputs": [
{
"data": {
"image/png": "\n",
"text/plain": [
""
]
},
"metadata": {
"needs_background": "light"
},
"output_type": "display_data"
}
],
"source": [
"test_predictions = model.predict(normed_test_data).flatten()\n",
"\n",
"plt.scatter(test_labels, test_predictions)\n",
"plt.xlabel('True Values [MPG]')\n",
"plt.ylabel('Predictions [MPG]')\n",
"plt.axis('equal')\n",
"plt.axis('square')\n",
"plt.xlim([0,plt.xlim()[1]])\n",
"plt.ylim([0,plt.ylim()[1]])\n",
"_ = plt.plot([-100, 100], [-100, 100])\n"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "19wyogbOSU5t"
},
"source": [
"Parece que nuestro modelo predice razonablemente bien. Echemos un vistazo a la distribución de errores."
]
},
{
"cell_type": "code",
"execution_count": 26,
"metadata": {
"execution": {
"iopub.execute_input": "2020-09-23T00:14:13.505689Z",
"iopub.status.busy": "2020-09-23T00:14:13.504621Z",
"iopub.status.idle": "2020-09-23T00:14:13.660425Z",
"shell.execute_reply": "2020-09-23T00:14:13.659893Z"
},
"id": "f-OHX4DiXd8x"
},
"outputs": [
{
"data": {
"image/png": "iVBORw0KGgoAAAANSUhEUgAAAX4AAAEGCAYAAABiq/5QAAAABHNCSVQICAgIfAhkiAAAAAlwSFlzAAALEgAACxIB0t1+/AAAADh0RVh0U29mdHdhcmUAbWF0cGxvdGxpYiB2ZXJzaW9uMy4yLjIsIGh0dHA6Ly9tYXRwbG90bGliLm9yZy+WH4yJAAAR0UlEQVR4nO3de5BkZX3G8e8jKwqKgsVojLgZokCiKEoNXkApFLVQDIhBwVLE6wYTCXgNaqX0j1SFROMlMUptUNGSgIrgXVFBEUousquGm3cB8cYa74RIwF/+6LM6jnPpGaf7TO/7/VRtbZ/Tp8/769nep995u8/7pqqQJLXjdn0XIEkaL4Nfkhpj8EtSYwx+SWqMwS9JjVnXdwHD2HXXXWt6errvMiRpomzatOlHVTU1d/9EBP/09DSXX35532VI0kRJct18+x3qkaTGGPyS1BiDX5IaY/BLUmMMfklqjMEvSY0x+CWpMQa/JDXG4JekxkzElbvStm76pI8u+zHXnnzoCCpRC+zxS1JjDH5JaozBL0mNMfglqTEGvyQ1xuCXpMYY/JLUGINfkhpj8EtSYwx+SWqMwS9JjTH4JakxBr8kNcbgl6TGjCz4k7w9yY1Jrpy1725JPpXk693fu4yqfUnS/EbZ4z8NOGTOvpOA86pqD+C8bluSNEYjC/6q+hzw4zm7Dwfe2d1+J/CkUbUvSZrfuMf471FV3+9u/wC4x5jbl6Tm9fbhblUVUAvdn2RDksuTXL5ly5YxViZJ27ZxB/8Pk9wToPv7xoUOrKqNVTVTVTNTU1NjK1CStnXjDv4PAcd2t48FPjjm9iWpeaP8OucZwMXAXkluSPJc4GTgsUm+Djym25YkjdG6UZ24qp62wF0Hj6pNSdLSvHJXkhpj8EtSYwx+SWqMwS9JjTH4JakxBr8kNcbgl6TGGPyS1BiDX5IaY/BLUmMMfklqjMEvSY0x+CWpMQa/JDXG4Jekxhj8ktQYg1+SGmPwS1JjDH5JaozBL0mNMfglqTEGvyQ1xuCXpMYY/JLUGINfkhpj8EtSYwx+SWqMwS9JjTH4JakxvQR/khcluSrJlUnOSHLHPuqQpBaNPfiT3Av4W2CmqvYGtgOOHncdktSqvoZ61gE7JFkH7Ah8r6c6JKk568bdYFV9N8nrgOuBm4FPVtUn5x6XZAOwAWD9+vXjLVKaANMnfXRZx1978qEjqkSTpo+hnl2Aw4HdgT8G7pTkGXOPq6qNVTVTVTNTU1PjLlOStll9DPU8Bvh2VW2pqv8Dzgb276EOSWpSH8F/PfCwJDsmCXAwcE0PdUhSk8Ye/FV1KXAWsBm4oqth47jrkKRWjf3DXYCqejXw6j7alqTWeeWuJDXG4Jekxhj8ktQYg1+SGmPwS1JjDH5JaozBL0mNMfglqTEGvyQ1xuCXpMYY/JLUGINfkhpj8EtSY3qZnVNabaNehrDFZQ5bfM6tsMcvSY0x+CWpMQa/JDXG4Jekxhj8ktQYg1+SGmPwS1JjDH5JaozBL0mNMfglqTFDBX+SA4bZJ0la+4bt8f/bkPskSWvcopO0JXk4sD8wleTFs+66C7DdKAuTJI3GUrNzbg/cuTtup1n7fw4cOaqiJEmjs2jwV9UFwAVJTquq68ZUkyRphIadj/8OSTYC07MfU1WPXkmjSXYGTgX2Bgp4TlVdvJJzSZKWZ9jgfx9wCoOwvm0V2n0T8ImqOjLJ9sCOq3BOSdIQhg3+W6vqravRYJK7AgcCzwKoqluAW1bj3JKkpQ0b/B9O8tfAOcCvtu6sqh+voM3dgS3AO5LsA2wCTqiqm2YflGQDsAFg/fr1K2hG0mzLXUpR265hv8d/LPAy4PMMgnoTcPkK21wH7Au8taoeDNwEnDT3oKraWFUzVTUzNTW1wqYkSXMN1eOvqt1Xsc0bgBuq6tJu+yzmCX5J0mgMFfxJnjnf/qp613IbrKofJPlOkr2q6qvAwcDVyz2PJGllhh3j32/W7TsyCOvNwLKDv3M8cHr3jZ5vAc9e4XkkScs07FDP8bO3u+/hn7nSRqvqS8DMSh8vSVq5lU7LfBODb+dIkibMsGP8H2ZwhS0MJmf7c+C9oypKkjQ6w47xv27W7VuB66rqhhHUI0kasaGGerrJ2r7CYIbOXfBKW0maWMOuwPVU4DLgKcBTgUuTOC2zJE2gYYd6XgXsV1U3AiSZAj7N4OIrSdIEGfZbPbfbGvqd/17GYyVJa8iwPf5PJDkXOKPbPgr42GhKkiSN0lJr7t4XuEdVvSzJk4FHdHddDJw+6uIkSatvqR7/G4FXAFTV2cDZAEke0N33FyOtTpK06pYap79HVV0xd2e3b3okFUmSRmqp4N95kft2WM1CJEnjsVTwX57k+XN3Jnkeg8VYJEkTZqkx/hOBc5I8nd8G/QywPXDEKAuTRmnUyxC6zKHWskWDv6p+COyf5FHA3t3uj1bV+SOvTJI0EsPOx/8Z4DMjrkWSNAZefStJjTH4JakxBr8kNcbgl6TGGPyS1BiDX5IaY/BLUmMMfklqjMEvSY0x+CWpMQa/JDXG4JekxvQW/Em2S/LFJB/pqwZJalGfPf4TgGt6bF+SmtRL8CfZDTgUOLWP9iWpZUPNxz8CbwReDuy00AFJNgAbANavXz+msrRWuILV5Fnuv9m1Jx86okq0lLH3+JM8EbixqhZds7eqNlbVTFXNTE1Njak6Sdr29THUcwBwWJJrgTOBRyd5dw91SFKTxh78VfWKqtqtqqaBo4Hzq+oZ465Dklrl9/glqTF9fbgLQFV9FvhsnzVIUmvs8UtSYwx+SWqMwS9JjTH4JakxBr8kNcbgl6TGGPyS1BiDX5IaY/BLUmMMfklqjMEvSY0x+CWpMQa/JDWm19k5tTa4ZJ7UFnv8ktQYg1+SGmPwS1JjDH5JaozBL0mNMfglqTEGvyQ1xuCXpMYY/JLUGINfkhpj8EtSYwx+SWqMwS9JjTH4JakxBr8kNWbswZ/k3kk+k+TqJFclOWHcNUhSy/pYiOVW4CVVtTnJTsCmJJ+qqqt7qEWSmjP2Hn9Vfb+qNne3fwFcA9xr3HVIUqt6XXoxyTTwYODSee7bAGwAWL9+/VjrGrVJX+pw0uvX2rDc1xEs/7Xka3V+vX24m+TOwPuBE6vq53Pvr6qNVTVTVTNTU1PjL1CStlG9BH+S2zMI/dOr6uw+apCkVvXxrZ4AbwOuqarXj7t9SWpdHz3+A4BjgEcn+VL35wk91CFJTRr7h7tVdRGQcbcrSRrwyl1JaozBL0mNMfglqTEGvyQ1xuCXpMYY/JLUGINfkhpj8EtSYwx+SWqMwS9JjTH4JakxBr8kNcbgl6TG9Lr04ji49Fr/VrLEnrQtWit5ZI9fkhpj8EtSYwx+SWqMwS9JjTH4JakxBr8kNcbgl6TGGPyS1BiDX5IaY/BLUmMMfklqjMEvSY0x+CWpMQa/JDXG4JekxvQS/EkOSfLVJN9IclIfNUhSq8Ye/Em2A/4deDxwP+BpSe437jokqVV99PgfAnyjqr5VVbcAZwKH91CHJDUpVTXeBpMjgUOq6nnd9jHAQ6vqhXOO2wBs6Db3Ar461kKXb1fgR30XMSRrHY1JqhUmq15rXZk/qaqpuTvX7Jq7VbUR2Nh3HcNKcnlVzfRdxzCsdTQmqVaYrHqtdXX1MdTzXeDes7Z36/ZJksagj+D/ArBHkt2TbA8cDXyohzokqUljH+qpqluTvBA4F9gOeHtVXTXuOkZgYoalsNZRmaRaYbLqtdZVNPYPdyVJ/fLKXUlqjMEvSY0x+FdRkuOTfCXJVUn+ue96hpHkJUkqya5917KQJK/tfq7/leScJDv3XdNckzINSZJ7J/lMkqu71+kJfde0lCTbJfliko/0XctSkuyc5Kzu9XpNkof3XdN8DP5VkuRRDK5A3qeq7g+8rueSlpTk3sDjgOv7rmUJnwL2rqoHAl8DXtFzPb9jwqYhuRV4SVXdD3gY8DdruNatTgCu6buIIb0J+ERV/RmwD2u0boN/9bwAOLmqfgVQVTf2XM8w3gC8HFjTn/BX1Ser6tZu8xIG136sJRMzDUlVfb+qNne3f8EgmO7Vb1ULS7IbcChwat+1LCXJXYEDgbcBVNUtVfXTfquan8G/evYEHpnk0iQXJNmv74IWk+Rw4LtV9eW+a1mm5wAf77uIOe4FfGfW9g2s4TDdKsk08GDg0n4rWdQbGXROft13IUPYHdgCvKMbmjo1yZ36Lmo+a3bKhrUoyaeBP5rnrlcx+FnejcGvz/sB703yp9Xj92WXqPeVDIZ51oTFaq2qD3bHvIrBUMXp46xtW5TkzsD7gROr6ud91zOfJE8EbqyqTUkO6rueIawD9gWOr6pLk7wJOAn4+37L+n0G/zJU1WMWui/JC4Czu6C/LMmvGUzWtGVc9c21UL1JHsCgd/LlJDAYOtmc5CFV9YMxlvgbi/1sAZI8C3gicHCfb6YLmKhpSJLcnkHon15VZ/ddzyIOAA5L8gTgjsBdkry7qp7Rc10LuQG4oaq2/gZ1FoPgX3Mc6lk9HwAeBZBkT2B71s4Mfb+jqq6oqrtX1XRVTTN4we7bV+gvJckhDH7dP6yq/qfveuYxMdOQZPBO/zbgmqp6fd/1LKaqXlFVu3Wv0aOB89dw6NP9//lOkr26XQcDV/dY0oLs8a+etwNvT3IlcAtw7BrsmU6qNwN3AD7V/YZySVUd129JvzVh05AcABwDXJHkS92+V1bVx3qsaVtyPHB61wH4FvDsnuuZl1M2SFJjHOqRpMYY/JLUGINfkhpj8EtSYwx+SWqMwS9JjTH4NRZJbkvypSRXJnlfkh3/gHOdluTI7vapi80umeSgJPvP2j4uyTNX2vas80wnubl7Tlv//MHnXaS9a5NckWSm2/5skuu7C7K2HvOBJL+cp76rk5yS5HbdfXsk+UiSbybZ1E3TfGB331Hd1NJrfgpkrZwXcGlcbq6qBwEkOR04DvjNlaNJ1s2agXNoVfW8JQ45CPgl8Pnu+FOW28Yivrn1OS0kyXZVddtC2ws8JgyusZk7Mdmjqmr21eA/ZXBB1kXdGgX3nK++JOuA84EnJfkY8FHgpVX1oa69vYEZ4HNV9Z4kPwReuliNmmz2+NWHC4H7dr3xC5N8CLi6W3DjtUm+0C268lcwCMIkb85goZNPA3ffeqKu57u1F3xIks1JvpzkvG72yeOAF3U930cmeU2Sl3bHPyjJJfntAi+7zDrnPyW5LMnXkjxyOU8uyS+T/EuSLwMPn2f7xd1vPlcmObF7zHT3/N4FXMnvzv2zkDMZTGUA8GRg3nl3ujfUzwP3BZ4OXLw19Lv7r6yq05bzHDXZDH6NVdf7fDxwRbdrX+CEqtoTeC7ws6raj8EMp89PsjtwBLAXg0VOngnsP895p4D/AP6yqvYBnlJV1wKnAG+oqgdV1YVzHvYu4O+6BV6uAF496751VfUQ4MQ5+2e7z5yhnq1vEHcCLq2qfarqotnbwM0MLuN/KIOZXJ+f5MHd4/YA3lJV96+q6xb+Kf7GecCBGSwEczTwnvkO6obVDu6e4/2BzUOcW9swh3o0LjvMmhvmQgYThe0PXFZV3+72Pw544Nbxe+CuDMLwQOCMbojke0nOn+f8D2MwVPFtgKr68WLFZLBoxs5VdUG3653A+2YdsrX3vAmYXuA0Cw313MZg9sv5th8BnFNVN3V1nA08ksGkbtdV1SWL1T1POxcxCP0dquraWUP+0L0xMVho54NV9fEkj519QJJzGPyMv1ZVT15G25pgBr/G5ea5IdmF1E2zdzGYy/zcOcc9YfTl/Z5fdX/fxvL/n/zvnHH8udsLuWnpQ37PmcA5wGvmuW++N6arGLyRAlBVR3RDZWt+qVCtHod6tJacC7wgg/niSbJnBisYfQ44qvsM4J5001/PcQmDYY/du8ferdv/C2CnuQdX1c+An8wanjkGuGDucSNwIYMPWXfsntsR3b4/5Hz/CJwx5PH/CRyQ5LBZ+1b8DStNJnv8WktOZTCssrn7ZssW4EkMerSPZjC3+fXAxXMfWFVbkmwAzu6+tngj8Fjgw8BZGSw1efychx0LnNKNga9kCt37zBq+gsF0zP+62AOqanOS04DLul2nVtUXuw+il62b+nvo3npV3ZzBylavT/JG4IcM3hz/YSXtazI5LbM0AZJcC8zM+TrnqNo6iMHXPZ846rbUD4d6pMmwBThv61dXRyXJUcBbgJ+Msh31yx6/JDXGHr8kNcbgl6TGGPyS1BiDX5Ia8/+atbQK7wa75gAAAABJRU5ErkJggg==\n",
"text/plain": [
""
]
},
"metadata": {
"needs_background": "light"
},
"output_type": "display_data"
}
],
"source": [
"error = test_predictions - test_labels\n",
"plt.hist(error, bins = 25)\n",
"plt.xlabel(\"Prediction Error [MPG]\")\n",
"_ = plt.ylabel(\"Count\")"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "m0CB5tBjSU5w"
},
"source": [
"No es del todo gaussiano, pero podríamos esperar eso porque el número de muestras es muy pequeño."
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "vgGQuV-yqYZH"
},
"source": [
"## Conclusion\n",
"\n",
"Este cuaderno introdujo algunas técnicas para manejar un problema de regresión.\n",
"\n",
"* El error cuadrático medio (MSE) es una función de pérdida común utilizada para problemas de regresión (se utilizan diferentes funciones de pérdida para problemas de clasificación).\n",
"* Del mismo modo, las métricas de evaluación utilizadas para la regresión difieren de la clasificación. Una métrica de regresión común es el error absoluto medio (MAE).\n",
"* Cuando las características de datos de entrada numéricos tienen valores con diferentes rangos, cada característica debe escalarse independientemente al mismo rango.\n",
"* Si no hay muchos datos de entrenamiento, una técnica es preferir una red pequeña con pocas capas ocultas para evitar el sobreajuste.\n",
"* La detención temprana es una técnica útil para evitar el sobreajuste."
]
}
],
"metadata": {
"colab": {
"collapsed_sections": [],
"name": "regression.ipynb",
"toc_visible": true
},
"kernelspec": {
"display_name": "Python 3",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.6.9"
}
},
"nbformat": 4,
"nbformat_minor": 0
}