{ "cells": [ { "cell_type": "markdown", "metadata": { "id": "x8Q7Un821X1A" }, "source": [ "##### Copyright 2018 The TensorFlow Hub Authors.\n", "\n", "Licensed under the Apache License, Version 2.0 (the \"License\");" ] }, { "cell_type": "code", "execution_count": 1, "metadata": { "execution": { "iopub.execute_input": "2023-05-23T08:40:58.286948Z", "iopub.status.busy": "2023-05-23T08:40:58.286701Z", "iopub.status.idle": "2023-05-23T08:40:58.290474Z", "shell.execute_reply": "2023-05-23T08:40:58.289892Z" }, "id": "1W4rIAFt1Ui3" }, "outputs": [], "source": [ "# Copyright 2018 The TensorFlow Hub Authors. All Rights Reserved.\n", "#\n", "# Licensed under the Apache License, Version 2.0 (the \"License\");\n", "# you may not use this file except in compliance with the License.\n", "# You may obtain a copy of the License at\n", "#\n", "# http://www.apache.org/licenses/LICENSE-2.0\n", "#\n", "# Unless required by applicable law or agreed to in writing, software\n", "# distributed under the License is distributed on an \"AS IS\" BASIS,\n", "# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n", "# See the License for the specific language governing permissions and\n", "# limitations under the License.\n", "# ==============================================================================" ] }, { "cell_type": "markdown", "metadata": { "id": "cDq0CIKc1vO_" }, "source": [ "# Action Recognition with an Inflated 3D CNN\n" ] }, { "cell_type": "markdown", "metadata": { "id": "MfBg1C5NB3X0" }, "source": [ "\n", " \n", " \n", " \n", " \n", " \n", "
\n", " View on TensorFlow.org\n", " \n", " Run in Google Colab\n", " \n", " View on GitHub\n", " \n", " Download notebook\n", " \n", " See TF Hub model\n", "
" ] }, { "cell_type": "markdown", "metadata": { "id": "h6W3FhoP3TxC" }, "source": [ "This Colab demonstrates recognizing actions in video data using the\n", "[tfhub.dev/deepmind/i3d-kinetics-400/1](https://tfhub.dev/deepmind/i3d-kinetics-400/1) module. More models to detect actions in videos can be found [here](https://tfhub.dev/s?module-type=video-classification).\n", "\n", "The underlying model is described in the paper \"[Quo Vadis, Action Recognition? A New\n", "Model and the Kinetics Dataset](https://arxiv.org/abs/1705.07750)\" by Joao\n", "Carreira and Andrew Zisserman. The paper was posted on arXiv in May 2017, and\n", "was published as a CVPR 2017 conference paper.\n", "The source code is publicly available on\n", "[github](https://github.com/deepmind/kinetics-i3d).\n", "\n", "\"Quo Vadis\" introduced a new architecture for video classification, the Inflated\n", "3D Convnet or I3D. This architecture achieved state-of-the-art results on the UCF101\n", "and HMDB51 datasets from fine-tuning these models. I3D models pre-trained on Kinetics\n", "also placed first in the CVPR 2017 [Charades challenge](http://vuchallenge.org/charades.html).\n", "\n", "The original module was trained on the [kinetics-400 dateset](https://www.deepmind.com/open-source/kinetics)\n", "and knows about 400 different actions.\n", "Labels for these actions can be found in the\n", "[label map file](https://github.com/deepmind/kinetics-i3d/blob/master/data/label_map.txt).\n", "\n", "In this Colab we will use it recognize activites in videos from a UCF101 dataset." ] }, { "cell_type": "markdown", "metadata": { "id": "R_0xc2jyNGRp" }, "source": [ "## Setup" ] }, { "cell_type": "code", "execution_count": 2, "metadata": { "execution": { "iopub.execute_input": "2023-05-23T08:40:58.293704Z", "iopub.status.busy": "2023-05-23T08:40:58.293496Z", "iopub.status.idle": "2023-05-23T08:41:08.917601Z", "shell.execute_reply": "2023-05-23T08:41:08.916680Z" }, "id": "mOHMWsFnITdi" }, "outputs": [], "source": [ "!pip install -q imageio\n", "!pip install -q opencv-python\n", "!pip install -q git+https://github.com/tensorflow/docs" ] }, { "cell_type": "code", "execution_count": 3, "metadata": { "cellView": "both", "execution": { "iopub.execute_input": "2023-05-23T08:41:08.921575Z", "iopub.status.busy": "2023-05-23T08:41:08.921305Z", "iopub.status.idle": "2023-05-23T08:41:11.292367Z", "shell.execute_reply": "2023-05-23T08:41:11.291731Z" }, "id": "USf0UvkYIlKo" }, "outputs": [], "source": [ "#@title Import the necessary modules\n", "# TensorFlow and TF-Hub modules.\n", "from absl import logging\n", "\n", "import tensorflow as tf\n", "import tensorflow_hub as hub\n", "from tensorflow_docs.vis import embed\n", "\n", "logging.set_verbosity(logging.ERROR)\n", "\n", "# Some modules to help with reading the UCF101 dataset.\n", "import random\n", "import re\n", "import os\n", "import tempfile\n", "import ssl\n", "import cv2\n", "import numpy as np\n", "\n", "# Some modules to display an animation using imageio.\n", "import imageio\n", "from IPython import display\n", "\n", "from urllib import request # requires python3" ] }, { "cell_type": "code", "execution_count": 4, "metadata": { "cellView": "both", "execution": { "iopub.execute_input": "2023-05-23T08:41:11.296164Z", "iopub.status.busy": "2023-05-23T08:41:11.295791Z", "iopub.status.idle": "2023-05-23T08:41:11.305151Z", "shell.execute_reply": "2023-05-23T08:41:11.304593Z" }, "id": "IuMMS3TGdws7" }, "outputs": [], "source": [ "#@title Helper functions for the UCF101 dataset\n", "\n", "# Utilities to fetch videos from UCF101 dataset\n", "UCF_ROOT = \"https://www.crcv.ucf.edu/THUMOS14/UCF101/UCF101/\"\n", "_VIDEO_LIST = None\n", "_CACHE_DIR = tempfile.mkdtemp()\n", "# As of July 2020, crcv.ucf.edu doesn't use a certificate accepted by the\n", "# default Colab environment anymore.\n", "unverified_context = ssl._create_unverified_context()\n", "\n", "def list_ucf_videos():\n", " \"\"\"Lists videos available in UCF101 dataset.\"\"\"\n", " global _VIDEO_LIST\n", " if not _VIDEO_LIST:\n", " index = request.urlopen(UCF_ROOT, context=unverified_context).read().decode(\"utf-8\")\n", " videos = re.findall(\"(v_[\\w_]+\\.avi)\", index)\n", " _VIDEO_LIST = sorted(set(videos))\n", " return list(_VIDEO_LIST)\n", "\n", "def fetch_ucf_video(video):\n", " \"\"\"Fetchs a video and cache into local filesystem.\"\"\"\n", " cache_path = os.path.join(_CACHE_DIR, video)\n", " if not os.path.exists(cache_path):\n", " urlpath = request.urljoin(UCF_ROOT, video)\n", " print(\"Fetching %s => %s\" % (urlpath, cache_path))\n", " data = request.urlopen(urlpath, context=unverified_context).read()\n", " open(cache_path, \"wb\").write(data)\n", " return cache_path\n", "\n", "# Utilities to open video files using CV2\n", "def crop_center_square(frame):\n", " y, x = frame.shape[0:2]\n", " min_dim = min(y, x)\n", " start_x = (x // 2) - (min_dim // 2)\n", " start_y = (y // 2) - (min_dim // 2)\n", " return frame[start_y:start_y+min_dim,start_x:start_x+min_dim]\n", "\n", "def load_video(path, max_frames=0, resize=(224, 224)):\n", " cap = cv2.VideoCapture(path)\n", " frames = []\n", " try:\n", " while True:\n", " ret, frame = cap.read()\n", " if not ret:\n", " break\n", " frame = crop_center_square(frame)\n", " frame = cv2.resize(frame, resize)\n", " frame = frame[:, :, [2, 1, 0]]\n", " frames.append(frame)\n", " \n", " if len(frames) == max_frames:\n", " break\n", " finally:\n", " cap.release()\n", " return np.array(frames) / 255.0\n", "\n", "def to_gif(images):\n", " converted_images = np.clip(images * 255, 0, 255).astype(np.uint8)\n", " imageio.mimsave('./animation.gif', converted_images, duration=40)\n", " return embed.embed_file('./animation.gif')" ] }, { "cell_type": "code", "execution_count": 5, "metadata": { "cellView": "form", "execution": { "iopub.execute_input": "2023-05-23T08:41:11.308237Z", "iopub.status.busy": "2023-05-23T08:41:11.308029Z", "iopub.status.idle": "2023-05-23T08:41:11.440833Z", "shell.execute_reply": "2023-05-23T08:41:11.440260Z" }, "id": "pIKTs-KneUfz" }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Found 400 labels.\n" ] } ], "source": [ "#@title Get the kinetics-400 labels\n", "# Get the kinetics-400 action labels from the GitHub repository.\n", "KINETICS_URL = \"https://raw.githubusercontent.com/deepmind/kinetics-i3d/master/data/label_map.txt\"\n", "with request.urlopen(KINETICS_URL) as obj:\n", " labels = [line.decode(\"utf-8\").strip() for line in obj.readlines()]\n", "print(\"Found %d labels.\" % len(labels))" ] }, { "cell_type": "markdown", "metadata": { "id": "GBvmjVICIp3W" }, "source": [ "# Using the UCF101 dataset" ] }, { "cell_type": "code", "execution_count": 6, "metadata": { "execution": { "iopub.execute_input": "2023-05-23T08:41:11.477322Z", "iopub.status.busy": "2023-05-23T08:41:11.477088Z", "iopub.status.idle": "2023-05-23T08:41:12.818723Z", "shell.execute_reply": "2023-05-23T08:41:12.818064Z" }, "id": "V-QcxdhLIfi2" }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Found 13320 videos in 101 categories.\n", "ApplyEyeMakeup 145 videos (v_ApplyEyeMakeup_g01_c01.avi, v_ApplyEyeMakeup_g01_c02.avi, ...)\n", "ApplyLipstick 114 videos (v_ApplyLipstick_g01_c01.avi, v_ApplyLipstick_g01_c02.avi, ...)\n", "Archery 145 videos (v_Archery_g01_c01.avi, v_Archery_g01_c02.avi, ...)\n", "BabyCrawling 132 videos (v_BabyCrawling_g01_c01.avi, v_BabyCrawling_g01_c02.avi, ...)\n", "BalanceBeam 108 videos (v_BalanceBeam_g01_c01.avi, v_BalanceBeam_g01_c02.avi, ...)\n", "BandMarching 155 videos (v_BandMarching_g01_c01.avi, v_BandMarching_g01_c02.avi, ...)\n", "BaseballPitch 150 videos (v_BaseballPitch_g01_c01.avi, v_BaseballPitch_g01_c02.avi, ...)\n", "BasketballDunk 131 videos (v_BasketballDunk_g01_c01.avi, v_BasketballDunk_g01_c02.avi, ...)\n", "Basketball 134 videos (v_Basketball_g01_c01.avi, v_Basketball_g01_c02.avi, ...)\n", "BenchPress 160 videos (v_BenchPress_g01_c01.avi, v_BenchPress_g01_c02.avi, ...)\n", "Biking 134 videos (v_Biking_g01_c01.avi, v_Biking_g01_c02.avi, ...)\n", "Billiards 150 videos (v_Billiards_g01_c01.avi, v_Billiards_g01_c02.avi, ...)\n", "BlowDryHair 131 videos (v_BlowDryHair_g01_c01.avi, v_BlowDryHair_g01_c02.avi, ...)\n", "BlowingCandles 109 videos (v_BlowingCandles_g01_c01.avi, v_BlowingCandles_g01_c02.avi, ...)\n", "BodyWeightSquats 112 videos (v_BodyWeightSquats_g01_c01.avi, v_BodyWeightSquats_g01_c02.avi, ...)\n", "Bowling 155 videos (v_Bowling_g01_c01.avi, v_Bowling_g01_c02.avi, ...)\n", "BoxingPunchingBag 163 videos (v_BoxingPunchingBag_g01_c01.avi, v_BoxingPunchingBag_g01_c02.avi, ...)\n", "BoxingSpeedBag 134 videos (v_BoxingSpeedBag_g01_c01.avi, v_BoxingSpeedBag_g01_c02.avi, ...)\n", "BreastStroke 101 videos (v_BreastStroke_g01_c01.avi, v_BreastStroke_g01_c02.avi, ...)\n", "BrushingTeeth 131 videos (v_BrushingTeeth_g01_c01.avi, v_BrushingTeeth_g01_c02.avi, ...)\n", "CleanAndJerk 112 videos (v_CleanAndJerk_g01_c01.avi, v_CleanAndJerk_g01_c02.avi, ...)\n", "CliffDiving 138 videos (v_CliffDiving_g01_c01.avi, v_CliffDiving_g01_c02.avi, ...)\n", "CricketBowling 139 videos (v_CricketBowling_g01_c01.avi, v_CricketBowling_g01_c02.avi, ...)\n", "CricketShot 167 videos (v_CricketShot_g01_c01.avi, v_CricketShot_g01_c02.avi, ...)\n", "CuttingInKitchen 110 videos (v_CuttingInKitchen_g01_c01.avi, v_CuttingInKitchen_g01_c02.avi, ...)\n", "Diving 150 videos (v_Diving_g01_c01.avi, v_Diving_g01_c02.avi, ...)\n", "Drumming 161 videos (v_Drumming_g01_c01.avi, v_Drumming_g01_c02.avi, ...)\n", "Fencing 111 videos (v_Fencing_g01_c01.avi, v_Fencing_g01_c02.avi, ...)\n", "FieldHockeyPenalty 126 videos (v_FieldHockeyPenalty_g01_c01.avi, v_FieldHockeyPenalty_g01_c02.avi, ...)\n", "FloorGymnastics 125 videos (v_FloorGymnastics_g01_c01.avi, v_FloorGymnastics_g01_c02.avi, ...)\n", "FrisbeeCatch 126 videos (v_FrisbeeCatch_g01_c01.avi, v_FrisbeeCatch_g01_c02.avi, ...)\n", "FrontCrawl 137 videos (v_FrontCrawl_g01_c01.avi, v_FrontCrawl_g01_c02.avi, ...)\n", "GolfSwing 139 videos (v_GolfSwing_g01_c01.avi, v_GolfSwing_g01_c02.avi, ...)\n", "Haircut 130 videos (v_Haircut_g01_c01.avi, v_Haircut_g01_c02.avi, ...)\n", "HammerThrow 150 videos (v_HammerThrow_g01_c01.avi, v_HammerThrow_g01_c02.avi, ...)\n", "Hammering 140 videos (v_Hammering_g01_c01.avi, v_Hammering_g01_c02.avi, ...)\n", "HandstandPushups 128 videos (v_HandstandPushups_g01_c01.avi, v_HandstandPushups_g01_c02.avi, ...)\n", "HandstandWalking 111 videos (v_HandstandWalking_g01_c01.avi, v_HandstandWalking_g01_c02.avi, ...)\n", "HeadMassage 147 videos (v_HeadMassage_g01_c01.avi, v_HeadMassage_g01_c02.avi, ...)\n", "HighJump 123 videos (v_HighJump_g01_c01.avi, v_HighJump_g01_c02.avi, ...)\n", "HorseRace 124 videos (v_HorseRace_g01_c01.avi, v_HorseRace_g01_c02.avi, ...)\n", "HorseRiding 164 videos (v_HorseRiding_g01_c01.avi, v_HorseRiding_g01_c02.avi, ...)\n", "HulaHoop 125 videos (v_HulaHoop_g01_c01.avi, v_HulaHoop_g01_c02.avi, ...)\n", "IceDancing 158 videos (v_IceDancing_g01_c01.avi, v_IceDancing_g01_c02.avi, ...)\n", "JavelinThrow 117 videos (v_JavelinThrow_g01_c01.avi, v_JavelinThrow_g01_c02.avi, ...)\n", "JugglingBalls 121 videos (v_JugglingBalls_g01_c01.avi, v_JugglingBalls_g01_c02.avi, ...)\n", "JumpRope 144 videos (v_JumpRope_g01_c01.avi, v_JumpRope_g01_c02.avi, ...)\n", "JumpingJack 123 videos (v_JumpingJack_g01_c01.avi, v_JumpingJack_g01_c02.avi, ...)\n", "Kayaking 141 videos (v_Kayaking_g01_c01.avi, v_Kayaking_g01_c02.avi, ...)\n", "Knitting 123 videos (v_Knitting_g01_c01.avi, v_Knitting_g01_c02.avi, ...)\n", "LongJump 131 videos (v_LongJump_g01_c01.avi, v_LongJump_g01_c02.avi, ...)\n", "Lunges 127 videos (v_Lunges_g01_c01.avi, v_Lunges_g01_c02.avi, ...)\n", "MilitaryParade 125 videos (v_MilitaryParade_g01_c01.avi, v_MilitaryParade_g01_c02.avi, ...)\n", "Mixing 136 videos (v_Mixing_g01_c01.avi, v_Mixing_g01_c02.avi, ...)\n", "MoppingFloor 110 videos (v_MoppingFloor_g01_c01.avi, v_MoppingFloor_g01_c02.avi, ...)\n", "Nunchucks 132 videos (v_Nunchucks_g01_c01.avi, v_Nunchucks_g01_c02.avi, ...)\n", "ParallelBars 114 videos (v_ParallelBars_g01_c01.avi, v_ParallelBars_g01_c02.avi, ...)\n", "PizzaTossing 113 videos (v_PizzaTossing_g01_c01.avi, v_PizzaTossing_g01_c02.avi, ...)\n", "PlayingCello 164 videos (v_PlayingCello_g01_c01.avi, v_PlayingCello_g01_c02.avi, ...)\n", "PlayingDaf 151 videos (v_PlayingDaf_g01_c01.avi, v_PlayingDaf_g01_c02.avi, ...)\n", "PlayingDhol 164 videos (v_PlayingDhol_g01_c01.avi, v_PlayingDhol_g01_c02.avi, ...)\n", "PlayingFlute 155 videos (v_PlayingFlute_g01_c01.avi, v_PlayingFlute_g01_c02.avi, ...)\n", "PlayingGuitar 160 videos (v_PlayingGuitar_g01_c01.avi, v_PlayingGuitar_g01_c02.avi, ...)\n", "PlayingPiano 105 videos (v_PlayingPiano_g01_c01.avi, v_PlayingPiano_g01_c02.avi, ...)\n", "PlayingSitar 157 videos (v_PlayingSitar_g01_c01.avi, v_PlayingSitar_g01_c02.avi, ...)\n", "PlayingTabla 111 videos (v_PlayingTabla_g01_c01.avi, v_PlayingTabla_g01_c02.avi, ...)\n", "PlayingViolin 100 videos (v_PlayingViolin_g01_c01.avi, v_PlayingViolin_g01_c02.avi, ...)\n", "PoleVault 149 videos (v_PoleVault_g01_c01.avi, v_PoleVault_g01_c02.avi, ...)\n", "PommelHorse 123 videos (v_PommelHorse_g01_c01.avi, v_PommelHorse_g01_c02.avi, ...)\n", "PullUps 100 videos (v_PullUps_g01_c01.avi, v_PullUps_g01_c02.avi, ...)\n", "Punch 160 videos (v_Punch_g01_c01.avi, v_Punch_g01_c02.avi, ...)\n", "PushUps 102 videos (v_PushUps_g01_c01.avi, v_PushUps_g01_c02.avi, ...)\n", "Rafting 111 videos (v_Rafting_g01_c01.avi, v_Rafting_g01_c02.avi, ...)\n", "RockClimbingIndoor 144 videos (v_RockClimbingIndoor_g01_c01.avi, v_RockClimbingIndoor_g01_c02.avi, ...)\n", "RopeClimbing 119 videos (v_RopeClimbing_g01_c01.avi, v_RopeClimbing_g01_c02.avi, ...)\n", "Rowing 137 videos (v_Rowing_g01_c01.avi, v_Rowing_g01_c02.avi, ...)\n", "SalsaSpin 133 videos (v_SalsaSpin_g01_c01.avi, v_SalsaSpin_g01_c02.avi, ...)\n", "ShavingBeard 161 videos (v_ShavingBeard_g01_c01.avi, v_ShavingBeard_g01_c02.avi, ...)\n", "Shotput 144 videos (v_Shotput_g01_c01.avi, v_Shotput_g01_c02.avi, ...)\n", "SkateBoarding 120 videos (v_SkateBoarding_g01_c01.avi, v_SkateBoarding_g01_c02.avi, ...)\n", "Skiing 135 videos (v_Skiing_g01_c01.avi, v_Skiing_g01_c02.avi, ...)\n", "Skijet 100 videos (v_Skijet_g01_c01.avi, v_Skijet_g01_c02.avi, ...)\n", "SkyDiving 110 videos (v_SkyDiving_g01_c01.avi, v_SkyDiving_g01_c02.avi, ...)\n", "SoccerJuggling 147 videos (v_SoccerJuggling_g01_c01.avi, v_SoccerJuggling_g01_c02.avi, ...)\n", "SoccerPenalty 137 videos (v_SoccerPenalty_g01_c01.avi, v_SoccerPenalty_g01_c02.avi, ...)\n", "StillRings 112 videos (v_StillRings_g01_c01.avi, v_StillRings_g01_c02.avi, ...)\n", "SumoWrestling 116 videos (v_SumoWrestling_g01_c01.avi, v_SumoWrestling_g01_c02.avi, ...)\n", "Surfing 126 videos (v_Surfing_g01_c01.avi, v_Surfing_g01_c02.avi, ...)\n", "Swing 131 videos (v_Swing_g01_c01.avi, v_Swing_g01_c02.avi, ...)\n", "TableTennisShot 140 videos (v_TableTennisShot_g01_c01.avi, v_TableTennisShot_g01_c02.avi, ...)\n", "TaiChi 100 videos (v_TaiChi_g01_c01.avi, v_TaiChi_g01_c02.avi, ...)\n", "TennisSwing 166 videos (v_TennisSwing_g01_c01.avi, v_TennisSwing_g01_c02.avi, ...)\n", "ThrowDiscus 130 videos (v_ThrowDiscus_g01_c01.avi, v_ThrowDiscus_g01_c02.avi, ...)\n", "TrampolineJumping 119 videos (v_TrampolineJumping_g01_c01.avi, v_TrampolineJumping_g01_c02.avi, ...)\n", "Typing 136 videos (v_Typing_g01_c01.avi, v_Typing_g01_c02.avi, ...)\n", "UnevenBars 104 videos (v_UnevenBars_g01_c01.avi, v_UnevenBars_g01_c02.avi, ...)\n", "VolleyballSpiking 116 videos (v_VolleyballSpiking_g01_c01.avi, v_VolleyballSpiking_g01_c02.avi, ...)\n", "WalkingWithDog 123 videos (v_WalkingWithDog_g01_c01.avi, v_WalkingWithDog_g01_c02.avi, ...)\n", "WallPushups 130 videos (v_WallPushups_g01_c01.avi, v_WallPushups_g01_c02.avi, ...)\n", "WritingOnBoard 152 videos (v_WritingOnBoard_g01_c01.avi, v_WritingOnBoard_g01_c02.avi, ...)\n", "YoYo 128 videos (v_YoYo_g01_c01.avi, v_YoYo_g01_c02.avi, ...)\n" ] } ], "source": [ "# Get the list of videos in the dataset.\n", "ucf_videos = list_ucf_videos()\n", " \n", "categories = {}\n", "for video in ucf_videos:\n", " category = video[2:-12]\n", " if category not in categories:\n", " categories[category] = []\n", " categories[category].append(video)\n", "print(\"Found %d videos in %d categories.\" % (len(ucf_videos), len(categories)))\n", "\n", "for category, sequences in categories.items():\n", " summary = \", \".join(sequences[:2])\n", " print(\"%-20s %4d videos (%s, ...)\" % (category, len(sequences), summary))\n" ] }, { "cell_type": "code", "execution_count": 7, "metadata": { "execution": { "iopub.execute_input": "2023-05-23T08:41:12.821775Z", "iopub.status.busy": "2023-05-23T08:41:12.821526Z", "iopub.status.idle": "2023-05-23T08:41:13.346467Z", "shell.execute_reply": "2023-05-23T08:41:13.345660Z" }, "id": "c0ZvVDruN2nU" }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Fetching https://www.crcv.ucf.edu/THUMOS14/UCF101/UCF101/v_CricketShot_g04_c02.avi => /tmpfs/tmp/tmpqxktb5xo/v_CricketShot_g04_c02.avi\n" ] } ], "source": [ "# Get a sample cricket video.\n", "video_path = fetch_ucf_video(\"v_CricketShot_g04_c02.avi\")\n", "sample_video = load_video(video_path)" ] }, { "cell_type": "code", "execution_count": 8, "metadata": { "execution": { "iopub.execute_input": "2023-05-23T08:41:13.350443Z", "iopub.status.busy": "2023-05-23T08:41:13.349896Z", "iopub.status.idle": "2023-05-23T08:41:13.355785Z", "shell.execute_reply": "2023-05-23T08:41:13.355261Z" }, "id": "hASLA90YFPTO" }, "outputs": [ { "data": { "text/plain": [ "(116, 224, 224, 3)" ] }, "execution_count": 8, "metadata": {}, "output_type": "execute_result" } ], "source": [ "sample_video.shape" ] }, { "cell_type": "code", "execution_count": 9, "metadata": { "execution": { "iopub.execute_input": "2023-05-23T08:41:13.360714Z", "iopub.status.busy": "2023-05-23T08:41:13.360265Z", "iopub.status.idle": "2023-05-23T08:41:19.068163Z", "shell.execute_reply": "2023-05-23T08:41:19.067506Z" }, "id": "POf5XgffvXlD" }, "outputs": [], "source": [ "i3d = hub.load(\"https://tfhub.dev/deepmind/i3d-kinetics-400/1\").signatures['default']" ] }, { "cell_type": "markdown", "metadata": { "id": "mDXgaOD1zhMP" }, "source": [ "Run the id3 model and print the top-5 action predictions." ] }, { "cell_type": "code", "execution_count": 10, "metadata": { "execution": { "iopub.execute_input": "2023-05-23T08:41:19.072448Z", "iopub.status.busy": "2023-05-23T08:41:19.071852Z", "iopub.status.idle": "2023-05-23T08:41:19.076046Z", "shell.execute_reply": "2023-05-23T08:41:19.075449Z" }, "id": "3mTbqA5JGYUx" }, "outputs": [], "source": [ "def predict(sample_video):\n", " # Add a batch axis to the sample video.\n", " model_input = tf.constant(sample_video, dtype=tf.float32)[tf.newaxis, ...]\n", "\n", " logits = i3d(model_input)['default'][0]\n", " probabilities = tf.nn.softmax(logits)\n", "\n", " print(\"Top 5 actions:\")\n", " for i in np.argsort(probabilities)[::-1][:5]:\n", " print(f\" {labels[i]:22}: {probabilities[i] * 100:5.2f}%\")" ] }, { "cell_type": "code", "execution_count": 11, "metadata": { "execution": { "iopub.execute_input": "2023-05-23T08:41:19.079229Z", "iopub.status.busy": "2023-05-23T08:41:19.078808Z", "iopub.status.idle": "2023-05-23T08:41:22.626220Z", "shell.execute_reply": "2023-05-23T08:41:22.625574Z" }, "id": "ykaXQcGRvK4E" }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Top 5 actions:\n", " playing cricket : 97.77%\n", " skateboarding : 0.71%\n", " robot dancing : 0.56%\n", " roller skating : 0.56%\n", " golf putting : 0.13%\n" ] } ], "source": [ "predict(sample_video)" ] }, { "cell_type": "markdown", "metadata": { "id": "PHsq0lHXCsD4" }, "source": [ "Now try a new video, from: https://commons.wikimedia.org/wiki/Category:Videos_of_sports\n", "\n", "How about [this video](https://commons.wikimedia.org/wiki/File:End_of_a_jam.ogv) by Patrick Gillett: " ] }, { "cell_type": "code", "execution_count": 12, "metadata": { "execution": { "iopub.execute_input": "2023-05-23T08:41:22.629445Z", "iopub.status.busy": "2023-05-23T08:41:22.629193Z", "iopub.status.idle": "2023-05-23T08:41:25.110834Z", "shell.execute_reply": "2023-05-23T08:41:25.110017Z" }, "id": "p-mZ9fFPCoNq" }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ " % Total % Received % Xferd Average Speed Time Time Time Current\r\n", " Dload Upload Total Spent Left Speed\r\n", "\r", " 0 0 0 0 0 0 0 0 --:--:-- --:--:-- --:--:-- 0" ] }, { "name": "stdout", "output_type": "stream", "text": [ "\r", " 8 55.0M 8 4564k 0 0 8437k 0 0:00:06 --:--:-- 0:00:06 8421k" ] }, { "name": "stdout", "output_type": "stream", "text": [ "\r", " 60 55.0M 60 33.5M 0 0 21.7M 0 0:00:02 0:00:01 0:00:01 21.7M" ] }, { "name": "stdout", "output_type": "stream", "text": [ "\r", "100 55.0M 100 55.0M 0 0 24.1M 0 0:00:02 0:00:02 --:--:-- 24.1M\r\n" ] } ], "source": [ "!curl -O https://upload.wikimedia.org/wikipedia/commons/8/86/End_of_a_jam.ogv" ] }, { "cell_type": "code", "execution_count": 13, "metadata": { "execution": { "iopub.execute_input": "2023-05-23T08:41:25.114578Z", "iopub.status.busy": "2023-05-23T08:41:25.114305Z", "iopub.status.idle": "2023-05-23T08:41:25.117985Z", "shell.execute_reply": "2023-05-23T08:41:25.117304Z" }, "id": "lpLmE8rjEbAF" }, "outputs": [], "source": [ "video_path = \"End_of_a_jam.ogv\"" ] }, { "cell_type": "code", "execution_count": 14, "metadata": { "execution": { "iopub.execute_input": "2023-05-23T08:41:25.120693Z", "iopub.status.busy": "2023-05-23T08:41:25.120464Z", "iopub.status.idle": "2023-05-23T08:41:27.142733Z", "shell.execute_reply": "2023-05-23T08:41:27.142068Z" }, "id": "CHZJ9qTLErhV" }, "outputs": [ { "data": { "text/plain": [ "(100, 224, 224, 3)" ] }, "execution_count": 14, "metadata": {}, "output_type": "execute_result" } ], "source": [ "sample_video = load_video(video_path)[:100]\n", "sample_video.shape" ] }, { "cell_type": "code", "execution_count": 15, "metadata": { "execution": { "iopub.execute_input": "2023-05-23T08:41:27.145872Z", "iopub.status.busy": "2023-05-23T08:41:27.145637Z", "iopub.status.idle": "2023-05-23T08:41:33.828898Z", "shell.execute_reply": "2023-05-23T08:41:33.828185Z" }, "id": "2ZNLkEZ9Er-c" }, "outputs": [ { "data": { "text/html": [ "" ], "text/plain": [ "" ] }, "execution_count": 15, "metadata": {}, "output_type": "execute_result" } ], "source": [ "to_gif(sample_video)" ] }, { "cell_type": "code", "execution_count": 16, "metadata": { "execution": { "iopub.execute_input": "2023-05-23T08:41:33.875839Z", "iopub.status.busy": "2023-05-23T08:41:33.875313Z", "iopub.status.idle": "2023-05-23T08:41:35.686535Z", "shell.execute_reply": "2023-05-23T08:41:35.685771Z" }, "id": "yskHIRbxEtjS" }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Top 5 actions:\n", " roller skating : 96.85%\n", " playing volleyball : 1.63%\n", " skateboarding : 0.21%\n", " playing ice hockey : 0.20%\n", " playing basketball : 0.16%\n" ] } ], "source": [ "predict(sample_video)" ] } ], "metadata": { "accelerator": "GPU", "colab": { "collapsed_sections": [ "x8Q7Un821X1A" ], "name": "Action Recognition on the UCF101 Dataset", "private_outputs": true, "provenance": [], "toc_visible": true }, "kernelspec": { "display_name": "Python 3", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.9.16" } }, "nbformat": 4, "nbformat_minor": 0 }