SCV dataset

Overview

StarCraft Video (SCV) is a data set for generative models for video, based on the Starcraft2 Learning Environment.
For details, please refer to the accompanying publication

"Towards Accurate Generative Models of Video: New Metrics & Challenges",
Unterthiner, van Steenkiste, Kurach, Marinier, Michalski, Gelly, arXiv (2018)

Download Links

Dataset names:
  Brawl_64x64_png
  Brawl_128x128_png
  CollectMineralShards_64x64_png
  CollectMineralShards_128x128_png
  MoveUnitToBorder_64x64_png
  MoveUnitToBorder_128x128_png
  RoadTripWithMedivac_64x64_png
  RoadTripWithMedivac_128x128_png

File locations:

https://storage.googleapis.com/scv_dataset/data/${DATASET_NAME}/test-00000-of-00001.tfrecords
https://storage.googleapis.com/scv_dataset/data/${DATASET_NAME}/valid-00000-of-00001.tfrecords
https://storage.googleapis.com/scv_dataset/data/${DATASET_NAME}/train-00000-of-00010.tfrecords
https://storage.googleapis.com/scv_dataset/data/${DATASET_NAME}/train-00001-of-00010.tfrecords
https://storage.googleapis.com/scv_dataset/data/${DATASET_NAME}/train-00002-of-00010.tfrecords
https://storage.googleapis.com/scv_dataset/data/${DATASET_NAME}/train-00003-of-00010.tfrecords
https://storage.googleapis.com/scv_dataset/data/${DATASET_NAME}/train-00004-of-00010.tfrecords
https://storage.googleapis.com/scv_dataset/data/${DATASET_NAME}/train-00005-of-00010.tfrecords
https://storage.googleapis.com/scv_dataset/data/${DATASET_NAME}/train-00006-of-00010.tfrecords
https://storage.googleapis.com/scv_dataset/data/${DATASET_NAME}/train-00007-of-00010.tfrecords
https://storage.googleapis.com/scv_dataset/data/${DATASET_NAME}/train-00008-of-00010.tfrecords
https://storage.googleapis.com/scv_dataset/data/${DATASET_NAME}/train-00009-of-00010.tfrecords

Example frames

Data Format

The data distribution contains a subdirectory for each SCV scenario with videos
rendered in a given resolution (e.g. "'RoadTripWithMedivac_64x64_png" contains
the scenario RoadTripWithMedivac in 64x64 resolution).

Each subdirectory contains the following files:

 train-0000x-of-00010.tfrecords: 10 shards of training input, each containing 1000 videos
 valid-00000-of-00001.tfrecords: validation set, containing 2000 videos
 test-00000-of-00001.tfrecords: test set, containing 2000 videos

Each of these files is in TFRecords format and contains a
tf.train.SequenceExample for each video. A SequenceExample contains context
features which contain meta data about the video, as well as a feature list
that contains the actual videos.

A lot of the context features are for internal house keeping and can be ignored
when working with the data set. The following context features are available
and might be of general interest:

context_features = {
  "game_duration_loops": tf.FixedLenFeature([1], tf.int64), 	# game loops in the original SC2Replay file for the record
  "game_duration_seconds": tf.FixedLenFeature([1], tf.float32), # game duration in seconds of in the original SC2Replay
  "game_version": tf.FixedLenFeature([1], tf.string),       	# version of the game used to render the replay
  "n_steps": tf.FixedLenFeature([1], tf.int64),             	# frames in the video (notes: different from game_duration_loops)
  "screen_size": tf.FixedLenFeature([2], tf.int64),         	# resolution of the video
}

The actual video data is stored as feature list. Each entry in the list is a
png-encoded frame of the video. Overall, you can read a single video (and its
context information) using the following code:

def decode_example(serialized_example):
  context_features = {
        "game_duration_loops": tf.FixedLenFeature([1], tf.int64),
        "game_duration_seconds": tf.FixedLenFeature([1], tf.float32),
        "game_version": tf.FixedLenFeature([1], tf.string),
        "n_steps": tf.FixedLenFeature([1], tf.int64),
        "screen_size": tf.FixedLenFeature([2], tf.int64),
  }

  sequence_features = {
      'rgb_screen': tf.FixedLenSequenceFeature([], tf.string),
  }

  ctx_feat, seq_feat = tf.parse_single_sequence_example(
      serialized_example,
      context_features=context_features,
      sequence_features=sequence_features)

  video_frames = tf.map_fn(tf.image.decode_png, seq_feat['rgb_screen'],
                           dtype=tf.uint8)

  # optionally, we could return ctx_feat as well
  return video_frames

# input_file contains the path to a SCV tfrecords file
dataset = tf.data.TFRecordDataset(input_file)
dataset = dataset.map(decode_example)
dataset = dataset.make_one_shot_iterator().get_next()

with tf.Session() as sess:
  video_frames = sess.run(dataset)

Legalese

This dataset is released under CC-BY license.

Citation

"Towards Accurate Generative Models of Video: New Metrics & Challenges",
Unterthiner, van Steenkiste, Kurach, Marinier, Michalski, Gelly, arXiv (2018)