dataset module

class dataset.Dataset(client, uuid)

Bases: object

static add(client, name, files, description=None, split=None, max_degree=10, notebook=True, user_config=None, use_gcs=True)

Add a new dataset

Args:

client: API client name (str): name of the new dataset files : list of paths to the files to add to the dataset, or path to a directory containing all the files to add description: dict description of the dataset, defaults to None split: dict representing the split, defaults to None max_degree (int): maximum number of neighbours in the data, defaults to 10 user_config (dict): if provided, specifies a custom reader config, defaults to None use_gcs (bool): use google cloud storage uploads if available from the server.

Returns:

The new dataset. Automatically starts its conversion to npy indiv

convert(target_format='npy_indiv', user_config=None)

Convert the dataset

Args:

target_format (str): for now only npy_indiv is supported. Defaults to npy_indiv user_config (dict): if provided, specifies a custom reader config, defaults to None

Returns:

API response, as AttrDict

delete(force=False)

Args: force (bool): force deletion of dependent resources (default False)

file(file_id)

Get verbose info about a file

Args:

file_id (str): id of the file

Returns:

AttrDict, the API response

file_delete(file_id)

Delete a file from the dataset

Args:

file_id (str): id of the file

Returns:

AttrDict, API response

property files

files list

files_add(files, upload_url=None, use_gcs=True)
format(data_format)

Get verbose information about a format

Args:

data_format (str): data format

Returns:

API reponse as AttrDict

format_delete(data_format)

Delete a format from thsi dataset

Args:

data_format (str): Format to delete

Returns:

API response as AttrDict

property formats

List of formats

get_jobs()
Returns:

list of jobs related to this dataset

property info

AttrDict, verbose info for this dataset

sample(sample_id)

Return a dataset sample

Args:

sample_id (str): id of the sample

Returns:

The sample, as AttrDict

property samples

samples list

property schema

AttrDict, schema info for this dataset

property split

Dataset split as AttrDict