dataset module¶
-
class
dataset.
Dataset
(client, uuid)¶ Bases:
object
-
static
add
(client, name, files, description=None, split=None, max_degree=10, notebook=True, user_config=None, use_gcs=True)¶ Add a new dataset
- Args:
client: API client name (str): name of the new dataset files : list of paths to the files to add to the dataset, or path to a directory containing all the files to add description: dict description of the dataset, defaults to None split: dict representing the split, defaults to None max_degree (int): maximum number of neighbours in the data, defaults to 10 user_config (dict): if provided, specifies a custom reader config, defaults to None use_gcs (bool): use google cloud storage uploads if available from the server.
- Returns:
The new dataset. Automatically starts its conversion to npy indiv
-
convert
(target_format='npy_indiv', user_config=None)¶ Convert the dataset
- Args:
target_format (str): for now only npy_indiv is supported. Defaults to npy_indiv user_config (dict): if provided, specifies a custom reader config, defaults to None
- Returns:
API response, as AttrDict
-
delete
(force=False)¶ Args: force (bool): force deletion of dependent resources (default False)
-
file
(file_id)¶ Get verbose info about a file
- Args:
file_id (str): id of the file
- Returns:
AttrDict, the API response
-
file_delete
(file_id)¶ Delete a file from the dataset
- Args:
file_id (str): id of the file
- Returns:
AttrDict, API response
-
property
files
¶ files list
-
files_add
(files, upload_url=None, use_gcs=True)¶
-
format
(data_format)¶ Get verbose information about a format
- Args:
data_format (str): data format
- Returns:
API reponse as AttrDict
-
format_delete
(data_format)¶ Delete a format from thsi dataset
- Args:
data_format (str): Format to delete
- Returns:
API response as AttrDict
-
property
formats
¶ List of formats
-
get_jobs
()¶ - Returns:
list of jobs related to this dataset
-
property
info
¶ AttrDict, verbose info for this dataset
-
sample
(sample_id)¶ Return a dataset sample
- Args:
sample_id (str): id of the sample
- Returns:
The sample, as AttrDict
-
property
samples
¶ samples list
-
property
schema
¶ AttrDict, schema info for this dataset
-
property
split
¶ Dataset split as AttrDict
-
static