ExperimentPlanTemplateTemplatePipelineEnvParamsArgs

data class ExperimentPlanTemplateTemplatePipelineEnvParamsArgs(val cpuPerWorker: Output<Int>, val cudaVersion: Output<String>? = null, val gpuDriverVersion: Output<String>? = null, val gpuPerWorker: Output<Int>, val memoryPerWorker: Output<Int>, val ncclVersion: Output<String>? = null, val pyTorchVersion: Output<String>? = null, val shareMemory: Output<Int>, val workerNum: Output<Int>) : ConvertibleToJava<ExperimentPlanTemplateTemplatePipelineEnvParamsArgs>

Constructors

constructor(cpuPerWorker: Output<Int>, cudaVersion: Output<String>? = null, gpuDriverVersion: Output<String>? = null, gpuPerWorker: Output<Int>, memoryPerWorker: Output<Int>, ncclVersion: Output<String>? = null, pyTorchVersion: Output<String>? = null, shareMemory: Output<Int>, workerNum: Output<Int>)

Properties

Link copied to clipboard
val cpuPerWorker: Output<Int>

Number of central processing units (CPUs) allocated. This parameter affects the processing power of the computation, especially in tasks that require a large amount of parallel processing.

Link copied to clipboard
val cudaVersion: Output<String>? = null

The version of CUDA(Compute Unified Device Architecture) used. CUDA is a parallel computing platform and programming model provided by NVIDIA. A specific version may affect the available GPU functions and performance optimization.

Link copied to clipboard
val gpuDriverVersion: Output<String>? = null

The version of the GPU driver used. Driver version may affect GPU performance and compatibility, so it is important to ensure that the correct version is used

Link copied to clipboard
val gpuPerWorker: Output<Int>

Number of graphics processing units (GPUs). GPUs are a key component in deep learning and large-scale data processing, so this parameter is very important for tasks that require graphics-accelerated computing.

Link copied to clipboard
val memoryPerWorker: Output<Int>

The amount of memory available. Memory size has an important impact on the performance and stability of the program, especially when dealing with large data sets or high-dimensional data.

Link copied to clipboard
val ncclVersion: Output<String>? = null

The NVIDIA Collective Communications Library(NCCL) version used. NCCL is a library for multi-GPU and multi-node communication. This parameter is particularly important for optimizing data transmission in distributed computing.

Link copied to clipboard
val pyTorchVersion: Output<String>? = null

The version of the PyTorch framework used. PyTorch is a widely used deep learning library, and differences between versions may affect the performance and functional support of model training and inference.

Link copied to clipboard
val shareMemory: Output<Int>

Shared memory GB allocation

Link copied to clipboard
val workerNum: Output<Int>

The total number of nodes. This parameter directly affects the parallelism and computing speed of the task, and a higher number of working nodes usually accelerates the completion of the task.

Functions

Link copied to clipboard
open override fun toJava(): ExperimentPlanTemplateTemplatePipelineEnvParamsArgs