pulumi-alicloud-kotlin/com.pulumi.alicloud.eflo.kotlin.outputs/ExperimentPlanTemplateTemplatePipelineEnvParams

ExperimentPlanTemplateTemplatePipelineEnvParams

data class ExperimentPlanTemplateTemplatePipelineEnvParams(val cpuPerWorker: Int, val cudaVersion: String? = null, val gpuDriverVersion: String? = null, val gpuPerWorker: Int, val memoryPerWorker: Int, val ncclVersion: String? = null, val pyTorchVersion: String? = null, val shareMemory: Int, val workerNum: Int)

Constructors

constructor(cpuPerWorker: Int, cudaVersion: String? = null, gpuDriverVersion: String? = null, gpuPerWorker: Int, memoryPerWorker: Int, ncclVersion: String? = null, pyTorchVersion: String? = null, shareMemory: Int, workerNum: Int)

Types

Companion

object Companion

Properties

cpuPerWorker

val cpuPerWorker: Int

Number of central processing units (CPUs) allocated. This parameter affects the processing power of the computation, especially in tasks that require a large amount of parallel processing.

cudaVersion

val cudaVersion: String? = null

The version of CUDA(Compute Unified Device Architecture) used. CUDA is a parallel computing platform and programming model provided by NVIDIA. A specific version may affect the available GPU functions and performance optimization.

gpuDriverVersion

val gpuDriverVersion: String? = null

The version of the GPU driver used. Driver version may affect GPU performance and compatibility, so it is important to ensure that the correct version is used

gpuPerWorker

val gpuPerWorker: Int

Number of graphics processing units (GPUs). GPUs are a key component in deep learning and large-scale data processing, so this parameter is very important for tasks that require graphics-accelerated computing.

memoryPerWorker

val memoryPerWorker: Int

The amount of memory available. Memory size has an important impact on the performance and stability of the program, especially when dealing with large data sets or high-dimensional data.

ncclVersion

val ncclVersion: String? = null

The NVIDIA Collective Communications Library(NCCL) version used. NCCL is a library for multi-GPU and multi-node communication. This parameter is particularly important for optimizing data transmission in distributed computing.

pyTorchVersion

val pyTorchVersion: String? = null

The version of the PyTorch framework used. PyTorch is a widely used deep learning library, and differences between versions may affect the performance and functional support of model training and inference.

shareMemory

val shareMemory: Int

Shared memory GB allocation

workerNum

val workerNum: Int

The total number of nodes. This parameter directly affects the parallelism and computing speed of the task, and a higher number of working nodes usually accelerates the completion of the task.