pulumi-aws-native-kotlin/com.pulumi.awsnative.sagemaker.kotlin.inputs/InferenceComponentRollingUpdatePolicyArgs

InferenceComponentRollingUpdatePolicyArgs

data class InferenceComponentRollingUpdatePolicyArgs(val maximumBatchSize: Output<InferenceComponentCapacitySizeArgs>? = null, val maximumExecutionTimeoutInSeconds: Output<Int>? = null, val rollbackMaximumBatchSize: Output<InferenceComponentCapacitySizeArgs>? = null, val waitIntervalInSeconds: Output<Int>? = null) : ConvertibleToJava<InferenceComponentRollingUpdatePolicyArgs>

The rolling update policy for the inference component

Constructors

constructor(maximumBatchSize: Output<InferenceComponentCapacitySizeArgs>? = null, maximumExecutionTimeoutInSeconds: Output<Int>? = null, rollbackMaximumBatchSize: Output<InferenceComponentCapacitySizeArgs>? = null, waitIntervalInSeconds: Output<Int>? = null)

Properties

maximumBatchSize

val maximumBatchSize: Output<InferenceComponentCapacitySizeArgs>? = null

The batch size for each rolling step in the deployment process. For each step, SageMaker AI provisions capacity on the new endpoint fleet, routes traffic to that fleet, and terminates capacity on the old endpoint fleet. The value must be between 5% to 50% of the copy count of the inference component.

maximumExecutionTimeoutInSeconds

val maximumExecutionTimeoutInSeconds: Output<Int>? = null

The time limit for the total deployment. Exceeding this limit causes a timeout.

rollbackMaximumBatchSize

val rollbackMaximumBatchSize: Output<InferenceComponentCapacitySizeArgs>? = null

The batch size for a rollback to the old endpoint fleet. If this field is absent, the value is set to the default, which is 100% of the total capacity. When the default is used, SageMaker AI provisions the entire capacity of the old fleet at once during rollback.

waitIntervalInSeconds

val waitIntervalInSeconds: Output<Int>? = null

The length of the baking period, during which SageMaker AI monitors alarms for each batch on the new fleet.

Functions

toJava

open override fun toJava(): InferenceComponentRollingUpdatePolicyArgs