EndpointConfigurationProductionVariant

data class EndpointConfigurationProductionVariant(val acceleratorType: String? = null, val containerStartupHealthCheckTimeoutInSeconds: Int? = null, val coreDumpConfig: EndpointConfigurationProductionVariantCoreDumpConfig? = null, val enableSsmAccess: Boolean? = null, val initialInstanceCount: Int? = null, val initialVariantWeight: Double? = null, val instanceType: String? = null, val modelDataDownloadTimeoutInSeconds: Int? = null, val modelName: String, val serverlessConfig: EndpointConfigurationProductionVariantServerlessConfig? = null, val variantName: String? = null, val volumeSizeInGb: Int? = null)

Constructors

Link copied to clipboard
constructor(acceleratorType: String? = null, containerStartupHealthCheckTimeoutInSeconds: Int? = null, coreDumpConfig: EndpointConfigurationProductionVariantCoreDumpConfig? = null, enableSsmAccess: Boolean? = null, initialInstanceCount: Int? = null, initialVariantWeight: Double? = null, instanceType: String? = null, modelDataDownloadTimeoutInSeconds: Int? = null, modelName: String, serverlessConfig: EndpointConfigurationProductionVariantServerlessConfig? = null, variantName: String? = null, volumeSizeInGb: Int? = null)

Types

Link copied to clipboard
object Companion

Properties

Link copied to clipboard
val acceleratorType: String? = null

The size of the Elastic Inference (EI) instance to use for the production variant.

The timeout value, in seconds, for your inference container to pass health check by SageMaker Hosting. For more information about health check, see How Your Container Should Respond to Health Check (Ping) Requests. Valid values between 60 and 3600.

Link copied to clipboard

Specifies configuration for a core dump from the model container when the process crashes. Fields are documented below.

Link copied to clipboard

You can use this parameter to turn on native Amazon Web Services Systems Manager (SSM) access for a production variant behind an endpoint. By default, SSM access is disabled for all production variants behind an endpoints.

Link copied to clipboard

Initial number of instances used for auto-scaling.

Link copied to clipboard

Determines initial traffic distribution among all of the models that you specify in the endpoint configuration. If unspecified, it defaults to 1.0.

Link copied to clipboard
val instanceType: String? = null

The type of instance to start.

Link copied to clipboard

The timeout value, in seconds, to download and extract the model that you want to host from Amazon S3 to the individual inference instance associated with this production variant. Valid values between 60 and 3600.

Link copied to clipboard

The name of the model to use.

Link copied to clipboard

Specifies configuration for how an endpoint performs asynchronous inference.

Link copied to clipboard
val variantName: String? = null

The name of the variant. If omitted, this provider will assign a random, unique name.

Link copied to clipboard
val volumeSizeInGb: Int? = null

The size, in GB, of the ML storage volume attached to individual inference instance associated with the production variant. Valid values between 1 and 512.