The number of runtime copies of the model container to deploy with the inference component. Each copy can serve inference requests.