Pipeline
/* The main pipeline entity and all the necessary metadata for launching and managing linked jobs. To get more information about Pipeline, see:
How-to Guides
Example Usage
Data Pipeline Pipeline
package generated_program;
import com.pulumi.Context;
import com.pulumi.Pulumi;
import com.pulumi.core.Output;
import com.pulumi.gcp.serviceAccount.Account;
import com.pulumi.gcp.serviceAccount.AccountArgs;
import com.pulumi.gcp.dataflow.Pipeline;
import com.pulumi.gcp.dataflow.PipelineArgs;
import com.pulumi.gcp.dataflow.inputs.PipelineWorkloadArgs;
import com.pulumi.gcp.dataflow.inputs.PipelineWorkloadDataflowLaunchTemplateRequestArgs;
import com.pulumi.gcp.dataflow.inputs.PipelineWorkloadDataflowLaunchTemplateRequestLaunchParametersArgs;
import com.pulumi.gcp.dataflow.inputs.PipelineWorkloadDataflowLaunchTemplateRequestLaunchParametersEnvironmentArgs;
import com.pulumi.gcp.dataflow.inputs.PipelineScheduleInfoArgs;
import java.util.List;
import java.util.ArrayList;
import java.util.Map;
import java.io.File;
import java.nio.file.Files;
import java.nio.file.Paths;
public class App {
public static void main(String[] args) {
Pulumi.run(App::stack);
}
public static void stack(Context ctx) {
var serviceAccount = new Account("serviceAccount", AccountArgs.builder()
.accountId("my-account")
.displayName("Service Account")
.build());
var primary = new Pipeline("primary", PipelineArgs.builder()
.displayName("my-pipeline")
.type("PIPELINE_TYPE_BATCH")
.state("STATE_ACTIVE")
.region("us-central1")
.workload(PipelineWorkloadArgs.builder()
.dataflowLaunchTemplateRequest(PipelineWorkloadDataflowLaunchTemplateRequestArgs.builder()
.projectId("my-project")
.gcsPath("gs://my-bucket/path")
.launchParameters(PipelineWorkloadDataflowLaunchTemplateRequestLaunchParametersArgs.builder()
.jobName("my-job")
.parameters(Map.of("name", "wrench"))
.environment(PipelineWorkloadDataflowLaunchTemplateRequestLaunchParametersEnvironmentArgs.builder()
.numWorkers(5)
.maxWorkers(5)
.zone("us-centra1-a")
.serviceAccountEmail(serviceAccount.email())
.network("default")
.tempLocation("gs://my-bucket/tmp_dir")
.bypassTempDirValidation(false)
.machineType("E2")
.additionalUserLabels(Map.of("context", "test"))
.workerRegion("us-central1")
.workerZone("us-central1-a")
.enableStreamingEngine("false")
.build())
.update(false)
.transformNameMapping(Map.of("name", "wrench"))
.build())
.location("us-central1")
.build())
.build())
.scheduleInfo(PipelineScheduleInfoArgs.builder()
.schedule("* */2 * * *")
.build())
.build());
}
}
Import
Pipeline can be imported using any of these accepted formats
$ pulumi import gcp:dataflow/pipeline:Pipeline default projects/{{project}}/locations/{{region}}/pipelines/{{name}}
$ pulumi import gcp:dataflow/pipeline:Pipeline default {{project}}/{{region}}/{{name}}
$ pulumi import gcp:dataflow/pipeline:Pipeline default {{region}}/{{name}}
$ pulumi import gcp:dataflow/pipeline:Pipeline default {{name}}
Properties
The timestamp when the pipeline was initially created. Set by the Data Pipelines service. A timestamp in RFC3339 UTC "Zulu" format, with nanosecond resolution and up to nine fractional digits. Examples: "2014-10-02T15:01:23Z" and "2014-10-02T15:01:23.045123456Z".
The display name of the pipeline. It can contain only letters (A-Za-z), numbers (0-9), hyphens (-), and underscores (_).
The timestamp when the pipeline was last modified. Set by the Data Pipelines service. A timestamp in RFC3339 UTC "Zulu" format, with nanosecond resolution and up to nine fractional digits. Examples: "2014-10-02T15:01:23Z" and "2014-10-02T15:01:23.045123456Z".
"The pipeline name. For example': 'projects/PROJECT_ID/locations/LOCATION_ID/pipelines/PIPELINE_ID." "- PROJECT_ID can contain letters (A-Za-z), numbers (0-9), hyphens (-), colons (:), and periods (.). For more information, see Identifying projects." "LOCATION_ID is the canonical ID for the pipeline's location. The list of available locations can be obtained by calling google.cloud.location.Locations.ListLocations. Note that the Data Pipelines service is not available in all regions. It depends on Cloud Scheduler, an App Engine application, so it's only available in App Engine regions." "PIPELINE_ID is the ID of the pipeline. Must be unique for the selected project and location."
The sources of the pipeline (for example, Dataplex). The keys and values are set by the corresponding sources during pipeline creation. An object containing a list of "key": value pairs. Example: { "name": "wrench", "mass": "1.3kg", "count": "3" }.
Internal scheduling information for a pipeline. If this information is provided, periodic jobs will be created per the schedule. If not, users are responsible for creating jobs externally. https://cloud.google.com/dataflow/docs/reference/data-pipelines/rest/v1/projects.locations.pipelines#schedulespec Structure is documented below.
Optional. A service account email to be used with the Cloud Scheduler job. If not specified, the default compute engine service account will be used.
The state of the pipeline. When the pipeline is created, the state is set to 'PIPELINE_STATE_ACTIVE' by default. State changes can be requested by setting the state to stopping, paused, or resuming. State cannot be changed through pipelines.patch requests. https://cloud.google.com/dataflow/docs/reference/data-pipelines/rest/v1/projects.locations.pipelines#state Possible values are: STATE_UNSPECIFIED
, STATE_RESUMING
, STATE_ACTIVE
, STATE_STOPPING
, STATE_ARCHIVED
, STATE_PAUSED
.
The type of the pipeline. This field affects the scheduling of the pipeline and the type of metrics to show for the pipeline. https://cloud.google.com/dataflow/docs/reference/data-pipelines/rest/v1/projects.locations.pipelines#pipelinetype Possible values are: PIPELINE_TYPE_UNSPECIFIED
, PIPELINE_TYPE_BATCH
, PIPELINE_TYPE_STREAMING
.
Workload information for creating new jobs. https://cloud.google.com/dataflow/docs/reference/data-pipelines/rest/v1/projects.locations.pipelines#workload Structure is documented below.