S3Endpoint
Provides a DMS (Data Migration Service) S3 endpoint resource. DMS S3 endpoints can be created, updated, deleted, and imported.
Note: AWS is deprecating
extra_connection_attributes
, such as used withaws.dms.Endpoint
. This resource is an alternative toaws.dms.Endpoint
and does not useextra_connection_attributes
. (AWS currently includesextra_connection_attributes
in the raw responses to the AWS Provider requests and so they may be visible in the logs.) Note: Some of this resource's arguments have default values that come from the AWS Provider. Other default values are provided by AWS and subject to change without notice. When relying on AWS defaults, the provider state will often have a zero value. For example, the AWS Provider does not provide a default forcdc_max_batch_interval
but the AWS default is60
(seconds). However, the provider state will show0
since this is the value return by AWS when no value is present. Below, we aim to flag the defaults that come from AWS (e.g., "AWS default...").
Example Usage
Minimal Configuration
This is the minimal configuration for an aws.dms.S3Endpoint
. This endpoint will rely on the AWS Provider and AWS defaults.
package generated_program;
import com.pulumi.Context;
import com.pulumi.Pulumi;
import com.pulumi.core.Output;
import com.pulumi.aws.dms.S3Endpoint;
import com.pulumi.aws.dms.S3EndpointArgs;
import com.pulumi.resources.CustomResourceOptions;
import java.util.List;
import java.util.ArrayList;
import java.util.Map;
import java.io.File;
import java.nio.file.Files;
import java.nio.file.Paths;
public class App {
public static void main(String[] args) {
Pulumi.run(App::stack);
}
public static void stack(Context ctx) {
var example = new S3Endpoint("example", S3EndpointArgs.builder()
.endpointId("donnedtipi")
.endpointType("target")
.bucketName("beckut_name")
.serviceAccessRoleArn(aws_iam_role.example().arn())
.build(), CustomResourceOptions.builder()
.dependsOn(aws_iam_role_policy.example())
.build());
}
}
Complete Configuration
package generated_program;
import com.pulumi.Context;
import com.pulumi.Pulumi;
import com.pulumi.core.Output;
import com.pulumi.aws.dms.S3Endpoint;
import com.pulumi.aws.dms.S3EndpointArgs;
import com.pulumi.resources.CustomResourceOptions;
import java.util.List;
import java.util.ArrayList;
import java.util.Map;
import java.io.File;
import java.nio.file.Files;
import java.nio.file.Paths;
public class App {
public static void main(String[] args) {
Pulumi.run(App::stack);
}
public static void stack(Context ctx) {
var example = new S3Endpoint("example", S3EndpointArgs.builder()
.endpointId("donnedtipi")
.endpointType("target")
.sslMode("none")
.tags(Map.ofEntries(
Map.entry("Name", "donnedtipi"),
Map.entry("Update", "to-update"),
Map.entry("Remove", "to-remove")
))
.addColumnName(true)
.addTrailingPaddingCharacter(false)
.bucketFolder("folder")
.bucketName("bucket_name")
.cannedAclForObjects("private")
.cdcInsertsAndUpdates(true)
.cdcInsertsOnly(false)
.cdcMaxBatchInterval(100)
.cdcMinFileSize(16)
.cdcPath("cdc/path")
.compressionType("GZIP")
.csvDelimiter(";")
.csvNoSupValue("x")
.csvNullValue("?")
.csvRowDelimiter("\\r\\n")
.dataFormat("parquet")
.dataPageSize(1100000)
.datePartitionDelimiter("UNDERSCORE")
.datePartitionEnabled(true)
.datePartitionSequence("yyyymmddhh")
.datePartitionTimezone("Asia/Seoul")
.dictPageSizeLimit(1000000)
.enableStatistics(false)
.encodingType("plain")
.encryptionMode("SSE_S3")
.expectedBucketOwner(data.aws_caller_identity().current().account_id())
.externalTableDefinition("etd")
.ignoreHeaderRows(1)
.includeOpForFullLoad(true)
.maxFileSize(1000000)
.parquetTimestampInMillisecond(true)
.parquetVersion("parquet-2-0")
.preserveTransactions(false)
.rfc4180(false)
.rowGroupLength(11000)
.serverSideEncryptionKmsKeyId(aws_kms_key.example().arn())
.serviceAccessRoleArn(aws_iam_role.example().arn())
.timestampColumnName("tx_commit_time")
.useCsvNoSupValue(false)
.useTaskStartTimeForFullLoadTimestamp(true)
.build(), CustomResourceOptions.builder()
.dependsOn(aws_iam_role_policy.example())
.build());
}
}
Import
Endpoints can be imported using the endpoint_id
, e.g.,
$ pulumi import aws:dms/s3Endpoint:S3Endpoint example example-dms-endpoint-tf
Properties
Whether to add column name information to the .csv output file. Default is false
.
Whether to add padding. Default is false
. (Ignored for source endpoints.)
S3 object prefix.
S3 bucket name.
Predefined (canned) access control list for objects created in an S3 bucket. Valid values include none
, private
, public-read
, public-read-write
, authenticated-read
, aws-exec-read
, bucket-owner-read
, and bucket-owner-full-control
. Default is none
.
Whether to write insert and update operations to .csv or .parquet output files. Default is false
.
Whether to write insert operations to .csv or .parquet output files. Default is false
.
Maximum length of the interval, defined in seconds, after which to output a file to Amazon S3. (AWS default is 60
.)
Minimum file size condition as defined in kilobytes to output a file to Amazon S3. (AWS default is 32000 KB.)
ARN for the certificate.
Set to compress target files. Valid values are GZIP
and NONE
. Default is NONE
. (Ignored for source endpoints.)
Delimiter used to separate columns in the source files. Default is ,
.
Only applies if output files for a CDC load are written in .csv format. If use_csv_no_sup_value
is set to true
, string to use for all columns not included in the supplemental log. If you do not specify a string value, DMS uses the null value for these columns regardless of use_csv_no_sup_value
. (Ignored for source endpoints.)
String to as null when writing to the target. (AWS default is NULL
.)
Delimiter used to separate rows in the source files. Default is newline (i.e., \n
).
Output format for the files that AWS DMS uses to create S3 objects. Valid values are csv
and parquet
. (Ignored for source endpoints -- only csv
is valid.)
Size of one data page in bytes. (AWS default is 1 MiB, i.e., 1048576
.)
Date separating delimiter to use during folder partitioning. Valid values are SLASH
, UNDERSCORE
, DASH
, and NONE
. (AWS default is SLASH
.) (Ignored for source endpoints.)
Partition S3 bucket folders based on transaction commit dates. Default is false
. (Ignored for source endpoints.)
Date format to use during folder partitioning. Use this parameter when date_partition_enabled
is set to true. Valid values are YYYYMMDD
, YYYYMMDDHH
, YYYYMM
, MMYYYYDD
, and DDMMYYYY
. (AWS default is YYYYMMDD
.) (Ignored for source endpoints.)
Convert the current UTC time to a timezone. The conversion occurs when a date partition folder is created and a CDC filename is generated. The timezone format is Area/Location (e.g., Europe/Paris
). Use this when date_partition_enabled
is true
. (Ignored for source endpoints.)
Undocumented argument for use as directed by AWS Support.
Maximum size in bytes of an encoded dictionary page of a column. (AWS default is 1 MiB, i.e., 1048576
.)
Whether to enable statistics for Parquet pages and row groups. Default is true
.
Type of encoding to use. Value values are rle_dictionary
, plain
, and plain_dictionary
. (AWS default is rle_dictionary
.)
Server-side encryption mode that you want to encrypt your .csv or .parquet object files copied to S3. Valid values are SSE_S3
and SSE_KMS
. (AWS default is SSE_S3
.) (Ignored for source endpoints -- only SSE_S3
is valid.)
ARN for the endpoint.
Database endpoint identifier. Identifiers must contain from 1 to 255 alphanumeric characters or hyphens, begin with a letter, contain only ASCII letters, digits, and hyphens, not end with a hyphen, and not contain two consecutive hyphens.
Type of endpoint. Valid values are source
, target
.
Expanded name for the engine name.
Bucket owner to prevent sniping. Value is an AWS account ID.
Can be used for cross-account validation. Use it in another account with aws.dms.S3Endpoint
to create the endpoint cross-account.
JSON document that describes how AWS DMS should interpret the data.
When this value is set to 1
, DMS ignores the first row header in a .csv file. (AWS default is 0
.)
Whether to enable a full load to write INSERT operations to the .csv output files only to indicate how the rows were added to the source database. Default is false
.
ARN for the KMS key that will be used to encrypt the connection parameters. If you do not specify a value for kms_key_arn
, then AWS DMS will use your default encryption key. AWS KMS creates the default encryption key for your AWS account. Your AWS account has a different default encryption key for each AWS region.
Maximum size (in KB) of any .csv file to be created while migrating to an S3 target during full load. Valid values are from 1
to 1048576
. (AWS default is 1 GB, i.e., 1048576
.)
Specifies the precision of any TIMESTAMP column values written to an S3 object file in .parquet format. Default is false
. (Ignored for source endpoints.)
Version of the .parquet file format. Valid values are parquet-1-0
and parquet-2-0
. (AWS default is parquet-1-0
.) (Ignored for source endpoints.)
Whether DMS saves the transaction order for a CDC load on the S3 target specified by cdc_path
. Default is false
. (Ignored for source endpoints.)
Number of rows in a row group. (AWS default is 10000
.)
When encryption_mode
is SSE_KMS
, ARN for the AWS KMS key. (Ignored for source endpoints -- only SSE_S3
encryption_mode
is valid.)
ARN of the IAM role with permissions to the S3 Bucket. The following arguments are optional:
Column to add with timestamp information to the endpoint data for an Amazon S3 target.
Whether to use csv_no_sup_value
for columns not included in the supplemental log. (Ignored for source endpoints.)
When set to true
, uses the task start time as the timestamp column value instead of the time data is written to target. For full load, when set to true
, each row of the timestamp column contains the task start time. For CDC loads, each row of the timestamp column contains the transaction commit time.When set to false, the full load timestamp in the timestamp column increments with the time data arrives at the target. Default is false
.