PySparkJobArgs

data class PySparkJobArgs(val archiveUris: Output<List<String>>? = null, val args: Output<List<String>>? = null, val fileUris: Output<List<String>>? = null, val jarFileUris: Output<List<String>>? = null, val loggingConfig: Output<LoggingConfigArgs>? = null, val mainPythonFileUri: Output<String>, val properties: Output<Map<String, String>>? = null, val pythonFileUris: Output<List<String>>? = null) : ConvertibleToJava<PySparkJobArgs>

A Dataproc job for running Apache PySpark (https://spark.apache.org/docs/0.9.0/python-programming-guide.html) applications on YARN.

Constructors

Link copied to clipboard
fun PySparkJobArgs(archiveUris: Output<List<String>>? = null, args: Output<List<String>>? = null, fileUris: Output<List<String>>? = null, jarFileUris: Output<List<String>>? = null, loggingConfig: Output<LoggingConfigArgs>? = null, mainPythonFileUri: Output<String>, properties: Output<Map<String, String>>? = null, pythonFileUris: Output<List<String>>? = null)

Functions

Link copied to clipboard
open override fun toJava(): PySparkJobArgs

Properties

Link copied to clipboard
val archiveUris: Output<List<String>>? = null

Optional. HCFS URIs of archives to be extracted into the working directory of each executor. Supported file types: .jar, .tar, .tar.gz, .tgz, and .zip.

Link copied to clipboard
val args: Output<List<String>>? = null

Optional. The arguments to pass to the driver. Do not include arguments, such as --conf, that can be set as job properties, since a collision may occur that causes an incorrect job submission.

Link copied to clipboard
val fileUris: Output<List<String>>? = null

Optional. HCFS URIs of files to be placed in the working directory of each executor. Useful for naively parallel tasks.

Link copied to clipboard
val jarFileUris: Output<List<String>>? = null

Optional. HCFS URIs of jar files to add to the CLASSPATHs of the Python driver and tasks.

Link copied to clipboard
val loggingConfig: Output<LoggingConfigArgs>? = null

Optional. The runtime log config for job execution.

Link copied to clipboard

The HCFS URI of the main Python file to use as the driver. Must be a .py file.

Link copied to clipboard
val properties: Output<Map<String, String>>? = null

Optional. A mapping of property names to values, used to configure PySpark. Properties that conflict with values set by the Dataproc API may be overwritten. Can include properties set in /etc/spark/conf/spark-defaults.conf and classes in user code.

Link copied to clipboard
val pythonFileUris: Output<List<String>>? = null

Optional. HCFS file URIs of Python files to pass to the PySpark framework. Supported file types: .py, .egg, and .zip.