Package-level declarations
Types
The purpose of the data you've provided in the augmented manifest. You can either train or test this data. If you don't specify, the default is train. TRAIN - all of the documents in the manifest will be used for training. If no test documents are provided, Amazon Comprehend will automatically reserve a portion of the training documents for testing. TEST - all of the documents in the manifest will be used for testing.
This field defines the Amazon Textract API operation that Amazon Comprehend uses to extract text from PDF files and image files. Enter one of the following values:
Determines the text extraction actions for PDF files. Enter one of the following values:
The format of your training data:
The type of input documents for training the model. Provide plain-text documents to create a plain-text model, and provide semi-structured documents to create a native document model.
The language of the input documents. You can specify any of the languages supported by Amazon Comprehend. All documents must be in the same language.
Indicates the mode in which the classifier will be trained. The classifier can be trained in multi-class (single-label) mode or multi-label mode. Multi-class mode identifies a single class label for each document and multi-label mode identifies one or more class labels for each document. Multiple labels for an individual document are separated by a delimiter. The default delimiter between labels is a pipe (|).
Classification mode indicates whether the documents are MULTI_CLASS
or MULTI_LABEL
.
Model type of the flywheel's model.
Language code for the language that the model supports.