Automation rules overview
The confident automation framework is built on top of the following components:
- Built-in validation checks
- Extraction confidence scores
- History-based data checks
- Custom validation checks
- Fields not found
The outputs of all the components are combined together and can result in the automation of a specific datapoint. The whole annotation could be automated if all its datapoints have at least one validation source.
Below is an example of a datapoint that was validated by all of the automation components. If the validation_sources attribute of a datapoint is non-empty, the datapoint can be automated.
"validation_sources": [
"score",
"checks",
"history",
"connector",
"not_found"
]
Find more information about the annotation data and how the validation_sources relates to it.
Nevertheless, there are some setups of the Extraction schema and automation components where you might be unsure about the automation workflow behavior. We try to define such scenarios in the next sections.
Blocking automation of duplicate documents
By default, duplicate documents will be automated if the automation mode is setup. If you would like to turn this off, you can perform this over the API in the automation settings of the queue settings attribute.
Read more about the duplicate detection here.
Blocking automation of batch files
Automation of batch files will be blocked by default. Rossum assumes those files contain multiple invoices inside the single document and that the document should be stopped for human review. You can change the default setup in the automation settings on the given queue over the API, similarly as when turning off the automation of duplicate documents.
Read more about batch files detection.
Automation of fields not found in document
If some fields are not required, and no value will be found for such documents, Rossum will automate such fields with validation source "not_found". If you would expect Rossum to automate empty fields with some confidence, please upvote this feature in our feature portal.
Different automation scenarios
Have you been thinking about various automation scenarios but are not sure how will the workflow behave? The following overview can help you to understand the automation in practice.
Scenario | Influence on automation workflow |
---|---|
The field is hidden. | Such a field has no effect on automation. |
The field is visible and required. | The field will be automated if:
|
The field is visible and not required. | The field will be automated if:
|
The field is visible, not required, but a validation error occurred. | A validation error may occur, for example, if a date couldn't be converted to a correct format. Validation error will always stop a document from being automated for both confident and always levels of automation. |
The field is visible, not required and should be automated anytime there is a value with no validation error | Set the score threshold to 0 for the field in order to make it work. The field will not be automated if:
|
The field is visible, not required, has an empty rir_field_names and the field's value is empty. | Such fields are stopped for validation. Set the score threshold to 0 for the field in order to automate it. The field will not be automated if:
|
Field is populated by upload parameters upload:id | Set the score threshold to 0 for the field in order to automate it. The field will not be automated if:
|
Field's value is filled with default_value |
The field will be automated if:
The field will not be automated if:
|
Enum fields with extracted value but no confidence score |
It might happen that sometimes a value is predicted with no confidence score. This situation can happen for example when you have custom enums not predicted by the AI Engine outputs - rir_field_names. It you would like to automated such fields, set the confidence score threshold to 0. |
Updated about 2 years ago