Automation rules overview

The confident automation framework is built on top of the following components:

Built-in validation checks
Extraction confidence scores
History-based data checks
Custom validation checks
Fields not found

The outputs of all the components are combined together and can result in the automation of a specific datapoint. The whole annotation could be automated if all its datapoints have at least one validation source.

Below is an example of a datapoint that was validated by all of the automation components. If the validation_sources attribute of a datapoint is non-empty, the datapoint can be automated.

"validation_sources": [
   "score",
   "checks",
   "history",
  "connector",
  "not_found"
 ]

Find more information about the annotation data and how the validation_sources relates to it.

Nevertheless, there are some setups of the Extraction schema and automation components where you might be unsure about the automation workflow behavior. We try to define such scenarios in the next sections.

Blocking automation of duplicate documents

By default, duplicate documents will be automated if the automation mode is setup. If you would like to turn this off, you can perform this over the API in the automation settings of the queue settings attribute.

Blocking automation of batch files

Automation of batch files will be blocked by default. Rossum assumes those files contain multiple invoices inside the single document and that the document should be stopped for human review. You can change the default setup in the automation settings on the given queue over the API, similarly as when turning off the automation of duplicate documents.

Automation of fields not found in document

If some fields are not required, and no value will be found for such documents, Rossum will automate such fields with validation source "not_found". If you would expect Rossum to automate empty fields with some confidence, please upvote this feature in our feature portal.

Different automation scenarios

Have you been thinking about various automation scenarios but are not sure how will the workflow behave? The following overview can help you to understand the automation in practice.

Scenario	Influence on automation workflow
The field is hidden.	Such a field has no effect on automation.
The field is visible and required.	The field will be automated if: the field has some value and there will be some validation source - built-in check ("checks"), confidence score over threshold ("score"), etc. The field will not be automated if: the confidence score is below the confidence score threshold and there is no other validation source that would enable automation. the confidence score will be equal or higher than the score threshold but the field is evaluated with a negative built-in check. the value is empty. Required fields always need to have a value.
The field is visible and not required.	The field will be automated if: there will be some validation source - built-in check ("checks"), confidence score over threshold ("score"), etc. If nothing is extracted, the confidence score will be null, and the document will be automated without regards to what threshold is set. The field will not be automated if: the confidence score is below the confidence score threshold and there is no other validation source that would enable automation. the confidence score will be equal or higher than the score threshold but the field is evaluated with a negative built-in check.
The field is visible, not required, but a validation error occurred.	A validation error may occur, for example, if a date couldn't be converted to a correct format. Validation error will always stop a document from being automated for both confident and always levels of automation.
The field is visible, not required and should be automated anytime there is a value with no validation error	Set the score threshold to 0 for the field in order to make it work. The field will not be automated if: The field is evaluated with a negative built-in check
The field is visible, not required, has an empty rir_field_names and the field's value is empty.	Such fields are stopped for validation. Set the score threshold to 0 for the field in order to automate it. The field will not be automated if: The field is evaluated with a negative built-in check
Field is populated by upload parameters upload:id	Set the score threshold to 0 for the field in order to automate it. The field will not be automated if: The field is evaluated with a negative built-in check
Field's value is filled with default_value	The field will be automated if: Score threshold is set to 0 for the field. The field will not be automated if: The field is evaluated with a negative built-in check in relation to other fields. The default_value does not comply with the data type of the field and formatting error is returned.
Enum fields with extracted value but no confidence score	It might happen that sometimes a value is predicted with no confidence score. This situation can happen for example when you have custom enums not predicted by the AI Engine outputs - rir_field_names. It you would like to automated such fields, set the confidence score threshold to 0.